![]() ![]() Having batches with similar length examples provides a lot of gain for recurrent models (RNN, GRU, LSTM) and transformers models (bert, roBerta, gpt2, xlnet, etc.) where padding will be minimal.īasically any model that takes as input variable text data sequences will benefit from this tutorial. ![]() This allows us to provide the most optimal batches when training models with text data. This notebook is a simple tutorial on how to use the powerful PytorchText BucketIterator functionality to group examples ( I use examples and sequences interchangeably) of similar lengths into batches. This is done intentionally in order to keep readers familiar with my format. Better Batches with PyTorchText BucketIterator How to use PyTorchText BucketIterator to sort text data for better batching.ĭisclaimer: The format of this tutorial notebook is very similar with my other tutorial notebooks. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
January 2023
Categories |