experanto.utils.MultiEpochsDataLoader
- class MultiEpochsDataLoader(*args, **kwargs)[source]
Bases:
DataLoaderDataLoader that keeps workers alive across epochs.
Solves a bug where worker processes are re-spawned at the start of each epoch, causing significant overhead. Workers are initialized once and reused throughout training.
- Parameters:
*args – Positional arguments forwarded to
torch.utils.data.DataLoader.shuffle_each_epoch (bool, default=False) – If True and the underlying dataset has a
shuffle_valid_screen_timesmethod, that method is called at the start of every epoch.**kwargs – Keyword arguments forwarded to
torch.utils.data.DataLoader.
References
https://discuss.pytorch.org/t/enumerate-dataloader-slow/87778
Methods
__init__(*args[, shuffle_each_epoch])