Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why no custom collate? #21

Open
alok opened this issue Apr 16, 2021 · 2 comments
Open

Why no custom collate? #21

alok opened this issue Apr 16, 2021 · 2 comments

Comments

@alok
Copy link

alok commented Apr 16, 2021

I find them pretty handy in pytorch

@darsnack
Copy link

Don't see a reason why there can't be. We'll just need to update BatchViewCollated to accept a user collate function.

@lorenzoh
Copy link
Owner

As Kyle pointed out, this will not be quite as straightforward if we want to support inplace data loading for custom collate functions. Below is a sketch of a possible solution, depending on the use case for it.

Currently a batch is recursively defined as either:

  • an AbstractArray with one dimension being the batch size
  • a Tuple of batches
  • a NamedTuple of batches
  • a Dict of batches

Importantly, getobs! is a property of the data container, not the BatchViewCollated. Let's say we have a data container DC with observations of type O so we have: getobs(::DC, idx)::O and getobs!(::O, ::DC, idx)::O.

The question is what you want to achieve through a custom colaltefn. If you want to return custom data types as batches, then the following would work:

  • have a custom collatefn that returns batches of type B
  • define a method of DataLoaders.obsslices(::B, ::DataLoaders.BatchDim) that returns an iterator over views of type O. For example, if O is an array type, then it should return array views.

Of course, if we don't want to support buffering and custom collate functions (as is the case in PyTorch if I'm not mistaken), we could simply make buffered and collatefn arguments on DataLoader mutually exclusive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants