Add batch processing mode #26

auroracramer · 2019-06-18T03:37:27Z

Something else to consider is a batch processing mode. i.e. making more efficient use of the GPU by predicting multiple files at once.

Probably the least messy option would be to separate some of the interior code of get_audio_embedding for the case of audio into their own functions and make a get_audio_embedding_batch function that calls most of the same functions. We would also have a process_audio_file_batch function.

I thought about changing get_audio_embedding so that it can either take in a single audio array, or a list of audio arrays (and probably a list of corresponding sample rates). While this might consolidate multiple usecases into one function, it'd probably get pretty messy so it's probably best we don't do this.

Regarding the visual frame embedding extraction, we could ask the same question, though there might be more nuance depending on if we allow for individual images to be processed or not (I think we should). In the case of videos though, multiple frames are already being provided at once. So it raises a question (to me at least) whether we allow for get_vframe_embedding (as I'm currently calling it) should support both a single frame as well as multiple. This also raises the question of whether we allow for frames of multiple sizes or not.

Thoughts?

The text was updated successfully, but these errors were encountered:

justinsalamon · 2019-06-19T00:42:12Z

Isn't the most elegant option to just have a single function that takes in a single sample OR a list of samples, and then computes the embedding for everything? Basically what you propose in the middle paragraph? Not sure what the downsides to that are?

auroracramer · 2019-06-19T01:12:16Z

I was thinking the downsides would be all of the involved type introspection and type checking, which maybe isn't so bad if we're very clear. I just get kinda nervous in Python when implementing things where the types of the inputs can change how the function works, particularly when dealing with iterable types. I suppose though if we either allow for non-np.ndarray iterables or just specifically restrict it to lists, then it might be fine. I was thinking there would be less surprises if there were specific functions for single file and batch.

justinsalamon · 2019-06-19T01:49:45Z

All valid points. My concern on the other end is API creep (paraphrasing on feature creep), where the API gets a little crowded. Any chance you can outline the current set of functions we envision for the API (including audio & vision) but excluding batch processing, so we see where we're at?

auroracramer · 2019-06-19T03:05:11Z

Sure, I'll put that in #19.

auroracramer · 2019-06-19T18:48:42Z

I've updated the proposed API changes in #19. Regarding batch processing if we wanted to not add too many functions, then we could do this:

For get_audio_embedding, batch mode would be used if audio is a non-np.ndarray iterable. sr could be an iterable also of the same length as audio if there is variable sample rates. There would have to be a check to make sure that if sr is iterable, it matches audio in length (and audio must also be a non-np.ndarray iterable). We could also add a batch_size argument to control how big the batches are (and how much is loaded in memory at once).

For get_vframe_embedding, a similar approach would be taken where image_arr as a non-np.npdarray iterable would result in batch processing. Similarly, frame_rate could be an iterable of the same length as image_arr (if all of the items of the batch are videos). Again, we could also add a batch_size argument to control how big the batches are (and how much is loaded in memory at once).

For process_audio_file, process_vframe_file and process_video_file, if filepath is a non-six.string_type iterable, then we can run in batch mode. We would also add a batch mode here. Though for batch mode we'd have to do some extra stuff outside of the loop to make most efficient use of batching without loading all of the files in at once.

Thoughts?

justinsalamon · 2019-06-20T19:11:27Z

Sounds reasonable I guess? Is there strong motivation to support any non-ndarray iterable as opposed to forcing it to be a native python list? My thinking being that a stricter type requirement could help prevent confusion?

auroracramer · 2019-06-20T19:42:11Z

The main motivation is for allowing for users to provide generators. But we can always just limit it to lists for now and see if there's a demand for that use case.

justinsalamon · 2019-06-20T20:23:44Z

Maybe I'd start with only supporting lists (simpler API, easier unit tests), and we can decide to expand that if there's demand down the line.

auroracramer · 2019-07-15T19:17:18Z

FYI: being addressed in #31

auroracramer · 2020-01-23T19:39:31Z

Closed by #37.

auroracramer mentioned this issue Jun 18, 2019

Implement image embedding API #19

Closed

auroracramer closed this as completed Jan 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add batch processing mode #26

Add batch processing mode #26

auroracramer commented Jun 18, 2019

justinsalamon commented Jun 19, 2019

auroracramer commented Jun 19, 2019

justinsalamon commented Jun 19, 2019

auroracramer commented Jun 19, 2019

auroracramer commented Jun 19, 2019

justinsalamon commented Jun 20, 2019

auroracramer commented Jun 20, 2019

justinsalamon commented Jun 20, 2019

auroracramer commented Jul 15, 2019

auroracramer commented Jan 23, 2020

Add batch processing mode #26

Add batch processing mode #26

Comments

auroracramer commented Jun 18, 2019

justinsalamon commented Jun 19, 2019

auroracramer commented Jun 19, 2019

justinsalamon commented Jun 19, 2019

auroracramer commented Jun 19, 2019

auroracramer commented Jun 19, 2019

justinsalamon commented Jun 20, 2019

auroracramer commented Jun 20, 2019

justinsalamon commented Jun 20, 2019

auroracramer commented Jul 15, 2019

auroracramer commented Jan 23, 2020