Benchmark: Add archive support (generation and training) #744

NicolasHug · 2022-08-18T14:53:04Z

This PR adds support for generating archive-based datasets (tar archives, pickles, torch.save) with different binary data storage (BytesIO, tensor). It also enables training on such archive-based datasets on the torchvision recipes.

I'm not sure if this is something in scope for merging or not. If anything, it's useful as a reference for our future discussions regarding these topics.

If this is in scope and you'd like to merge it, a few things can probably be simplified / cleaned up a bit. E.g. it also adds support for tinyimagenet and simplifies the logger output, but that's unrelated to the goal of this PR - I'm happy to clean that up if needed.

I'm currently running these benchmarks on the cluster, which we can discuss tomorrow in our meeting!

Benchmark: Add archive support (generation and training)

a60427f

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 18, 2022

NicolasHug added 2 commits August 18, 2022 15:57

minor fix, remove unnecessary transform

6f4a61e

comment

46119d5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark: Add archive support (generation and training) #744

Benchmark: Add archive support (generation and training) #744

NicolasHug commented Aug 18, 2022

Benchmark: Add archive support (generation and training) #744

Are you sure you want to change the base?

Benchmark: Add archive support (generation and training) #744

Conversation

NicolasHug commented Aug 18, 2022