Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read/write original files as arrays, and a completely memory-based processing flow #294

Open
fisheggg opened this issue Mar 15, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@fisheggg
Copy link
Contributor

fisheggg commented Mar 15, 2023

Read/write original files as arrays

For further processing and training purposes, I think it is nice to be able to read the original files of MgVideo and MgAudio objects as arrays.

I understand that this could be done by opencv or librosa, but a shortcut function would be handy, for example:

video = musicalgestures.MgVideo('/path/to/a/video.mp4')

video_array, fps = video.numpy() # returns a numpy array of the video file

On the other hand, we can also consider to add the function that inits an MgVideo object from an array, for example:

video_array, fps = video.numpy()

new_object = musicalgesture.MgVideo(array=video_array, fps=fps, path='your/cutsom/path', filename='your_filename.avi') # this inits a new MgVideo object and saves the array as a file, maybe have a default path and filename if not specified

The function name, style, parameters and default values need to be further discussed.

Memory-based processing flow

Current workflow creates a new file after each line of processing is done, this might become a issue if we run a batch process of a large dataset, say 100,000 video files (although I guess other performance issues are more critical in such a situation...)

So I suggest we could have a flag that tells the program not to save a new file, but return an array instead. For example:

video = musicalgestures.MgVideo('/path/to/a/video.mp4')
video_grid = video.grid(height=300, rows=1, cols=9, return_array=True) # video_grid will be a numpy array, and no files will be created

In fact, I would recommend using memory-based flow as a default, since it's safer in terms of storage management. I would suggest only creating new files when the user flags that they want so.

@fisheggg fisheggg changed the title Read/write original files as arrays Read/write original files as arrays, and a completely memory-based processing flow Mar 15, 2023
@balintlaczko
Copy link
Collaborator

Very good suggestions, I agree the file-based workflow get generate a lot of clutter if you don't manage it proactively, and that can become tedious with large batches of input files. Originally one of the issues I thought of is that you might not have enough RAM to buffer in entire videos as uncompressed matrices – one example @alexarje often brought up was the Bergensbanen 8h video of a train's dash cam going from Bergen to Oslo that he used for some examples. But I think that one is probably more of an exception in which case the user could specify to render files instead, and probably in most "everyday" cases the memory-based workflow would have much less inertia + would connect well to other CV-related workflows.
The only thing is that with the current FFmpeg backend it might take some experimentation to get the piping right, I haven't really used it before like this, but I know it's possible.

@alexarje
Copy link
Contributor

Yes, there are some good points here. Personally, I find it very practical to work with files written to disk. It works well for large files that don't fit in memory (I often work with hour-long files) and saves me if/when things crash along the way and I don't need to start everything again. But I see that it would be practical with an option to choose whether to write files to disk. How much work would it entail to add such an option @balintlaczko?

@balintlaczko
Copy link
Collaborator

I can imagine that _utils.ffmpeg_cmd could be modified to pipe output that we in turn read in as a numpy array, either frame-by-frame or the whole video at once, depending on the use case and memory constraints.
There is a promising thread about this here.
If we could implement it right in that function, then it might not take that long, actually.

@joachimpoutaraud
Copy link
Contributor

joachimpoutaraud commented Nov 7, 2023

Thanks for the cool suggestions @fisheggg!

I have added a new numpy() function to the MgVideo class to load the video frames as an array with FFmpeg. For that, I have also updated the ffmpeg_cmd function in order to be able to either 'read' the video frame by frame or 'load' all the frames in memory.

Now, it is possible to load the video as an array by doing so:

video = musicalgestures.MgVideo('/path/to/a/video.mp4')
video_array, fps = video.numpy() # returns a numpy array of the video file 

@joachimpoutaraud joachimpoutaraud added the enhancement New feature or request label Nov 7, 2023
joachimpoutaraud added a commit that referenced this issue Dec 5, 2023
@joachimpoutaraud
Copy link
Contributor

joachimpoutaraud commented Dec 5, 2023

I have added the possibility to generate an MgVideo object from an array. For example, now if you write the original video file as an array:

video = musicalgestures.MgVideo('/path/to/a/video.mp4')
video_array, fps = video.numpy() # returns a numpy array of the video file 

You can then generates a new MgVideo object from the following array by doing so:

# Generates a new MgVideo object and saves the array as a video file
new_mg = musicalgesture.MgVideo(filename='your_filename.avi', array=video_array, fps=fps, path='your/cutsom/path') 

Finally, I have also updated the _grid.py script with a memory-based processing flow. Now, it is possible to return an array instead of writing the video file to disk.

from matplotlib import pyplot as plt

video = musicalgestures.MgVideo('/path/to/a/video.mp4')
# video_grid will be a numpy array, and no files will be created
video_grid = video.grid(height=300, rows=3, cols=3, return_array=True) 

# Plot the grid image
plt.figure(figsize=(40, 5))
plt.imshow(video_grid) 
plt.show()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants