-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FFV1 CUDA accelerated version #269
Comments
AFAIK there is no CUDA implementation but this is on my todo-list.
I expect that it is highly parallelizable on lot of parts, but the range coder (which consumes the biggest part of the time) may be more tricky to parallelize, we'll know the gain only when it is implemented. |
We use an in-house implementation of FFV1 on CUDA (GeForce) on a daily basis since before the pandemic. The parallelisation could be improved by modifying a little the bitstream syntax. |
Thank you for your feedback! @retokromer: Have you considered to put it in the public domain? If not, do you have an estimation of developper.hour for the development? Can you give a bit more details regarding what you are encoding (resolution/FPS) and which performance you get? I'm also considering VC-2 as a codec, but FFV1 exists for a long time and seems to be de-facto standard in the video industry. However, VC-2 website provides a good overview of the performance/compression ratio that's achievable. Do you know where I can find such resource for FFV1? |
Yes, that’s the plan. Yet no schedule is set, as I do this in my spare time.
I guess @pjotrek-b and @digitensions have some. On my end, I did prepare figures for NTTW4 in 2019 at Budapest, but I could finally not attend, because of a date conflict. One of the aspects I did explore was the influence of the |
We compared FFV1 with JPEG-2000 in a study few years ago. An old but still relevant study compares FFV1 with a couple of other more or less open lossless formats (but not VC-2/Dirac), here is an excerpt: There is also a quick study including formats requiring royalties. That said, we definitely lack of up to date comparison charts about speed and compression. |
We might put some resources internally on this project, is there anything we can do to make things happen?
Great, do not hesitate to share, I will start experiementation with FFV1 next week :)
This document is gold, thank you for sharing your work! Maybe it can found its place in the Wiki section of this repo?
Found this one, but I guess every codec have evolved since then! |
How exciting! 😄 @lp35: My performance stats @retokromer refers to can be found here: Since they were generated for/during development of FFV1.2+, they're not only dated (2012) but also a bit hard to read. In a nutshell:
|
Hi @retokromer, IIUC the FFV1 implementation that you authored is currently the only GPU accelerated version. Could you consider publishing your implementation with an open license. I think an open GPU based FFV1 encoder would be incredibly helpful, but want to prevent redundant work. |
@dericed That is indeed the plan! אַ גוט געבענטשט יאָר |
Cool. 😎 |
@dericed What is your deadline? Can this wait until February? |
@retokromer could you share if your GPU based FFV1 encoder supports range coding or is it just golomb rice? |
@dericed I am actually a big fan of arithmetic coding, and indeed we have implemented range coding. |
Hi @retokromer, with your implementation of range coding in GPU did you implement 10-16 bit coding? If so, did you notice any substantial change to the compression ratio? Did you adjust any technique in slice size to go about this? |
@retokromer: cool. 😎 |
I know it's Vulkan and not CUDA, but possibly interesting in this context here? Received a commit yesterday titled "ffv1enc: add a Vulkan encoder" |
@dericed We use mainly 12 bit and 16 bit. I already posted somewhere how we use slides: in short, a small multiple (power of 2) of the available cores gives the best performance. Recently I did explore more the optimisation possibilities between the different bands of a multispectral scan, but indeed in past I was interested also in optimising classic RGB (or CMY or YCbCr or YCoCg). In the real world, the compression rate depends on many factors of which one important is the resolution. If you chose a higher resolution, you increase the noise which is hard to compress (and often people tend to “over-kill” in resolution). @pjotrek-b I don’t know how it could be used as a patch for FFmpeg. It is another implementation of the codec. At the beginning I wrote it in order to gain an in-depth understanding on how it works (that was during the standardisation process). This also gave me some ideas for improvements to version 4. I posted here on GitHub many (all?) of them and I also presented them on various editions of No Time to Wait. |
Offtopic, but my Vulkan encoder was just sent to the ML. It supports all pixel formats, along with all version 3 and 4 features. Its got some interesting optimizations, and more coming up. |
It’s a bit more complicated than that if you want the most optimised code possible, but yes, it’s not rocket science. Our first implementation on CUDA was before the pandemic; in the meantime we also worked with Vulkan. Both solutions have pros and cons … I personally do not have a real preference. |
Hi,
Not sure I'm at the right place to ask this question, but I would like to know if there is any implementation of FFV1 available for CUDA.
If not, do you have any advice on the complexity/feasibility of porting FFV1 on CUDA platform? Is it highly parallelizable?
Don't hesitate to point me to other online resources/repo/person if I'm not at the right place!
Thanks for your time
The text was updated successfully, but these errors were encountered: