-
Couldn't load subscription status.
- Fork 323
encode & decode 32-bit float (IEEE 754 standard) samples. #842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…standard) samples. I've also added support to encode & decode raw, wave riff & aiff float formats. The float encoding feature achieves a near 70% compression ratio, which is better than nothing. no oss-fuzz or tests have been added yet (sorry). the replay gain feature should also be expanded in the future to support the new feature. sorry for the big PR. it's pretty readable tho! I tried to make it as modular & independent as possible. documentation is good. should this make it to a release version, an update to the standard RFC could also be considered. the changes are backwards-compatible and are as follows: 1. when storing float samples, the bps bits in the streaminfo metadata block should be 0b00000. 2. when storing float samples, the zero-padded bit, originally reserved, after the bit-depth bits of each frame header should be 1. 3. when storing float samples, the actual data samples stored in each subframe are obtained by doing some bit manipulation before encoding and after decoding. That part is in src/libFLAC/transform_float.c and is necessary to boost the compression (more info in the file comments). Monkey's Audio .ape does something similar, but this one seems to achieve better ratios (~8% improvement). sorry for the unorthodox type conversions. will fix it if I see it being considered for a merge.
|
Hi, Many thanks for filing a PR. I haven't tested it, and as you might understand, this will need a lot of testing :) I've tried doing this in the past, but didn't get very far. float PCM coding has been a often-requested feature, so this would be a great addition in that regard. I have some "objections on principal grounds" to floating point audio in FLAC. For example, floating point audio isn't meant for playback. Also, because this breaks forward compatibility. It feels to me like this will cause confusion for a lot of users, as these FLAC files will not playback on current equipment. But then again, that was also the case (and still is for some newly manufactured equipment) for 24-bit audio. I'm not saying this won't be merged, but I'll need to hear a few opinions of other people involved in the FLAC project. Still, I already know a lot of people want this. I am surprised you didn't need any new coding methods (subframe types or residual coding methods). I think this is really nice. FLAC is also used for compression measurement signals (for scientific experiments and such) where this also might be useful. Not sure whether your bit manipulation works for that as well. After I test this myself (which might take a while), I would like to propose this change through various channels with FLAC enthousiasts. However, before that, I would like to make the format changes future-proof. The problem is that the reserved bit after the bit depth bits that you use, is the last remaining reserved bit. Use of this makes it impossible for the frame header to extended any further in the future. I would like this bit to have a different function: signal an extension to the frame header with one byte of extra flag bits (or feature bits) and a channel mask. Anyway, many thanks for taking on this task. Please be patient, as merging might take quite a while, and publishing a release after that a while longer. |
|
By the way, perhaps we should also define a new STREAMINFO metadata block variant. |
|
Thx!
There's also another reserved bit in the frame header, just after the sync bits at the very beginning (
I don't think it'd be necessary. A bps of 1-3 is forbidden, according to the current standard rfc. Using them seems like a no-brainer to me. It also suggests (to the older decoders) to stop playing these (cause of the forbidden value), and requires minimal changes in the standard and other implementations. p.s. Sorry for closing the PR, wrong button 😅 |
No, there isn't. The code still refers to it, but the RFC made it part of the sync code. That is to make it distinguishable from MPEG. See here: https://lists.xiph.org/pipermail/flac-dev/2008-December/002607.html So, we only have one bit remaining.
When it is necessary to deviate from the standard, I'd like to do it in a clear and conscious way. So, there'll be more features incorporated in a new streaminfo metadata block, like increasing the max number of samples or perhaps increasing the max samplerate and the total number of samples. Still really niche stuff, so it is really only necessary for exotic stuff. The thing is, like using floats (which is a niche), there are others using FLAC for stuff it wasn't made for, like RF captures. So, this will take a long time. |
Oh, OK. |
|
I completely agree both on
There is also the 64-bit float format (in both endiannesses) - not that it is any more urgently needed for listening, but for compatibility (edit: with DAW plugins, for example) it wouldn't be a bad thing for a FLAC plugin to be able to handle "everything" the application could save. (64-bits would likely need more than 5-bit Rice, but who cares if that element also makes today's decoders err out on something they cannot decode.) Here is a part of a possible solution, if we adopt the following view:
The FLAC format admits sample rate "0" for non-audio. It could also be used for "audio in files that are stored as non-audio", with the FLAC format being used to compress the file and not just the audio stream: interpret the use of "0" as "files you need to treat as a full file", and a player/DAW should then skip it unless it knows what it is doing. There are a bunch of APPLICATION block types left to be used. You got ones for foreign metadata already, but here you might consider ones for mandatory file headers/footers (I think footer would be potentially more crucial for float than for integer, with possible metadata chunks for volume?!), and maybe one for audio properties the decoder needs to reconstruct the files (endianness, and signendness although that isn't applicable for float) - and source file extension (like WavPack does)? So the workflow of a "player"/DAW would then be:
This could also make it possible to store AIFF(/CAF) with non-integer sampling rates - or object-based audio in the BW64 format (like Monkey's and WavPack do), as there would be no need for FLAC to "understand" the objects metadata. Sure the _en_coder does need to know what a sample and a channel is, and if it is not aware of the input format it could be force-fed with "raw format options" plus OptimFROG-stype --headersize and --tailsize. As long as the source file is structured in the order header--audio--footer and nothing between audio and audio, then it would future-proof against new file types? (And past-proof enough to encode .au and A-law and µ-law ... just what the world has not been waiting for.)
In frame headers then:
And more:
|
|
How does WavPack (or its patented competitor, DST) compress 1-bit PWM streams, I don't know. I suppose we also need to reserve some space for defining new subframe types later? AFAIK, with As for the frame header, I will use bit_depth= I'm more in favour of defining a new STREAMINFO_EXTENSION metadata block type (instead of wrapping it inside an APPLICATION block), with the same workflow you said.
|
|
Oof my bad, I wrote APPLICATION when I meant "metadata". I agree yes, different (new) type - there are 120 left, not running out soon. STREAMINFO_EXTENSION to inform a "new audio type-aware" decoder/player how to play, and a "this block type-aware but won't play" decoder how to order the decoded bits wrt. endianness and signedness and interleaving. Then I suggest types to inform about and store file header and footer when those are "mandatory" to get output right; If full file headers/footers are indeed indispensable, there should be a way to get them past the 16MiB by spanning them over several blocks if necessary; whether the STREAMINFO_EXTENSION contains the info about that or there are just block types for "first" and "continued" header and ditto footer ... there are many ways.
Though I tend to disagree (I think that "subset" should not become more permissible, that would put new demands on finalized software that claims to decode subset), and maybe it has to be sorted out before committing, but ... not needed yet. |
|
Well, now that we're modifying the STREAMINFO block, we can also replace the 3-byte block size indicator with an elias gamma representation... |
I've also added support to encode & decode raw, wave riff, & aiff float formats.
The float encoding feature achieves a near 70% compression ratio, which is better than nothing. No oss-fuzz or tests have been added yet (sorry). The replay gain feature should also be expanded in the future to support the new feature. Sorry for the big PR. It's pretty readable tho! I tried to make it as modular & independent as possible. Documentation is good. Should this make it to a release version, an update to the standard RFC could also be considered. The changes are backwards-compatible and are as follows: