Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question #8

Open
francqz31 opened this issue Oct 1, 2024 · 2 comments
Open

A question #8

francqz31 opened this issue Oct 1, 2024 · 2 comments

Comments

@francqz31
Copy link

Really amazing job , this field has been abandoned for a couple of years now .
Do you think this same approach/this paper be used to turn mp3s into lossless flac ? if it got trained on a better bigger data?
and also did you try to turn the compressed mp3 into lossless high bit flac >1411kbps eg or high bit rate WAV?

Thanks in advance

@JusperLee
Copy link
Owner

I've experimented with higher bitrate WAV files, and the results are promising. However, I believe we need more data to improve the outcomes further. Looking forward to any suggestions or insights!

@Manoa1911
Copy link

Manoa1911 commented Dec 10, 2024

the technology is very good :) I was surprised by the quality of inference on my vinyls (and some of them were degraded), I hope in the future a 96 khz model could be available, if it's even at all possible to train for vinyl degradation because for that you would not only need to do correct sample alignment but you would also need the exact same signal in digital form as you have on the vinyl (you can't do it with CD source because the mix/mastering for CD is not at all the same as it is for vinyl), and then there is also the problem with different turntable\head\stylus\preamp and sound card....

would you like me to send you some files that get misresolved ?
some files resolved flat out wrong while others get significant presence reduction, you can see a big decrease in frequency density in the spectrum - sometimes this is good, but other times it sounds dull, at first I thought this was a problem related to downsampling but it isn't

it's kind of like this: when the track has high frequency density, with a significant amount of inter-frequency crosstalk Apollo makes them sound clear because high amount of frequency crosstalk causes interference so when Apollo separates them it sounds better. the problem is when the source file has low frequency density/crosstalk and it does the same thing - except this time this effect has negative consequence: it dulls the presence of the track :(

it also kills echoes and reverbs :(

I also would recommend to train the model for AAC/AC3/OGG and generally other codecs, so the users would have something that fits the distortions of the specific codec :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants