Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor noise processing #617

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

danielhuang
Copy link

@danielhuang danielhuang commented Nov 2, 2024

The current implementation (dropping samples and introducing a delay) tends to introduce a delay which doesn't get reduced even after CPU load is reduced and processing is able to keep up again. It also ends up causing unrelated processing (e.g. EasyEffects output processing) to stutter when this happens.

Instead of keeping track of a delay, the processing thread will read samples from a ring buffer, and samples only get dropped when the ring buffer gets full.

Also fixes a resource (background thread continues to poll for samples) leak.

@danielhuang
Copy link
Author

danielhuang commented Nov 10, 2024

Mostly done; the calling thread for the plugin no longer blocks (EasyEffects expects the plugin run method to not take too long, otherwise other audio streams could stutter), and CPU load will no longer lead to audio stutters. Since there's no more spin loops, the processing thread will remain idle when there's no activity. Latency is also lower when there's no other CPU activity.

Code should be ready to use, just need some more testing.

@danielhuang danielhuang marked this pull request as ready for review November 10, 2024 08:17
@Safari77
Copy link

does not work with
ffmpeg -i in.wav -af ladspa=file=libdeep_filter_ladspa:plugin=deep_filter_mono:sample_rate=48000:controls="40|-15|35|35|4|0.02" out.wav
it produces only zero-value samples. The Rikorose branch is okay.

2025-01-25T13:36:08.230Z | DEBUG |  tract_pulse::model | Pulsifying node #49 "/df_convp/df_convp.4/Relu.low" Max
2025-01-25T13:36:08.230Z | DEBUG |  tract_pulse::model | Pulsified node #49 "/df_convp/df_convp.4/Relu.low" Max with PulsingWrappingOp
2025-01-25T13:36:08.230Z | DEBUG |  tract_pulse::model | Pulsifying node #50 "/Add_1" Add
2025-01-25T13:36:08.230Z | DEBUG |  tract_pulse::model | Pulsified node #50 "/Add_1" Add with PulsingWrappingOp
2025-01-25T13:36:08.230Z | INFO |  df::tract | Init DF decoder
2025-01-25T13:36:08.251Z | INFO |  df::tract | Running with model type deepfilternet3 lookahead 0
2025-01-25T13:36:08.252Z | INFO |  deep_filter_ladspa | DF 7ae5ccfbf6d1 | Initialized plugin in 384.5ms
[ladspa/src/lib.rs:213:9] &channels = 1
2025-01-25T13:36:08.252Z | INFO |  deep_filter_ladspa | DF 7ae5ccfbf6d1 | activate
[Parsed_ladspa_1 @ 0x78f618019a80] [debug] handles: 1
[auto_aresample_1 @ 0x78f618031c40] [SWR @ 0x78f618031d40] [debug] Using fltp internally between filters
[auto_aresample_1 @ 0x78f618031c40] [verbose] ch:1 chl:mono fmt:fltp r:48000Hz -> ch:1 chl:mono fmt:s16 r:48000Hz
2025-01-25T13:36:08.252Z | DEBUG |  df::tract | Loading model DeepFilterNet3_ll_onnx.tar.gz
[info] Output #0, wav, to 'dfn.wav':
[info]   Metadata:
[info]     ISFT            : Lavf61.9.106
[info]   Stream #0:0, 0, 1/48000: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, mono, s16, 768 kb/s
[info]     Metadata:
[info]       encoder         : Lavc61.31.101 pcm_s16le
[out#0/wav @ 0x587d9d287580] [verbose] Starting thread...
2025-01-25T13:36:08.252Z | WARN |  deep_filter_ladspa | DF 7ae5ccfbf6d1 | Processing thread is overloaded! Dropping frame
[ladspa/src/lib.rs:403:17] &e = "Full(..)"
[ladspa/src/lib.rs:404:17] self.hop_size = 480
2025-01-25T13:36:08.252Z | WARN |  deep_filter_ladspa | DF 7ae5ccfbf6d1 | Processing thread is overloaded! Dropping frame
[ladspa/src/lib.rs:403:17] &e = "Full(..)"
[ladspa/src/lib.rs:404:17] self.hop_size = 480

@danielhuang
Copy link
Author

danielhuang commented Feb 17, 2025

The Rikorose branch is okay.

The original implementation was designed for online (real-time) processing - it happens to work with FFmpeg since it only cares about the wall clock time. If processing was slower than real-time (filtering 1s of audio took longer than 1s), then it would gradually increase the delay, causing silent audio samples to be inserted into the output stream, only to eventually crash with "Processing too slow!". It worked on your computer since your hardware is fast enough.

My new implementation is also designed for online processing with programs such as EasyEffects, but would drop samples more aggressively (and insert silent samples) since EasyEffects would lag the entire audio stream (all inputs and outputs) if samples are not received on time.

The more correct solution for offline processing (using FFmpeg on files, for example) would be to not care about timing at all; it should instead block the thread for each incoming sample as needed. This second approach could also work for real-time processing, but would rely on the other program to handle the case if processing falls behind. EasyEffects doesn't do this, hence the first approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants