Refactor noise processing #617

danielhuang · 2024-11-02T23:21:02Z

The current implementation (dropping samples and introducing a delay) tends to introduce a delay which doesn't get reduced even after CPU load is reduced and processing is able to keep up again. It also ends up causing unrelated processing (e.g. EasyEffects output processing) to stutter when this happens.

Instead of keeping track of a delay, the processing thread will read samples from a ring buffer, and samples only get dropped when the ring buffer gets full.

Also fixes a resource (background thread continues to poll for samples) leak.

danielhuang · 2024-11-10T08:16:59Z

Mostly done; the calling thread for the plugin no longer blocks (EasyEffects expects the plugin run method to not take too long, otherwise other audio streams could stutter), and CPU load will no longer lead to audio stutters. Since there's no more spin loops, the processing thread will remain idle when there's no activity. Latency is also lower when there's no other CPU activity.

Code should be ready to use, just need some more testing.

Safari77 · 2025-01-25T13:43:20Z

does not work with
ffmpeg -i in.wav -af ladspa=file=libdeep_filter_ladspa:plugin=deep_filter_mono:sample_rate=48000:controls="40|-15|35|35|4|0.02" out.wav
it produces only zero-value samples. The Rikorose branch is okay.

2025-01-25T13:36:08.230Z | DEBUG |  tract_pulse::model | Pulsifying node #49 "/df_convp/df_convp.4/Relu.low" Max
2025-01-25T13:36:08.230Z | DEBUG |  tract_pulse::model | Pulsified node #49 "/df_convp/df_convp.4/Relu.low" Max with PulsingWrappingOp
2025-01-25T13:36:08.230Z | DEBUG |  tract_pulse::model | Pulsifying node #50 "/Add_1" Add
2025-01-25T13:36:08.230Z | DEBUG |  tract_pulse::model | Pulsified node #50 "/Add_1" Add with PulsingWrappingOp
2025-01-25T13:36:08.230Z | INFO |  df::tract | Init DF decoder
2025-01-25T13:36:08.251Z | INFO |  df::tract | Running with model type deepfilternet3 lookahead 0
2025-01-25T13:36:08.252Z | INFO |  deep_filter_ladspa | DF 7ae5ccfbf6d1 | Initialized plugin in 384.5ms
[ladspa/src/lib.rs:213:9] &channels = 1
2025-01-25T13:36:08.252Z | INFO |  deep_filter_ladspa | DF 7ae5ccfbf6d1 | activate
[Parsed_ladspa_1 @ 0x78f618019a80] [debug] handles: 1
[auto_aresample_1 @ 0x78f618031c40] [SWR @ 0x78f618031d40] [debug] Using fltp internally between filters
[auto_aresample_1 @ 0x78f618031c40] [verbose] ch:1 chl:mono fmt:fltp r:48000Hz -> ch:1 chl:mono fmt:s16 r:48000Hz
2025-01-25T13:36:08.252Z | DEBUG |  df::tract | Loading model DeepFilterNet3_ll_onnx.tar.gz
[info] Output #0, wav, to 'dfn.wav':
[info]   Metadata:
[info]     ISFT            : Lavf61.9.106
[info]   Stream #0:0, 0, 1/48000: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, mono, s16, 768 kb/s
[info]     Metadata:
[info]       encoder         : Lavc61.31.101 pcm_s16le
[out#0/wav @ 0x587d9d287580] [verbose] Starting thread...
2025-01-25T13:36:08.252Z | WARN |  deep_filter_ladspa | DF 7ae5ccfbf6d1 | Processing thread is overloaded! Dropping frame
[ladspa/src/lib.rs:403:17] &e = "Full(..)"
[ladspa/src/lib.rs:404:17] self.hop_size = 480
2025-01-25T13:36:08.252Z | WARN |  deep_filter_ladspa | DF 7ae5ccfbf6d1 | Processing thread is overloaded! Dropping frame
[ladspa/src/lib.rs:403:17] &e = "Full(..)"
[ladspa/src/lib.rs:404:17] self.hop_size = 480

danielhuang · 2025-02-17T01:02:31Z

The Rikorose branch is okay.

The original implementation was designed for online (real-time) processing - it happens to work with FFmpeg since it only cares about the wall clock time. If processing was slower than real-time (filtering 1s of audio took longer than 1s), then it would gradually increase the delay, causing silent audio samples to be inserted into the output stream, only to eventually crash with "Processing too slow!". It worked on your computer since your hardware is fast enough.

My new implementation is also designed for online processing with programs such as EasyEffects, but would drop samples more aggressively (and insert silent samples) since EasyEffects would lag the entire audio stream (all inputs and outputs) if samples are not received on time.

The more correct solution for offline processing (using FFmpeg on files, for example) would be to not care about timing at all; it should instead block the thread for each incoming sample as needed. This second approach could also work for real-time processing, but would rely on the other program to handle the case if processing falls behind. EasyEffects doesn't do this, hence the first approach.

danielhuang added 2 commits November 2, 2024 19:17

refactor

1b57e40

fix memory leak

e1df351

danielhuang mentioned this pull request Nov 9, 2024

easyeffects causes unusual events/s in powertop, keeping the CPU active wwmm/easyeffects#3281

Open

danielhuang added 2 commits November 9, 2024 21:21

prefill buffer

fec0e33

remove spin loops

2f9a157

danielhuang marked this pull request as ready for review November 10, 2024 08:17

fix control messages

a20ca9f

danielhuang mentioned this pull request Nov 28, 2024

Sound stutter under cpu load when using DNR wwmm/easyeffects#3154

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor noise processing #617

Refactor noise processing #617

danielhuang commented Nov 2, 2024 •

edited

Loading

danielhuang commented Nov 10, 2024 •

edited

Loading

Safari77 commented Jan 25, 2025

danielhuang commented Feb 17, 2025 •

edited

Loading

Refactor noise processing #617

Are you sure you want to change the base?

Refactor noise processing #617

Conversation

danielhuang commented Nov 2, 2024 • edited Loading

danielhuang commented Nov 10, 2024 • edited Loading

Safari77 commented Jan 25, 2025

danielhuang commented Feb 17, 2025 • edited Loading

danielhuang commented Nov 2, 2024 •

edited

Loading

danielhuang commented Nov 10, 2024 •

edited

Loading

danielhuang commented Feb 17, 2025 •

edited

Loading