Skip to content

Conversation

yger
Copy link
Collaborator

@yger yger commented Jul 17, 2025

Putative attempt to fix #4038, by explicitly exposing job_kwargs to the silence_period function, in order to allow the user to set them explicitly. In addition, noise_levels are now saved explicitly in extra_kwargs. In theory, once a first call to noise_levels has been made, they should be cached and thus no recomputation should be made.

mode="zeros",
noise_levels=None,
seed=None,
job_kwargs=dict(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
job_kwargs=dict(),
**job_kwargs,

Should it be like this?

And then load the _shared_job_kwargs doc?

I'll test this PR now!

@zm711
Copy link
Member

zm711 commented Jul 17, 2025

So it appropriately used the global_job kwargs now during detect_peaks. But it didn't use the cached noise levels during the node pipeline it recalculated them.

Exposing job_kwargs worked fine for silence_periods.

Copy of script I ran and terminal output.

> import spikeinterface.full as si
> from spikeinterface.sortingcomponents.peak_detection import detect_peaks
> rec, _ = si.generate_ground_truth_recording([600], 30000, 64, 30)
> rec = si.common_reference(si.bandpass_filter(rec))
> rec_silenced = si.silence_periods(rec, [[30000,30040],[50000, 50400]], mode='noise', job_kwargs=dict(n_jobs=7))
noise_level (workers: 7 processes): 100%|██████████| 20/20 [00:00<00:00, 50.22it/s]
> si.set_global_job_kwargs(**dict(n_jobs=7))
> peaks = detect_peaks(rec_silenced)
noise_level (workers: 7 processes): 100%|██████████| 20/20 [00:00<00:00, 46.42it/s]
detect peaks using locally_exclusive (workers: 7 processes): 100%|██████████| 600/600 [00:13<00:00, 43.03it/s]

I will say this went much faster than when I was testing and each process would go back and recalculate noise_levels with n_jobs=1.

@yger
Copy link
Collaborator Author

yger commented Jul 18, 2025

@samuelgarcia @zm711 This should also now fix the problem with detect_peaks.

@zm711
Copy link
Member

zm711 commented Jul 18, 2025

At least on my computer it still said it was computing noise_levels.

import spikeinterface.full as si
rec, _ = si.generate_ground_truth_recording([600], 30000, 64, 30)
rec = si.common_reference(si.bandpass_filter(rec))
rec1 = si.silence_periods(rec, [[30000,30040],[50000, 50400]], mode='noise', job_kwargs=dict(n_jobs=7))
from spikeinterface.sortingcomponents.peak_detection import detect_peaks
si.set_global_job_kwargs(**dict(n_jobs=7))
peaks= detect_peaks(rec1)
noise_level (workers: 7 processes): 100%|██████████| 20/20 [00:00<00:00, 46.48it/s]
noise_level (workers: 7 processes): 100%|██████████| 20/20 [00:00<00:00, 46.86it/s]
detect peaks using locally_exclusive (workers: 7 processes): 100%|██████████| 600/600 [00:36<00:00, 16.55it/s]

@zm711
Copy link
Member

zm711 commented Jul 18, 2025

Is this because it is assuming silence_periods has a different noise level now that we removed some "artifacts". Maybe this is the right behavior for this preprocessor?

@alejoe91 alejoe91 added the preprocessing Related to preprocessing module label Sep 1, 2025
Copy link
Member

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tested this it fixed the most immediate problems with the PR. Do we want to ask the user who reported this to test this?

self._kwargs = dict(
recording=recording, list_periods=list_periods, mode=mode, seed=seed, noise_levels=noise_levels
)
self._kwargs.update(noise_levels_kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't play with the _kwargs. Will this update lead to incompatible keys between spikeinterface versions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is a good question, and we should discuss this PR with @samuelgarcia . Indeed, moving the computation of noise levels into the detect_peaks() and not in the node themselves is the only option to control finely the job_kwargs. Not sure we need to save the noise_level_kwargs because the noise_levels are cached per recording I think, and not recomputed during parallel processing. I'll double check with @samuelgarcia

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preprocessing Related to preprocessing module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

parallel processing with silence_periods
3 participants