Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added implementation of zapline for power noise removal #1032

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ariguiba
Copy link

Information about this PR:

Current issues:

  • The algorithm takes too long to run for even a small dataset
  • Some artifacts are still visible

Copy link

welcome bot commented Dec 17, 2024

Hello! 👋 Thanks for opening your first pull request here! ❤️ We will try to get back to you soon. 🚴🏽‍♂️

@ariguiba
Copy link
Author

@behinger

@behinger
Copy link

Thanks Boshra!

  • this looks already good to me - I think zapline is at the conceptually right place (a "replacement" to notch-filtering).
  • meegkit as a requirement, here someone from mne-bids-pipeline time has to chim in for sure, is that too large? is it ok? can it be made optionally, or how does the dependency-management work?
  • the failing unittests because of deprecated use of numpy.core.numerictype are a problem to be still fixed. Maybe this is something to update upstream to the pyriemann package, can you check? I'm also wondering if we can use meegkit without ASR etc. - just the dss.py importants - but I dont know enough about python

@larsoner
Copy link
Member

meegkit as a requirement, here someone from mne-bids-pipeline time has to chim in for sure, is that too large? is it ok? can it be made optionally, or how does the dependency-management work?

We could make it optional but really:

$ pip show meegkit
...
Requires: joblib, matplotlib, numpy, pandas, pyriemann, scikit-learn, scipy, statsmodels, tqdm
...

...we already require all of these except statsmodels and pyriemann so I think it's okay just to add it, assuming it's on PyPI and conda-forge, and it does appear to be both places.

the failing unittests because of deprecated use of numpy.core.numerictype are a problem to be still fixed. Maybe this is something to update upstream to the pyriemann package, can you check? I'm also wondering if we can use meegkit without ASR etc. - just the dss.py importants - but I dont know enough about python

Either meegkit could make some of these imports optional, or we can just ignore the dtype issue locally in our tests. It would be okay to add another ignore to mne_bids_pipeline/tests/conftest.py

@hoechenberger
Copy link
Member

I'm okay with depending on meegkit. If it ever starts to cause trouble, we can simply drop the functionality again -- it's not a "core" functionality we critically depend on.

@agramfort WDYT?

@behinger
Copy link

is there an update on this? How should we move this forward?

@larsoner
Copy link
Member

@ariguiba do you still want to work on this? If so I'm happy to do a quick review, looks like it might be a few small tweaks then we could get it in!

@behinger if there is no response for a little bit (maybe a week?) then you could take over if you want

@ariguiba
Copy link
Author

So I would be done with my part, I don't know what more to tweak honestly. I think a decision needs to be made about the following:
As I understand it, the errors are caused because the code we're using from MEEGKit is using some deprecated or problematic numpy method.
Also, in my opinion using the dss_line method may not be the best idea also because it seems to be super slow even on a not-so-big dataset.
So I think the best choice would be to take the source code and adapt it to our use-case 1. to remove the problematic numpy method and 2. maybe make it faster when integrated in our pipeline.
But I don't know if it's possible to just reuse the code, what do you think?
Or how would you move forward with this? Is there some small tweaks I can still do?

@larsoner
Copy link
Member

So I think the best choice would be to take the source code and adapt it to our use-case 1. to remove the problematic numpy method and 2. maybe make it faster when integrated in our pipeline.

I think it would be better to improve meegkit directly if possible -- have you raised the issue over there yet? Better to improve the upstream package rather than start maintaining a parallel implementation

In the meantime I can hopefully push some commits next week to make CIs happy

@larsoner
Copy link
Member

... actually meegkit 0.1.9 landed three days ago, I'll restart CIs to see if it's fixed already

@larsoner
Copy link
Member

Looks like it ran out of memory, I'll try an 8GB machine but if that dies, too, then the implementation will need to be improved before this can proceed I think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants