Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download Repository does not apply terms to anonymize, unlike the web interface #342

Open
eringrant opened this issue Oct 16, 2024 · 4 comments

Comments

@eringrant
Copy link

eringrant commented Oct 16, 2024

Thanks for creating this tool and supporting open science!

When downloading my repository with the "Download Repository" button, some of the instances of the terms identified in "terms to anonymize" are not properly anonymized in the downloaded copy. Which instances fail to anonymize appears somewhat random.

@aric-fowler
Copy link

I have this issue as well. I was hoping there would be a way to turn off downloads entirely for a repo in the options.

@tdurieux
Copy link
Owner

I dont see how this is possible because it uses the same pipeline. Could you provide me a repo for me to try?

@aric-fowler
Copy link

The repo I'd hand you is currently under double-blind review for a publication. I had to go through it and scrub sensitive terms manually. When the review process is finished, I will try and re-create the issue I was having, and provide you with the repo, but no promises. I do not know if the error will occur again.

@eringrant
Copy link
Author

eringrant commented Oct 22, 2024

Here's a repro: I anonymized eringrant/test-anonymous-github at https://anonymous.4open.science/r/test-anonymous-github-B58A. The anonymized terms are:

mercury   # XXXX-1
venus     # XXXX-2
mars      # XXXX-3
jupiter   # XXXX-4
saturn    # XXXX-5
uranus    # XXXX-6
neptune   # XXXX-7
pluto     # XXXX-8
haumea    # XXXX-9
makemake  # XXXX-10
eris      # XXXX-11

"Download Repository," unzip, and look for non-anonymized terms:

$ grep "mercury\|venus\|mars\|jupiter\|saturn\|uranus\|neptune\|pluto\|haumea\|makemake\|eris" src/*
src/file30.txt:mercury
src/file4.txt:mars
src/file41.txt:saturn
src/file43.txt:mars
src/file50.txt:makemake

These terms are anonymized in the web interface. Downloading the repo again and looking for non-anonymized terms gives a different set of instances:

$ grep "mercury\|venus\|mars\|jupiter\|saturn\|uranus\|neptune\|pluto\|haumea\|makemake\|eris" src/*
src/file11.txt:mercury
src/file13.txt:pluto
src/file19.txt:makemake
src/file24.txt:neptune
src/file26.txt:mercury
src/file28.txt:pluto
src/file31.txt:mercury
src/file49.txt:makemake
src/file58.txt:makemake

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants