-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SyntaxWarning
for invalid escape sequences in Python 3.12
#1457
Comments
Thanks a lot for the report! This comes from a joblib that we imported once, a long time ago. Surprisingly, it was not fixed in their repository at the moment. Otherwise, we could've just updated it. Pull request is most welcomed and appreciated! Quick search in main folders with Python code (ext/src/python_libs, src/projects/spades/pipeline) didn't give me any suspicious places, but second look would be great. Also, joblib is only used for running gzip in parallel, which, I believe, can be done using inbuilt Python methods, so maybe we don't even need joblib. Best |
Using the following two regexes
on all Python scripts, I found several more places with problematic escape sequences and tried to fix them now. The regexes also apply for doubled backslashes (which are valid), but they are rare enough not to flood the output. In most cases, the issues were indeed related to the Besides, I noticed several other issues maybe worth some attention, viz. Python 2.x code (which is not supported, anymore, and a security risk), like
The last thing I noticed were shebang lines given after the copyright header, like in test.py. Since the kernel only interprets shebang lines when the |
@Simon-Brandt You can limit yourself you the code in https://github.com/ablab/spades/tree/main/src/projects/spades/pipeline and in https://github.com/ablab/spades/tree/main/ext/src/python_libs Everything else are some aux scripts that are not used / run in the pipeline. |
Description of bug
In Python >= 3.6, invalid escape sequences for Unicode strings emit a
DeprecationWarning
, changed to aSyntaxWarning
in Python 3.12, to finally become aSyntaxError
in a future Python version. SPAdes uses several of these invalid escape sequences across the Python scripts, most notably (or only?) in regular expressions. I obtained two of these warnings for a metaSPAdes run, in:spades/ext/src/python_libs/joblib3/func_inspect.py
Lines 50 to 52 in 80f282e
and in:
spades/ext/src/python_libs/joblib3/_memory_helpers.py
Line 10 in 80f282e
To my knowledge,
<
and>
should have never carried a specific meaning in Python's regex flavor and thus shouldn't have been required to escape, whilst the character classes\s
and\d
indeed do. As theSyntaxWarning
will eventually become aSyntaxError
, SPAdes will break, in the future. Since the same error has already been reported in #1320 and in #1326, but was only selectively fixed in 60ad35e, it may be preferable to go through the code base and find all strings with escape sequences and fix them, either by doubling the backslashes for Python's parser, or, preferred, by marking them as raw strings. If you want, I could try fixing this myself via pull request.spades.log
Since my dataset contains sensitive information (including the names of file paths), I cannot upload the
spades.log
and am only able to provide the following snippets. Since, however, the error should be clear, I hope more data isn't needed.params.txt
For the
params.txt
, I replaced the sensitive paths with/path/to
.SPAdes version
SPAdes 4.1.0
Operating System
Linux-5.4.0-208-generic-x86_64-with-glibc2.39
Python Version
Python 3.12.3
Method of SPAdes installation
Manual compilation, virtualized as Docker container, run as Singularity image
No errors reported in spades.log
The text was updated successfully, but these errors were encountered: