Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

escaped forward slashes? #20

Closed
davidszotten opened this issue Sep 14, 2019 · 5 comments
Closed

escaped forward slashes? #20

davidszotten opened this issue Sep 14, 2019 · 5 comments

Comments

@davidszotten
Copy link

Hi,

when trying to use rure on https://github.com/ua-parser/uap-core, i came across this behaviour where rure isn't consistent with re. i'm not sure who's "right"

$ python
Python 3.7.3 (default, May  6 2019, 08:10:06)
[Clang 9.0.0 (clang-900.0.38)] on darwin
>>> import re, rure
>>> re.match('\/', '/')
<re.Match object; span=(0, 1), match='/'>
>>> rure.match('\/', '/')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
[snip]
rure.exceptions.RegexSyntaxError: regex parse error:
    \/
    ^^
error: unrecognized escape sequence
@BurntSushi
Copy link

This isn't a behavior of the binding. This is a behavior of the underlying regex library. The regex library does not let you escape characters that don't need to be escaped. Forward slashes are not special characters in the regex language.

More generally, there is no intent for rure to be drop in compatible with any other regex engine. There are a number of differences between them beyond escaping.

@davidszotten
Copy link
Author

Thanks for the quick reply. So i guess the python implementation is just more lenient, and i guess uap escapes slashes in case they're used by eg a javascript engine?

@BurntSushi
Copy link

I'm on vacation, so I don't know off hand with certainty, but I don't think / is a special character in JavaScript regexes. It is special in regex literals of course, but I don't think it is when using new RegExp.

So I don't know why they escape them. You'd have to ask them. They probably needlessly escape other things too.

@davidblewett
Copy link
Owner

More generally, there is no intent for rure to be drop in compatible with any other regex engine. There are a number of differences between them beyond escaping.

The Python wrapper does take some attempts to be as drop-in as possible. This package provides a compatibility layer to translate the Python standard library method signatures and flags to those used by regex. However, it does not try to work around more fundamental differences like this escaping issue, or arbitrary look around expressions for instance.

This package will always be more strict than the standard library when parsing input expressions.

@davidblewett davidblewett pinned this issue Sep 16, 2019
davidblewett added a commit that referenced this issue Sep 16, 2019
Include note about the engine rejecting some expressions the standard library will accept (see #20 ).
@davidblewett
Copy link
Owner

I left a note about this in the readme ( ea26d71 ).

@davidszotten if you would like further explanation in the readme, let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants