Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for modes such as (?i) #35

Open
exyzzy opened this issue Jan 15, 2020 · 5 comments
Open

Add support for modes such as (?i) #35

exyzzy opened this issue Jan 15, 2020 · 5 comments

Comments

@exyzzy
Copy link

exyzzy commented Jan 15, 2020

Trying in lexer.go to parse:
lexer.Add([]byte((?i)(varchar\([0-9]+\))), token("VARCHARID"))

results: (debug=true)
...
01/14 20:10:48 enter alternation 0 '(?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter atomicOps 0 '(?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter atomicOp 0 '(?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter atomic 0 '(?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter char 0 '(?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter CHAR 0 '(?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit CHAR 0 '(?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter charRange 0 '(?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit char 0 '(?i)(varchar([0-9]+))'
2020/01/14 20:10:48 char Regex parse error in production 'charClass' : at index 0 line 0 column 1 '(?i)(varchar([0-9]+))' : expected '[' at 0 got '(' of '(?i)(varchar([0-9]+))'
Regex parse error in production 'CHAR' : at index 0 line 0 column 1 '(?i)(varchar([0-9]+))' : unexpected operator, (
Regex parse error in production 'char' : at index 0 line 0 column 1 '(?i)(varchar([0-9]+))' : Expected a CHAR or charRange at 0, (?i)(varchar([0-9]+))
2020/01/14 20:10:48 enter group 0 '(?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter alternation 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter atomicOps 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter atomicOp 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter atomic 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter char 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter CHAR 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit CHAR 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter charRange 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit char 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 char Regex parse error in production 'charClass' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : expected '[' at 1 got '?' of '?i)(varchar([0-9]+))'
Regex parse error in production 'CHAR' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : unexpected operator, ?
Regex parse error in production 'char' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : Expected a CHAR or charRange at 1, (?i)(varchar([0-9]+))
2020/01/14 20:10:48 enter group 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit group 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 group Regex parse error in production 'group' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : expected '(' at 1 got '?' of '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit atomic 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 atomic Regex parse error in production 'group' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : expected '(' at 1 got '?' of '?i)(varchar([0-9]+))'
Regex parse error in production 'charClass' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : expected '[' at 1 got '?' of '?i)(varchar([0-9]+))'
Regex parse error in production 'CHAR' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : unexpected operator, ?
Regex parse error in production 'char' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : Expected a CHAR or charRange at 1, (?i)(varchar([0-9]+))
Regex parse error in production 'atomic' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : Expected group or char
2020/01/14 20:10:48 exit atomicOp 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit atomicOps 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter alternation_ 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit alternation_ 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit alternation 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit group 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 group Regex parse error in production 'group' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : expected ')' at 1 got '?' of '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit atomic 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 atomic Regex parse error in production 'group' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : expected ')' at 1 got '?' of '?i)(varchar([0-9]+))'
Regex parse error in production 'charClass' : at index 0 line 0 column 1 '(?i)(varchar([0-9]+))' : expected '[' at 0 got '(' of '(?i)(varchar([0-9]+))'
Regex parse error in production 'CHAR' : at index 0 line 0 column 1 '(?i)(varchar([0-9]+))' : unexpected operator, (
Regex parse error in production 'char' : at index 0 line 0 column 1 '(?i)(varchar([0-9]+))' : Expected a CHAR or charRange at 0, (?i)(varchar([0-9]+))
Regex parse error in production 'atomic' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : Expected group or char
2020/01/14 20:10:48 exit atomicOp 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit atomicOps 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 enter alternation_ 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit alternation_ 1 '?i)(varchar([0-9]+))'
2020/01/14 20:10:48 exit alternation 1 '?i)(varchar([0-9]+))'
panic: Regex parse error in production 'group' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : expected ')' at 1 got '?' of '?i)(varchar([0-9]+))'
Regex parse error in production 'charClass' : at index 0 line 0 column 1 '(?i)(varchar([0-9]+))' : expected '[' at 0 got '(' of '(?i)(varchar([0-9]+))'
Regex parse error in production 'CHAR' : at index 0 line 0 column 1 '(?i)(varchar([0-9]+))' : unexpected operator, (
Regex parse error in production 'char' : at index 0 line 0 column 1 '(?i)(varchar([0-9]+))' : Expected a CHAR or charRange at 0, (?i)(varchar([0-9]+))
Regex parse error in production 'atomic' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : Expected group or char
Regex parse error in production 'group' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : expected '(' at 1 got '?' of '?i)(varchar([0-9]+))'
Regex parse error in production 'charClass' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : expected '[' at 1 got '?' of '?i)(varchar([0-9]+))'
Regex parse error in production 'CHAR' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : unexpected operator, ?
Regex parse error in production 'char' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : Expected a CHAR or charRange at 1, (?i)(varchar([0-9]+))
Regex parse error in production 'atomic' : at index 1 line 0 column 2 '?i)(varchar([0-9]+))' : Expected group or char
Regex parse error in production 'Parse' : at index 0 line 0 column 1 '(?i)(varchar([0-9]+))' : unconsumed input

@timtadh
Copy link
Owner

timtadh commented Jan 15, 2020

Hi @exyzzy,

Lexmachine does not support every variation of regular expression syntax. I will tag this as a feature request to add support for modes in the parser. For reference, I document the portions of the regular expression syntax that lexmachine support here: https://github.com/timtadh/lexmachine#regular-expressions

Note, just because lexmachine doesn't support (?i) doesn't mean you can't achieve case insensitivity. For example, your given expression could be rewritten as

([Vv][Aa][Rr][Cc][Hh][Aa][Rr]\([0-9]+\))

I recognize this is more work.

@timtadh timtadh changed the title Does not handle case insensitive? Add support for modes such as (?i) Jan 15, 2020
@timtadh
Copy link
Owner

timtadh commented Jan 15, 2020

The modes the Go regexp parser supports are documented here: https://golang.org/pkg/regexp/syntax/

@exyzzy
Copy link
Author

exyzzy commented Jan 15, 2020

Hi Tim, Thanks. Yes, I also came up with the workaround you mention above and it does do the trick, just a little clumsier. So I am not blocked. Thanks for making LexMachine, it's really cool.

@rolyagca
Copy link

rolyagca commented Jun 5, 2020

Hey Tim! I was trying to use a regex expression like this "^-?[0-9]+$" but I think "$" is not supported, what could I do instead?

@timtadh
Copy link
Owner

timtadh commented Jun 5, 2020

@rolyagca correct, $ is not supported (nor is ^) if you would like to see there support please open a separate bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants