-
-
Notifications
You must be signed in to change notification settings - Fork 351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add re.findall to pick out re matches #805
Conversation
Draft because it needs documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I don't mind a re.findall()
, but people might be tempted to use it to "parse" HTML (which is not possible using regex) when the CSS or XPath filters are more suited for the job.
If we make it clear in the docs when it isn't and when it is the right tool for the job, I'm happy to merge this :)
Yeah, I was thinking that but was too lazy to think of a good example :) I've borrowed something from another test, and added some doc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments, rest looks good. Thanks!
docs/source/filters.rst
Outdated
XPATH expressions. HTML and XML cannot be parsed properly using regular | ||
expressions. If the CSS selector or XPATH cannot provide the targeted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could add a link to https://stackoverflow.com/a/1732454/1047040 on "cannot be parsed" for extra entertainment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a classic :)
Actually using re.finditer so we can apply a repl to the result. This allows users to pick out matches and reformat them in one step. Fixes thp#804 Signed-off-by: James Hewitt <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Actually using re.finditer so we can apply a repl to the result. This allows users to pick out matches and reformat them in one step.
Fixes #804