Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex capture groups silently require whitespace #397

Open
ohmann opened this issue Sep 15, 2015 · 2 comments
Open

Regex capture groups silently require whitespace #397

ohmann opened this issue Sep 15, 2015 · 2 comments

Comments

@ohmann
Copy link
Contributor

ohmann commented Sep 15, 2015

Capture groups within the regex(es) used to parse log lines silently require whitespace to precede them. This is undocumented and is only detectable when using the --debugParse flag. In other words, it is likely to confuse users.

Example command:
synoptic.sh -r "(?<ITIME>),(?<ip>),(?<TYPE>)" [... other args ...]

Example debug parse snippet:

INFO: input: (?<ITIME>),(?<ip>),(?<TYPE>) 
INFO: processed: (?:\s*(?<ITIME>\S+)\s*),(?:\s+(?<ip>\S+)\s*),(?:\s+(?<TYPE>\S+)\s*) 
INFO: standard: (?:\s*(\S+)\s*),(?:\s+(\S+)\s*),(?:\s+(\S+)\s*)

Note that the capture group (?<ip>) is transformed into (?:\s+(?<ip>\S+)\s*), i.e., the capture group won't match a log line unless it is preceded by 1+ whitespace characters. This is somewhat deceptive, since the regex the user passed does not reference whitespace at all. The user might think that a line like 123,4.4.4.4,event would be matched, but in fact it will not be.

This only affects default behavior. Manually specifying capture group formatting, e.g., (?<ip>\S+), works as expected and does not silently require whitespace before it.

@bestchai
Copy link
Member

This is an important usability bug to fix. However, note that you'll have to fix existing args.txt files in the repository, as well. I think this is worthwhile, but the hardest part is to make sure that everything continues to work as before, at least for the examples we distribute. It might also be prudent to document this change on the main synoptic homepage, or in some form of release notes?

@ohmann
Copy link
Contributor Author

ohmann commented Sep 17, 2015

One option I considered is to deprecate the current -r argument (and state
so) and to add a new argument (e.g., -R) that becomes the preferred one and
does not mess with whitespace. This might seem messy, however, so updating
everything is perhaps the better option. Adding to the usage message a
warning that the behavior has changed might be useful if any current
Synoptic users currently rely on the old behavior, sometimes update
Synoptic, but don't regularly check the online documentation.

On Thu, Sep 17, 2015 at 4:41 PM Ivan Beschastnikh [email protected]
wrote:

This is an important usability bug to fix. However, note that you'll have
to fix existing args.txt files in the repository, as well. I think this is
worthwhile, but the hardest part is to make sure that everything continues
to work as before, at least for the examples we distribute. It might also
be prudent to document this change on the main synoptic homepage, or in
some form of release notes?


Reply to this email directly or view it on GitHub
#397 (comment)
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants