Add log-parsing and standardize JSON output #24
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR essentially adds 2 things:
Event Log parsing and searching so only matching events are returned instead of the entire file. This is done by adding the --log_type argument, right now just supporting a value of "cloudtrail" but with commented stubs for other log formats. --log_type basically sets defined values for 2 other arguments --log_format ("json") and --log_properties (["Records"]), and these two arguments can also be set directly. This way users can on-the-fly define ANY log format and property list that cloudgrep will dive into to finally retrieve a LIST of logs that the cloudgrep search will be applied against without needing to add any more code.
More formal output format. All results from cloudgrep are now JSON objects (json.dump of dict objects containing matched lines) without any printing of non-result strings. Any other output has been changed to logging.info or logging.warning. This allows cloudgrep results in stdout to be programmatically parsed by any calling code without the pipeline being polluted with informational printlines.