Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: ECS compatibility (added headers_target and attachments_target) #55

Merged
merged 22 commits into from
Dec 6, 2021

Conversation

kares
Copy link
Contributor

@kares kares commented Nov 10, 2021

When ECS compatibility is disabled, mail headers and attachments are targeted at the root level.
When targeting an ECS versionn, headers and attachments target @metadata sub-fields unless configured otherwise in order
to avoid conflict with ECS fields.

This is achieved by having 2 new options:

  1. headers_target (for header names e.g. "From", "Subject", "X-Spam-Score")
  2. attachments_target

NOTE: headers are lower-cased by default, this settings stays the same in ECS mode.
Further, decided NOT to tr('-', '_') as an attempt to under_score names e.g. "X-Spam-Score" -> "x_spam_score", this would likely need another configuration option to maintain backwards compatibility (things could get a bit out of hand with multiple things changing due ECS mode).


There's an RFC for providing email.* fields in ECS, this isn't part of the PR, however there's a draft waiting for the RFC to become stable or at least the stage 2 RFC to get pushed (e.g. as these changes were developed the RFC changed the mapping in a way that would have created ECS conflicts).


Also includes some minor fixes that came along the way, notably the Date header is no longer skipped as that only worked when lowercase_headers => false was specified.

@kares kares changed the title Feat: ECS compatibility (using a headers_target) Feat: ECS compatibility (added headers_target and attachments_target) Nov 10, 2021
@kares kares marked this pull request as ready for review November 10, 2021 15:47
Copy link
Contributor

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In gneral, this looks to be on-track. I've left a nitpick about whether pre-normalizing the target is strictly necessary, and another about helpful logging at startup being reduced slightly.

next if name == "Date"

case (field = event.get(name))
targeted_name = "#{@headers_target}[#{name}]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example: pre-normalizing the @headers_target isn't strictly necessary, as the syntax allows for Composite References

Suggested change
targeted_name = "#{@headers_target}[#{name}]"
targeted_name = "[#{@headers_target}][#{name}]"

Copy link
Contributor Author

@kares kares Nov 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, would still rather keep the 'canonical' form here - subjectively find it less confusing than the composite.
we end up using what the user set with the exception of the bare foo form being wrapped ([foo]), the normalization code is similar to what we have in the field reference validator.

it's also clearer this way to handle the case when headers_target is nil -> [name]

Copy link
Contributor Author

@kares kares Nov 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this brings up a good question though, when user explicitly sets a headers_target => ''
think we should treat it as a way to disabled the target -> user does not want headers (already present in #56 but I think it makes sense to ship here) instead of treating it as if headers_target was not set and setting stuff at top level.

or does this sound confusing?
(would give us a way for users to opt-out of attachments_target once we ship email.* field support in the future)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that setting headers_target => '' should read as an explicit instruction to not place the headers anywhere, and distinctly different than not supplying a headers_target instruction. We will need to call this out in the docs, and will need to add specs to ensure that it passes validation.

When the email RFC gets finalized, we will need a separate way to opt into the RFC format, and when this future mode is active we can also inject better defaults for related options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that setting headers_target => '' should read as an explicit instruction to not place the headers anywhere, and distinctly different than not supplying a headers_target instruction. We will need to call this out in the docs, and will need to add specs to ensure that it passes validation.

✔️ (pending 6.x compatibility headers_target => '' logstash-plugins/logstash-mixin-validator_support#2)

there's one extra change since last review - not skipping the date header (despite the header being duplicate information as it's also used for event.timestamp = LogStash::Timestamp.new(mail.date.to_time)) - initially considered it a bug but the date header was present with the plugins defaults due:

  • lowercase_headers being on by default
  • the plugin was checking the header name to == "Date" after the lower-case operation

so while it looks like a bug the "fix" (to properly skip the header regardless or lowercase_headers setting) might break pipelines thus an alternate resolution to never skip the date headers seems more appropriate.

@@ -221,7 +266,6 @@ def parse_mail(mail)

def stop
Stud.stop!(@run_thread)
$stdin.close
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😬 whoops. Good catch.

@kares
Copy link
Contributor Author

kares commented Nov 30, 2021

🔴 CI due incompatibility with header_target => '' under LS 6.x (logstash-plugins/logstash-mixin-validator_support#2)

@kares kares requested a review from yaauie November 30, 2021 10:28
@yaauie
Copy link
Contributor

yaauie commented Dec 2, 2021

LGTM pending resolution of logstash-plugins/logstash-mixin-validator_support#2

@kares kares merged commit 2de0776 into logstash-plugins:main Dec 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants