Skip to content

enh: collector_mail_attach Decrypt GPG attachments #2623

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

e3rd
Copy link
Member

@e3rd e3rd commented Jul 15, 2025

Our partner sends us a GPG-encrypted attachment. This PR solves this.

Before merging, let's discuss. Should it be part of a collector bot, or should it be rather a GPG parser?

Current pipeline (current PR)

graph LR;
a["mail collector attach"]--> b["generic csv parser"];
Loading

Another possibility

graph LR;
a["mail collector attach"]--> c["gpg parser"] --> b["generic csv parser"];
Loading

Another possibility

graph LR;
a["mail collector attach"]--> b["generic csv parser allowing GPG decryption"];
Loading

Another possibility

graph LR;
a["mail collector attach"]--> b["gpg parser, inheriting from generic csv parser"];
Loading

@e3rd e3rd force-pushed the mail-attach-encrypt branch 2 times, most recently from 57b27f8 to b3c0b05 Compare July 15, 2025 16:09
@sebix sebix added this to the 3.5.0 milestone Jul 15, 2025
@sebix
Copy link
Member

sebix commented Jul 15, 2025

A separate OpenPGP parser (first variant) would strictly adhere to the rule "one bot one job", but the current code is not designed for nor tested with two parsers in a row.
As you did it now, adding the decryption to the collector, is in line with other post-collection processing, such as extracting from archives, so that's fine for me.

Forth variant makes less sense than the third. In third, the CSV parser could use an "OpenPGP Mixin".

@sebix sebix removed this from the 3.5.0 milestone Jul 15, 2025
@e3rd
Copy link
Member Author

e3rd commented Jul 16, 2025

Great! Thanks for having investigated, sebix! :) I am then ready to be merged – or is there anything I could do better?

Copy link
Contributor

@kamil-certat kamil-certat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the recommendation from @sebix I like the change, it looks good - however I'd slightly suggest adding negative test to ensure proper handling GPG errors

@kamil-certat
Copy link
Contributor

Ah, unfortunately, you need to add license files for not code test files (see example https://github.com/certtools/intelmq/blob/develop/docs/static/images/Logo_Intel_MQ.png.license)

@e3rd e3rd force-pushed the mail-attach-encrypt branch 4 times, most recently from 22a7423 to b86e313 Compare July 17, 2025 15:18
@e3rd
Copy link
Member Author

e3rd commented Jul 17, 2025

tests added, license solved :) I'm done now

Copy link
Member

@sebix sebix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this extensive work, including nice test cases.

I'm missing the documentation though (in docs/user/bots.md).

@@ -28,11 +31,19 @@ class MailAttachCollectorBot(MailCollectorBot):
mail_user: str = "<user>"
rate_limit: int = 60
subject_regex: str = "<subject>"
decrypt: bool = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As OpenPGP is not the only e-mail encryption method (S/MIME is also widespread), I suggest naming the parameter decrypt_openpgp or similar.

Suggested change
decrypt: bool = False
decrypt_openpgp: Optional[bool] = False

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about decrypt_gpg to be in line with other params? (If you mean it, decrypt_openpgp is alright.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The encryption standard is called OpenPGP. gnupg/gpg is the program we are using to decrypt it.

So I think opengpg is a better suffix, as the important information is the standard's name, not the name of the program that the bot uses internally.

Analogously, S/MIME is the standard, and gpgsm is one of the programs for en-/decryption.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems legit. Should I then use opengpg for other params too? (Which is longer)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd switch only _passphrase as gpg_home is related only to the gnupg itself. In my opinion _openpgp sounds generally better, but _gpg is also acceptable 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we have now three attributes:

  • decrypt_openpgp
  • openpgp_passphrase
  • gpg_home

Good luck for the users who are not accustomed to PGP nomenclature maze :D . But I admit it's right now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the first glance, it looks awful, but glancing at it for five minutes, it all makes sense xD
Luckily, they are grouped in the docs.

If the parameters were hierarchical, they looked more sensible:

  • decrypt
    • openpgp
      • enabled: yes
      • passphrase: xxx
      • gpg_home: yyy

The more I think of it, openpgp_passphrase is also ambiguous, as OpenPGP allows for asymmetric and symmetric encryption, so it could be the passphrase for symmetric decryption.
(If the parameter was also called gnupgp_passphrase, that's again ambiguous, as that could also mean a passphrase for unlocking gnupg itself, eg, the keyring or whatever).

So, finally openpgp_key_passphrase?
Advantage: If in the future someone adds a parameter to select a specific key, it would be called openpgp_key_id or openpgp_key_file, etc.

I hope that's my last thought on this topic 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, I might make them hierarchical, that would be the best!

"decrypt.openpgp": bool
"decrypt.openpgp.key_passphrase": str
"decrypt.openpgp.gpg_home": Path

Isn't it the best?

IMO, the key cannot be chosen, at least by GnuPG. It either is in the keyring, or not. (For encrypting, we choose keys, not for decrypting.)

@sebix sebix added feature Indicates new feature requests or new features component: bots labels Jul 19, 2025
@e3rd e3rd force-pushed the mail-attach-encrypt branch 4 times, most recently from 93e66e6 to 72c3c49 Compare July 21, 2025 10:42
@e3rd
Copy link
Member Author

e3rd commented Jul 21, 2025

All should be done :)

@e3rd e3rd force-pushed the mail-attach-encrypt branch from 72c3c49 to a8f56c7 Compare July 24, 2025 12:22
@e3rd e3rd force-pushed the mail-attach-encrypt branch from a8f56c7 to 6716704 Compare July 24, 2025 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: bots feature Indicates new feature requests or new features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants