Skip to content
This repository has been archived by the owner on Jun 20, 2023. It is now read-only.

Unofficial antivirus definitions using Fangfrisch #212

Open
wants to merge 50 commits into
base: master
Choose a base branch
from

Conversation

gchamon
Copy link
Contributor

@gchamon gchamon commented Mar 2, 2023

This PR merges clamd usage from #112 which also enables scanning configuration (which is missing from clamscan, but present in clamdscan). It also uses #202 (comment) as inspiration to fix missing libprce.so issues.

Related issue: #18

The original logic for downloading database definitions is kept intact. There is a new step added to download all extra database definitions regardless of database prefix (rfxn.hdb and rfxn.yara are actually two different databases, for instance).

The upload logic is refactored but is largely the same.

The Dockerfile is refactored to include fewer layers (which reduced build time by half).

Fangfrisch (https://github.com/rseichter/fangfrisch) is installed as CLI from a stage image using the official aws lambda docker. This prevents having to rewrite shebang of the fangfrisch executable which inherits the location of the python interpreter from the build environment (/usr/bin/python3, which in amazonlinux 2 is 3.7, whereas in aws lambda runtime environment is actually 3.6 which breaks sqlalchemy).

The configuration for fangfrisch and clamdscan was taken from this blog post: https://blog.frehi.be/2021/01/25/using-fangfrisch-to-improve-malware-e-mail-detection-with-clamav/

The fangfrisch configuration is static, as it would require a flexible inventory of the database definition files, which would in turn require more extensive work in the upload/download logic. The only option that is exposed is to either use the extra unofficial definitions or not through the environment variable AV_EXTRA_VIRUS_DEFINITIONS.

- add extra config variables to control fangfrisch usage
- refactor use of str_to_bool to avoid repetition
- running aws sync as subprocess
- running fangfrisch as subprocess
uncompressed lambda zip was hitting size limit, maybe downgrading will
shrink size
must still verify if extra virus definitions get pulled form S3
from bluesentry#112
@CLAassistant
Copy link

CLAassistant commented Mar 2, 2023

CLA assistant check
All committers have signed the CLA.

/tmp/usr/sbin/clamd \
/tmp/usr/bin/freshclam \
/tmp/usr/lib64/* \
/usr/lib64/libpcre.so* \
Copy link
Contributor Author

@gchamon gchamon Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

must copy all libpcre.so variants using wildcard, because libpcre.so.1 is a symbolic link sometimes

Comment on lines +100 to +105
if older_files:
print("Not downloading the following older files in series:")
print(json.dumps(list(older_files)))
if md5_matches:
print("Not downloading the following files because local md5 matches s3:")
print(json.dumps(list(md5_matches)))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is to avoid extremely long logs in cloudwatch, which makes is harder to debug issues

Comment on lines +47 to +48
if clamav.update_defs_from_freshclam(AV_DEFINITION_PATH, CLAMAVLIB_PATH) != 0:
return 1
Copy link
Contributor Author

@gchamon gchamon Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we must know that freshclam broke, otherwise the lambda would return a success status while the virus definitions in the s3 bucket would stay outdated

@waycarbon-buildserver waycarbon-buildserver deleted the feature/fangfrisch-extra-defs branch May 3, 2023 19:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants