Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hyperscan: add caching mechanism for hyperscan contexts v9 #12048

Closed

Conversation

lukashino
Copy link
Contributor

@lukashino lukashino commented Oct 28, 2024

Followup of #11888

Cache Hyperscan serialized databases to disk to prevent compilation of the same databases when Suricata is run again with the same ruleset.
Hyperscan binary files are stored per rulegroup in the designated
folder, by default in the cached library folder.
Since caching is per signature group heads,
some chunk of the ruleset can change and it still can reuse part of
the unchanged signature groups.

Loading fresh ET Open ruleset: 19 seconds
Loading cached ET Open ruleset: 07 seconds

Link to ticket: https://redmine.openinfosecfoundation.org/issues/7170

Describe changes:
v9:

  • rebase
  • split HS MPM codebase into multiple files - core functions, Hyperscan MPM functions, HS caching functions
  • recommitted the code
  • minor tweaks from the previous PR, docs update
  • made DetectEngineMpmCachingEnabled private (possible also for DetectEngineMpmCachingGetPath), but checkout v8 thread - decision needed
  • some discussion threads in v8 can/should be brought up again and be decided

v7: (v6 was private)

  • fix docs and add ticket number to the commit
  • fix privilege drop issue, files are created after privilege drop, tested also rule reload - it worked fine
  • refactor the util-mpm-hs code, primarily prepare function
  • rebase

v5:

  • rebased
  • commit message update
  • docs update

v4:

  • rebased
  • changed the default caching directory to somewhere /var/lib/suricata/cache/hs
  • custom cache directory path option added
  • docs added
  • the default settings changed - enabled on the config generation, disabled when the option is not present in the config

v3

  • rebased
  • MPM caching is still left on by default.

v2

  • improved styling to follow Suricata code styleguide
  • increased cache file name length from 10 to 20 characters
  • cache file name is a hash of the patterns - now only HS relevant fields are hashed - as long as the group of patterns itself is not changed then it is reused
  • minor refactors
  • added a safe variant of littlehash2 function
  • added suricata.yaml option to enable/disable caching
  • changed the storage location to the configured logging directory

v1

  • initial work to cache and load Hyperscan databases from the disk

Lukas Sismis added 8 commits October 28, 2024 17:28
This variant of hashlittle2() ensures that it avoids
accesses beyond the last byte of the string, which will
cause warnings from tools like Valgrind or Address
Sanitizer.
Cache Hyperscan serialized databases to disk to prevent compilation
of the same databases when Suricata is run again with the same
ruleset.
Hyperscan binary files are stored per rulegroup in the designated
folder, by default in the cached library folder.
Since caching is per signature group heads,
some chunk of the ruleset can change and it still can reuse part of
the unchanged signature groups.

Loading *fresh* ET Open ruleset:  19 seconds
Loading *cached* ET Open ruleset: 07 seconds

Ticket: 7170
Copy link

codecov bot commented Oct 28, 2024

Codecov Report

Attention: Patch coverage is 12.12121% with 174 lines in your changes missing coverage. Please review.

Project coverage is 68.14%. Comparing base (89aa525) to head (3e98865).
Report is 183 commits behind head on master.

❗ There is a different number of reports uploaded between BASE (89aa525) and HEAD (3e98865). Click for more details.

HEAD has 2 uploads less than BASE
Flag BASE (89aa525) HEAD (3e98865)
suricata-verify 1 0
unittests 1 0
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #12048       +/-   ##
===========================================
- Coverage   83.42%   68.14%   -15.29%     
===========================================
  Files         910      847       -63     
  Lines      257642   155185   -102457     
===========================================
- Hits       214934   105746   -109188     
- Misses      42708    49439     +6731     
Flag Coverage Δ
fuzzcorpus 61.58% <10.60%> (+0.01%) ⬆️
livemode 19.39% <11.61%> (-0.02%) ⬇️
pcap 44.45% <11.11%> (-0.05%) ⬇️
suricata-verify ?
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

@suricata-qa
Copy link

Information: QA ran without warnings.

Pipeline 23194

@catenacyber
Copy link
Contributor

Loading cached ET Open ruleset: 07 seconds

7 or 0.7 ?

Copy link
Member

@victorjulien victorjulien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incomplete review: noted I had some pending comments so here they are

static bool DetectEngineMpmCachingEnabled(void)
{
const char *strval = NULL;
if (ConfGet("detect.sgh-mpm-caching", &strval) != 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this double get of detect.sgh-mpm-caching should not be needed

@@ -1045,6 +1045,9 @@ typedef struct DetectEngineCtx_ {

/* number of signatures using filestore, limited as u16 */
uint16_t filestore_cnt;

/* If enabled, MPM matchers can store compiled pattern databases to disk */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: use doxygen style comment /** ... */

@lukashino
Copy link
Contributor Author

Loading cached ET Open ruleset: 07 seconds

7 or 0.7 ?

7 seconds, because well it's not just loading the ruleset it's the complete Suricata startup

@lukashino
Copy link
Contributor Author

Next #12394

@lukashino lukashino closed this Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants