docs(user guide): Add benchmarks

crowdsecurity · Jun 27, 2024 · 98ae9c2 · 98ae9c2
1 parent 0d28764
commit 98ae9c2
Showing 1 changed file with 155 additions and 1 deletion.
diff --git a/docs/USER_GUIDE.md b/docs/USER_GUIDE.md
@@ -1,6 +1,6 @@
 ![CrowdSec Logo](images/logo_crowdsec.png)
 
-# OpenCTI CrowdSec internal enrichment connector
+# OpenCTI CrowdSec external import connector
 
 ## User Guide
 
@@ -82,6 +82,160 @@ You will find a `config.yml.sample` file as example.
 
 
 
+### Example: Enrichment with recommended settings
+
+In this example, we chose `146.70.186.190` as it is currently  reported for cve and mitre techniques.
+
+The result of a CrowdSec's enrichment should be similar to the following description: 
+
+- With regard to the observable itself, you should see:
+  - a list of dark olive green scenario name labels (`crowdsecurity/http-admin-interface-probing`, `crowdsecurity/http-bad-user-agent`, etc.)
+  - a list of purple cve labels (`cve-2021-41773`, etc.)
+  - a red `malicious`reputation label 
+  - An external reference  to the [CrowdSec CTI's url](https://app.crowdsec.net/cti/146.70.186.190)
+  - A note with some content (confidence, first seen, last seen, behaviors, targeted countries, etc.)
+  - A list of relationships:
+    - `related` relationships leading to vulnerabilities created from CVEs
+    - `based-on` relationship leading to a CrowdSec CTI  indicator
+  - A sighting related to CrowdSec with the first and last seen information
+- As the `CROWDSEC_INDICATOR_CREATE_FROM` recommended setting contains `malicious` reputation, an indicator has been created with:
+  - An external reference to the blocking list from which the flagged IP originates.
+  - A list of `indicates` relationship leading to attack patterns created using mitre techniques
+    - If you follow one of this relationship, you can navigate to the attack pattern created, where you will see
+      - An external reference to the MITRE ATT&CK url
+      - A list of `targets` relationships leading to location created from targeted countries (`Canada`, `Poland`, etc.)
+
 ### Quotas
 
 An API key is limited to 24 queries per day. This should be taken into account when setting the import job frequency (see above  `CROWDSEC_IMPORT_INTERVAL` configuration ).
+
+
+
+## Performance and metrics
+
+As mentioned in [this blog article](https://blog.filigran.io/opencti-platform-performances-e3431b03f822), it's really hard tosay how long it will take to ingest thousands of IPs.
+The most honest answer is :
+
+> Well, it depends...
+
+We carried out benchmarking tests on 2 separate servers  (Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz), using the OpenCTI 6.1 platform.
+
+- **Server 1** with 8GB ram and 2 cores: we have allocated 2GB for Elastic Search
+- **Server 2** with 32Gb ram and 8 cores: we have allocated 16GB for Elastic Search
+
+We used the default `docker-compose.yml` file provided by OpenCTI. In particular, we have left 3 workers in both cases.
+
+The benchmarks were performed on a fresh installation, with no other connectors running (other than the default ones).
+
+We used two types of import configuration: 
+
+- a **LIGHT** import with the poorest possible enrichment: create only an observable with an external reference pointing to the CrowdSec CTI url:
+
+```yaml
+- CROWDSEC_LABELS_REPUTATION=false
+- CROWDSEC_LABELS_SCENARIO_NAME=false
+- CROWDSEC_LABELS_SCENARIO_LABEL=false
+- CROWDSEC_LABELS_CVE=false
+- CROWDSEC_LABELS_MITRE=false
+- CROWDSEC_LABELS_BEHAVIOR=false
+- CROWDSEC_INDICATOR_CREATE_FROM=''
+- CROWDSEC_ATTACK_PATTERN_CREATE_FROM_MITRE=false
+- CROWDSEC_VULNERABILITY_CREATE_FROM_CVE=false
+- CROWDSEC_CREATE_NOTE=false
+- CROWDSEC_CREATE_TARGETED_COUNTRIES_SIGHTINGS=false
+- CROWDSEC_CREATE_SIGHTING=false
+```
+
+
+
+- a **FULL** import with the richest possible enrichment: all labels, objects and relationships:
+
+
+```yaml
+- CROWDSEC_LABELS_REPUTATION=true
+- CROWDSEC_LABELS_SCENARIO_NAME=true
+- CROWDSEC_LABELS_SCENARIO_LABEL=true
+- CROWDSEC_LABELS_CVE=true
+- CROWDSEC_LABELS_MITRE=true
+- CROWDSEC_LABELS_BEHAVIOR=true
+- CROWDSEC_INDICATOR_CREATE_FROM='malicious,suspicious,known'
+- CROWDSEC_ATTACK_PATTERN_CREATE_FROM_MITRE=true
+- CROWDSEC_VULNERABILITY_CREATE_FROM_CVE=true
+- CROWDSEC_CREATE_NOTE=true
+- CROWDSEC_CREATE_TARGETED_COUNTRIES_SIGHTINGS=true
+- CROWDSEC_CREATE_SIGHTING=true	
+```
+
+
+
+We analyzed 3 metrics:
+
+- the *CrowdSec Python process time*: the time required by the connector's Python process to handle a given number of IPs. It measures the time needed to retrieve the CrowdSec dump, analyze all the IPs, format all the available data to enrich an observable and send all the necessary bundle to the OpenCTI workers.
+- the *Total time for ingestion*: the total time to ingest all IP. This this the time elapsed between the start and end of the import.
+- the *Average number of bundles ingested per seconds*
+
+
+
+We have obtained the following benchmarks.
+
+### Light import vs Full import
+
+To compare a light import and a full import, we used the Server 1 for 2000 IP addresses.
+
+|                                                | Light Import | Full Import      |
+| ---------------------------------------------- | ------------ | ---------------- |
+| CrowdSec Python process time                   | 90s          | 380s             |
+| Number of bundles sent                         | 2000         | 39119            |
+| Total time for ingestion                       | 393s (6m33s) | 7050s (2h25m30s) |
+| Average number of bundles ingested per seconds | 5.08         | 5.55             |
+
+
+
+We can see that, depending on enrichment quality, import time varies by a factor of 1 to 18.
+
+
+
+### Server 1 vs Server 2
+
+To compare Server 1 and Server 2 performances, we used a full import of 2000 IP addresses.
+
+As the IP addresses retrieved and the associated CTI data vary from import to import, the number of bundles sent also varies.
+
+|                                                | Server 1         | Server 2        |
+| ---------------------------------------------- | ---------------- | --------------- |
+| CrowdSec Python process time                   | 380s             | 289s            |
+| Number of bundles sent                         | 39119            | 38980           |
+| Total time for ingestion                       | 7050s (2h25m30s) | 4925s (1h22m5s) |
+| Average number of bundles ingested per seconds | 5.55             | 7,91            |
+
+
+
+We can see that, depending on server configuration, import time varies by a factor of 1 to 1,4
+
+
+
+### Comparison by number of IPs
+
+We also compared the time needed to ingest different numbers of IP addressses using Server 2 and a full import configuration.
+
+|                                                | 2000 IPs        | 10000 IPs         | 50000 IPs             |
+| ---------------------------------------------- | --------------- | ----------------- | --------------------- |
+| CrowdSec Python process time                   | 289s            | 1436s (23m56s)    | 2236s (1h36m6s)       |
+| Number of bundles sent                         | 38980           | 195519            | 971287                |
+| Total time for ingestion                       | 4925s (1h22m5s) | 30346s (8h25m46s) | 305125s (3d12h45m25s) |
+| Average number of bundles ingested per seconds | 7,91            | 6,44              | 3,18                  |
+
+
+
+We can see that, with the current server configuration, the more IP addresses we ingest, the lower the average number of bundles ingested per second. 
+
+
+
+
+
+
+
+
+
+
+