Skip to content

Releases: vulnerability-lookup/VulnTrain

Release 2.0.0

05 Sep 12:35
v2.0.0
1809d72
Compare
Choose a tag to compare

News

  • Dataset generation: Introduced a new script to build datasets of structured vulnerabilities enriched with CWE identifiers and corresponding patches.
    Each entry now includes the Git commit message and the full diff (Base64-encoded).
    #10 by @3LS3-1F
  • Model generation: Added a new trainer for predicting CWE classifications from vulnerability descriptions and associated patches (commit messages).
    #10 by @3LS3-1F

Related resources shared via Hugging Face: https://huggingface.co/collections/CIRCL/vlai-for-cwe-guessing-68bab22e3d71b513146d13b3

Changes

  • Improved documentation and reorganized modules for better clarity and maintainability.
  • Updated dependencies to their latest stable versions.

Release 1.5.0

25 Jul 14:58
v1.5.0
51adde5
Compare
Choose a tag to compare

News

  • Dataset generation: Associating Git Fixes with Common Weakness Enumerations (CWEs) found
    in security advisories. (#4)
  • A documentation is now available. (8a345ca)

Changes

  • Model generation: Added a boolean parameter in map_cvss_to_severity in order to switch between using the first non-null CVSS score or the mean of all available CVSS scores. (ff6616e)
  • Dataset generation: Removed useless keys in extract_cnvd (b7d694)

Release 1.4.0

01 Jul 08:42
v1.4.0
d03079a
Compare
Choose a tag to compare

This version adds support for creating new AI-ready datasets based on the China National Vulnerability Database (CNVD). It also introduces a new training module designed to classify vulnerabilities using text classification models tailored for CNVD data. By default hfl/chinese-macbert-base is used but it is possible to use hfl/chinese-bert-wwm-ext or google-bert/bert-base-chinese.
By @3LS3-1F

Release 1.3.1

28 Apr 07:28
v1.3.1
b27bba3
Compare
Choose a tag to compare

Updated dependencies and fixed issues due to changes in transformers.

Release 1.3.0

28 Apr 05:12
v1.3.0
f1c14a3
Compare
Choose a tag to compare

Changes

  • Updated dependencies.

Release 1.2.0

11 Mar 07:31
v1.2.0
d405b7d
Compare
Choose a tag to compare

Changes

  • Dataset generation: CVSS are now extracted from GitHub and PySec security advisories.
  • Dataset generation: CVSS, CPE, title and description (summary) are now extracted from CSAF document.

Release 1.1.0

27 Feb 07:44
v1.1.0
c94d3d0
Compare
Choose a tag to compare

News

  • Trainers: Support of roberta-base for the text classifier with improved
    settings for TrainingArguments.
  • Validators: Validator for severity classification.

Release 1.0.0

25 Feb 07:40
v1.0.0
3f11a97
Compare
Choose a tag to compare

News

  • Introduced a new trainer to automatically classify vulnerabilities based on their descriptions,
    even when CVSS scores are unavailable.
  • Added CVSS parsing to the dataset generation script.

Changes

  • Refactored the project structure for better organization.
  • Improved CPE parsing.
  • Enhanced the dataset generation script.
  • Optimized the trainer for text generation on vulnerability descriptions.
  • Improved command-line argument parsing.
  • Improved the process of pushing the tokenizer and trainer to Hugging Face.

Release 0.5.1

21 Feb 23:02
v0.5.1
2a250c1
Compare
Choose a tag to compare

Fixed configuration module name.

Release 0.5.0

21 Feb 22:43
v0.5.0
6aaa31f
Compare
Choose a tag to compare

Added support of configuration file.