Skip to content

Extract useful semantic from CVE descriptions usinig NLP

License

Notifications You must be signed in to change notification settings

Phat3/CVE-analyzer

Repository files navigation

CVE analyzer

This project aims to extract from a collection of vulnerabilities report expressed in common English language various semantic information. These semantic information are encoded and retrieved using Name Entity recognition (NER) on the description and currently the available labels are the following:

  • FUNCTION: Vulnerable function name.
  • VERSION: Vulnerable version of the target program.
  • SOURCECE: Path to the source code that contains the vulnerable function/functions.
  • DRIVER: Driver that we the attacker needs to interact with to trigger the exploit.
  • STRUCT: Malformed struct that contains the bug.
  • VULNERABILITY: Type of the vulnerability (e.g. buffer overflow, etc...).
  • CAPABILITY: Capability that the attacker gains after a successful exploitation of the vulnerability (e.g. remote code execution, etc...).

Dataset

The dataset on which the initial state of the project has been developed and tested on is the list of Common Vulnerability Exposure (CVE) regarding the Linux kernel for the years 2017 and 2018 (for this first implementation). The dataset can be found on the website CVE detail

The dataset is formatted as a Comma Separated Values (CSV) but it has been simplified from it's original version and only the description fields has been taken into account.

Installation

Install the project and al its dependencies with:

pip install cve_analyzer

About

Extract useful semantic from CVE descriptions usinig NLP

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published