Skip to content

Paimon: Patch Identification Monster (extended version of GraphSPD)

License

Notifications You must be signed in to change notification settings

shuwang127/Paimon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Paimon

Paimon: Patch Identification Monster

0. Introduction

Paimon is an extended version of GraphSPD [1], which is a graph-based security patch detection program.

Compared with the GraphSPD, Paimon makes the following changes:

  • multiple bug fixes, better modular design, and more diverse changable arguments (see args.py).
  • new (statement) node alignment and graph merging algorithm with less overhead and faster speed.
  • accelarate graph slicing with a reconstructed algorithm, based on graph theory and matrix operations.
  • generated the node embeddings with advanced methods, e.g., CodeBERT.
  • updated the default hyper-parameters of graph learning models.

Citation:

[1] Shu Wang, Xinda Wang, Kun Sun, Sushil Jajodia, Haining Wang, and Qi Li, “GraphSPD: Graph-Based Security Patch Detection with Enriched Code Semantics,” 2023 IEEE Symposium on Security and Privacy (S&P 2023), San Francisco, CA, USA, 2023, pp. 2409-2426, doi: 10.1109/SP46215.2023.10179479.

1. Dependencies

Paimon can run on the conda environment by the following setup. The environment is based on GPU with cuda 11.7.

$ conda create -n paimon python=3.9
$ conda activate paimon
$ conda install numpy scipy transformers
$ conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
$ conda install pyg -c pyg

If using pip, execute the following commands.

$ pip install numpy scipy transformers
$ pip install torch torchvision torchaudio
$ pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv torch_geometric -f https://data.pyg.org/whl/torch-1.13.0+cu117.html

2. Model Training

All commands are executed under the root folder: <PATH_TO_FOLDER>/Paimon/, which is refered as <root> in the following instructions.

Option 1: Train the model first time.

If you train the model first time, please use the following commands.

python paimon.py --task train

You can find the avaiable arguments by python paimon.py --help

Option 2: Train the model with processed data.

If you have already processed the dataset, you can train the model using --train_only flag, which saves a lot of time in data processing.

python paimon.py --task train --train_only

Option 3: Train the model of twin networks.

If you do not use PatchCPG, you can train the twin network model by using the following commands.

python paimon.py --task train --twin 

The flag --train_only is also avaiable if you have processed dataset.

3. Model Testing.

Test the model using the following command.

python paimon.py --task test

If you test the twin network model, please also include --twin flag.

Appendix

The old version of GraphSPD also included in this repo. Please see the Old_ReadMe for instructions.

About

Paimon: Patch Identification Monster (extended version of GraphSPD)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published