This Repository is meant to go along with my Term Paper in which I am conducting a comparative study of log anomaly detection models using reconstruction and representation learning, respectively.
The first model is AutoLog[1]. An Autoencoder, which plays the role of the reconstuction learning part in my paper. It allows for Semi-Supervised Learning and detects anomalous Log Timeframes by setting a Threshold on the Reconstruction Error of the Model's output. (The author's official implementation can be found here).
The second model is CLDTLog[2]. A fine-tuned version of the BERT[3] Large Language Model developed by Google. It is trained in a Supervised setting, using Triplet Loss as well as Focal Loss in order to seperate log embeddings and address the class imbalance of normative vs anomalous samples.
Both Models are trained and tested on the public Loghub[4] Blue Gene/L Supercomputer Log Dataset which contains log data labeled as either normative or anomalous. (Can be found here).
The goal of this paper is comparison of the AutoLog and CLDTLog models in terms of anomaly detection performance, measured by their Recall, Precision and F1 Score, as well as their efficiency, measured by the model's throughput, in real-world log data.
Warning
Code was exclusively developed and tested in Python Version 3.13.2 and might not work on older Versions.
The Installation Script install_deps.bat
is executable on Windows. If you wish to run this project on another Operating System, please check out the Source code and manually replicate the performed steps.
If you want to run the code yourself you will need an NVIDIA GPU. If you wish to train and evaluate the models on your CPU, you will have to alter the source code a bit yourself since tensor / model conversions to CUDA are hard-coded in some places in the code.
-
Clone this repository
- By cloning using git in the command-line:
git clone https://github.com/fietensen/wab-3-log-anomaly-detection.git
- By downloading and extracting the ZIP-File when clicking on the
<> Code
Button at the top of this Page
- By cloning using git in the command-line:
-
Install Python (Version
3.13.2
) from python.org -
Install the NVIDIA Cuda Toolkit (Version
11
) from developer.nvidia.com -
Install NVIDIA CuDNN (Version
10
) from developer.nvidia.com -
Install Python Dependencies and run the Model
- Press the
WIN+R
Keys on your Keyboard to open a Command Prompt - Navigate to the cloned Repository by executing
cd C:\Path\To\wab-3-log-anomaly-detection\
- Install prerequisites by executing
.\install_deps.bat
- Activate the created virtual environment:
.\venv\Scripts\activate.bat
- Run the model training and evaluation script:
python -m model_eval
- Press the
[1] Catillo, M., Pecchia, A., & Villano, U. (2022).
AutoLog: Anomaly Detection by Deep Autoencoding of System Logs.
Expert Systems with Applications, 191, 116263. https://doi.org/10.1016/j.eswa.2021.116263
[2] Tian, G., Luktarhan, N., Wu, H., & Shi, Z. (2023).
CLDTLog: System Log Anomaly Detection Method Based on Contrastive Learning and Dual Objective Tasks.
Sensors, 23(11), 5042. https://doi.org/10.3390/s23115042
[3] Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint, arXiv:1810.04805. https://arxiv.org/abs/1810.04805
This repository makes use of Code from these external sources (libraries excluded):
@AktGPT - Fast Online Triplet mining in Pytorch:
Implements classes for efficiently performing semi-hard Triplet Mining.
@rjnclarke - Fine-Tune an Embedding Model with Triplet Margin Loss in PyTorch:
Implements a Batch Sampler ensuring Batches for CLDTLog always contain both, normative and anomalous samples so the Triplets can be mined in an online fashion. The Sampler was slightly altered for this implementation.
KDEPlot of Logging-Entity Scores for the AutoLog Model
Training and Validation Loss Curve for the AutoLog Model (50 epochs)
Training and Validation Loss Curve for the CLDTLog Model (10 epochs)
Comparison of AutoLog and CLDTLog Metrics: F1 Score, Precision & Recall
Comparison of AutoLog and CLDTLog Throughput: Per Sample in Milliseconds; Log Scale