This is a research project work for CSCE 665. It analyses OSQuery logs, trains a model in unsupervised fashion and predicts over a test set.
Code Details:
- Libraries required:
- pandas
- numpy
- scikit-learn
- matplotlib
Steps to run: Before executing any of the following commands, ‘train.log’, 'test.log' for windows or 'train_lin.log' , test_lin.log' must be present in the ‘data’ directory.
-
Create and save features
- Run the command: $python main.py create
- Next select from windows or linux, choose 1 for windows and 2 for linux
- This will create and save features in csv format in the data directory
-
Generate predictions from the features created in step 1
- Run the command: $python main.py test
- It will prompt you to select the type of classifier.
- After choosing an option, the selected model will be created.
- See anomalous.csv to analyse the predicted anomalies from the test data.
- Code is developed using python 2.7 and should be compatible for higher versions as well