/src/
contains the source code for this project. The assemble_dataset.py
script assembles the multimodal dataset consisting of MalNet images and graphs. The Jupyter notebook experiments.ipynb
includes the experiments and results presented in our paper:
- CNNs on binary images
- Semantic information encoding (RGB vs greyscale)
- Transfer learning (ImageNet)
- Plain CNN grid optimisation
- Advanced CNN architectures (ResNet18, DenseNet121, MobileNetV2)
- GNNs on function call graphs (GCN, GIN)
- Fusion strategies (2 intermediate, 2 late)
- Optimising late-fusion ensembles (meta-classifier)
- Grid optimisation
- Plain CNN + GCN
- Plain CNN + GIN
- Bayesian optimisation
- MobileNetV2 + GCN
- DenseNet121 + GCN
- DenseNet121 + GIN
- MobileNetV2 + DenseNet121 + GCN + GIN
- Grid optimisation
- Tables and figures
- Fusion strategies
- Base and late-fusion models
- Confusion matricies (plain CNN + GCN, DenseNet121 + GIN)
- UMAP anlaysis (DenseNet121 + GIN)
- ROC curve (DenseNet121 + GIN)
- SHAP graphs (DenseNet121 + GIN)
Code authored by James Arrowsmith — please direct correspondence to [email protected]
Citation
Arrowsmith, J.; Susnjak, T.; Jang-Jaccard, J. Multimodal Deep Learning for Android Malware Classification. Mach. Learn. Knowl. Extr. 2025, 7, 23. https://doi.org/10.3390/make7010023
@article{arrowsmith2025multimodal,
author = {Arrowsmith, James and Susnjak, Teo and Jang-Jaccard, Julian},
title = {Multimodal Deep Learning for Android Malware Classification},
journal = {Machine Learning and Knowledge Extraction},
year = {2025},
volume = {7},
number = {1},
pages = {23},
doi = {10.3390/make7010023},
url = {https://doi.org/10.3390/make7010023}
}