A data augmentation-based system for future malware prediction

Adesina Simon Sodiya; Enoch Okikijesu Oluwumi; Saidat Adebukola Onashoga; Olubunmi Adewale Akinola

download PDF

Published:

Sep 13, 2022

Keywords:

Fully-connected neural network; Generative adversarial network; Malware strains; New datapoints.

Issue

Vol. 5 No. 2 (2021)

Section

Articles

Adesina Simon Sodiya

Enoch Okikijesu Oluwumi

Saidat Adebukola Onashoga

Olubunmi Adewale Akinola

Abstract

Malware detection is important for computer security. However, existing signature-based malware detection systems are still not quite perfect because they are designed to recognize already established patterns of malicious codes. In this work, a malware prediction model was developed to intelligently discover future malware strains, in order to improve the capability of malware detection system. The paradigm imbibed for this method includes "malware in vision context". Particularly, the method involved the generation of new data points from a malware data distribution using generative adversarial network (GAN), parameterized with fully-connected neural network (FCN) architecture. The developed model generates malware images from a 100-dimensional Gaussian noise distribution and learns to distinguish it from real malware images. The generated malware is similar but not the same as the real malware, as it consists modified features when compared with real malware. To establish the feasibility of the proposed method for malware research, an experiment was conducted by leveraging Mallmg, an image-based malware dataset. Due to certain technical constraints as discussed in the study, 52.83% of Mallmg dataset was used to generate new malware data which yielded 224.98%, amounting to 98.66% of the original dataset. Metrics such as Mean Squared Error (MSE), Structural Similarity Index (SSIM) and a customized enhancer (ABV) were used evaluate the generated images. The best scores obtained for MSE, SSIM andABV are 0.02, 0.91 and 1.00 respectively while the worst scores are 0.07, 0.02, 0.68 respectively. Also, the uniqueness of the generated malware was established. These metrics showcase an exemplary, yet simplistic approach to malware prediction and data augmentation.

International Journal of Information Security, Privacy and Digital Forensics
Journal / International Journal of Information Security, Privacy and Digital Forensics / Vol. 5 No. 2 (2021) / Articles

Published:

Keywords:

A data augmentation-based system for future malware prediction

Adesina Simon Sodiya

Enoch Okikijesu Oluwumi

Saidat Adebukola Onashoga

Olubunmi Adewale Akinola

Abstract

Journal Identifiers

Article Sidebar

Published:

Keywords:

Article Details

Main Article Content

Adesina Simon Sodiya

Enoch Okikijesu Oluwumi

Saidat Adebukola Onashoga

Olubunmi Adewale Akinola

Abstract

Journal Identifiers