With the fast growth of malware's volume circulating in the wild, to obtain a timely and correct classification is increasingly difficult. Traditional approaches to automatic classification suffer from some limitations. The first one concerns the feature extraction: static approaches are hindered by code obfuscation techniques, while dynamic approaches are time consuming and evasion techniques often impede the correct execution of the code. The second limitation regards the building of the prediction models: the adequateness of a training dataset may degrade over time or can not be sufficient for some malware families or instances. With this paper we investigate the effectiveness of a new approach that uses malware visualization, for overcoming the problems related to the features selection and extraction, along with deep learning classification, whose performances are less sensitive to a small dataset than machine learning. The experiments carried out on twelve different neural network architectures and with a dataset of 20,199 malware, demonstrate that the proposed approach is successful as produced an F-measure of 99.97%.
Malware detection employed by visualization and deep neural network
Visaggio C. A.;
2021-01-01
Abstract
With the fast growth of malware's volume circulating in the wild, to obtain a timely and correct classification is increasingly difficult. Traditional approaches to automatic classification suffer from some limitations. The first one concerns the feature extraction: static approaches are hindered by code obfuscation techniques, while dynamic approaches are time consuming and evasion techniques often impede the correct execution of the code. The second limitation regards the building of the prediction models: the adequateness of a training dataset may degrade over time or can not be sufficient for some malware families or instances. With this paper we investigate the effectiveness of a new approach that uses malware visualization, for overcoming the problems related to the features selection and extraction, along with deep learning classification, whose performances are less sensitive to a small dataset than machine learning. The experiments carried out on twelve different neural network architectures and with a dataset of 20,199 malware, demonstrate that the proposed approach is successful as produced an F-measure of 99.97%.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.