Mobile devices are undeniably becoming essential in our lives and our daily activities. The adoption of mobile applications increases the human computing experience and the capability to access and exchange data. However, mobile devices are also the target of several malware attacks, usually obtained by evolving existing malicious code. This allows researchers and practitioners to recognize malware applications based on their similarities with existing infected applications. This study uses a multi-perspective declarative language to model the behavior of infected and trusted applications by discovering it from their system call traces. The obtained models are used to classify malware applications and evaluate if they belong to a known malware family. The approach has been evaluated on a dataset obtained by capturing system call traces from more than 160K trusted and infected applications, the latter gathered from 27 known malware families. The empirical study shows the good performance of the approach in the identification of the infected applications and their membership to a specific malware family. In addition, the approach exhibits a high level of robustness to code transformations and major evasion techniques.
Data-aware process discovery for malware detection: an empirical study
Bernardi M. L.;
2023-01-01
Abstract
Mobile devices are undeniably becoming essential in our lives and our daily activities. The adoption of mobile applications increases the human computing experience and the capability to access and exchange data. However, mobile devices are also the target of several malware attacks, usually obtained by evolving existing malicious code. This allows researchers and practitioners to recognize malware applications based on their similarities with existing infected applications. This study uses a multi-perspective declarative language to model the behavior of infected and trusted applications by discovering it from their system call traces. The obtained models are used to classify malware applications and evaluate if they belong to a known malware family. The approach has been evaluated on a dataset obtained by capturing system call traces from more than 160K trusted and infected applications, the latter gathered from 27 known malware families. The empirical study shows the good performance of the approach in the identification of the infected applications and their membership to a specific malware family. In addition, the approach exhibits a high level of robustness to code transformations and major evasion techniques.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.