Protecting data and applications from malware and other forms of malicious code has assumed a great relevance in the current era of pervasive web-based applications. Attackers often use code obfuscation to hide harmful programs from automatic detection. Several researchers have proposed methods to classify an unknown program as malicious or benign; however, little work has been done to identify obfuscated code. A promising approach to detect obfuscated code consists of using a set of metrics, collected by static analysis, to classify a program. In this paper we present an empirical evaluation of three text-based metrics to identify obfuscated code. Our experiment shows that the effectiveness of these metrics depends on the obfuscators: there are cases in which the metrics allow the proliferation of false positives (i.e., misclassification of clear code as obfuscated code), which is bothering but not dangerous, and cases where false negatives (i.e. misclassification of obfuscated as clear code) proliferate, which is definitely more dangerous. Based on our experiment, we propose a combination of these three metrics and show how this combination outperforms the individual metrics.
|Titolo:||An empirical study of metric-based methods to detect obfuscated code|
|Data di pubblicazione:||2013|
|Appare nelle tipologie:||1.1 Articolo in rivista|