Intrusion detection is a primary concern in any modern computer system due to the ever-growing number of intrusions. Machine learning represents an effective solution to detect and prevent network intrusions. Many existing intrusion detection approaches capitalize on machine learning models learned on the top of individual public datasets and achieve detection accuracy close to 1. These highly performing detectors strongly depend on the training data, which may not be representative of real-life production environments. This paper aims to explore this proposition in the context of denial of service attacks. Different intrusion detectors learned on the top of CICIDS2017 (an established public dataset widely used as a benchmark) are tested against an unseen, although closely related, dataset. The test dataset is based on the same mixture of denial of service attacks in CICIDS2017 and some additional variants. The results indicate that the perfect detection figures obtained in the context of a public dataset may not transfer in practice.
Transferability of machine learning models learned from public intrusion detection datasets: the CICIDS2017 case study
Marta Catillo
;Antonio Pecchia;Umberto Villano
2022-01-01
Abstract
Intrusion detection is a primary concern in any modern computer system due to the ever-growing number of intrusions. Machine learning represents an effective solution to detect and prevent network intrusions. Many existing intrusion detection approaches capitalize on machine learning models learned on the top of individual public datasets and achieve detection accuracy close to 1. These highly performing detectors strongly depend on the training data, which may not be representative of real-life production environments. This paper aims to explore this proposition in the context of denial of service attacks. Different intrusion detectors learned on the top of CICIDS2017 (an established public dataset widely used as a benchmark) are tested against an unseen, although closely related, dataset. The test dataset is based on the same mixture of denial of service attacks in CICIDS2017 and some additional variants. The results indicate that the perfect detection figures obtained in the context of a public dataset may not transfer in practice.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.