The adoption of Large Language Models (LLMs) in the education context has been strongly increasing in the last years. A large range of possible applications shows the great opportunities derived from the use of LLM for learning and teaching tasks. However, LLM also introduces the risk that cheating students use existing tools to generate academic content, making it extremely difficult for the teacher to evaluate his/her performance. This drives a great interest in researchers and developers to study new approaches for distinguishing generated content from human content. However, the existing approaches are not able to adapt to the rapid improvement and evolution of content generators that have always been more effective in simulating human tasks. Starting from these considerations this paper proposes a new approach ensuring a great capability to adapt to the continuous generator market changes thanks to the adoption of generative adversarial networks (GANs). The proposed approach includes a generator that starting from the human-written content can obtain new generated content leveraging a continuous retraining process. The proposed approach is evaluated on a dataset composed of 150k human-written and LLM-generated texts. It is built starting from a free available dataset. The empirical validation shows good performance of the proposed approach to discriminate the contents obtaining an accuracy of 0.86.
Detecting the Usage of Large Language Models Exploiting Generative Adversarial Networks
Aversano L.;Bernardi M. L.;
2024-01-01
Abstract
The adoption of Large Language Models (LLMs) in the education context has been strongly increasing in the last years. A large range of possible applications shows the great opportunities derived from the use of LLM for learning and teaching tasks. However, LLM also introduces the risk that cheating students use existing tools to generate academic content, making it extremely difficult for the teacher to evaluate his/her performance. This drives a great interest in researchers and developers to study new approaches for distinguishing generated content from human content. However, the existing approaches are not able to adapt to the rapid improvement and evolution of content generators that have always been more effective in simulating human tasks. Starting from these considerations this paper proposes a new approach ensuring a great capability to adapt to the continuous generator market changes thanks to the adoption of generative adversarial networks (GANs). The proposed approach includes a generator that starting from the human-written content can obtain new generated content leveraging a continuous retraining process. The proposed approach is evaluated on a dataset composed of 150k human-written and LLM-generated texts. It is built starting from a free available dataset. The empirical validation shows good performance of the proposed approach to discriminate the contents obtaining an accuracy of 0.86.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.