Towards Robust Plagiarism Detection in Programming Education: Introducing Tolerant Token Matching Techniques to Counter Novel Obfuscation Methods

Maisch, Robin; Hagel, Nathan; Bartel, Alexander

doi:10.1145/3723010.3723019

Towards Robust Plagiarism Detection in Programming Education: Introducing Tolerant Token Matching Techniques to Counter Novel Obfuscation Methods

Maisch, Robin

¹; Hagel, Nathan

¹; Bartel, Alexander
¹ Institut für Informationssicherheit und Verlässlichkeit (KASTEL), Karlsruher Institut für Technologie (KIT)

Abstract (englisch):

With the rise of AI-generated code, programming courses face
new challenges in detecting code plagiarism. Traditional methods
struggle against obfuscation techniques that modify code structure
through statement insertion and deletion. To address this, we propose
a novel approach based on tolerant token matching designed
to enhance resilience against such attacks.We evaluate our method
through three experiments on a real-life dataset with AI-obfuscated
plagiarisms. The results show that our approach increased the median
similarity gap between originals and plagiarisms by 1 to 6
percentage points.

Zugehörige Institution(en) am KIT	Institut für Informationssicherheit und Verlässlichkeit (KASTEL)
Publikationstyp	Proceedingsbeitrag
Publikationsdatum	02.06.2025
Sprache	Englisch
Identifikator	ISBN: 979-8-4007-1282-1 KITopen-ID: 1000180637
Erschienen in	ECSEE '25: Proceedings of the 6th European Conference on Software Engineering Education, 2nd – 4th June 2025, Seeon, Germany
Veranstaltung	6th European Conference on Software Engineering Education (2025), Seeon, 02.06.2025 – 04.06.2025
Verlag	Association for Computing Machinery (ACM)
Externe Relationen	Supplement
Schlagwörter	Software Plagiarism Detection, Source Code Plagiarism Detection,, Plagiarism Obfuscation, Obfuscation Attacks, Code Normalization,, Tokenization, Computer Science Education

KITopen-Download

DOI: 10.5445/IR/1000180637

Frei zugänglich ab 03.06.2025

Externe Links

Originalveröffentlichung
DOI: 10.1145/3723010.3723019

Export

Statistiken

Seitenaufrufe: 16
seit 01.04.2025

Repository KITopen

Towards Robust Plagiarism Detection in Programming Education: Introducing Tolerant Token Matching Techniques to Counter Novel Obfuscation Methods

Abstract (englisch):