KIT | KIT-Bibliothek | Impressum | Datenschutz

Towards Robust Plagiarism Detection in Programming Education: Introducing Tolerant Token Matching Techniques to Counter Novel Obfuscation Methods

Maisch, Robin ORCID iD icon 1; Hagel, Nathan ORCID iD icon 1; Bartel, Alexander
1 Institut für Informationssicherheit und Verlässlichkeit (KASTEL), Karlsruher Institut für Technologie (KIT)

Abstract (englisch):

With the rise of AI-generated code, programming courses face
new challenges in detecting code plagiarism. Traditional methods
struggle against obfuscation techniques that modify code structure
through statement insertion and deletion. To address this, we propose
a novel approach based on tolerant token matching designed
to enhance resilience against such attacks.We evaluate our method
through three experiments on a real-life dataset with AI-obfuscated
plagiarisms. The results show that our approach increased the median
similarity gap between originals and plagiarisms by 1 to 6
percentage points.

Zugehörige Institution(en) am KIT Institut für Informationssicherheit und Verlässlichkeit (KASTEL)
Publikationstyp Proceedingsbeitrag
Publikationsdatum 02.06.2025
Sprache Englisch
Identifikator ISBN: 979-8-4007-1282-1
KITopen-ID: 1000180637
Erschienen in ECSEE '25: Proceedings of the 6th European Conference on Software Engineering Education, 2nd – 4th June 2025, Seeon, Germany
Veranstaltung 6th European Conference on Software Engineering Education (2025), Seeon, 02.06.2025 – 04.06.2025
Verlag Association for Computing Machinery (ACM)
Externe Relationen Supplement
Schlagwörter Software Plagiarism Detection, Source Code Plagiarism Detection,, Plagiarism Obfuscation, Obfuscation Attacks, Code Normalization,, Tokenization, Computer Science Education

Postprint §
DOI: 10.5445/IR/1000180637
Frei zugänglich ab 03.06.2025
Originalveröffentlichung
DOI: 10.1145/3723010.3723019
Seitenaufrufe: 16
seit 01.04.2025
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page