KIT | KIT-Bibliothek | Impressum | Datenschutz

Detecting Automatic Software Plagiarism via Token Sequence Normalization

Sağlam, Timur ORCID iD icon 1; Brödel, Moritz 2; Schmid, Larissa ORCID iD icon 1; Hahner, Sebastian ORCID iD icon 1
1 Institut für Informationssicherheit und Verlässlichkeit (KASTEL), Karlsruher Institut für Technologie (KIT)
2 Karlsruher Institut für Technologie (KIT)

Abstract (englisch):

While software plagiarism detectors have been used for decades, the assumption that evading detection requires programming proficiency is challenged by the emergence of automated plagiarism generators. These generators enable effortless obfuscation attacks, exploiting vulnerabilities in existing detectors by inserting statements to disrupt the matching of related programs. Thus, we present a novel, language-independent defense mechanism that leverages program dependence graphs, rendering such attacks infeasible. We evaluate our approach with multiple real-world datasets and show that it defeats plagiarism generators by offering resilience against automated obfuscation while maintaining a low rate of false positives.


Postprint §
DOI: 10.5445/IR/1000167588
Veröffentlicht am 26.01.2024
Originalveröffentlichung
DOI: 10.1145/3597503.3639192
Dimensions
Zitationen: 3
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Informationssicherheit und Verlässlichkeit (KASTEL)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2024
Sprache Englisch
Identifikator ISBN: 979-8-4007-0217-4
KITopen-ID: 1000167588
HGF-Programm 46.23.03 (POF IV, LK 01) Engineering Security for Mobility Systems
Weitere HGF-Programme 46.23.01 (POF IV, LK 01) Methods for Engineering Secure Systems
Erschienen in Proceedings of the 46th International Conference on Software Engineering, ICSE ’24, Lissabon, April 14-20
Veranstaltung 46th International Conference on Software Engineering (ICSE 2024), Lissabon, Portugal, 14.04.2024 – 20.04.2024
Verlag Institute of Electrical and Electronics Engineers (IEEE)
Bemerkung zur Veröffentlichung in press
Schlagwörter Software Plagiarism Detection, Plagiarism Obfuscation, Obfuscation Attacks, Code Normalization, Program Dependence Graph, Tokenization
Nachgewiesen in Dimensions
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page