KIT | KIT-Bibliothek | Impressum | Datenschutz

DaCToR: A data collection tool for the RELATER project

Hussain, J.; Zenkri, O.; Stüker, S.; Waibel, A.

Abstract:
Collecting domain-specific data for under-resourced languages, e.g., dialects of languages, can be very expensive, potentially financially prohibitive and taking long time. Moreover, in the case of rarely written languages, the normalization of non-canonical transcription might be another time consuming but necessary task. In order to collect domain-specific data in such circumstances in a time and cost-efficient way, collecting read data of pre-prepared texts is often a viable option. In order to collect data in the domain of psychiatric diagnosis in Arabic dialects for the project RELATER, we have prepared the data collection tool DaCToR for collecting read texts by speakers in the respective countries and districts in which the dialects are spoken. In this paper we describe our tool, its purpose within the project RELATER and the dialects which we have started to collect with the tool.

Open Access Logo


Verlagsausgabe §
DOI: 10.5445/IR/1000127261
Veröffentlicht am 08.12.2020
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Anthropomatik und Robotik (IAR)
Publikationstyp Proceedingsbeitrag
Publikationsjahr 2020
Sprache Englisch
Identifikator ISBN: 979-1-09-554634-4
KITopen-ID: 1000127261
Erschienen in Proceedings of the 12th Language Resources and Evaluation Conference. Ed.: N. Calzolari
Verlag European Language Resources Association (ELRA)
Seiten 6627-6632
Bemerkung zur Veröffentlichung Die Veranstaltung „12th International Conference on Language Resources and Evaluation, LREC 2020, 11-16 May 2020, Marseille, France“ wurde aufgrund der Corona-Pandemie abgesagt.
Externe Relationen Abstract/Volltext
Nachgewiesen in Scopus
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page