KIT | KIT-Bibliothek | Impressum | Datenschutz

Improved Fast Similarity Search in Dictionaries

Karch, Daniel; Luxen, Dennis; Sanders, Peter ORCID iD icon

Abstract:

We engineer an algorithm to solve the approximate dictionary matching problem.
Given a list of words W, maximum distance d fixed at preprocessing time and a query word q, we would like to retrieve all words from W that can be transformed into q with d or less edit operations.
We present data structures that support fault tolerant queries by generating an index.
On top of that, we present a generalization of the method that eases memory consumption and preprocessing time significantly.
At the same time, running times of queries are virtually unaffected.
We are able to match in lists of hundreds of thousands of words and beyond within microseconds for reasonable distances.

Zugehörige Institution(en) am KIT Institut für Theoretische Informatik (ITI)
Publikationstyp Buch
Publikationsjahr 2010
Sprache Englisch
Identifikator urn:nbn:de:swb:90-203781
KITopen-ID: 1000020378

Volltext §
DOI: 10.5445/IR/1000020378
Seitenaufrufe: 308
seit 04.09.2018
Downloads: 958
seit 09.12.2011
Cover der Publikation
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page