KIT | KIT-Bibliothek | Impressum | Datenschutz

Directional Semantic Drift Across Grokipedia Versions: A Diffusion-Manifold Analysis

Batzdorfer, Veronika 1; Tölkes, Oliver
1 Institut für Technikzukünfte (ITZ), Karlsruher Institut für Technologie (KIT)

Abstract:

Large language models increasingly mediate access to factual knowledge, yet little is known about how their representations diverge from established encyclopedic sources. We quantify semantic drift between Wikipedia and Grokipedia–a large language model– generated encyclopedia explicitly designed to counter perceived Wikipedia biases–as well as across two Grokipedia versions released before and after a platform update (v0.1, v0.2). Focusing on articles about cities in the United States and Germany, we embed 1,387 matched entries, project them onto a shared diffusion manifold, and operationalize semantic drift as Euclidean displacement in latent space. Both cross-sectional and longitudinal analyses reveal systematic structure in this drift: larger cities and articles containing politically salient framing exhibit significantly greater divergence. Regression models further show that changes in article length and political content predict drift even after controlling for population size, country, and time. Articles in Grokipedia v0.1 that were explicitly attributed as adapted from Wikipedia lost this attribution in v0.2 and exhibited a pronounced and statistically significant semantic shift, indicating a new article generation. ... mehr


Verlagsausgabe §
DOI: 10.5445/IR/1000194450
Veröffentlicht am 22.06.2026
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Technikzukünfte (ITZ)
Publikationstyp Proceedingsbeitrag
Publikationsdatum 26.05.2026
Sprache Englisch
Identifikator ISBN: 979-8-4007-2504-3
KITopen-ID: 1000194450
Erschienen in Proceedings of the 18th ACM Web Science Conference 2026
Veranstaltung 18th ACM Web Science Conference (Websci 2026), Braunschweig, Deutschland, 26.05.2026 – 29.05.2026
Verlag Association for Computing Machinery (ACM)
Seiten 326 - 331
Vorab online veröffentlicht am 25.05.2026
Externe Relationen Siehe auch
Schlagwörter Wikipedia, Grokipedia, v0.2, v0.1, Generative AI, large languagemodels, information retrieval, diffusion manifold, semantic drift,temporal analyses
Nachgewiesen in Scopus
OpenAlex
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page