KIT | KIT-Bibliothek | Impressum | Datenschutz

Topic Detection and Classification in Consumer Web Communication Data

Nakayama, Atsuho

In this study, we examined temporal variation in topics regarding new products by classifying words into clusters based on the co-occurrence of words in Twitter entries. To help identify market trends, analysis of consumer tweet data has received much attention. We collected Twitter entries about new products based on their specific expressions of sentiment or interest. The matrix obtained from the Twitter entries are sparse and of high dimensionality, so we need to perform a dimensionality reduction analysis. We analyzed the matrix using non-negative matrix factorization to reduce the dimensionality. We also clarified temporal variation by using the weight coefficients which show the strength of associations between entries and topics. It is important to consider the temporal variation of these topics when detecting trending topics by classifying words into clusters based on co-occurrence of words.

Open Access Logo

Verlagsausgabe §
DOI: 10.5445/KSP/1000087327/11
Veröffentlicht am 07.02.2020
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Informationswirtschaft und Marketing (IISM)
Publikationstyp Zeitschriftenaufsatz
Publikationsjahr 2018
Sprache Englisch
Identifikator ISSN: 2363-9881
KITopen-ID: 1000105698
Erschienen in Archives of Data Science, Series A (Online First)
Band 5
Heft 1
Seiten A11, 18 S. online
KIT – Die Forschungsuniversität in der Helmholtz-Gemeinschaft
KITopen Landing Page