Extended Abstract
How can the impact of events on public opinion be quantified? A key challenge in many liberal democracies is increasing polarization, as different social groups respond divergently to global events. Especially during polycrisis, vulnerable groups are often disproportionately affected. To address this, there is a growing need for robust tools that can analyze how polarization and fragmentation between groups occur and how different crises affect specific parts of society. In this study, we introduce a novel tool to the computational social sciences community and explore its potential to advance research on societal polarization and democratic resilience. A straightforward approach to relate public opinion with certain crises is to augment longitudinal survey data on sentiments with recorded crises media coverage. While survey data enables the tracking of changes in opinion dynamics over time, media reports provide insights into the evolving nature of crises at specific periods. However, existing approaches often face methodological challenges in integrating tabular survey data with unstructured news text. In this work, we present a novel approach that combines natural language processing (NLP) and longitudinal survey data through a multimodal neural network architecture. ... mehrIt provides post-hoc explanations using a new multimodal Explainable Artificial Intelligence (XAI) architecture. Our model predicts and explains how specific news items influence the opinions of survey participants. This allows to analyze how real or hypothetical news content affects sentiments and opinions for individual, group-level or society-level samples.
To evaluate the model’s performance, we apply it to predict public opinion on whether Germany should provide military support to Ukraine during the Russo-Ukrainian war – a prominently featured topic in the German and global public discourse throughout the survey period.To assess the model’s ability to explain the impact of events on public opinion, we group individuals into k personas based on cluster means and a selection of news headlines, then examine how opinion predictions change depending on a persona and the given news context. Machine learning (ML) is being increasingly used to model public opinion using digital content, particularly social media and webtracking data. However, these approaches often suffer from biased data, limited generalizability Wicaksono et al. (2016); Isotalo et al. (2016); Lopez et al. (2017), or a lack of interpretability Chu et al. (2023); Kirkizh et al. (2024); Bach et al. (2022); Albanese et al. (2020). NLP techniques provide new ways to incorporate text data, but their black-box nature poses additional challenges for explaining model behaviour. The first component of the multimodal dataset used in this work is a survey dataset. It stems from a longitudinal survey conducted biweekly in Germany from November 2022 until July 2024. It includes items from various topics, e. g. political partisanship, fears and wellbeing, media cosumption, trust and demographics. The second component is a collection of mostviewed German news reports about the war for each wave of the survey. They are retrieved from EventRegistry, a news intelligence platform that provides data on global news and events Leban et al. (2014). These two sources, biweekly survey data and news texts, make up our dataset for multimodal modelling.
A specialized architecture is designed for a feed-forward neural network by incorporating two distinct input branches: One branch processes tabular (survey) data, while the other handles the raw text (news item) data directly instead of the precomputed embeddings. Subsequently, we apply model-agnostic post-hoc approaches to trace back the behavior of the model to its multimodal input features (survey data and embeddings). These are suitable for both global and local levels of explanations. Since no standardized approach exists for multimodal explanations, we adapt SHapley Additive exPlanations (SHAP) for the given multimodality. SHAP is a framework for explaining ML model predictions, introduced to ML introduced by Lundberg and Lee (2017). SHAP can also be used for text-based models to analyse how individual words influence predictions. By using pre-trained embeddings or transformer-based models, the contribution of each token is determined by modifying text segments and observing the resulting changes in model output. Aggregating SHAP values across multiple predictions provides a global explanation of the model’s behaviour. Text-based explainers also have limitations when applied to certain ML models and text preprocessing techniques. For instance, in SHAP models, specialized text explainers are primarily designed for deep learning architectures. This is because the text masker function operates on raw text input by modifying individual tokens to analyze their impact on the model’s predictions. This approach relies on a model’s capability to process raw text and transform it into embeddings.
Therefore, for the present work, a custom approach is introduced: One input branch is held at a fixed default value while the other varies, enabling an analysis of its isolated influence on the model’s predictions. To explain tabular input contributions, the multimodal model’s text input is set to a neutral placeholder string, forcing the explanations to rely only on tabular data. For text input explanations, the tabular branch is set to the mean value of each tabular feature. The XAI model then processes only a textual input, allowing for an isolated interpretation on how specific words or phrases contribute to the model’s predictions. Analogously, this enables an analysis of the text’s influence on predictions for a generic persona or real survey participant. To assess our multimodal model’s predictions, we train it on the individual participants’ sentiment towards German military support to Ukraine, the topic we chose to assess our tool. One item in the survey operationalizes this sentiment (agreement with statement “Germany should provide military support to Ukraine as long as the war continues.”) As to the weekly news items, we query EventRegistry for highest-ranked articles on concepts Russia and Ukraine. K-means clustering on various features from the survey dataset, yields three clusters that we shall call Trusting Democrats, Skeptical Moderates and Agitated Extremes on account of some of their characteristics. Table 1 shows how their predicted opinions on military support to Ukraine change on presentation of selected headlines to the model. Further, SHAP explanations are shown for the survey data (Figure 1) and an example headline (Figure 2). This allows to interpret, which types of news evoke what opinion shifts in particular groups of survey participants: I. e. sympathy with party “Die Gr¨unen” has a strong influence, as does the wording “Ukraine war” (German: “Ukraine-Krieg”) for the Agitated Extremes cluster.
This work demonstrates the potential of multimodal learning for public opinion prediction while taking up current challenges in interpretability. The proposed approach sets a foundation for future research on interpretable multimodal ML in quantitative social science, and presents a computation framework the study of crisis event or other news coverage. On the one hand, we would like to discuss with the research community, how multimodal models like ours can improve and enable research designs; on the other hand, how novel methods in XAI can even better capture the role of unstructured text.
References
Albanese, F., Pinto, S., Semeshenko, V., and Balenzuela, P. (2020). Analyzing mass media
influence using natural language processing and time series analysis. Journal of Physics:
Complexity, 1(2):025005.
Bach, R. L., Kern, C., Bonnay, D., and Kalaora, L. (2022). Understanding political news media
consumption with digital trace data and natural language processing. Journal of the Royal
Statistical Society Series A: Statistics in Society, 185(Supplement 2):S246–S269.
Chu, E., Andreas, J., Ansolabehere, S., and Roy, D. (2023). Language models trained on media
diets can predict public opinion.
Isotalo, V., Saari, P., Paasivaara, M., Steineker, A., and Gloor, P. (2016). Predicting 2016 US
Presidential Election Polls with Online and Media Variables, pages 45–53.
Kirkizh, N., Ulloa, R., Stier, S., and Pfeffer, J. (2024). Predicting political attitudes from web
tracking data: A machine learning approach. Journal of Information Technology Politics,
21(4):564–577.
Leban, G., Fortuna, B., Brank, J., and Grobelnik, M. (2014). Event registry: learning about
world events from news. In Proceedings of the 23rd International Conference on World
Wide Web, WWW ’14 Companion, page 107–110, New York, NY, USA. Association for
Computing Machinery.
Lopez, J. A. D., Collignon-Delmar, S., Benoit, K., and Matsuo, A. (2017). Predicting the
brexit vote by tracking and classifying public opinion using twitter data. Statistics, Politics
and Policy, 8(1):85–104.
Lundberg, S. M. and Lee, S.-I. (2017). A unified approach to interpreting model predictions.
In Advances in Neural Information Processing Systems (NeurIPS), volume 30, pages 4765–
4774.
Wicaksono, A. J., Suyoto, and Pranowo (2016). A proposed method for predicting us presiden-
tial election by analyzing sentiment in social media. In 2016 2nd International Conference
on Science in Information Technology (ICSITech), pages 276–280.
Table 1: Predicted Likert values for cluster personas across different news headlines. Low
values correspond to high agreement on support to Ukraine and vice versa, on a 7-point Likert
scale from Completely agree (1) to Completely disagree (7).