KIT | KIT-Bibliothek | Impressum | Datenschutz

How well do DeepSeek, ChatGPT, and Gemini respond to water science questions?

Hosseini, Seyed Hossein ; Pourzangbar, Ali 1
1 Institut für Wasser und Gewässerentwicklung (IWG), Karlsruher Institut für Technologie (KIT)

Abstract:

This study aims to evaluate the performance of three prominent LLMs, DeepSeek R1, ChatGPT-4o, and Gemini 2, in addressing key questions within four core fields of hydrology and water science: machine learning and optimization, remote sensing, flood modeling, and sediment transport. LLMs' responses are systematically compared to benchmark responses derived from review articles in the respective fields. To assess the LLMs’ efficiency, a novel evaluation rubric is introduced in this study, incorporating four key criteria: relevancy, accuracy, authenticity, and novelty. Findings revealed that each model can address the core aspects of the benchmark questions. DeepSeek R1 achieved the highest overall scores in machine learning and optimization, flood modeling, and sediment transport, while ChatGPT-4o demonstrated superior performance in remote sensing. Notably, DeepSeek R1 and Gemini 2 exhibited the lowest response similarity in 95 % of the evaluated questions, whereas ChatGPT-4o and Gemini 2 showed the highest similarity in 70 % of cases.


Verlagsausgabe §
DOI: 10.5445/IR/1000192895
Veröffentlicht am 04.05.2026
Originalveröffentlichung
DOI: 10.1016/j.envsoft.2025.106772
Scopus
Zitationen: 3
Cover der Publikation
Zugehörige Institution(en) am KIT Institut für Wasser und Gewässerentwicklung (IWG)
Publikationstyp Zeitschriftenaufsatz
Publikationsmonat/-jahr 01.2026
Sprache Englisch
Identifikator ISSN: 1364-8152
KITopen-ID: 1000192895
Erschienen in Environmental Modelling & Software
Verlag Elsevier
Band 196
Seiten 106772
Nachgewiesen in Scopus
OpenAlex
KIT – Die Universität in der Helmholtz-Gemeinschaft
KITopen Landing Page