Analysis methodology of pro-kremlin desinformation in internet news articles

The article proposes and implements a methodology for analyzing pro-Kremlin disinformation in Internet sources based on the integration of automated data collection, natural language processing methods, topic modeling, and statistical analysis. The study utilized an open multilingual dataset contain...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Datum:2026
Hauptverfasser: Terentiev, Oleksandr, Prosyankina-Zharova, Tetyana, Abroskin, Yurii, Duda, Volodymyr
Format: Artikel
Sprache:Ukrainisch
Veröffentlicht: Kyiv National University of Construction and Architecture 2026
Schlagworte:
Online Zugang:https://es-journal.in.ua/article/view/364965
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Назва журналу:Environmental safety and natural resources
Завантажити файл: Pdf

Institution

Environmental safety and natural resources
Beschreibung
Zusammenfassung:The article proposes and implements a methodology for analyzing pro-Kremlin disinformation in Internet sources based on the integration of automated data collection, natural language processing methods, topic modeling, and statistical analysis. The study utilized an open multilingual dataset containing 18,249 links to web articles in 42 languages, developed within the framework of the European anti-disinformation initiatives VERA.AI and EUvsDisinfo. The proposed methodology includes the stages of automated extraction of texts from web resources, text preprocessing, language filtering, thematic clustering, and the development of classification models using the SAS Text Miner system. For automated collection of textual content, a specialized Python-based software application was developed using the PLAYWRIGHT and ASYNCIO libraries, optimized for high-performance processing of large-scale web article corpora.The results of the study revealed a significant relationship between the type of content, the language of the source, and the necessity of VPN access for retrieving texts. The Pearson chi-square statistic was 8847 with 10 degrees of freedom and a p-value < 0.000001, indicating a high statistical significance of the obtained results. It was found that Russian-language disinformation resources in most cases require the use of VPN access due to sanctions and geographical access restrictions, whereas trustworthy English-language and Ukrainian-language sources demonstrate substantially higher openness and accessibility stability. Thematic analysis showed that pro-Kremlin disinformation is concentrated around anti-Ukrainian, anti-NATO, and conspiracy-oriented narratives, demonstrating high thematic repetition and characteristics of coordinated FIMI campaigns. The proposed methodology can be applied in the fields of information security, OSINT analytics, information space monitoring, and the development of automated disinformation detection systems.
DOI:10.32347/2411-4049.2026.2.154-160