Analysis of text analytics methods for knowledge extraction from Ukrainian-language social media

The purpose of the study is to review and systematize current text analytics and natural language processing methods for knowledge extraction from unstructured social media content, with a focus on Ukrainian-language sources.A comparative analysis of topic modelling methods (LSA, NMF, LDA, HDP, Top2...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Datum:	2026
Hauptverfasser:	Terentiev, Oleksandr, Abroskin, Yurii, Duda, Volodymyr, Prosyankina-Zharova, Tetyana
Format:	Artikel
Sprache:	Ukrainisch
Veröffentlicht:	Kyiv National University of Construction and Architecture 2026
Schlagworte:	text analytics data processing Coherence Score F1-score LSA NMF LDA Top2Vec BERTopic OSINT
Online Zugang:	https://es-journal.in.ua/article/view/358171
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!
Назва журналу:	Environmental safety and natural resources

Institution

Environmental safety and natural resources

Beschreibung
Zusammenfassung:	The purpose of the study is to review and systematize current text analytics and natural language processing methods for knowledge extraction from unstructured social media content, with a focus on Ukrainian-language sources.A comparative analysis of topic modelling methods (LSA, NMF, LDA, HDP, Top2Vec, BERTopic), ontology construction approaches, OSINT data collection tools, and the F1 evaluation metric for named entity recognition tasks was conducted.Comparative analysis of four topic modelling methods applied to real Twitter datasets demonstrated that BERTopic (coherence score 0.62) outperforms LDA (0.45) and Top2Vec (0.56) for short texts; the NER-UK 2.0 corpus provides a baseline solution for Ukrainian named entity recognition with an F1 score of 0.89.Theoretically, the selection of methods that take into account the temporal dynamics of topics is justified. Practically, five-block pipeline architecture for knowledge extraction from Ukrainian-language social media is proposed.The originality of the work lies in the adaptation of the Methontology-based approach to ontology generation for short unstructured Ukrainian-language texts.Further prospects include practical implementation and validation of the proposed pipeline on real Ukrainian social media datasets.
DOI:	10.32347/2411-4049.2026.1.161-170

Analysis of text analytics methods for knowledge extraction from Ukrainian-language social media

Institution

Ähnliche Einträge