Determining the weights of links in networks of terms
One of the most urgent problem in natural language processing, such as a formalization and creation of ontological models of subject domains based on the thematic text corpora is considered. Using text mining and natural language processing, with applying lingvo-statistical methods and computational...
Збережено в:
Дата: | 2019 |
---|---|
Автори: | , |
Формат: | Стаття |
Мова: | Ukrainian |
Опубліковано: |
Інститут проблем реєстрації інформації НАН України
2019
|
Теми: | |
Онлайн доступ: | http://drsp.ipri.kiev.ua/article/view/199357 |
Теги: |
Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
|
Назва журналу: | Data Recording, Storage & Processing |
Репозитарії
Data Recording, Storage & ProcessingРезюме: | One of the most urgent problem in natural language processing, such as a formalization and creation of ontological models of subject domains based on the thematic text corpora is considered. Using text mining and natural language processing, with applying lingvo-statistical methods and computational linguistics, networks models of subject domains have been created to provide a better interaction between human communicative acts that presented in sign and verbal form, and computer systems. A new approach for determining the weights of links in the network of terms which correspond to certain concepts of the considered subject domain has been proposed. In particular, applying the approach for determining the weights of links in the network of terms, the terminological ontology of subject domain that related with a climate emergency has been created as approbation. Further analysis of the created model made it possible to determine the most influential and significant links between the corresponding nodes in networks of terms that in turn correspond to certain concepts of the considered subject domain. The Python programming language and its separate functions of a specialized add-in — the module NLTK (Natural Language Toolkit open source library) is used to create the software realization of the proposed and considered approaches and methods. Using the software for modelling and visualization of graphs - Gephi, the built directed networks of terms have been visualized for better visual perception. The weighted directed networks of terms built according to the proposed approach can be used for automatically creating terminological ontologies of subject domains with the participation of experts. Also, the research result can be used to create personal search interfaces for users of information retrieval systems and also can be used in navigation systems in data-bases. It should help users of such systems simplify the process of searching the relevant information. Tabl.: 2. Fig.: 2. Refs: 18 titles. |
---|