Distributional semantic modeling: a revised technique to train term/word vector space models applying the ontology-related approach

We design a new technique for the distributional semantic modeling with a neural network-based approach to learn distributed term representations (or term embeddings) – term vector space models as a result, inspired by the recent ontology-related approach (using different types of contextual knowled...

Повний опис

Збережено в:
Бібліографічні деталі
Дата:2020
Автори: Palagin, O.V., Velychko, V.Yu., Malakhov, K.S., Shchurov, O.S.
Формат: Стаття
Мова:English
Опубліковано: Інститут програмних систем НАН України 2020
Назва видання:Проблеми програмування
Теми:
Онлайн доступ:http://dspace.nbuv.gov.ua/handle/123456789/180480
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
Назва журналу:Digital Library of Periodicals of National Academy of Sciences of Ukraine
Цитувати:Distributional semantic modeling: a revised technique to train term/word vector space models applying the ontology-related approach / O.V. Palagin, V.Yu Velychko., K.S. Malakhov, O.S. Shchurov // Проблеми програмування. — 2020. — № 2-3. — С. 341-351. — Бібліогр.: 50 назв. — англ.

Репозитарії

Digital Library of Periodicals of National Academy of Sciences of Ukraine
Опис
Резюме:We design a new technique for the distributional semantic modeling with a neural network-based approach to learn distributed term representations (or term embeddings) – term vector space models as a result, inspired by the recent ontology-related approach (using different types of contextual knowledge such as syntactic knowledge, terminological knowledge, semantic knowledge, etc.) to the identification of terms (term extraction) and relations between them (relation extraction) called semantic pre-processing technology – SPT. Our method relies on automatic term extraction from the natural language texts and subsequent formation of the problem-oriented or application-oriented (also deeply annotated) text corpora where the fundamental entity is the term (includes non-compositional and compositional terms). This gives us an opportunity to changeover from distributed word representations (or word embeddings) to distributed term representations (or term embeddings). The main practical result of our work is the development kit (set of toolkits represented as web service APIs and web application), which provides all necessary routines for the basic linguistic pre-processing and the semantic pre-processing of the natural language texts in Ukrainian for future training of term vector space models.