Порівняльний аналіз модифікованих алгоритмів навчання з частковим залученням учителя на малій кількості розмічених даних

The paper is devoted to improving semi-supervised clustering methods and comparing their accuracy and robustness. The proposed approach is based on expanding a clustering algorithm for using an available set of labels by replacing the distance function. Using the distance function considers not only...

Повний опис

Збережено в:
Бібліографічні деталі
Дата:2022
Автори: Lyubchyk, Leonid, Yamkovyi, Klym
Формат: Стаття
Мова:English
Опубліковано: The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" 2022
Теми:
Онлайн доступ:http://journal.iasa.kpi.ua/article/view/239726
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
Назва журналу:System research and information technologies

Репозитарії

System research and information technologies
Опис
Резюме:The paper is devoted to improving semi-supervised clustering methods and comparing their accuracy and robustness. The proposed approach is based on expanding a clustering algorithm for using an available set of labels by replacing the distance function. Using the distance function considers not only spatial data but also available labels. Moreover, the proposed distance function could be adopted for working with ordinal variables as labels. An extended approach is also considered, based on a combination of unsupervised k-medoids methods, modified for using only labeled data during the medoids calculation step, supervised method of k nearest neighbor, and unsupervised k-means. The learning algorithm uses information about the nearest points and classes’ centers of mass. The results demonstrate that even a small amount of labeled data allows us to use semi-supervised learning, and proposed modifications improve accuracy and algorithm performance, which was found during experiments.