Модель вторинних некорельованих семантичних полів для анализу текстових даних

The model of derived uncorrelated semantic fields generated by the method of principal components and singular decomposition of the matrix of semantic fields frequencies has been considered. This model describes a new semantic space with orthonormal basis of displaying text documents. The dimension...

Full description

Saved in:
Bibliographic Details
Date:2014
Main Author: Pavlyshenko, B. M.
Format: Article
Language:Ukrainian
Published: The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" 2014
Online Access:http://journal.iasa.kpi.ua/article/view/33341
Tags: Add Tag
No Tags, Be the first to tag this record!
Journal Title:System research and information technologies

Institution

System research and information technologies
Description
Summary:The model of derived uncorrelated semantic fields generated by the method of principal components and singular decomposition of the matrix of semantic fields frequencies has been considered. This model describes a new semantic space with orthonormal basis of displaying text documents. The dimension of the space of derived semantic fields is significantly less than the dimension of the space of initial semantic fields as a result of replacement of interconnected components by uncorrelated semantic characteristics. The analysis of the test sample of text documents showed the possibility to take into consideration only those components of secondary semantic fields which are described by the first singular numbers. The use of the low-dimension orthonormal basis of derived semantic fields can be effective in the problems of the text data classification and clustering.