Means and methods of the unstructured data analysis

Analysis of the current trends in the unstructured text data wide usage and the development of software tools for their processing causes the high urgency of this research direction and the necessity of intelligent information systems in such processing. A signigicant part of Big Data consists of...

Full description

Saved in:

Bibliographic Details
Date:	2019
Main Author:	Rogushina, J.V.
Format:	Article
Language:	Ukrainian
Published:	PROBLEMS IN PROGRAMMING 2019
Subjects:	unstructured data ontology Text Mining Semantic Web Wiki UDC 004,853 004.55
Online Access:	https://pp.isofts.kiev.ua/index.php/ojs1/article/view/348
Tags:	Add Tag No Tags, Be the first to tag this record!
Journal Title:	Problems in programming
Download file:

Institution

Problems in programming

Description
Summary:	Analysis of the current trends in the unstructured text data wide usage and the development of software tools for their processing causes the high urgency of this research direction and the necessity of intelligent information systems in such processing. A signigicant part of Big Data consists of unstructured texts that require the further development of specific Text Mining and algorythms of machine learning. Unstructured data consisting of natural language text in the general case, do not have a predetermined data model. Their ambiguity, heterogeneity and context dependence considerably complicate the classification of documents, the identification of their components and the automated obtaining of user-oriented knowledge from their content, while the large volumes and dynamism of such data do not involve efficient manual processing. The means and methods of data structuring, their various software implementations are considered. The prospects of using background knowledge for such structuring are analyzed. The feasibility of application such W3C standards as RDF and OWL is substantiated. The use of semantic Wiki-technologies for development of distributed information resources simplifies the process of natural text structuring by users and also generates the source of background knowledge for the analysis of arbitrary texts of the corresponding domains. The models and methods proposed in the work allow to improve this process.Problems in programming 2019; 1: 57-77
DOI:	10.15407/pp2019.01.057

Means and methods of the unstructured data analysis

Institution

Similar Items