The concept and evaluating of big data quality in the semantic environment

Big data refers to large volumes, complex data sets with various autonomous sources, characterized by continuous growth. Data storage and data collection capabilities are now rapidly expanding in all fields of science and technology due to the rapid development of networks. Evaluating the quality of...

Повний опис

Збережено в:
Бібліографічні деталі
Дата:2023
Автор: Novitsky, A.V.
Формат: Стаття
Мова:Англійська
Опубліковано: PROBLEMS IN PROGRAMMING 2023
Теми:
Онлайн доступ:https://pp.isofts.kiev.ua/index.php/ojs1/article/view/527
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
Назва журналу:Problems in programming
Завантажити файл: Pdf

Репозитарії

Problems in programming
_version_ 1859502220854165504
author Novitsky, A.V.
author_facet Novitsky, A.V.
author_sort Novitsky, A.V.
baseUrl_str https://pp.isofts.kiev.ua/index.php/ojs1/oai
collection OJS
datestamp_date 2023-06-25T06:57:27Z
description Big data refers to large volumes, complex data sets with various autonomous sources, characterized by continuous growth. Data storage and data collection capabilities are now rapidly expanding in all fields of science and technology due to the rapid development of networks. Evaluating the quality of data is a difficult task in the context of big data, because the speed of semantic data reasoning directly depends on its quality. The appropriate strategies are necessary to evaluate and assess data quality according to the huge amount of data and its rapid generation. Managing a large volume of heterogeneous and distributed data requires defining and continuously updating metadata describing various aspects of data semantics and its quality, such as conformance to metadata schema, provenance, reliability, accuracy and other properties. The article examines the problem of evaluating the quality of big data in the semantic environment. The definition of big data and its semantics is given below and there is a short excursion on a theory of quality assessment. The model and its components which allow to form and specify metrics for quality have already been developed. This model includes such components as: quality characteristics; quality metric; quality system; quality policy. A quality model for big data that defines the main components and requirements for data evaluation has already been proposed. In particular, such evaluation components as: accessibility, relevance, popularity, compliance with the standard, consistency, etc. are highlighted. The problem of inference complexity is demonstrated in the article. Approaches to improving fast semantic inference through materialization and division of the knowledge base into two components, which are expressed by different dialects of descriptive logic, are also considered below. The materialization of big data makes it possible to significantly speed up the processing of requests for information extraction. It is demonstrated how the quality of metadata affects materialization. The proposed model of the knowledge base allows increasing the qualitative indicators of the reasoning speed.Prombles in programming 2022; 3-4: 260-270
first_indexed 2025-07-17T09:41:12Z
format Article
fulltext 260 Програмні засоби аналітики даних УДК 004.05 http://doi.org/10.15407/pp2022.03-04.260 THE CONCEPT AND EVALUATING OF BIG DATA QUALITY IN THE SEMANTIC ENVIRONMENT Oleksandr Novytskyi Великі дані стосуються великих обсягів, складних наборів даних із різними автономними джерелами, що характеризуються постійним зростанням. Зі швидким розвитком мереж, зберігання даних і можливостей збору даних, великі дані швидко розши- рюються в усіх сферах науки та техніки. У контексті великих даних оцінка якості даних є складною задачею. Для семантичних даних якість і швидкість виводу безпосередньо залежить від якості даних. Враховуючи величезний обсяг даних і їх швидке генерування, це вимагає відповідних стратегій для оцінки якості даних. Управління великим обсягом різнорідних і розподілених даних вимагає визначення та постійного оновлення метаданих, що описують різні аспекти семантики та якості даних, такі як від- повідність схемі метаданих, походження, надійність, точність та інші властивості. В статі розглянута проблематика оцінювання якості великих даних у семантичному середовищі. Наведено визначення великих даних та їх семантики, зроблено невеликий екс- курс в теорію оцінювання якості. Розроблена модель та її компоненти, що дозволяє сформувати та конкретизувати метрики для якості. В дану модель входять такі компоненти як: характеристика якості, метрика якості, система якості, політка якості. Запро- понована модель якості для великих даних, яка визначає основні компоненти та вимоги до оцінювання даних. Зокрема, виділено такі компоненти оцінювання як: доступність, релеватність, популярність, відповідність стандарту, узгодженість тощо. Продемон- стрована проблема складності виводу. Розглянуто підходи до покращення швидкого семантичного виводу через матеріалізацію та поділ бази знань на два компоненти, які виражаються різними діалектами дескриптивної логіки. Оскільки матеріалізація великих даних дозволяє значно пришвидшити обробку запитів на екстракцію інформації. Продемонстровано як якість метаданих вливає на матеріалізацію. Запропонована модель бази знань, яка дозволяє підвищити якісні показники швидкості виводу. Big data refers to large volumes, complex data sets with various autonomous sources, characterized by continuous growth. Data storage and data collection capabilities are now rapidly expanding in all fields of science and technology due to the rapid development of net- works. Evaluating the quality of data is a difficult task in the context of big data, because the speed of semantic data reasoning directly depends on its quality. The appropriate strategies are necessary to evaluate and assess data quality according to the huge amount of data and its rapid generation. Managing a large volume of heterogeneous and distributed data requires defining and continuously updating metadata describing various aspects of data semantics and its quality, such as conformance to metadata schema, provenance, reliability, accuracy and other properties. The article examines the problem of evaluating the quality of big data in the semantic environment. The definition of big data and its semantics is given below and there is a short excursion on a theory of quality assessment. The model and its components which allow to form and specify metrics for quality have already been developed. This model includes such components as: quality characteristics; quality metric; quality system; quality policy. A quality model for big data that defines the main components and requirements for data evaluation has already been proposed. In particular, such evaluation components as: accessibility, relevance, popularity, compliance with the standard, consistency, etc. are highlighted. The problem of inference complexity is demonstrated in the article. Approaches to improving fast semantic inference through materialization and division of the knowledge base into two compo- nents, which are expressed by different dialects of descriptive logic, are also considered below. The materialization of big data makes it possible to significantly speed up the processing of requests for information extraction. It is demonstrated how the quality of metadata affects materialization. The proposed model of the knowledge base allows increasing the qualitative indicators of the reasoning speed. 1. Introduction The concept of Big Data in the broad sense of this word is used to define data processing, spread, and analytics (Stuart Ward & Barker, 2013). The main special feature of this data is increased exponentially. Many efforts are aimed at solving the problem of big data, this is due to the need to develop new methods and algorithms for BD processing. Defining big data is primarily related to the difficulty of defining a quantitative definition of a set of information objects. The most accepted definition is indicated in the report (Laney, 2001), where the problem of managing large data sets is based on the three Vs: Volume, Velocity, and Variety. They are expressed due to the growth of data volumes, the heterogeneity of data formats and metadata which make the rapid management of data more complicated. Later, such a criterion as Veracity (Schroeck, et al., 2012) was added to the definition of big data. This term was clarified and supple- mented with criteria that affected the complexity and unstructuredness of the data (Intel IT Center, 2012), (Suthaharan, 2014). A number of big data definitions came from real business problems. However, we assume that the semantics and structure are given through external ontologies and fixed through metadata for semantic big data. We do not consider the problem of normalization and data extraction but evaluate the quality of such data. But this does not solve the problems of operating with such data and creates additional problems related to the reasoning of information from such a BD set. Our semantic data model must satisfy such requirements as Findable, Accessible, Interoperable and Reusable data or metadata (Wilkinson, et al., 2016). 2. Big Data Semantics The issue of semantics was studied in works (Ceravolo, et al., 2018), where big data was considered on the ba- sis that data semantics refers to the meaningful and effective use of a data object to represent a concept or object in the real world. Such a general concept unites a wide variety of applications (Amsler, 1972). Big Data semantic knowledge © О.В. Новицький, 2022 ISSN 1727-4907. Проблеми програмування. 2022. № 3-4. Спеціальний випуск 261 Програмні засоби аналітики даних refers to numerous aspects of rules, expert knowledge and domain information (Woods, 1975). One of specific big data in semantic environment is complexly of reasoning even this data not to big for first view. Online web-application is very sensitive for delay for response and union approach reasoning and web technology provide high requirement to velocity big data. Our article survey the problem big data quality for web application and means for increasing velocity. 3. Model quality of Dig Data The practical suitability of BD is determined primarily by its quality. The urgency of solving the BD quality problem is determined by the scale of its creation and distribution. Let us consider the main concepts related to the quality of BD (Novytskyi, et al., 2014) some concepts was taken from the digital library domain and adapting to big data. Quality is a set of properties of objects that give them the ability to satisfy the stipulated or anticipated needs of the consumer following the purpose. The quality characteristic is a property or a set of object properties, with the help of which quality can be de- scribed and evaluated. Each object has its nomenclature characteristic. A characteristic can be a composition of other characteristics, forming a hierarchical structure. Metric is a formula or rule for determining the degree to which an object possesses a characteristic. A quality indicator is a quantitative or qualitative value, obtained as a result of the procedure for evaluating the quality of a characteristic according to the evaluation methodology. Quantitative indicators have a numerical expression within a certain scale. Qualitative indicators have a verbal expression within a certain verbal ordered scale. Quality level is the degree of acceptability of the obtained quality indicator from the view of the expected (planned) quality. The quality system is a set of organizational structures, methods, processes, procedures and resources necessary for the general direction and management of quality by established methods. It includes quality policy, quality model; quality achievement system; quality system documentation. The quality policy is a document developed by the responsible management. It expresses the goals in the quality field, the acceptable level of quality, the duties of various persons and structures for quality assurance, a set of measures to achieve quality. The quality policy is defined based on tasks set in the quality field. Quality model is a set of objects for which it is described, evaluated and supported. Also, it includes quality characteristics, methods and means of quality assessment, metrics and algorithms for determining quality indicators. A specific quality model is selected based on the developed quality policy and other factors. Achieving quality is a set of organizational structure, responsibilities, procedures, processes and resources that implement general quality management (Novytskyi, et al., 2014). The quality management system is an organizational structure that includes personnel who implement quality management functions using established methods. Quality management is the general management of quality provided by resources, particularly human resources. It organizes quality assurance work, interacts with the external environment, defines policies, goals and plans in the qual- ity field, and makes strategic and important operational decisions regarding quality. Also an quality assurance is creating confidence that quality requirements will be met. It includes administrative and procedural measures carried out within the framework of the quality system to ensure the fulfillment of requirements and goals. This is a systematic measurement, comparison with a standard, process monitoring, making technological or any other process adjustments to achieve the required quality. Quality control is a set of measures, procedures, methods and means that allow performing a systematic and in- dependent analysis. It is possible to determine the compliance of activities and results in the quality field with the planned measures and the effectiveness of their implementation and compliance with the set goals. The quality assurance system is the subject of the system analysis. Програмні засоби аналітики даних 3. Model quality of Dig Data The practical suitability of BD is determined primarily by its quality. The urgency of solving the BD quality problem is determined by the scale of its creation and distribution. Let us consider the main concepts related to the quality of BD (Novytskyi, et al., 2014) some concepts was taken from the digital library domain and adapting to big data. Quality is a set of properties of objects that give them the ability to satisfy the stipulated or anticipated needs of the consumer following the purpose. The quality characteristic is a property or a set of object properties, with the help of which quality can be described and evaluated. Each object has its nomenclature characteristic. A characteristic can be a composition of other characteristics, forming a hierarchical structure. Metric is a formula or rule for determining the degree to which an object possesses a characteristic. A quality indicator is a quantitative or qualitative value, obtained as a result of the procedure for evaluating the quality of a characteristic according to the evaluation methodology. Quantitative indicators have a numerical expression within a certain scale. Qualitative indicators have a verbal expression within a certain verbal ordered scale. Quality level is the degree of acceptability of the obtained quality indicator from the view of the expected (planned) quality. The quality system is a set of organizational structures, methods, processes, procedures and resources necessary for the general direction and management of quality by established methods. It includes quality policy, quality model; quality achievement system; quality system documentation. The quality policy is a document developed by the responsible management. It expresses the goals in the quality field, the acceptable level of quality, the duties of various persons and structures for quality assurance, a set of measures to achieve quality. The quality policy is defined based on tasks set in the quality field. Quality model is a set of objects for which it is described, evaluated and supported. Also, it includes quality characteristics, methods and means of quality assessment, metrics and algorithms for determining quality indicators. A specific quality model is selected based on the developed quality policy and other factors. Achieving quality is a set of organizational structure, responsibilities, procedures, processes and resources that implement general quality management (Novytskyi, et al., 2014). The quality management system is an organizational structure that includes personnel who implement quality management functions using established methods. Quality management is the general management of quality provided by resources, particularly human resources. It organizes quality assurance work, interacts with the external environment, defines policies, goals and plans in the quality field, and makes strategic and important operational decisions regarding quality. Also an quality assurance is creating confidence that quality requirements will be met. It includes administrative and procedural measures carried out within the framework of the quality system to ensure the fulfillment of requirements and goals. This is a systematic measurement, comparison with a standard, process monitoring, making technological or any other process adjustments to achieve the required quality. Quality control is a set of measures, procedures, methods and means that allow performing a systematic and independent analysis. It is possible to determine the compliance of activities and results in the quality field with the planned measures and the effectiveness of their implementation and compliance with the set goals. The quality assurance system is the subject of the system analysis. Quality managementQuality assurance Quality control Quality assessment Tools of support Methods Approach Tools Manage Monitors execution Evaluates efficiency Manage 3.1 The system of quality achieving Quality assessment measures the achieved or expected level of quality overall at every stage of the BD life cycle. There is a distinction between objective and subjective assessment. Objective assessment is a clearly defined assessment process, usually fixed by mathematical formulas, which does not depend on subjective perception. Subjective assessment is based on personal feelings, views and opinions. We propose considering the main requirements for the quality model (Spirin, et al., 2012), which are also applied to BD. 3.1 The system of quality achieving 262 Програмні засоби аналітики даних Quality assessment measures the achieved or expected level of quality overall at every stage of the BD life cycle. There is a distinction between objective and subjective assessment. Objective assessment is a clearly defined assessment process, usually fixed by mathematical formulas, which does not depend on subjective perception. Subjective assessment is based on personal feelings, views and opinions. We propose considering the main requirements for the quality model (Spirin, et al., 2012), which are also applied to BD. A. The quality model should provide an opportunity to highlight the quality of the product itself and its interac- tion with the environment. The following components are distinguished in this context as: - the quality of the product itself, without taking into account its behavior with the external environment (in- ternal quality); - product quality regarding its behavior in the external environment (external quality); - the quality of technological processes of product development (process quality); - the quality of the product to its use in different contexts (and the quality experienced by the user in specific scenarios of product use (quality during use)). B. The quality model should include all stages of the BD development and use life cycle starting from require- ments development and ending with the industrial operation. С. The quality model is relevant to all structural elements of BD. It contains all types of support for the software system — functional, informational, mathematical, technical, etc. D. An important component of the quality model is the structure of quality characteristics and metrics that assess elementary characteristics. BD consist of two components are data and data base application, information is retrieved from a computerized BD by using a computer program. The semantic information model for BD defines as a set of information objects in which each predicate define through top-level ontology. Each information IO object in the BD environment is specified in a certain directed acyclic graph where the in- formation object consists of a list of statements in the form «subject - predicate - object». Each such statement is called a triplet. The set of such triplets forms a directed graph, in which vertices are subjects and objects, and edges are predicates. Certain metadata describes each node of such a graph. That is, the model of the information object in the BD environment is defined as Програмні засоби аналітики даних [Введите текст] A. The quality model should provide an opportunity to highlight the quality of the product itself and its interaction with the environment. The following components are distinguished in this context as: − the quality of the product itself, without taking into account its behavior with the external environment (internal quality); − product quality regarding its behavior in the external environment (external quality); − the quality of technological processes of product development (process quality); − the quality of the product to its use in different contexts (and the quality experienced by the user in specific scenarios of product use (quality during use)). B. The quality model should include all stages of the BD development and use life cycle starting from requirements development and ending with the industrial operation. С. The quality model is relevant to all structural elements of BD. It contains all types of support for the software system — functional, informational, mathematical, technical, etc. D. An important component of the quality model is the structure of quality characteristics and metrics that assess elementary characteristics. BD consist of two components are data and data base application, information is retrieved from a computerized BD by using a computer program. The semantic information model for BD defines as a set of information objects in which each predicate define through top-level ontology. Each information IO object in the BD environment is specified in a certain directed acyclic graph where the information object consists of a list of statements in the form «subject - predicate - object». Each such statement is called a triplet. The set of such triplets forms a directed graph, in which vertices are subjects and objects, and edges are predicates. Certain metadata describes each node of such a graph. That is, the model of the information object in the BD environment is defined as ( ), ( ), ( )IO s m p m o m . Evaluating the quality of elementary characteristics involves determining their metrics represented by formulas or rules for determining the degree to which an object has an elementary characteristic (Novitsky, et al., 2016). The metric of an elementary characteristic reflects the degree to which an object or a set of objects possesses a certain property. Let a set of equivalent objects i M M ( 1,...,i N ), be given, which may or may not have a certain property. We define the following characteristic function: 1, ; ( , ) 0, . i i object M has property p M p anothercase . (1) Then the estimate of the degree to which the set of objects M has the property p is equal to: 1 , N j j M p M p N . (2) If the objects i M ( 1,...,i N ) are unequal and their weighting factor : 0 1 i i K K ( 1,...,i N ), is given for each of them, which determines the relative importance of the objects, then the above formula takes the following form: 1 , N j j J K M p M p N . (3) Similarly, a metric can be defined for a situation where one object can have multiple properties and it is necessary to determine to what extent they are inherent to the object. Establishing acceptable values for certain characteristics and adding a qualitative measure to the appropriate range is important for metrics. This range can be determined experimentally or algorithmically. An expert establishes it in many cases. For example, let's imagine j as an expert with j K competence specifying a range of values for the i with its characteristics: , ij ij X Y , ij Y - where the optimal value of the characteristic is ij X with its worst value. M experts evaluated the characteristics. The final score for the range of values is calculated as follows: . Evaluating the quality of elementary characteristics involves determining their metrics represented by formulas or rules for determining the degree to which an object has an elementary characteristic (Novitsky, et al., 2016). The metric of an elementary characteristic reflects the degree to which an object or a set of objects possesses a certain property. Let a set of equivalent objects Програмні засоби аналітики даних [Введите текст] A. The quality model should provide an opportunity to highlight the quality of the product itself and its interaction with the environment. The following components are distinguished in this context as: − the quality of the product itself, without taking into account its behavior with the external environment (internal quality); − product quality regarding its behavior in the external environment (external quality); − the quality of technological processes of product development (process quality); − the quality of the product to its use in different contexts (and the quality experienced by the user in specific scenarios of product use (quality during use)). B. The quality model should include all stages of the BD development and use life cycle starting from requirements development and ending with the industrial operation. С. The quality model is relevant to all structural elements of BD. It contains all types of support for the software system — functional, informational, mathematical, technical, etc. D. An important component of the quality model is the structure of quality characteristics and metrics that assess elementary characteristics. BD consist of two components are data and data base application, information is retrieved from a computerized BD by using a computer program. The semantic information model for BD defines as a set of information objects in which each predicate define through top-level ontology. Each information IO object in the BD environment is specified in a certain directed acyclic graph where the information object consists of a list of statements in the form «subject - predicate - object». Each such statement is called a triplet. The set of such triplets forms a directed graph, in which vertices are subjects and objects, and edges are predicates. Certain metadata describes each node of such a graph. That is, the model of the information object in the BD environment is defined as ( ), ( ), ( )IO s m p m o m . Evaluating the quality of elementary characteristics involves determining their metrics represented by formulas or rules for determining the degree to which an object has an elementary characteristic (Novitsky, et al., 2016). The metric of an elementary characteristic reflects the degree to which an object or a set of objects possesses a certain property. Let a set of equivalent objects i M M ( 1,...,i N ), be given, which may or may not have a certain property. We define the following characteristic function: 1, ; ( , ) 0, . i i object M has property p M p anothercase . (1) Then the estimate of the degree to which the set of objects M has the property p is equal to: 1 , N j j M p M p N . (2) If the objects i M ( 1,...,i N ) are unequal and their weighting factor : 0 1 i i K K ( 1,...,i N ), is given for each of them, which determines the relative importance of the objects, then the above formula takes the following form: 1 , N j j J K M p M p N . (3) Similarly, a metric can be defined for a situation where one object can have multiple properties and it is necessary to determine to what extent they are inherent to the object. Establishing acceptable values for certain characteristics and adding a qualitative measure to the appropriate range is important for metrics. This range can be determined experimentally or algorithmically. An expert establishes it in many cases. For example, let's imagine j as an expert with j K competence specifying a range of values for the i with its characteristics: , ij ij X Y , ij Y - where the optimal value of the characteristic is ij X with its worst value. M experts evaluated the characteristics. The final score for the range of values is calculated as follows: , be given, which may or may not have a certain property. We define the following characteristic function: Програмні засоби аналітики даних [Введите текст] A. The quality model should provide an opportunity to highlight the quality of the product itself and its interaction with the environment. The following components are distinguished in this context as: − the quality of the product itself, without taking into account its behavior with the external environment (internal quality); − product quality regarding its behavior in the external environment (external quality); − the quality of technological processes of product development (process quality); − the quality of the product to its use in different contexts (and the quality experienced by the user in specific scenarios of product use (quality during use)). B. The quality model should include all stages of the BD development and use life cycle starting from requirements development and ending with the industrial operation. С. The quality model is relevant to all structural elements of BD. It contains all types of support for the software system — functional, informational, mathematical, technical, etc. D. An important component of the quality model is the structure of quality characteristics and metrics that assess elementary characteristics. BD consist of two components are data and data base application, information is retrieved from a computerized BD by using a computer program. The semantic information model for BD defines as a set of information objects in which each predicate define through top-level ontology. Each information IO object in the BD environment is specified in a certain directed acyclic graph where the information object consists of a list of statements in the form «subject - predicate - object». Each such statement is called a triplet. The set of such triplets forms a directed graph, in which vertices are subjects and objects, and edges are predicates. Certain metadata describes each node of such a graph. That is, the model of the information object in the BD environment is defined as ( ), ( ), ( )IO s m p m o m . Evaluating the quality of elementary characteristics involves determining their metrics represented by formulas or rules for determining the degree to which an object has an elementary characteristic (Novitsky, et al., 2016). The metric of an elementary characteristic reflects the degree to which an object or a set of objects possesses a certain property. Let a set of equivalent objects i M M ( 1,...,i N ), be given, which may or may not have a certain property. We define the following characteristic function: 1, ; ( , ) 0, . i i object M has property p M p anothercase . (1) Then the estimate of the degree to which the set of objects M has the property p is equal to: 1 , N j j M p M p N . (2) If the objects i M ( 1,...,i N ) are unequal and their weighting factor : 0 1 i i K K ( 1,...,i N ), is given for each of them, which determines the relative importance of the objects, then the above formula takes the following form: 1 , N j j J K M p M p N . (3) Similarly, a metric can be defined for a situation where one object can have multiple properties and it is necessary to determine to what extent they are inherent to the object. Establishing acceptable values for certain characteristics and adding a qualitative measure to the appropriate range is important for metrics. This range can be determined experimentally or algorithmically. An expert establishes it in many cases. For example, let's imagine j as an expert with j K competence specifying a range of values for the i with its characteristics: , ij ij X Y , ij Y - where the optimal value of the characteristic is ij X with its worst value. M experts evaluated the characteristics. The final score for the range of values is calculated as follows: (1) Then the estimate of the degree to which the set of objects M has the property p is equal to: Програмні засоби аналітики даних [Введите текст] A. The quality model should provide an opportunity to highlight the quality of the product itself and its interaction with the environment. The following components are distinguished in this context as: − the quality of the product itself, without taking into account its behavior with the external environment (internal quality); − product quality regarding its behavior in the external environment (external quality); − the quality of technological processes of product development (process quality); − the quality of the product to its use in different contexts (and the quality experienced by the user in specific scenarios of product use (quality during use)). B. The quality model should include all stages of the BD development and use life cycle starting from requirements development and ending with the industrial operation. С. The quality model is relevant to all structural elements of BD. It contains all types of support for the software system — functional, informational, mathematical, technical, etc. D. An important component of the quality model is the structure of quality characteristics and metrics that assess elementary characteristics. BD consist of two components are data and data base application, information is retrieved from a computerized BD by using a computer program. The semantic information model for BD defines as a set of information objects in which each predicate define through top-level ontology. Each information IO object in the BD environment is specified in a certain directed acyclic graph where the information object consists of a list of statements in the form «subject - predicate - object». Each such statement is called a triplet. The set of such triplets forms a directed graph, in which vertices are subjects and objects, and edges are predicates. Certain metadata describes each node of such a graph. That is, the model of the information object in the BD environment is defined as ( ), ( ), ( )IO s m p m o m . Evaluating the quality of elementary characteristics involves determining their metrics represented by formulas or rules for determining the degree to which an object has an elementary characteristic (Novitsky, et al., 2016). The metric of an elementary characteristic reflects the degree to which an object or a set of objects possesses a certain property. Let a set of equivalent objects i M M ( 1,...,i N ), be given, which may or may not have a certain property. We define the following characteristic function: 1, ; ( , ) 0, . i i object M has property p M p anothercase . (1) Then the estimate of the degree to which the set of objects M has the property p is equal to: 1 , N j j M p M p N . (2) If the objects i M ( 1,...,i N ) are unequal and their weighting factor : 0 1 i i K K ( 1,...,i N ), is given for each of them, which determines the relative importance of the objects, then the above formula takes the following form: 1 , N j j J K M p M p N . (3) Similarly, a metric can be defined for a situation where one object can have multiple properties and it is necessary to determine to what extent they are inherent to the object. Establishing acceptable values for certain characteristics and adding a qualitative measure to the appropriate range is important for metrics. This range can be determined experimentally or algorithmically. An expert establishes it in many cases. For example, let's imagine j as an expert with j K competence specifying a range of values for the i with its characteristics: , ij ij X Y , ij Y - where the optimal value of the characteristic is ij X with its worst value. M experts evaluated the characteristics. The final score for the range of values is calculated as follows: (2) If the objects Програмні засоби аналітики даних [Введите текст] A. The quality model should provide an opportunity to highlight the quality of the product itself and its interaction with the environment. The following components are distinguished in this context as: − the quality of the product itself, without taking into account its behavior with the external environment (internal quality); − product quality regarding its behavior in the external environment (external quality); − the quality of technological processes of product development (process quality); − the quality of the product to its use in different contexts (and the quality experienced by the user in specific scenarios of product use (quality during use)). B. The quality model should include all stages of the BD development and use life cycle starting from requirements development and ending with the industrial operation. С. The quality model is relevant to all structural elements of BD. It contains all types of support for the software system — functional, informational, mathematical, technical, etc. D. An important component of the quality model is the structure of quality characteristics and metrics that assess elementary characteristics. BD consist of two components are data and data base application, information is retrieved from a computerized BD by using a computer program. The semantic information model for BD defines as a set of information objects in which each predicate define through top-level ontology. Each information IO object in the BD environment is specified in a certain directed acyclic graph where the information object consists of a list of statements in the form «subject - predicate - object». Each such statement is called a triplet. The set of such triplets forms a directed graph, in which vertices are subjects and objects, and edges are predicates. Certain metadata describes each node of such a graph. That is, the model of the information object in the BD environment is defined as ( ), ( ), ( )IO s m p m o m . Evaluating the quality of elementary characteristics involves determining their metrics represented by formulas or rules for determining the degree to which an object has an elementary characteristic (Novitsky, et al., 2016). The metric of an elementary characteristic reflects the degree to which an object or a set of objects possesses a certain property. Let a set of equivalent objects i M M ( 1,...,i N ), be given, which may or may not have a certain property. We define the following characteristic function: 1, ; ( , ) 0, . i i object M has property p M p anothercase . (1) Then the estimate of the degree to which the set of objects M has the property p is equal to: 1 , N j j M p M p N . (2) If the objects i M ( 1,...,i N ) are unequal and their weighting factor : 0 1 i i K K ( 1,...,i N ), is given for each of them, which determines the relative importance of the objects, then the above formula takes the following form: 1 , N j j J K M p M p N . (3) Similarly, a metric can be defined for a situation where one object can have multiple properties and it is necessary to determine to what extent they are inherent to the object. Establishing acceptable values for certain characteristics and adding a qualitative measure to the appropriate range is important for metrics. This range can be determined experimentally or algorithmically. An expert establishes it in many cases. For example, let's imagine j as an expert with j K competence specifying a range of values for the i with its characteristics: , ij ij X Y , ij Y - where the optimal value of the characteristic is ij X with its worst value. M experts evaluated the characteristics. The final score for the range of values is calculated as follows: are unequal and their weighting factor Програмні засоби аналітики даних [Введите текст] A. The quality model should provide an opportunity to highlight the quality of the product itself and its interaction with the environment. The following components are distinguished in this context as: − the quality of the product itself, without taking into account its behavior with the external environment (internal quality); − product quality regarding its behavior in the external environment (external quality); − the quality of technological processes of product development (process quality); − the quality of the product to its use in different contexts (and the quality experienced by the user in specific scenarios of product use (quality during use)). B. The quality model should include all stages of the BD development and use life cycle starting from requirements development and ending with the industrial operation. С. The quality model is relevant to all structural elements of BD. It contains all types of support for the software system — functional, informational, mathematical, technical, etc. D. An important component of the quality model is the structure of quality characteristics and metrics that assess elementary characteristics. BD consist of two components are data and data base application, information is retrieved from a computerized BD by using a computer program. The semantic information model for BD defines as a set of information objects in which each predicate define through top-level ontology. Each information IO object in the BD environment is specified in a certain directed acyclic graph where the information object consists of a list of statements in the form «subject - predicate - object». Each such statement is called a triplet. The set of such triplets forms a directed graph, in which vertices are subjects and objects, and edges are predicates. Certain metadata describes each node of such a graph. That is, the model of the information object in the BD environment is defined as ( ), ( ), ( )IO s m p m o m . Evaluating the quality of elementary characteristics involves determining their metrics represented by formulas or rules for determining the degree to which an object has an elementary characteristic (Novitsky, et al., 2016). The metric of an elementary characteristic reflects the degree to which an object or a set of objects possesses a certain property. Let a set of equivalent objects i M M ( 1,...,i N ), be given, which may or may not have a certain property. We define the following characteristic function: 1, ; ( , ) 0, . i i object M has property p M p anothercase . (1) Then the estimate of the degree to which the set of objects M has the property p is equal to: 1 , N j j M p M p N . (2) If the objects i M ( 1,...,i N ) are unequal and their weighting factor : 0 1 i i K K ( 1,...,i N ), is given for each of them, which determines the relative importance of the objects, then the above formula takes the following form: 1 , N j j J K M p M p N . (3) Similarly, a metric can be defined for a situation where one object can have multiple properties and it is necessary to determine to what extent they are inherent to the object. Establishing acceptable values for certain characteristics and adding a qualitative measure to the appropriate range is important for metrics. This range can be determined experimentally or algorithmically. An expert establishes it in many cases. For example, let's imagine j as an expert with j K competence specifying a range of values for the i with its characteristics: , ij ij X Y , ij Y - where the optimal value of the characteristic is ij X with its worst value. M experts evaluated the characteristics. The final score for the range of values is calculated as follows: , is given for each of them, which determines the relative importance of the objects, then the above formula takes the follow- ing form: Програмні засоби аналітики даних [Введите текст] A. The quality model should provide an opportunity to highlight the quality of the product itself and its interaction with the environment. The following components are distinguished in this context as: − the quality of the product itself, without taking into account its behavior with the external environment (internal quality); − product quality regarding its behavior in the external environment (external quality); − the quality of technological processes of product development (process quality); − the quality of the product to its use in different contexts (and the quality experienced by the user in specific scenarios of product use (quality during use)). B. The quality model should include all stages of the BD development and use life cycle starting from requirements development and ending with the industrial operation. С. The quality model is relevant to all structural elements of BD. It contains all types of support for the software system — functional, informational, mathematical, technical, etc. D. An important component of the quality model is the structure of quality characteristics and metrics that assess elementary characteristics. BD consist of two components are data and data base application, information is retrieved from a computerized BD by using a computer program. The semantic information model for BD defines as a set of information objects in which each predicate define through top-level ontology. Each information IO object in the BD environment is specified in a certain directed acyclic graph where the information object consists of a list of statements in the form «subject - predicate - object». Each such statement is called a triplet. The set of such triplets forms a directed graph, in which vertices are subjects and objects, and edges are predicates. Certain metadata describes each node of such a graph. That is, the model of the information object in the BD environment is defined as ( ), ( ), ( )IO s m p m o m . Evaluating the quality of elementary characteristics involves determining their metrics represented by formulas or rules for determining the degree to which an object has an elementary characteristic (Novitsky, et al., 2016). The metric of an elementary characteristic reflects the degree to which an object or a set of objects possesses a certain property. Let a set of equivalent objects i M M ( 1,...,i N ), be given, which may or may not have a certain property. We define the following characteristic function: 1, ; ( , ) 0, . i i object M has property p M p anothercase . (1) Then the estimate of the degree to which the set of objects M has the property p is equal to: 1 , N j j M p M p N . (2) If the objects i M ( 1,...,i N ) are unequal and their weighting factor : 0 1 i i K K ( 1,...,i N ), is given for each of them, which determines the relative importance of the objects, then the above formula takes the following form: 1 , N j j J K M p M p N . (3) Similarly, a metric can be defined for a situation where one object can have multiple properties and it is necessary to determine to what extent they are inherent to the object. Establishing acceptable values for certain characteristics and adding a qualitative measure to the appropriate range is important for metrics. This range can be determined experimentally or algorithmically. An expert establishes it in many cases. For example, let's imagine j as an expert with j K competence specifying a range of values for the i with its characteristics: , ij ij X Y , ij Y - where the optimal value of the characteristic is ij X with its worst value. M experts evaluated the characteristics. The final score for the range of values is calculated as follows: (3) Similarly, a metric can be defined for a situation where one object can have multiple properties and it is necessary to determine to what extent they are inherent to the object. Establishing acceptable values for certain characteristics and adding a qualitative measure to the appropriate range is important for metrics. This range can be determined experimentally or algorithmically. An expert establishes it in many cases. For example, let’s imagine Програмні засоби аналітики даних [Введите текст] A. The quality model should provide an opportunity to highlight the quality of the product itself and its interaction with the environment. The following components are distinguished in this context as: − the quality of the product itself, without taking into account its behavior with the external environment (internal quality); − product quality regarding its behavior in the external environment (external quality); − the quality of technological processes of product development (process quality); − the quality of the product to its use in different contexts (and the quality experienced by the user in specific scenarios of product use (quality during use)). B. The quality model should include all stages of the BD development and use life cycle starting from requirements development and ending with the industrial operation. С. The quality model is relevant to all structural elements of BD. It contains all types of support for the software system — functional, informational, mathematical, technical, etc. D. An important component of the quality model is the structure of quality characteristics and metrics that assess elementary characteristics. BD consist of two components are data and data base application, information is retrieved from a computerized BD by using a computer program. The semantic information model for BD defines as a set of information objects in which each predicate define through top-level ontology. Each information IO object in the BD environment is specified in a certain directed acyclic graph where the information object consists of a list of statements in the form «subject - predicate - object». Each such statement is called a triplet. The set of such triplets forms a directed graph, in which vertices are subjects and objects, and edges are predicates. Certain metadata describes each node of such a graph. That is, the model of the information object in the BD environment is defined as ( ), ( ), ( )IO s m p m o m . Evaluating the quality of elementary characteristics involves determining their metrics represented by formulas or rules for determining the degree to which an object has an elementary characteristic (Novitsky, et al., 2016). The metric of an elementary characteristic reflects the degree to which an object or a set of objects possesses a certain property. Let a set of equivalent objects i M M ( 1,...,i N ), be given, which may or may not have a certain property. We define the following characteristic function: 1, ; ( , ) 0, . i i object M has property p M p anothercase . (1) Then the estimate of the degree to which the set of objects M has the property p is equal to: 1 , N j j M p M p N . (2) If the objects i M ( 1,...,i N ) are unequal and their weighting factor : 0 1 i i K K ( 1,...,i N ), is given for each of them, which determines the relative importance of the objects, then the above formula takes the following form: 1 , N j j J K M p M p N . (3) Similarly, a metric can be defined for a situation where one object can have multiple properties and it is necessary to determine to what extent they are inherent to the object. Establishing acceptable values for certain characteristics and adding a qualitative measure to the appropriate range is important for metrics. This range can be determined experimentally or algorithmically. An expert establishes it in many cases. For example, let's imagine j as an expert with j K competence specifying a range of values for the i with its characteristics: , ij ij X Y , ij Y - where the optimal value of the characteristic is ij X with its worst value. M experts evaluated the characteristics. The final score for the range of values is calculated as follows: as an expert with Програмні засоби аналітики даних [Введите текст] A. The quality model should provide an opportunity to highlight the quality of the product itself and its interaction with the environment. The following components are distinguished in this context as: − the quality of the product itself, without taking into account its behavior with the external environment (internal quality); − product quality regarding its behavior in the external environment (external quality); − the quality of technological processes of product development (process quality); − the quality of the product to its use in different contexts (and the quality experienced by the user in specific scenarios of product use (quality during use)). B. The quality model should include all stages of the BD development and use life cycle starting from requirements development and ending with the industrial operation. С. The quality model is relevant to all structural elements of BD. It contains all types of support for the software system — functional, informational, mathematical, technical, etc. D. An important component of the quality model is the structure of quality characteristics and metrics that assess elementary characteristics. BD consist of two components are data and data base application, information is retrieved from a computerized BD by using a computer program. The semantic information model for BD defines as a set of information objects in which each predicate define through top-level ontology. Each information IO object in the BD environment is specified in a certain directed acyclic graph where the information object consists of a list of statements in the form «subject - predicate - object». Each such statement is called a triplet. The set of such triplets forms a directed graph, in which vertices are subjects and objects, and edges are predicates. Certain metadata describes each node of such a graph. That is, the model of the information object in the BD environment is defined as ( ), ( ), ( )IO s m p m o m . Evaluating the quality of elementary characteristics involves determining their metrics represented by formulas or rules for determining the degree to which an object has an elementary characteristic (Novitsky, et al., 2016). The metric of an elementary characteristic reflects the degree to which an object or a set of objects possesses a certain property. Let a set of equivalent objects i M M ( 1,...,i N ), be given, which may or may not have a certain property. We define the following characteristic function: 1, ; ( , ) 0, . i i object M has property p M p anothercase . (1) Then the estimate of the degree to which the set of objects M has the property p is equal to: 1 , N j j M p M p N . (2) If the objects i M ( 1,...,i N ) are unequal and their weighting factor : 0 1 i i K K ( 1,...,i N ), is given for each of them, which determines the relative importance of the objects, then the above formula takes the following form: 1 , N j j J K M p M p N . (3) Similarly, a metric can be defined for a situation where one object can have multiple properties and it is necessary to determine to what extent they are inherent to the object. Establishing acceptable values for certain characteristics and adding a qualitative measure to the appropriate range is important for metrics. This range can be determined experimentally or algorithmically. An expert establishes it in many cases. For example, let's imagine j as an expert with j K competence specifying a range of values for the i with its characteristics: , ij ij X Y , ij Y - where the optimal value of the characteristic is ij X with its worst value. M experts evaluated the characteristics. The final score for the range of values is calculated as follows: competence specifying a range of values for the i with its characteristics: Програмні засоби аналітики даних [Введите текст] A. The quality model should provide an opportunity to highlight the quality of the product itself and its interaction with the environment. The following components are distinguished in this context as: − the quality of the product itself, without taking into account its behavior with the external environment (internal quality); − product quality regarding its behavior in the external environment (external quality); − the quality of technological processes of product development (process quality); − the quality of the product to its use in different contexts (and the quality experienced by the user in specific scenarios of product use (quality during use)). B. The quality model should include all stages of the BD development and use life cycle starting from requirements development and ending with the industrial operation. С. The quality model is relevant to all structural elements of BD. It contains all types of support for the software system — functional, informational, mathematical, technical, etc. D. An important component of the quality model is the structure of quality characteristics and metrics that assess elementary characteristics. BD consist of two components are data and data base application, information is retrieved from a computerized BD by using a computer program. The semantic information model for BD defines as a set of information objects in which each predicate define through top-level ontology. Each information IO object in the BD environment is specified in a certain directed acyclic graph where the information object consists of a list of statements in the form «subject - predicate - object». Each such statement is called a triplet. The set of such triplets forms a directed graph, in which vertices are subjects and objects, and edges are predicates. Certain metadata describes each node of such a graph. That is, the model of the information object in the BD environment is defined as ( ), ( ), ( )IO s m p m o m . Evaluating the quality of elementary characteristics involves determining their metrics represented by formulas or rules for determining the degree to which an object has an elementary characteristic (Novitsky, et al., 2016). The metric of an elementary characteristic reflects the degree to which an object or a set of objects possesses a certain property. Let a set of equivalent objects i M M ( 1,...,i N ), be given, which may or may not have a certain property. We define the following characteristic function: 1, ; ( , ) 0, . i i object M has property p M p anothercase . (1) Then the estimate of the degree to which the set of objects M has the property p is equal to: 1 , N j j M p M p N . (2) If the objects i M ( 1,...,i N ) are unequal and their weighting factor : 0 1 i i K K ( 1,...,i N ), is given for each of them, which determines the relative importance of the objects, then the above formula takes the following form: 1 , N j j J K M p M p N . (3) Similarly, a metric can be defined for a situation where one object can have multiple properties and it is necessary to determine to what extent they are inherent to the object. Establishing acceptable values for certain characteristics and adding a qualitative measure to the appropriate range is important for metrics. This range can be determined experimentally or algorithmically. An expert establishes it in many cases. For example, let's imagine j as an expert with j K competence specifying a range of values for the i with its characteristics: , ij ij X Y , ij Y - where the optimal value of the characteristic is ij X with its worst value. M experts evaluated the characteristics. The final score for the range of values is calculated as follows: – where the optimal value of the characteristic is Програмні засоби аналітики даних [Введите текст] A. The quality model should provide an opportunity to highlight the quality of the product itself and its interaction with the environment. The following components are distinguished in this context as: − the quality of the product itself, without taking into account its behavior with the external environment (internal quality); − product quality regarding its behavior in the external environment (external quality); − the quality of technological processes of product development (process quality); − the quality of the product to its use in different contexts (and the quality experienced by the user in specific scenarios of product use (quality during use)). B. The quality model should include all stages of the BD development and use life cycle starting from requirements development and ending with the industrial operation. С. The quality model is relevant to all structural elements of BD. It contains all types of support for the software system — functional, informational, mathematical, technical, etc. D. An important component of the quality model is the structure of quality characteristics and metrics that assess elementary characteristics. BD consist of two components are data and data base application, information is retrieved from a computerized BD by using a computer program. The semantic information model for BD defines as a set of information objects in which each predicate define through top-level ontology. Each information IO object in the BD environment is specified in a certain directed acyclic graph where the information object consists of a list of statements in the form «subject - predicate - object». Each such statement is called a triplet. The set of such triplets forms a directed graph, in which vertices are subjects and objects, and edges are predicates. Certain metadata describes each node of such a graph. That is, the model of the information object in the BD environment is defined as ( ), ( ), ( )IO s m p m o m . Evaluating the quality of elementary characteristics involves determining their metrics represented by formulas or rules for determining the degree to which an object has an elementary characteristic (Novitsky, et al., 2016). The metric of an elementary characteristic reflects the degree to which an object or a set of objects possesses a certain property. Let a set of equivalent objects i M M ( 1,...,i N ), be given, which may or may not have a certain property. We define the following characteristic function: 1, ; ( , ) 0, . i i object M has property p M p anothercase . (1) Then the estimate of the degree to which the set of objects M has the property p is equal to: 1 , N j j M p M p N . (2) If the objects i M ( 1,...,i N ) are unequal and their weighting factor : 0 1 i i K K ( 1,...,i N ), is given for each of them, which determines the relative importance of the objects, then the above formula takes the following form: 1 , N j j J K M p M p N . (3) Similarly, a metric can be defined for a situation where one object can have multiple properties and it is necessary to determine to what extent they are inherent to the object. Establishing acceptable values for certain characteristics and adding a qualitative measure to the appropriate range is important for metrics. This range can be determined experimentally or algorithmically. An expert establishes it in many cases. For example, let's imagine j as an expert with j K competence specifying a range of values for the i with its characteristics: , ij ij X Y , ij Y - where the optimal value of the characteristic is ij X with its worst value. M experts evaluated the characteristics. The final score for the range of values is calculated as follows: with its worst value. M experts evaluated the characteristics. The final score for the range of values is calculated as follows: Програмні засоби аналітики даних 1 1 M j ij j i M j j K X X K 1 1 M j ij j i M j j K Y Y K . (4) It should be noted that intervals , ij ij X Y are set by experts or determined algorithmically only for elementary characteristics. At other levels, i.e. for integral characteristics, the minimum and maximum values are calculated according to the defined formulas based on the given or calculated values of the previous levels (Novitsky, et al., 2016). 4. Quality properties of information objects in Big Data Next, the issues of evaluating the quality of semantic information objects are considered. IO quality characteristics. Accessibility is a complex function that depends on many factors, including: − the IO is actually available in the DB (the information object may be in the BD, but for some reasons, it may be removed from public access or due to the amount of data, it may not be identified among a set of objects); − there is a service that can find the IO (one of the ways to remove an information object from public access is to deactivate its searching characteristics); − it is the network and data transmission system in the network operational; − there are no restrictions on access to the IO or if there are such restrictions they do not apply to specific persons or groups of persons. It should be noted that in the given context, they talk about the availability of the IO to perform a single operation as reading. Our review does not include other possible operations with IO (changes, deletion, administration). For BD this is availability for a specific service that interacts with BD. As a rule, a distinction is made between availability for all and certain services. In this case, the restriction of access rights , i j SAcc IO i S service to j IO , means a function that acquires the following values: 1 — the service does not have access restrictions or it belongs to the group to which access is open; 0 — otherwise. Now, if we mark other availability indicators as i P except for access rights restrictions which take the following values: 1 — the indicator is satisfied, 0 — the indicator is not satisfied, then the general availability formula is calculated as follows: 1 , ..., , , n i j MIN P c SAc IOP . (5) Relevance is the measure to which the information content of the information object meets the information needs of the user. Both cannot be strictly formalized. This assessment largely depends on the depth of the user's knowledge about their information needs at the current time and the tasks facing them. The user's information needs at the current moment are expressed through his information search query as a result of knowledge reasoning. The query implicitly defines the context in which relevance is evaluated. The user carries out an evaluation of this compliance as a result of receiving a response to the request (the user can be a group of people). The relevance evaluation function is as follows , , i j k Relevance IO S Query : 1 Servise S IO , , , 0 another case j i k i j k is relevant forQuery R appove tha n t l IOe eva ce S Query (6) Accuracy of storage. In the process of existence, the object can go into different states caused by the transition to other software and technology platforms. Big data is characterized by constant changes, and errors in these data also tend to accumulate and scale [14], including changing the storage format, using newer versions of BD, etc. All this can lead to a loss of storage accuracy of the new version of the information object compared to the old one. This characteristic assesses the loss degree of storage accuracy in the above-described cases (Novitsky, et al., 2016). Credibility means that the IO has the ability to confirm that it is what it should be. The ability to verify and measure the extent to which an IO is what it is claimed to be is fundamentally important in its correct perception and use. Reliability determines the extent to which the IO can be relied upon. This is largely determined by the developer's credibility and origin source. The credibility of the IO can be measured by: − the attitude of users towards the IO itself; − the attitude of users towards the source of the IP; − the availability of information on the chronology of IO changes; − the attitude of users to the BD in which the IO is located. . (4) It should be noted that intervals Програмні засоби аналітики даних 1 1 M j ij j i M j j K X X K 1 1 M j ij j i M j j K Y Y K . (4) It should be noted that intervals , ij ij X Y are set by experts or determined algorithmically only for elementary characteristics. At other levels, i.e. for integral characteristics, the minimum and maximum values are calculated according to the defined formulas based on the given or calculated values of the previous levels (Novitsky, et al., 2016). 4. Quality properties of information objects in Big Data Next, the issues of evaluating the quality of semantic information objects are considered. IO quality characteristics. Accessibility is a complex function that depends on many factors, including: − the IO is actually available in the DB (the information object may be in the BD, but for some reasons, it may be removed from public access or due to the amount of data, it may not be identified among a set of objects); − there is a service that can find the IO (one of the ways to remove an information object from public access is to deactivate its searching characteristics); − it is the network and data transmission system in the network operational; − there are no restrictions on access to the IO or if there are such restrictions they do not apply to specific persons or groups of persons. It should be noted that in the given context, they talk about the availability of the IO to perform a single operation as reading. Our review does not include other possible operations with IO (changes, deletion, administration). For BD this is availability for a specific service that interacts with BD. As a rule, a distinction is made between availability for all and certain services. In this case, the restriction of access rights , i j SAcc IO i S service to j IO , means a function that acquires the following values: 1 — the service does not have access restrictions or it belongs to the group to which access is open; 0 — otherwise. Now, if we mark other availability indicators as i P except for access rights restrictions which take the following values: 1 — the indicator is satisfied, 0 — the indicator is not satisfied, then the general availability formula is calculated as follows: 1 , ..., , , n i j MIN P c SAc IOP . (5) Relevance is the measure to which the information content of the information object meets the information needs of the user. Both cannot be strictly formalized. This assessment largely depends on the depth of the user's knowledge about their information needs at the current time and the tasks facing them. The user's information needs at the current moment are expressed through his information search query as a result of knowledge reasoning. The query implicitly defines the context in which relevance is evaluated. The user carries out an evaluation of this compliance as a result of receiving a response to the request (the user can be a group of people). The relevance evaluation function is as follows , , i j k Relevance IO S Query : 1 Servise S IO , , , 0 another case j i k i j k is relevant forQuery R appove tha n t l IOe eva ce S Query (6) Accuracy of storage. In the process of existence, the object can go into different states caused by the transition to other software and technology platforms. Big data is characterized by constant changes, and errors in these data also tend to accumulate and scale [14], including changing the storage format, using newer versions of BD, etc. All this can lead to a loss of storage accuracy of the new version of the information object compared to the old one. This characteristic assesses the loss degree of storage accuracy in the above-described cases (Novitsky, et al., 2016). Credibility means that the IO has the ability to confirm that it is what it should be. The ability to verify and measure the extent to which an IO is what it is claimed to be is fundamentally important in its correct perception and use. Reliability determines the extent to which the IO can be relied upon. This is largely determined by the developer's credibility and origin source. The credibility of the IO can be measured by: − the attitude of users towards the IO itself; − the attitude of users towards the source of the IP; − the availability of information on the chronology of IO changes; − the attitude of users to the BD in which the IO is located. are set by experts or determined algorithmically only for elementary characteristics. At other levels, i.e. for integral characteristics, the minimum and maximum values are calculated accord- ing to the defined formulas based on the given or calculated values of the previous levels (Novitsky, et al., 2016). 263 Програмні засоби аналітики даних 4. Quality properties of information objects in Big Data Next, the issues of evaluating the quality of semantic information objects are considered. IO quality char- acteristics. Accessibility is a complex function that depends on many factors, including: - the IO is actually available in the DB (the information object may be in the BD, but for some reasons, it may be removed from public access or due to the amount of data, it may not be identified among a set of objects); - there is a service that can find the IO (one of the ways to remove an information object from public access is to deactivate its searching characteristics); - it is the network and data transmission system in the network operational; - there are no restrictions on access to the IO or if there are such restrictions they do not apply to specific per- sons or groups of persons. It should be noted that in the given context, they talk about the availability of the IO to perform a single operation as reading. Our review does not include other possible operations with IO (changes, deletion, administration). For BD this is availability for a specific service that interacts with BD. As a rule, a distinction is made between availability for all and certain services. In this case, the restriction of access rights Програмні засоби аналітики даних 1 1 M j ij j i M j j K X X K 1 1 M j ij j i M j j K Y Y K . (4) It should be noted that intervals , ij ij X Y are set by experts or determined algorithmically only for elementary characteristics. At other levels, i.e. for integral characteristics, the minimum and maximum values are calculated according to the defined formulas based on the given or calculated values of the previous levels (Novitsky, et al., 2016). 4. Quality properties of information objects in Big Data Next, the issues of evaluating the quality of semantic information objects are considered. IO quality characteristics. Accessibility is a complex function that depends on many factors, including: − the IO is actually available in the DB (the information object may be in the BD, but for some reasons, it may be removed from public access or due to the amount of data, it may not be identified among a set of objects); − there is a service that can find the IO (one of the ways to remove an information object from public access is to deactivate its searching characteristics); − it is the network and data transmission system in the network operational; − there are no restrictions on access to the IO or if there are such restrictions they do not apply to specific persons or groups of persons. It should be noted that in the given context, they talk about the availability of the IO to perform a single operation as reading. Our review does not include other possible operations with IO (changes, deletion, administration). For BD this is availability for a specific service that interacts with BD. As a rule, a distinction is made between availability for all and certain services. In this case, the restriction of access rights , i j SAcc IO i S service to j IO , means a function that acquires the following values: 1 — the service does not have access restrictions or it belongs to the group to which access is open; 0 — otherwise. Now, if we mark other availability indicators as i P except for access rights restrictions which take the following values: 1 — the indicator is satisfied, 0 — the indicator is not satisfied, then the general availability formula is calculated as follows: 1 , ..., , , n i j MIN P c SAc IOP . (5) Relevance is the measure to which the information content of the information object meets the information needs of the user. Both cannot be strictly formalized. This assessment largely depends on the depth of the user's knowledge about their information needs at the current time and the tasks facing them. The user's information needs at the current moment are expressed through his information search query as a result of knowledge reasoning. The query implicitly defines the context in which relevance is evaluated. The user carries out an evaluation of this compliance as a result of receiving a response to the request (the user can be a group of people). The relevance evaluation function is as follows , , i j k Relevance IO S Query : 1 Servise S IO , , , 0 another case j i k i j k is relevant forQuery R appove tha n t l IOe eva ce S Query (6) Accuracy of storage. In the process of existence, the object can go into different states caused by the transition to other software and technology platforms. Big data is characterized by constant changes, and errors in these data also tend to accumulate and scale [14], including changing the storage format, using newer versions of BD, etc. All this can lead to a loss of storage accuracy of the new version of the information object compared to the old one. This characteristic assesses the loss degree of storage accuracy in the above-described cases (Novitsky, et al., 2016). Credibility means that the IO has the ability to confirm that it is what it should be. The ability to verify and measure the extent to which an IO is what it is claimed to be is fundamentally important in its correct perception and use. Reliability determines the extent to which the IO can be relied upon. This is largely determined by the developer's credibility and origin source. The credibility of the IO can be measured by: − the attitude of users towards the IO itself; − the attitude of users towards the source of the IP; − the availability of information on the chronology of IO changes; − the attitude of users to the BD in which the IO is located. means a function that acquires the following values: 1 — the service does not have access restrictions or it belongs to the group to which access is open; 0 — otherwise. Now, if we mark other availability indicators as except for access rights restrictions which take the following values: 1 — the indicator is satisfied, 0 — the indicator is not satisfied, then the general availability formula is calculated as follows: Програмні засоби аналітики даних 1 1 M j ij j i M j j K X X K 1 1 M j ij j i M j j K Y Y K . (4) It should be noted that intervals , ij ij X Y are set by experts or determined algorithmically only for elementary characteristics. At other levels, i.e. for integral characteristics, the minimum and maximum values are calculated according to the defined formulas based on the given or calculated values of the previous levels (Novitsky, et al., 2016). 4. Quality properties of information objects in Big Data Next, the issues of evaluating the quality of semantic information objects are considered. IO quality characteristics. Accessibility is a complex function that depends on many factors, including: − the IO is actually available in the DB (the information object may be in the BD, but for some reasons, it may be removed from public access or due to the amount of data, it may not be identified among a set of objects); − there is a service that can find the IO (one of the ways to remove an information object from public access is to deactivate its searching characteristics); − it is the network and data transmission system in the network operational; − there are no restrictions on access to the IO or if there are such restrictions they do not apply to specific persons or groups of persons. It should be noted that in the given context, they talk about the availability of the IO to perform a single operation as reading. Our review does not include other possible operations with IO (changes, deletion, administration). For BD this is availability for a specific service that interacts with BD. As a rule, a distinction is made between availability for all and certain services. In this case, the restriction of access rights , i j SAcc IO i S service to j IO , means a function that acquires the following values: 1 — the service does not have access restrictions or it belongs to the group to which access is open; 0 — otherwise. Now, if we mark other availability indicators as i P except for access rights restrictions which take the following values: 1 — the indicator is satisfied, 0 — the indicator is not satisfied, then the general availability formula is calculated as follows: 1 , ..., , , n i j MIN P c SAc IOP . (5) Relevance is the measure to which the information content of the information object meets the information needs of the user. Both cannot be strictly formalized. This assessment largely depends on the depth of the user's knowledge about their information needs at the current time and the tasks facing them. The user's information needs at the current moment are expressed through his information search query as a result of knowledge reasoning. The query implicitly defines the context in which relevance is evaluated. The user carries out an evaluation of this compliance as a result of receiving a response to the request (the user can be a group of people). The relevance evaluation function is as follows , , i j k Relevance IO S Query : 1 Servise S IO , , , 0 another case j i k i j k is relevant forQuery R appove tha n t l IOe eva ce S Query (6) Accuracy of storage. In the process of existence, the object can go into different states caused by the transition to other software and technology platforms. Big data is characterized by constant changes, and errors in these data also tend to accumulate and scale [14], including changing the storage format, using newer versions of BD, etc. All this can lead to a loss of storage accuracy of the new version of the information object compared to the old one. This characteristic assesses the loss degree of storage accuracy in the above-described cases (Novitsky, et al., 2016). Credibility means that the IO has the ability to confirm that it is what it should be. The ability to verify and measure the extent to which an IO is what it is claimed to be is fundamentally important in its correct perception and use. Reliability determines the extent to which the IO can be relied upon. This is largely determined by the developer's credibility and origin source. The credibility of the IO can be measured by: − the attitude of users towards the IO itself; − the attitude of users towards the source of the IP; − the availability of information on the chronology of IO changes; − the attitude of users to the BD in which the IO is located. (5) Relevance is the measure to which the information content of the information object meets the information needs of the user. Both cannot be strictly formalized. This assessment largely depends on the depth of the user’s knowledge about their information needs at the current time and the tasks facing them. The user’s information needs at the current moment are expressed through his information search query as a result of knowledge reasoning. The query implicitly defines the context in which relevance is evaluated. The user carries out an evaluation of this compliance as a result of receiving a response to the request (the user can be a group of people). The relevance evaluation function is as follows Програмні засоби аналітики даних 1 1 M j ij j i M j j K X X K 1 1 M j ij j i M j j K Y Y K . (4) It should be noted that intervals , ij ij X Y are set by experts or determined algorithmically only for elementary characteristics. At other levels, i.e. for integral characteristics, the minimum and maximum values are calculated according to the defined formulas based on the given or calculated values of the previous levels (Novitsky, et al., 2016). 4. Quality properties of information objects in Big Data Next, the issues of evaluating the quality of semantic information objects are considered. IO quality characteristics. Accessibility is a complex function that depends on many factors, including: − the IO is actually available in the DB (the information object may be in the BD, but for some reasons, it may be removed from public access or due to the amount of data, it may not be identified among a set of objects); − there is a service that can find the IO (one of the ways to remove an information object from public access is to deactivate its searching characteristics); − it is the network and data transmission system in the network operational; − there are no restrictions on access to the IO or if there are such restrictions they do not apply to specific persons or groups of persons. It should be noted that in the given context, they talk about the availability of the IO to perform a single operation as reading. Our review does not include other possible operations with IO (changes, deletion, administration). For BD this is availability for a specific service that interacts with BD. As a rule, a distinction is made between availability for all and certain services. In this case, the restriction of access rights , i j SAcc IO i S service to j IO , means a function that acquires the following values: 1 — the service does not have access restrictions or it belongs to the group to which access is open; 0 — otherwise. Now, if we mark other availability indicators as i P except for access rights restrictions which take the following values: 1 — the indicator is satisfied, 0 — the indicator is not satisfied, then the general availability formula is calculated as follows: 1 , ..., , , n i j MIN P c SAc IOP . (5) Relevance is the measure to which the information content of the information object meets the information needs of the user. Both cannot be strictly formalized. This assessment largely depends on the depth of the user's knowledge about their information needs at the current time and the tasks facing them. The user's information needs at the current moment are expressed through his information search query as a result of knowledge reasoning. The query implicitly defines the context in which relevance is evaluated. The user carries out an evaluation of this compliance as a result of receiving a response to the request (the user can be a group of people). The relevance evaluation function is as follows , , i j k Relevance IO S Query : 1 Servise S IO , , , 0 another case j i k i j k is relevant forQuery R appove tha n t l IOe eva ce S Query (6) Accuracy of storage. In the process of existence, the object can go into different states caused by the transition to other software and technology platforms. Big data is characterized by constant changes, and errors in these data also tend to accumulate and scale [14], including changing the storage format, using newer versions of BD, etc. All this can lead to a loss of storage accuracy of the new version of the information object compared to the old one. This characteristic assesses the loss degree of storage accuracy in the above-described cases (Novitsky, et al., 2016). Credibility means that the IO has the ability to confirm that it is what it should be. The ability to verify and measure the extent to which an IO is what it is claimed to be is fundamentally important in its correct perception and use. Reliability determines the extent to which the IO can be relied upon. This is largely determined by the developer's credibility and origin source. The credibility of the IO can be measured by: − the attitude of users towards the IO itself; − the attitude of users towards the source of the IP; − the availability of information on the chronology of IO changes; − the attitude of users to the BD in which the IO is located. : Програмні засоби аналітики даних 1 1 M j ij j i M j j K X X K 1 1 M j ij j i M j j K Y Y K . (4) It should be noted that intervals , ij ij X Y are set by experts or determined algorithmically only for elementary characteristics. At other levels, i.e. for integral characteristics, the minimum and maximum values are calculated according to the defined formulas based on the given or calculated values of the previous levels (Novitsky, et al., 2016). 4. Quality properties of information objects in Big Data Next, the issues of evaluating the quality of semantic information objects are considered. IO quality characteristics. Accessibility is a complex function that depends on many factors, including: − the IO is actually available in the DB (the information object may be in the BD, but for some reasons, it may be removed from public access or due to the amount of data, it may not be identified among a set of objects); − there is a service that can find the IO (one of the ways to remove an information object from public access is to deactivate its searching characteristics); − it is the network and data transmission system in the network operational; − there are no restrictions on access to the IO or if there are such restrictions they do not apply to specific persons or groups of persons. It should be noted that in the given context, they talk about the availability of the IO to perform a single operation as reading. Our review does not include other possible operations with IO (changes, deletion, administration). For BD this is availability for a specific service that interacts with BD. As a rule, a distinction is made between availability for all and certain services. In this case, the restriction of access rights , i j SAcc IO i S service to j IO , means a function that acquires the following values: 1 — the service does not have access restrictions or it belongs to the group to which access is open; 0 — otherwise. Now, if we mark other availability indicators as i P except for access rights restrictions which take the following values: 1 — the indicator is satisfied, 0 — the indicator is not satisfied, then the general availability formula is calculated as follows: 1 , ..., , , n i j MIN P c SAc IOP . (5) Relevance is the measure to which the information content of the information object meets the information needs of the user. Both cannot be strictly formalized. This assessment largely depends on the depth of the user's knowledge about their information needs at the current time and the tasks facing them. The user's information needs at the current moment are expressed through his information search query as a result of knowledge reasoning. The query implicitly defines the context in which relevance is evaluated. The user carries out an evaluation of this compliance as a result of receiving a response to the request (the user can be a group of people). The relevance evaluation function is as follows , , i j k Relevance IO S Query : 1 Servise S IO , , , 0 another case j i k i j k is relevant forQuery R appove tha n t l IOe eva ce S Query (6) Accuracy of storage. In the process of existence, the object can go into different states caused by the transition to other software and technology platforms. Big data is characterized by constant changes, and errors in these data also tend to accumulate and scale [14], including changing the storage format, using newer versions of BD, etc. All this can lead to a loss of storage accuracy of the new version of the information object compared to the old one. This characteristic assesses the loss degree of storage accuracy in the above-described cases (Novitsky, et al., 2016). Credibility means that the IO has the ability to confirm that it is what it should be. The ability to verify and measure the extent to which an IO is what it is claimed to be is fundamentally important in its correct perception and use. Reliability determines the extent to which the IO can be relied upon. This is largely determined by the developer's credibility and origin source. The credibility of the IO can be measured by: − the attitude of users towards the IO itself; − the attitude of users towards the source of the IP; − the availability of information on the chronology of IO changes; − the attitude of users to the BD in which the IO is located. (6) Accuracy of storage. In the process of existence, the object can go into different states caused by the transition to other software and technology platforms. Big data is characterized by constant changes, and errors in these data also tend to accumulate and scale [14], including changing the storage format, using newer versions of BD, etc. All this can lead to a loss of storage accuracy of the new version of the information object compared to the old one. This characteristic assesses the loss degree of storage accuracy in the above-described cases (Novitsky, et al., 2016). Credibility means that the IO has the ability to confirm that it is what it should be. The ability to verify and mea- sure the extent to which an IO is what it is claimed to be is fundamentally important in its correct perception and use. Reli- ability determines the extent to which the IO can be relied upon. This is largely determined by the developer’s credibility and origin source. The credibility of the IO can be measured by: - the attitude of users towards the IO itself; - the attitude of users towards the source of the IP; - the availability of information on the chronology of IO changes; - the attitude of users to the BD in which the IO is located. Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can ac- curately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This character- istic evaluates how quickly the set Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) is updated compared to the real state of affairs. The charac- teristic is measured by the ratio of the actual delay time compared to the permissible one: Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system’s behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. 264 Програмні засоби аналітики даних Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop meth- ods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed lan- guage expressions, and semantics indicates their formal meaning. Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: – M and L are concepts; – an arbitrary atomic concept A is a concept; – if C is an arbitrary concept, then Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) are concepts. he corresponding constructors are called addition, intersection and union; – if C is a concept, R is an atomic role, then Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) are arbitrary concepts. Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) semantics is defined through the concept of interpretation. An interpretation is a pair of Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) , where Δ – is a non-empty set, called the domain of interpretation, Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) is an interpreting function that assigns the measure Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) to each atomic A concept and R to each atomic role as an binary relation Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) . Other formulas are interpreted as follows: Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) (8) Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) (9) Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) (10) Програмні засоби аналітики даних [Введите текст] Integrity determines to what extent the IO is complete and correct from the point of view of the software object it represents. Integrity contributes to increasing trust in the IO [13]. Accuracy of reproduction determines the degree of accuracy of the reproduction of the IO of its original. For example, a text document reproducing an ancient book can accurately reproduce the text and completely ignore its artistic design. Timeliness indicates that the IO is introduced and updated on time, as this issue is specific to BD. This characteristic evaluates how quickly the set ( ), ( ), ( )s m p m o m in IO is updated compared to the real state of affairs. The characteristic is measured by the ratio of the actual delay time compared to the permissible one: ( , , ) exp real timedelay Timeliness IO s p o ected timedelay . (7) Origin is a characteristic of the quality of an IO. It indicates how well (correctly, completely, qualitatively) the entire prehistory of the origin and change of an IO is presented, and how accurately and during what period it is possible to trace the prehistory of the existence of an IO. This is an important characteristic since inference over semantic data depends on the data itself. Understanding the historical information about the data helps to determine the reasons for changing the system's behavior, which is not a trivial task in the BD environment. Susceptibility indicates how easily a person can understand and accept IO. It can be used to analyze which set of IO is most easily perceived by a group of persons due to the solved tasks. Practical aspects of assessment of the quality of BD. One of the most challenging tasks in achieving data quality metrics is the early detection of data-related problems. Typical problems include completeness, the integrity of data and lack of contradictions. The problem lies in that in the conditions of the BD, the time to detect such issues may exceed the time requirements for receiving a response to the information from the BD. That is why it is necessary to develop methods that will allow the detection of such problems at an early stage. There are various approaches to deal with the task, like the way to control all data entered into the system through the ontology. In practice, it is often not known what the data model should be since the requirements for the BD system can change as the data increases. These requirements can be constantly updated. This means that data previously entered into the BD management environment in the previously specified structure may not correspond to the quality model after some time. Identifying these problems due to the scale is a difficult problem. One of the criteria of the quality model is the ability of BD to give a quick response to user requests. The most effective method of increasing such speed is materialization [15]. Materialization can be used to improve performance at query time by making the required information explicit in advance. Thus, recalculation of the necessary information for each separate request is avoided. However, this method can be ineffective if there is excessive materialization. Consider a certain graph of semantic data G in which the connections between concepts are built on the basis of descriptive logic. We will briefly describe the DL, which is the basis for all DL of the family. means «Attributive Language with Complements». It is defined in [16]. The language is based on the previously introduced language AL (Attributive Language), to which the addition constructor (negation) was added. Syntax describes a set of correctly constructed language expressions, and semantics indicates their formal meaning. Let 1 , . . . , m CN A A і 1 , . . . , n RN R R be finite, non-empty sets of atomic concepts and atomic roles. The ALC syntax is defined as follows: − M and L are concepts; − an arbitrary atomic concept A is a concept; − if C is an arbitrary concept, then C , C Dh and C Dg are concepts. he corresponding constructors are called addition, intersection and union; − if C is a concept, R is an atomic role, then .RCj and .RCi are arbitrary concepts. semantics is defined through the concept of interpretation. An interpretation is a pair of . ), ( II , where Δ – is a non-empty set, called the domain of interpretation, Ia is an interpreting function that assigns the measure ΔIA 8 to each atomic A concept and R to each atomic role as an binary relation Δ ΔR ×I8 . Other formulas are interpreted as follows: ΔI I= , =M L ; (8) \ ,   ( ) (  ) ( ), I I I I I I I IA A C D C D C D C Dy h 1 g 2 (9) { | (( ) )}. , I IRC a b a b R b Cj 9 j 9 9 o 9 (10) { | (( ) )}. , I IRC a b a b R b Ci 9 i 9 9 9 (11) (11) Next the essence of the Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; terminology is revealed for DL Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific indi- viduals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual’s belonging to a concept (written as C a); the statement about the belonging of a pair of individuals a and b and a role (written as R a, b). A system of facts or Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; is a finite set of statements of form Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; , where a and Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; are indi- viduals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; . We denote the power of such a set by Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; . The fol- lowing constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: – Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; is a concept for limitation of functionality; – Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; is a concept for quantitative limitation; – Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; is a concept for qualitative limitation. The following constructors are interpreted as follows: Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; (12) Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; (13) Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; (14) Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; (15) Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; (16) 265 Програмні засоби аналітики даних There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; , where D is a non-empty set and Ф is a set of predicates in the D. It can be assumed that given a set of predicate symbols PN where each predi- cate symbol Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; is associated with an n arity and Ф maps an n-relation to it as Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; . It should be noted that Ф always contains a single predicate D, that is PN always includes M symbol and is interpreted as Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles, Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; are atomic abstract attributes, CF are atomic concrete attributes. A sequence of Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; з k≥1 with atomic abstract attributes Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; and one concrete attribute Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; will be called a complex, concrete attribute. Concepts of Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; logic are defined by grammar (Lutz, 2002): Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; (17) where Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; are arbitrary attributes, Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; is the n-concrete predicate. The semantics of Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; logic is considered as Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; interpretation with the following additions: – sets Δ and D must not intersect; – each atomic abstract attribute Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; is assigned a partial function Програмні засоби аналітики даних Next the essence of the (TBox ) terminology is revealed for DL . However, all introduced concepts are easily transferred to other DL. Terminologies describe general knowledge about concepts and roles. To describe knowledge about specific individuals (their belonging to concepts and roles), the DL offers a system of facts about individuals or ABox. For this, a set of names of individuals is entered into the DL. There are two types of facts: a statement about an individual's belonging to a concept (written as C a ); the statement about the belonging of a pair of individuals a and b and a role (written as ,R a b ). A system of facts or ABox is a finite set of statements of form C a and ,R a b , where a and b IN are individuals, C is an arbitrary concept and R is a role. Here are some ALC extensions that were used to fulfill the tasks of the dissertation work. R-follower is an individual who is the right part of the role R. We denote the set of R-followers for e that can be written as ( )IR e , where e : ( ) | ,I IR e d e d R . We denote the power of such a set by I|R |e . The following constructors are called numerical role constraints. If R is a concept, n and 0 is a natural number, then: − 1R is a concept for limitation of functionality; − nR and nR is a concept for quantitative limitation; − .nRC and .nRC is a concept for qualitative limitation. The following constructors are interpreted as follows: 1 | ( ) 1 I IR e R e , (12) | ( ) I InR e R e n , (13) | ( ) I InR e R e n , (14) . | ( ) I I InRC e R e C n , (15) . | ( ) I I InRC e R e C n1 . (16) There are cases when it is necessary to describe specific characteristics of an object In order to describe the real world, for example, the number of pages in an information resource. To solve this problem, a specific area with a fixed set of predicates is created (Lutz, 2002). A concrete domain is a pair ,D  , where D is a non-empty set and is a set of predicates in the D . It can be assumed that given a set of predicate symbols PN where each predicate symbol  P PN is associated with an n arity and maps an n-relation to it as nP D8 . It should be noted that always contains a single predicateD , that is PN always includes M symbol and is interpreted as DM . Also is always closed with respect to the complement, that is for every n-predicate symbol P in PN there is an n-predicate symbol in P, which is interpreted as \n P . Let be a given concrete area D with a set of predicate symbols PN. Also let a finite set of symbols be given: CN are atomic concepts, RN are atomic roles,AF RN8 are atomic abstract attributes, CF are atomic concrete attributes. A sequence of 1 k f f h з k≥1 with atomic abstract attributes i f AF and one concrete attributeh CF will be called a complex, concrete attribute. Concepts of logic are defined by grammar (Lutz, 2002): 1 | | | | | | |. . . n A C D C D RC RC u u PML y h g j i j (17) where A CN , R RN , 1 ,..., n u u are arbitrary attributes, P PN is the n-concrete predicate. The semantics of logic is considered as .,( )II interpretation with the following additions: - sets Δ and D must not intersect; - each atomic abstract attribute f AF is assigned a partial function :If ; ; – each atomic abstract attribute Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) is assigned a partial function Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) . A composite concrete attribute Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) is interpreted as a composition of partial functions Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) . As a result, a partial function Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) is formed. The only new (compared to ) type of concept is interpreted as follows: Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) . (18) The set of points on which the attribute u is defined is expressed by the concept Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) , where M is a specific predicate that is always present in the PN signature. The following equivalence is valid: Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) (19) Indeed, the condition Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) means that either one of functions Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) is undefined at point е or the tuple Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) does not belong to the predicate Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) but belongs to its complement Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) . So, the G graph we have is given by BD Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let’s take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) possible combinations that will determine the concept Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This 266 Програмні засоби аналітики даних means that the complexity depends on the size of the input data and to solve the problems of inference and feasibil- ity of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) and then Програмні засоби аналітики даних [Введите текст] - each atomic abstract attribute h CF is assigned a partial function :If D . A composite concrete attribute 1 k u f f h is interpreted as a composition of partial functions 1 (I I I I k u x h f f x . As a result, a partial function :Iu D is formed. The only new (compared to ) type of concept is interpreted as follows: 1 1 1 1 1 ( ) { | } . : , , I I n n I D n n n u u P e x x D u e x u e x x x P j j o o o . (18) The set of points on which the attribute u is defined is expressed by the concept u., where  is a specific predicate that is always present in the PN signature. The following equivalence is valid: 1 1 1 1 , , . . . , , . . n n u u P u u u u Py y Mg h y Mg y (19) Indeed, the condition 1 ( ), , . I n e u u Py j means that either one of functions I i u is undefined at point е or the tuple 1 , ,I I n u e u e does not belong to the predicate P , P, but belongs to its complement Py . So, the G graph we have is given by BD 1 | | | | | | | 1 | | | . | . | . . . n R nR P A C D C D RC R C C un n uR nRC R ML y h g j i j . (20) When building a materialization, rules are set according to which it should be built. Consider the problem of excessive materialization, which can be caused by the following way of constructing concepts. For example, let's take the computer components motherboard and RAM. The concept that will determine the compatibility of these two components will be defined as follows: 4_2_ 4 ) ( ).( 4 1 . 4hasSlotTypeDDR R h amDDR Mai a mboardDDR eMemory MainboarsSlotTyp DDR dg h g (21) RAM Main Board Slots Ram Model 1 DDR4 MainBoard Model 1 DDR 4 2 Ram Model 2 DDR4 MainBoard Model 2 DDR 4 4 As a result of the materialization, we will get the next G graph that will be set 2 6 15C possible combinations that will determine the concept 4_2_ 4RamDDR MaimboardDDR . If we take into account that the motherboard also has limitations in terms of supporting the maximum size of RAM and the real situation will become even more complicated. RAM RAM Size Main Board RAM Slots Max Memory support Ram Model 1 DDR4 32 MainBoard Model 1 DDR 4 2 32 Ram Model 2 DDR4 12 MainBoard Model 2 DDR 4 4 128 Such dependence means that even with a small number of components, the knowledge base representation system will have to store a huge number of relationships that will determine the materialization. Accordingly, the inference on such a graph will work very slowly due to the huge number of combinations that form nodes of the graph available for search, as stated in (Lutz, 2002), such an inference problem belongs to the P Space class. This means that the complexity depends on the size of the input data and to solve the problems of inference and feasibility of concepts, it is necessary to reduce the set of input data. To avoid such a problem, it is proposed to divide the knowledge base, which traditionally consists of TBox and ABox into two components, so that the subject area is described DL and then (Pic. 4.2) (Pic. 4.2)Програмні засоби аналітики даних ABox TBox Knowledge baseDescription logic Reasoning Dig data Rules Abox(D) Tbox(D) Knowledge baseDescription logic Reasoning Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two TBox , and two ABox , , , , . An I interpretation satisfies if ,Q and ,Q , in this case is is called executable and the interpretation is called a model and written as Q . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the description of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organization of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): ( ( )) 1 ( ( )) w IO md S dard IO n IO md tan , (22) where ( ( ))n IO md is the total number of IO metadata, а is the number of metadata that does not meet the standard adopted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Pr ( ( )) , Re ( ( )) esent IO md Completeness IO MS quired IO md , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two Програмні засоби аналітики даних ABox TBox Knowledge baseDescription logic Reasoning Dig data Rules Abox(D) Tbox(D) Knowledge baseDescription logic Reasoning Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two TBox , and two ABox , , , , . An I interpretation satisfies if ,Q and ,Q , in this case is is called executable and the interpretation is called a model and written as Q . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the description of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organization of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): ( ( )) 1 ( ( )) w IO md S dard IO n IO md tan , (22) where ( ( ))n IO md is the total number of IO metadata, а is the number of metadata that does not meet the standard adopted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Pr ( ( )) , Re ( ( )) esent IO md Completeness IO MS quired IO md , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. and two Програмні засоби аналітики даних ABox TBox Knowledge baseDescription logic Reasoning Dig data Rules Abox(D) Tbox(D) Knowledge baseDescription logic Reasoning Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two TBox , and two ABox , , , , . An I interpretation satisfies if ,Q and ,Q , in this case is is called executable and the interpretation is called a model and written as Q . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the description of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organization of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): ( ( )) 1 ( ( )) w IO md S dard IO n IO md tan , (22) where ( ( ))n IO md is the total number of IO metadata, а is the number of metadata that does not meet the standard adopted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Pr ( ( )) , Re ( ( )) esent IO md Completeness IO MS quired IO md , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. . An I interpretation satisfies Програмні засоби аналітики даних ABox TBox Knowledge baseDescription logic Reasoning Dig data Rules Abox(D) Tbox(D) Knowledge baseDescription logic Reasoning Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two TBox , and two ABox , , , , . An I interpretation satisfies if ,Q and ,Q , in this case is is called executable and the interpretation is called a model and written as Q . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the description of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organization of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): ( ( )) 1 ( ( )) w IO md S dard IO n IO md tan , (22) where ( ( ))n IO md is the total number of IO metadata, а is the number of metadata that does not meet the standard adopted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Pr ( ( )) , Re ( ( )) esent IO md Completeness IO MS quired IO md , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. and Програмні засоби аналітики даних ABox TBox Knowledge baseDescription logic Reasoning Dig data Rules Abox(D) Tbox(D) Knowledge baseDescription logic Reasoning Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two TBox , and two ABox , , , , . An I interpretation satisfies if ,Q and ,Q , in this case is is called executable and the interpretation is called a model and written as Q . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the description of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organization of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): ( ( )) 1 ( ( )) w IO md S dard IO n IO md tan , (22) where ( ( ))n IO md is the total number of IO metadata, а is the number of metadata that does not meet the standard adopted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Pr ( ( )) , Re ( ( )) esent IO md Completeness IO MS quired IO md , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. , in this case is Програмні засоби аналітики даних ABox TBox Knowledge baseDescription logic Reasoning Dig data Rules Abox(D) Tbox(D) Knowledge baseDescription logic Reasoning Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two TBox , and two ABox , , , , . An I interpretation satisfies if ,Q and ,Q , in this case is is called executable and the interpretation is called a model and written as Q . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the description of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organization of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): ( ( )) 1 ( ( )) w IO md S dard IO n IO md tan , (22) where ( ( ))n IO md is the total number of IO metadata, а is the number of metadata that does not meet the standard adopted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Pr ( ( )) , Re ( ( )) esent IO md Completeness IO MS quired IO md , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. is called executable and the Програмні засоби аналітики даних ABox TBox Knowledge baseDescription logic Reasoning Dig data Rules Abox(D) Tbox(D) Knowledge baseDescription logic Reasoning Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two TBox , and two ABox , , , , . An I interpretation satisfies if ,Q and ,Q , in this case is is called executable and the interpretation is called a model and written as Q . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the description of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organization of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): ( ( )) 1 ( ( )) w IO md S dard IO n IO md tan , (22) where ( ( ))n IO md is the total number of IO metadata, а is the number of metadata that does not meet the standard adopted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Pr ( ( )) , Re ( ( )) esent IO md Completeness IO MS quired IO md , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. interpre- tation is called a Програмні засоби аналітики даних ABox TBox Knowledge baseDescription logic Reasoning Dig data Rules Abox(D) Tbox(D) Knowledge baseDescription logic Reasoning Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two TBox , and two ABox , , , , . An I interpretation satisfies if ,Q and ,Q , in this case is is called executable and the interpretation is called a model and written as Q . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the description of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organization of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): ( ( )) 1 ( ( )) w IO md S dard IO n IO md tan , (22) where ( ( ))n IO md is the total number of IO metadata, а is the number of metadata that does not meet the standard adopted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Pr ( ( )) , Re ( ( )) esent IO md Completeness IO MS quired IO md , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. model and written as Програмні засоби аналітики даних ABox TBox Knowledge baseDescription logic Reasoning Dig data Rules Abox(D) Tbox(D) Knowledge baseDescription logic Reasoning Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two TBox , and two ABox , , , , . An I interpretation satisfies if ,Q and ,Q , in this case is is called executable and the interpretation is called a model and written as Q . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the description of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organization of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): ( ( )) 1 ( ( )) w IO md S dard IO n IO md tan , (22) where ( ( ))n IO md is the total number of IO metadata, а is the number of metadata that does not meet the standard adopted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Pr ( ( )) , Re ( ( )) esent IO md Completeness IO MS quired IO md , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the descrip- tion of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organiza- tion of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): Програмні засоби аналітики даних ABox TBox Knowledge baseDescription logic Reasoning Dig data Rules Abox(D) Tbox(D) Knowledge baseDescription logic Reasoning Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two TBox , and two ABox , , , , . An I interpretation satisfies if ,Q and ,Q , in this case is is called executable and the interpretation is called a model and written as Q . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the description of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organization of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): ( ( )) 1 ( ( )) w IO md S dard IO n IO md tan , (22) where ( ( ))n IO md is the total number of IO metadata, а is the number of metadata that does not meet the standard adopted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Pr ( ( )) , Re ( ( )) esent IO md Completeness IO MS quired IO md , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. , (22) where Програмні засоби аналітики даних ABox TBox Knowledge baseDescription logic Reasoning Dig data Rules Abox(D) Tbox(D) Knowledge baseDescription logic Reasoning Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two TBox , and two ABox , , , , . An I interpretation satisfies if ,Q and ,Q , in this case is is called executable and the interpretation is called a model and written as Q . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the description of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organization of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): ( ( )) 1 ( ( )) w IO md S dard IO n IO md tan , (22) where ( ( ))n IO md is the total number of IO metadata, а is the number of metadata that does not meet the standard adopted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Pr ( ( )) , Re ( ( )) esent IO md Completeness IO MS quired IO md , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. is the total number of IO metadata, а is the number of metadata that does not meet the standard ad- opted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. 267 Програмні засоби аналітики даних Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Програмні засоби аналітики даних ABox TBox Knowledge baseDescription logic Reasoning Dig data Rules Abox(D) Tbox(D) Knowledge baseDescription logic Reasoning Pic 4.2 Knowledge base with separation Thus, the knowledge base consists of two TBox , and two ABox , , , , . An I interpretation satisfies if ,Q and ,Q , in this case is is called executable and the interpretation is called a model and written as Q . 5. Estimate quality of metadata and an information object family in Big Data Metadata quality assessment is intended to find out to what extent certain metadata or metadata schemas present in a BD meet the tasks that were set before the BD when it was designed. They contribute to the quality functioning of the semantics in the BD. The quality of metadata affects many processes related to the use of inference, building connections between the IO description, their input, storage, identification, search and access. There are two aspects of quality related to metadata. The first of them refers to IO metadata (what IO metadata is, how fully it describes IO, whether it meets a certain metadata schema standard). The second aspect is related to the schema of metadata (is the schema of metadata standard, to what extent the chosen schema meets the needs of the description of IS in a specific subject area). The quality of both aspects is described below. Compliance with the standard. This characteristic indicates whether a standard IO metadata description scheme is used. The use of a standard metadata scheme is a fundamental issue in the consideration of the problem of the organization of search and retrieval of knowledge. The existence of IOs in the DB, the metadata of which do not meet or do not fully meet the standard, significantly reduces the resolution of fundamentally important issues facing the DB and reduces its quality. The measure of compliance with the standard can be the ratio of the number of non-standard metadata to the total number of metadata used in the description of the IO (Novitsky, et al., 2016): ( ( )) 1 ( ( )) w IO md S dard IO n IO md tan , (22) where ( ( ))n IO md is the total number of IO metadata, а is the number of metadata that does not meet the standard adopted for this BD model The completeness of the description of the IO in relation to the metadata scheme. This characteristic indicates the extent to which the metadata schema is fully used to describe the IO. Please note that not all metadata of the selected scheme can be applied to some types of IOs. Several metadata schemes can be used simultaneously in the DB network, but the completeness is determined relative to only those metadata that participate in the construction of semantic links between IOs. Therefore, the degree of completeness of the description of the IO, according to the selected MS metadata scheme, is determined as follows: Pr ( ( )) , Re ( ( )) esent IO md Completeness IO MS quired IO md , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. , (23) where: md is metadata, MS is metadata schema, Present(md) is the total number of metadata required to describe the IO, which is actually present in the IO description, Required(md) is the total number of MS metadata required to describe the IO. Compliance with metadata schema. A metadata schema can set certain properties to its metadata. The charac- teristic of matching the metadata scheme determines how well the properties of the metadata of the IO correspond to the properties of the corresponding metadata of the selected scheme. Such properties include the type of data or attributes of relations between IOs, which in general are also included in the quality model. Let n is the number of metadata in the MS scheme, mi – is the number of properties of metadata mdi, Програмні засоби аналітики даних [Введите текст] Compliance with metadata schema. A metadata schema can set certain properties to its metadata. The characteristic of matching the metadata scheme determines how well the properties of the metadata of the IO correspond to the properties of the corresponding metadata of the selected scheme. Such properties include the type of data or attributes of relations between IOs, which in general are also included in the quality model. Let n is the number of metadata in the MS scheme, mi – is the number of properties of metadata mdi, ,Conformance i j is compliance of property j of metadata mdi ІО with the standard specification of the MS schema. ,Conformance i j is calculated by the formula: , 0 otherwise 1 iff i metadata propertybelong to j property fromMS Conformance i j (24) The correspondence of the IO to the i MS schema metadata is calculated according to the formula: 1 , im j i i Conformance i j Conformance md m (25) Then the correspondence to the Conformance(MS) metadata schema is calculated using the formula: 1 n i i Conformance md Conformance MS n (26) Metadata scheme quality characteristics. A set of specially selected metadata make up a metadata schema. In the general case, such a set can be arbitrary, but this significantly reduces the quality of the DB, because our BD environment becomes isolated from other data sets and will not be able to take (at least fully) in the process of integration and reasoning information, in a sense the system becomes isolated because even using mappings between data schemas will be inefficient due to the scale of the data. In this regard, efforts are being made to develop and use standard metadata schemas, which are usually aimed at describing IOs of a certain class. There are many metadata schemes. In this connection, the question of choosing the most suitable for a certain subject area arises. This task is facilitated by the evaluation of the quality of the metadata scheme. Compliance with standard metadata schema. This characteristic evaluates the extent to which all DB information objects conform to the standard. For IO, the characteristic of compliance with the standard is also significant, but it is at the IO level. In general, compliance with the standard scheme is evaluated as the arithmetic mean of compliance with the IO standard 1 n i i S dard IO S dard MS n tan tan . (27) The completeness (usage) of the metadata scheme. This characteristic provides an opportunity to assess how much a certain scheme is used to describe the entire population of BD IOs. It is based on the characteristic of the completeness of the description of the IO in relation to the metadata scheme and is its arithmetic average for all IOs of the BD: 1 , n i i Completeness IO MS Completeness MS n . (28) This characteristic makes it possible to assess to what extent the decision to use a certain metadata scheme is justified, and, if necessary, to make a decision to replace it. Let's introduce metrics for evaluating the IO family. A family is a systematized set of IOs that are united into a single whole based on some meaningful or formal criteria of belonging, for example, regarding the general content, sources, purpose, semantic independence, method of use, etc. Completeness of the family. This characteristic establishes to what degree of completeness the family contains those IOs that it should contain. Completeness can be measured only when it is known what exactly the collection should contain, that is, when the original family, which acts as a sample, is known [13]. As a rule, families are distinguished on the basis of IO attributes. The formula for measuring family completeness is as follows: : is compliance of property j of metadata mdi ІО with the standard specification of the MS schema. Програмні засоби аналітики даних [Введите текст] Compliance with metadata schema. A metadata schema can set certain properties to its metadata. The characteristic of matching the metadata scheme determines how well the properties of the metadata of the IO correspond to the properties of the corresponding metadata of the selected scheme. Such properties include the type of data or attributes of relations between IOs, which in general are also included in the quality model. Let n is the number of metadata in the MS scheme, mi – is the number of properties of metadata mdi, ,Conformance i j is compliance of property j of metadata mdi ІО with the standard specification of the MS schema. ,Conformance i j is calculated by the formula: , 0 otherwise 1 iff i metadata propertybelong to j property fromMS Conformance i j (24) The correspondence of the IO to the i MS schema metadata is calculated according to the formula: 1 , im j i i Conformance i j Conformance md m (25) Then the correspondence to the Conformance(MS) metadata schema is calculated using the formula: 1 n i i Conformance md Conformance MS n (26) Metadata scheme quality characteristics. A set of specially selected metadata make up a metadata schema. In the general case, such a set can be arbitrary, but this significantly reduces the quality of the DB, because our BD environment becomes isolated from other data sets and will not be able to take (at least fully) in the process of integration and reasoning information, in a sense the system becomes isolated because even using mappings between data schemas will be inefficient due to the scale of the data. In this regard, efforts are being made to develop and use standard metadata schemas, which are usually aimed at describing IOs of a certain class. There are many metadata schemes. In this connection, the question of choosing the most suitable for a certain subject area arises. This task is facilitated by the evaluation of the quality of the metadata scheme. Compliance with standard metadata schema. This characteristic evaluates the extent to which all DB information objects conform to the standard. For IO, the characteristic of compliance with the standard is also significant, but it is at the IO level. In general, compliance with the standard scheme is evaluated as the arithmetic mean of compliance with the IO standard 1 n i i S dard IO S dard MS n tan tan . (27) The completeness (usage) of the metadata scheme. This characteristic provides an opportunity to assess how much a certain scheme is used to describe the entire population of BD IOs. It is based on the characteristic of the completeness of the description of the IO in relation to the metadata scheme and is its arithmetic average for all IOs of the BD: 1 , n i i Completeness IO MS Completeness MS n . (28) This characteristic makes it possible to assess to what extent the decision to use a certain metadata scheme is justified, and, if necessary, to make a decision to replace it. Let's introduce metrics for evaluating the IO family. A family is a systematized set of IOs that are united into a single whole based on some meaningful or formal criteria of belonging, for example, regarding the general content, sources, purpose, semantic independence, method of use, etc. Completeness of the family. This characteristic establishes to what degree of completeness the family contains those IOs that it should contain. Completeness can be measured only when it is known what exactly the collection should contain, that is, when the original family, which acts as a sample, is known [13]. As a rule, families are distinguished on the basis of IO attributes. The formula for measuring family completeness is as follows: : is calculated by the formula: Програмні засоби аналітики даних [Введите текст] Compliance with metadata schema. A metadata schema can set certain properties to its metadata. The characteristic of matching the metadata scheme determines how well the properties of the metadata of the IO correspond to the properties of the corresponding metadata of the selected scheme. Such properties include the type of data or attributes of relations between IOs, which in general are also included in the quality model. Let n is the number of metadata in the MS scheme, mi – is the number of properties of metadata mdi, ,Conformance i j is compliance of property j of metadata mdi ІО with the standard specification of the MS schema. ,Conformance i j is calculated by the formula: , 0 otherwise 1 iff i metadata propertybelong to j property fromMS Conformance i j (24) The correspondence of the IO to the i MS schema metadata is calculated according to the formula: 1 , im j i i Conformance i j Conformance md m (25) Then the correspondence to the Conformance(MS) metadata schema is calculated using the formula: 1 n i i Conformance md Conformance MS n (26) Metadata scheme quality characteristics. A set of specially selected metadata make up a metadata schema. In the general case, such a set can be arbitrary, but this significantly reduces the quality of the DB, because our BD environment becomes isolated from other data sets and will not be able to take (at least fully) in the process of integration and reasoning information, in a sense the system becomes isolated because even using mappings between data schemas will be inefficient due to the scale of the data. In this regard, efforts are being made to develop and use standard metadata schemas, which are usually aimed at describing IOs of a certain class. There are many metadata schemes. In this connection, the question of choosing the most suitable for a certain subject area arises. This task is facilitated by the evaluation of the quality of the metadata scheme. Compliance with standard metadata schema. This characteristic evaluates the extent to which all DB information objects conform to the standard. For IO, the characteristic of compliance with the standard is also significant, but it is at the IO level. In general, compliance with the standard scheme is evaluated as the arithmetic mean of compliance with the IO standard 1 n i i S dard IO S dard MS n tan tan . (27) The completeness (usage) of the metadata scheme. This characteristic provides an opportunity to assess how much a certain scheme is used to describe the entire population of BD IOs. It is based on the characteristic of the completeness of the description of the IO in relation to the metadata scheme and is its arithmetic average for all IOs of the BD: 1 , n i i Completeness IO MS Completeness MS n . (28) This characteristic makes it possible to assess to what extent the decision to use a certain metadata scheme is justified, and, if necessary, to make a decision to replace it. Let's introduce metrics for evaluating the IO family. A family is a systematized set of IOs that are united into a single whole based on some meaningful or formal criteria of belonging, for example, regarding the general content, sources, purpose, semantic independence, method of use, etc. Completeness of the family. This characteristic establishes to what degree of completeness the family contains those IOs that it should contain. Completeness can be measured only when it is known what exactly the collection should contain, that is, when the original family, which acts as a sample, is known [13]. As a rule, families are distinguished on the basis of IO attributes. The formula for measuring family completeness is as follows: : (24) The correspondence of the IO to the i MS schema metadata is calculated according to the formula: Програмні засоби аналітики даних [Введите текст] Compliance with metadata schema. A metadata schema can set certain properties to its metadata. The characteristic of matching the metadata scheme determines how well the properties of the metadata of the IO correspond to the properties of the corresponding metadata of the selected scheme. Such properties include the type of data or attributes of relations between IOs, which in general are also included in the quality model. Let n is the number of metadata in the MS scheme, mi – is the number of properties of metadata mdi, ,Conformance i j is compliance of property j of metadata mdi ІО with the standard specification of the MS schema. ,Conformance i j is calculated by the formula: , 0 otherwise 1 iff i metadata propertybelong to j property fromMS Conformance i j (24) The correspondence of the IO to the i MS schema metadata is calculated according to the formula: 1 , im j i i Conformance i j Conformance md m (25) Then the correspondence to the Conformance(MS) metadata schema is calculated using the formula: 1 n i i Conformance md Conformance MS n (26) Metadata scheme quality characteristics. A set of specially selected metadata make up a metadata schema. In the general case, such a set can be arbitrary, but this significantly reduces the quality of the DB, because our BD environment becomes isolated from other data sets and will not be able to take (at least fully) in the process of integration and reasoning information, in a sense the system becomes isolated because even using mappings between data schemas will be inefficient due to the scale of the data. In this regard, efforts are being made to develop and use standard metadata schemas, which are usually aimed at describing IOs of a certain class. There are many metadata schemes. In this connection, the question of choosing the most suitable for a certain subject area arises. This task is facilitated by the evaluation of the quality of the metadata scheme. Compliance with standard metadata schema. This characteristic evaluates the extent to which all DB information objects conform to the standard. For IO, the characteristic of compliance with the standard is also significant, but it is at the IO level. In general, compliance with the standard scheme is evaluated as the arithmetic mean of compliance with the IO standard 1 n i i S dard IO S dard MS n tan tan . (27) The completeness (usage) of the metadata scheme. This characteristic provides an opportunity to assess how much a certain scheme is used to describe the entire population of BD IOs. It is based on the characteristic of the completeness of the description of the IO in relation to the metadata scheme and is its arithmetic average for all IOs of the BD: 1 , n i i Completeness IO MS Completeness MS n . (28) This characteristic makes it possible to assess to what extent the decision to use a certain metadata scheme is justified, and, if necessary, to make a decision to replace it. Let's introduce metrics for evaluating the IO family. A family is a systematized set of IOs that are united into a single whole based on some meaningful or formal criteria of belonging, for example, regarding the general content, sources, purpose, semantic independence, method of use, etc. Completeness of the family. This characteristic establishes to what degree of completeness the family contains those IOs that it should contain. Completeness can be measured only when it is known what exactly the collection should contain, that is, when the original family, which acts as a sample, is known [13]. As a rule, families are distinguished on the basis of IO attributes. The formula for measuring family completeness is as follows: : (25) Then the correspondence to the Conformance(MS) metadata schema is calculated using the formula: Програмні засоби аналітики даних [Введите текст] Compliance with metadata schema. A metadata schema can set certain properties to its metadata. The characteristic of matching the metadata scheme determines how well the properties of the metadata of the IO correspond to the properties of the corresponding metadata of the selected scheme. Such properties include the type of data or attributes of relations between IOs, which in general are also included in the quality model. Let n is the number of metadata in the MS scheme, mi – is the number of properties of metadata mdi, ,Conformance i j is compliance of property j of metadata mdi ІО with the standard specification of the MS schema. ,Conformance i j is calculated by the formula: , 0 otherwise 1 iff i metadata propertybelong to j property fromMS Conformance i j (24) The correspondence of the IO to the i MS schema metadata is calculated according to the formula: 1 , im j i i Conformance i j Conformance md m (25) Then the correspondence to the Conformance(MS) metadata schema is calculated using the formula: 1 n i i Conformance md Conformance MS n (26) Metadata scheme quality characteristics. A set of specially selected metadata make up a metadata schema. In the general case, such a set can be arbitrary, but this significantly reduces the quality of the DB, because our BD environment becomes isolated from other data sets and will not be able to take (at least fully) in the process of integration and reasoning information, in a sense the system becomes isolated because even using mappings between data schemas will be inefficient due to the scale of the data. In this regard, efforts are being made to develop and use standard metadata schemas, which are usually aimed at describing IOs of a certain class. There are many metadata schemes. In this connection, the question of choosing the most suitable for a certain subject area arises. This task is facilitated by the evaluation of the quality of the metadata scheme. Compliance with standard metadata schema. This characteristic evaluates the extent to which all DB information objects conform to the standard. For IO, the characteristic of compliance with the standard is also significant, but it is at the IO level. In general, compliance with the standard scheme is evaluated as the arithmetic mean of compliance with the IO standard 1 n i i S dard IO S dard MS n tan tan . (27) The completeness (usage) of the metadata scheme. This characteristic provides an opportunity to assess how much a certain scheme is used to describe the entire population of BD IOs. It is based on the characteristic of the completeness of the description of the IO in relation to the metadata scheme and is its arithmetic average for all IOs of the BD: 1 , n i i Completeness IO MS Completeness MS n . (28) This characteristic makes it possible to assess to what extent the decision to use a certain metadata scheme is justified, and, if necessary, to make a decision to replace it. Let's introduce metrics for evaluating the IO family. A family is a systematized set of IOs that are united into a single whole based on some meaningful or formal criteria of belonging, for example, regarding the general content, sources, purpose, semantic independence, method of use, etc. Completeness of the family. This characteristic establishes to what degree of completeness the family contains those IOs that it should contain. Completeness can be measured only when it is known what exactly the collection should contain, that is, when the original family, which acts as a sample, is known [13]. As a rule, families are distinguished on the basis of IO attributes. The formula for measuring family completeness is as follows: : (26) Metadata scheme quality characteristics. A set of specially selected metadata make up a metadata schema. In the general case, such a set can be arbitrary, but this significantly reduces the quality of the DB, because our BD envi- ronment becomes isolated from other data sets and will not be able to take (at least fully) in the process of integration and reasoning information, in a sense the system becomes isolated because even using mappings between data schemas will be inefficient due to the scale of the data. In this regard, efforts are being made to develop and use standard meta- data schemas, which are usually aimed at describing IOs of a certain class. There are many metadata schemes. In this connection, the question of choosing the most suitable for a certain subject area arises. This task is facilitated by the evaluation of the quality of the metadata scheme. Compliance with standard metadata schema. This characteristic evaluates the extent to which all DB information objects conform to the standard. For IO, the characteristic of compliance with the standard is also significant, but it is at the IO level. In general, compliance with the standard scheme is evaluated as the arithmetic mean of compliance with the IO standard Програмні засоби аналітики даних [Введите текст] Compliance with metadata schema. A metadata schema can set certain properties to its metadata. The characteristic of matching the metadata scheme determines how well the properties of the metadata of the IO correspond to the properties of the corresponding metadata of the selected scheme. Such properties include the type of data or attributes of relations between IOs, which in general are also included in the quality model. Let n is the number of metadata in the MS scheme, mi – is the number of properties of metadata mdi, ,Conformance i j is compliance of property j of metadata mdi ІО with the standard specification of the MS schema. ,Conformance i j is calculated by the formula: , 0 otherwise 1 iff i metadata propertybelong to j property fromMS Conformance i j (24) The correspondence of the IO to the i MS schema metadata is calculated according to the formula: 1 , im j i i Conformance i j Conformance md m (25) Then the correspondence to the Conformance(MS) metadata schema is calculated using the formula: 1 n i i Conformance md Conformance MS n (26) Metadata scheme quality characteristics. A set of specially selected metadata make up a metadata schema. In the general case, such a set can be arbitrary, but this significantly reduces the quality of the DB, because our BD environment becomes isolated from other data sets and will not be able to take (at least fully) in the process of integration and reasoning information, in a sense the system becomes isolated because even using mappings between data schemas will be inefficient due to the scale of the data. In this regard, efforts are being made to develop and use standard metadata schemas, which are usually aimed at describing IOs of a certain class. There are many metadata schemes. In this connection, the question of choosing the most suitable for a certain subject area arises. This task is facilitated by the evaluation of the quality of the metadata scheme. Compliance with standard metadata schema. This characteristic evaluates the extent to which all DB information objects conform to the standard. For IO, the characteristic of compliance with the standard is also significant, but it is at the IO level. In general, compliance with the standard scheme is evaluated as the arithmetic mean of compliance with the IO standard 1 n i i S dard IO S dard MS n tan tan . (27) The completeness (usage) of the metadata scheme. This characteristic provides an opportunity to assess how much a certain scheme is used to describe the entire population of BD IOs. It is based on the characteristic of the completeness of the description of the IO in relation to the metadata scheme and is its arithmetic average for all IOs of the BD: 1 , n i i Completeness IO MS Completeness MS n . (28) This characteristic makes it possible to assess to what extent the decision to use a certain metadata scheme is justified, and, if necessary, to make a decision to replace it. Let's introduce metrics for evaluating the IO family. A family is a systematized set of IOs that are united into a single whole based on some meaningful or formal criteria of belonging, for example, regarding the general content, sources, purpose, semantic independence, method of use, etc. Completeness of the family. This characteristic establishes to what degree of completeness the family contains those IOs that it should contain. Completeness can be measured only when it is known what exactly the collection should contain, that is, when the original family, which acts as a sample, is known [13]. As a rule, families are distinguished on the basis of IO attributes. The formula for measuring family completeness is as follows: : . (27) The completeness (usage) of the metadata scheme. This characteristic provides an opportunity to assess how much a certain scheme is used to describe the entire population of BD IOs. It is based on the characteristic of the completeness of the description of the IO in relation to the metadata scheme and is its arithmetic average for all IOs of the BD: Програмні засоби аналітики даних [Введите текст] Compliance with metadata schema. A metadata schema can set certain properties to its metadata. The characteristic of matching the metadata scheme determines how well the properties of the metadata of the IO correspond to the properties of the corresponding metadata of the selected scheme. Such properties include the type of data or attributes of relations between IOs, which in general are also included in the quality model. Let n is the number of metadata in the MS scheme, mi – is the number of properties of metadata mdi, ,Conformance i j is compliance of property j of metadata mdi ІО with the standard specification of the MS schema. ,Conformance i j is calculated by the formula: , 0 otherwise 1 iff i metadata propertybelong to j property fromMS Conformance i j (24) The correspondence of the IO to the i MS schema metadata is calculated according to the formula: 1 , im j i i Conformance i j Conformance md m (25) Then the correspondence to the Conformance(MS) metadata schema is calculated using the formula: 1 n i i Conformance md Conformance MS n (26) Metadata scheme quality characteristics. A set of specially selected metadata make up a metadata schema. In the general case, such a set can be arbitrary, but this significantly reduces the quality of the DB, because our BD environment becomes isolated from other data sets and will not be able to take (at least fully) in the process of integration and reasoning information, in a sense the system becomes isolated because even using mappings between data schemas will be inefficient due to the scale of the data. In this regard, efforts are being made to develop and use standard metadata schemas, which are usually aimed at describing IOs of a certain class. There are many metadata schemes. In this connection, the question of choosing the most suitable for a certain subject area arises. This task is facilitated by the evaluation of the quality of the metadata scheme. Compliance with standard metadata schema. This characteristic evaluates the extent to which all DB information objects conform to the standard. For IO, the characteristic of compliance with the standard is also significant, but it is at the IO level. In general, compliance with the standard scheme is evaluated as the arithmetic mean of compliance with the IO standard 1 n i i S dard IO S dard MS n tan tan . (27) The completeness (usage) of the metadata scheme. This characteristic provides an opportunity to assess how much a certain scheme is used to describe the entire population of BD IOs. It is based on the characteristic of the completeness of the description of the IO in relation to the metadata scheme and is its arithmetic average for all IOs of the BD: 1 , n i i Completeness IO MS Completeness MS n . (28) This characteristic makes it possible to assess to what extent the decision to use a certain metadata scheme is justified, and, if necessary, to make a decision to replace it. Let's introduce metrics for evaluating the IO family. A family is a systematized set of IOs that are united into a single whole based on some meaningful or formal criteria of belonging, for example, regarding the general content, sources, purpose, semantic independence, method of use, etc. Completeness of the family. This characteristic establishes to what degree of completeness the family contains those IOs that it should contain. Completeness can be measured only when it is known what exactly the collection should contain, that is, when the original family, which acts as a sample, is known [13]. As a rule, families are distinguished on the basis of IO attributes. The formula for measuring family completeness is as follows: : . (28) This characteristic makes it possible to assess to what extent the decision to use a certain metadata scheme is justified, and, if necessary, to make a decision to replace it. Let’s introduce metrics for evaluating the IO family. A family is a systematized set of IOs that are united into a single whole based on some meaningful or formal criteria of belonging, for example, regarding the general content, sources, purpose, semantic independence, method of use, etc. Completeness of the family. This characteristic establishes to what degree of completeness the family contains those IOs that it should contain. Completeness can be measured only when it is known what exactly 268 Програмні засоби аналітики даних the collection should contain, that is, when the original family, which acts as a sample, is known [13]. As a rule, families are distinguished on the basis of IO attributes. The formula for measuring family completeness is as follows: Програмні засоби аналітики даних 1 1 ( ) ( ) n i i n i original i IO F Completeness F IO F . (29) Conformity of the collection to the standard. Determines the extent to which collection IOs conform to the standard. Compliance with the standard of the family can be considered as the arithmetic average of compliance with the standard of its IO:: 1 ( ) n i i S dard IO F S dard F n tan tan , (30) where n is the number of IOs in the collection A variety of standards. It is believed that the family should be based on one standard metadata scheme specified in the external ontology, as the use of many schemes deteriorates the operational characteristics. The quality of this feature can be measured as the inverse of the number of metadata schema standards used. Consistency. There are many different situations where a collection can be considered inconsistent (conflicting). For non-limiting generalizations, we consider only one situation when there are two IOs with absolutely identical values of their metadata. Let the function , i j IdentMd IO IO acquire the following values: 1 have the same set of metadata , 0 otherwise i j i j IO and IO IdentMd IO IO . (31) Then the family matching function is defined as follows: 1 1, , 1 1 n n i j i j j i IdentMd IO IO Consistency F n n . (32) For modeling our approach, we are using Neo4j as a system for storing and managing big data (Miller, 2013), (Shi, et al., 2021). Neo4j is a database whose data model is a graph, specifically a property graph. We took a database for electronic components consisting of boxes, main boards, and memory modules. Our goal is to find all available interpretations which will be models for our knowledge base. It means the need to find all compatible components or find a list of components that are compatible with the selected. This problem more detail describe in (Trentin, et al., 2012), (Thorsten, et al., 2004), (Wang, et al., 2020). As specified in these works the quality of the result depends on the quality of metadata. And another important characteristic for semantic networks is the speed of reasoning for checking interpretation. It is related to time which needs to get answers about the compatibility of electronic components. The metrics of quality data are allowing us to reveal a problem with missing required metadata for interconnecting components. Due to this information and metrics like compliance with metadata schema as a result of cleaning data, we built graph storage which consists of 44195 relations, we don't have any nodes without missing important data. This graph has a relation between memories, main boards, and cases. At first look, this graph does not belong to big data but if we take only 54 different types of memories, 113 types of mainboards, and 119 types of cases the result of materialization gives 246912 available combinations for our system. This materialization is not included in the concrete domain. Materialization in the concrete domain will bring an enormous quantity of available nodes because if we have for example attribute which describes the count of ram slots on main board it allows putting on these slots a different combination of memory modules. Our optimization also includes checking only bi-directional dependencies between components. Our idea to split the knowledge database into two-part brings the possibility of extracting information from a database with materialization without a concrete domain. We are build relation in our graph that it responsibility to DL the main condition for building relation avoid concrete domain. On Pic5.3 demonstrate relation between our components. . (29) Conformity of the collection to the standard. Determines the extent to which collection IOs conform to the stan- dard. Compliance with the standard of the family can be considered as the arithmetic average of compliance with the standard of its IO:: Програмні засоби аналітики даних 1 1 ( ) ( ) n i i n i original i IO F Completeness F IO F . (29) Conformity of the collection to the standard. Determines the extent to which collection IOs conform to the standard. Compliance with the standard of the family can be considered as the arithmetic average of compliance with the standard of its IO:: 1 ( ) n i i S dard IO F S dard F n tan tan , (30) where n is the number of IOs in the collection A variety of standards. It is believed that the family should be based on one standard metadata scheme specified in the external ontology, as the use of many schemes deteriorates the operational characteristics. The quality of this feature can be measured as the inverse of the number of metadata schema standards used. Consistency. There are many different situations where a collection can be considered inconsistent (conflicting). For non-limiting generalizations, we consider only one situation when there are two IOs with absolutely identical values of their metadata. Let the function , i j IdentMd IO IO acquire the following values: 1 have the same set of metadata , 0 otherwise i j i j IO and IO IdentMd IO IO . (31) Then the family matching function is defined as follows: 1 1, , 1 1 n n i j i j j i IdentMd IO IO Consistency F n n . (32) For modeling our approach, we are using Neo4j as a system for storing and managing big data (Miller, 2013), (Shi, et al., 2021). Neo4j is a database whose data model is a graph, specifically a property graph. We took a database for electronic components consisting of boxes, main boards, and memory modules. Our goal is to find all available interpretations which will be models for our knowledge base. It means the need to find all compatible components or find a list of components that are compatible with the selected. This problem more detail describe in (Trentin, et al., 2012), (Thorsten, et al., 2004), (Wang, et al., 2020). As specified in these works the quality of the result depends on the quality of metadata. And another important characteristic for semantic networks is the speed of reasoning for checking interpretation. It is related to time which needs to get answers about the compatibility of electronic components. The metrics of quality data are allowing us to reveal a problem with missing required metadata for interconnecting components. Due to this information and metrics like compliance with metadata schema as a result of cleaning data, we built graph storage which consists of 44195 relations, we don't have any nodes without missing important data. This graph has a relation between memories, main boards, and cases. At first look, this graph does not belong to big data but if we take only 54 different types of memories, 113 types of mainboards, and 119 types of cases the result of materialization gives 246912 available combinations for our system. This materialization is not included in the concrete domain. Materialization in the concrete domain will bring an enormous quantity of available nodes because if we have for example attribute which describes the count of ram slots on main board it allows putting on these slots a different combination of memory modules. Our optimization also includes checking only bi-directional dependencies between components. Our idea to split the knowledge database into two-part brings the possibility of extracting information from a database with materialization without a concrete domain. We are build relation in our graph that it responsibility to DL the main condition for building relation avoid concrete domain. On Pic5.3 demonstrate relation between our components. , (30) where n is the number of IOs in the collection A variety of standards. It is believed that the family should be based on one standard metadata scheme specified in the external ontology, as the use of many schemes deteriorates the operational characteristics. The quality of this feature can be measured as the inverse of the number of metadata schema standards used. Consistency. There are many different situations where a collection can be considered inconsistent (conflicting). For non-limiting generalizations, we consider only one situation when there are two IOs with absolutely identical values of their metadata. Let the function Програмні засоби аналітики даних 1 1 ( ) ( ) n i i n i original i IO F Completeness F IO F . (29) Conformity of the collection to the standard. Determines the extent to which collection IOs conform to the standard. Compliance with the standard of the family can be considered as the arithmetic average of compliance with the standard of its IO:: 1 ( ) n i i S dard IO F S dard F n tan tan , (30) where n is the number of IOs in the collection A variety of standards. It is believed that the family should be based on one standard metadata scheme specified in the external ontology, as the use of many schemes deteriorates the operational characteristics. The quality of this feature can be measured as the inverse of the number of metadata schema standards used. Consistency. There are many different situations where a collection can be considered inconsistent (conflicting). For non-limiting generalizations, we consider only one situation when there are two IOs with absolutely identical values of their metadata. Let the function , i j IdentMd IO IO acquire the following values: 1 have the same set of metadata , 0 otherwise i j i j IO and IO IdentMd IO IO . (31) Then the family matching function is defined as follows: 1 1, , 1 1 n n i j i j j i IdentMd IO IO Consistency F n n . (32) For modeling our approach, we are using Neo4j as a system for storing and managing big data (Miller, 2013), (Shi, et al., 2021). Neo4j is a database whose data model is a graph, specifically a property graph. We took a database for electronic components consisting of boxes, main boards, and memory modules. Our goal is to find all available interpretations which will be models for our knowledge base. It means the need to find all compatible components or find a list of components that are compatible with the selected. This problem more detail describe in (Trentin, et al., 2012), (Thorsten, et al., 2004), (Wang, et al., 2020). As specified in these works the quality of the result depends on the quality of metadata. And another important characteristic for semantic networks is the speed of reasoning for checking interpretation. It is related to time which needs to get answers about the compatibility of electronic components. The metrics of quality data are allowing us to reveal a problem with missing required metadata for interconnecting components. Due to this information and metrics like compliance with metadata schema as a result of cleaning data, we built graph storage which consists of 44195 relations, we don't have any nodes without missing important data. This graph has a relation between memories, main boards, and cases. At first look, this graph does not belong to big data but if we take only 54 different types of memories, 113 types of mainboards, and 119 types of cases the result of materialization gives 246912 available combinations for our system. This materialization is not included in the concrete domain. Materialization in the concrete domain will bring an enormous quantity of available nodes because if we have for example attribute which describes the count of ram slots on main board it allows putting on these slots a different combination of memory modules. Our optimization also includes checking only bi-directional dependencies between components. Our idea to split the knowledge database into two-part brings the possibility of extracting information from a database with materialization without a concrete domain. We are build relation in our graph that it responsibility to DL the main condition for building relation avoid concrete domain. On Pic5.3 demonstrate relation between our components. acquire the following values: Програмні засоби аналітики даних 1 1 ( ) ( ) n i i n i original i IO F Completeness F IO F . (29) Conformity of the collection to the standard. Determines the extent to which collection IOs conform to the standard. Compliance with the standard of the family can be considered as the arithmetic average of compliance with the standard of its IO:: 1 ( ) n i i S dard IO F S dard F n tan tan , (30) where n is the number of IOs in the collection A variety of standards. It is believed that the family should be based on one standard metadata scheme specified in the external ontology, as the use of many schemes deteriorates the operational characteristics. The quality of this feature can be measured as the inverse of the number of metadata schema standards used. Consistency. There are many different situations where a collection can be considered inconsistent (conflicting). For non-limiting generalizations, we consider only one situation when there are two IOs with absolutely identical values of their metadata. Let the function , i j IdentMd IO IO acquire the following values: 1 have the same set of metadata , 0 otherwise i j i j IO and IO IdentMd IO IO . (31) Then the family matching function is defined as follows: 1 1, , 1 1 n n i j i j j i IdentMd IO IO Consistency F n n . (32) For modeling our approach, we are using Neo4j as a system for storing and managing big data (Miller, 2013), (Shi, et al., 2021). Neo4j is a database whose data model is a graph, specifically a property graph. We took a database for electronic components consisting of boxes, main boards, and memory modules. Our goal is to find all available interpretations which will be models for our knowledge base. It means the need to find all compatible components or find a list of components that are compatible with the selected. This problem more detail describe in (Trentin, et al., 2012), (Thorsten, et al., 2004), (Wang, et al., 2020). As specified in these works the quality of the result depends on the quality of metadata. And another important characteristic for semantic networks is the speed of reasoning for checking interpretation. It is related to time which needs to get answers about the compatibility of electronic components. The metrics of quality data are allowing us to reveal a problem with missing required metadata for interconnecting components. Due to this information and metrics like compliance with metadata schema as a result of cleaning data, we built graph storage which consists of 44195 relations, we don't have any nodes without missing important data. This graph has a relation between memories, main boards, and cases. At first look, this graph does not belong to big data but if we take only 54 different types of memories, 113 types of mainboards, and 119 types of cases the result of materialization gives 246912 available combinations for our system. This materialization is not included in the concrete domain. Materialization in the concrete domain will bring an enormous quantity of available nodes because if we have for example attribute which describes the count of ram slots on main board it allows putting on these slots a different combination of memory modules. Our optimization also includes checking only bi-directional dependencies between components. Our idea to split the knowledge database into two-part brings the possibility of extracting information from a database with materialization without a concrete domain. We are build relation in our graph that it responsibility to DL the main condition for building relation avoid concrete domain. On Pic5.3 demonstrate relation between our components. . (31) Then the family matching function is defined as follows: Програмні засоби аналітики даних 1 1 ( ) ( ) n i i n i original i IO F Completeness F IO F . (29) Conformity of the collection to the standard. Determines the extent to which collection IOs conform to the standard. Compliance with the standard of the family can be considered as the arithmetic average of compliance with the standard of its IO:: 1 ( ) n i i S dard IO F S dard F n tan tan , (30) where n is the number of IOs in the collection A variety of standards. It is believed that the family should be based on one standard metadata scheme specified in the external ontology, as the use of many schemes deteriorates the operational characteristics. The quality of this feature can be measured as the inverse of the number of metadata schema standards used. Consistency. There are many different situations where a collection can be considered inconsistent (conflicting). For non-limiting generalizations, we consider only one situation when there are two IOs with absolutely identical values of their metadata. Let the function , i j IdentMd IO IO acquire the following values: 1 have the same set of metadata , 0 otherwise i j i j IO and IO IdentMd IO IO . (31) Then the family matching function is defined as follows: 1 1, , 1 1 n n i j i j j i IdentMd IO IO Consistency F n n . (32) For modeling our approach, we are using Neo4j as a system for storing and managing big data (Miller, 2013), (Shi, et al., 2021). Neo4j is a database whose data model is a graph, specifically a property graph. We took a database for electronic components consisting of boxes, main boards, and memory modules. Our goal is to find all available interpretations which will be models for our knowledge base. It means the need to find all compatible components or find a list of components that are compatible with the selected. This problem more detail describe in (Trentin, et al., 2012), (Thorsten, et al., 2004), (Wang, et al., 2020). As specified in these works the quality of the result depends on the quality of metadata. And another important characteristic for semantic networks is the speed of reasoning for checking interpretation. It is related to time which needs to get answers about the compatibility of electronic components. The metrics of quality data are allowing us to reveal a problem with missing required metadata for interconnecting components. Due to this information and metrics like compliance with metadata schema as a result of cleaning data, we built graph storage which consists of 44195 relations, we don't have any nodes without missing important data. This graph has a relation between memories, main boards, and cases. At first look, this graph does not belong to big data but if we take only 54 different types of memories, 113 types of mainboards, and 119 types of cases the result of materialization gives 246912 available combinations for our system. This materialization is not included in the concrete domain. Materialization in the concrete domain will bring an enormous quantity of available nodes because if we have for example attribute which describes the count of ram slots on main board it allows putting on these slots a different combination of memory modules. Our optimization also includes checking only bi-directional dependencies between components. Our idea to split the knowledge database into two-part brings the possibility of extracting information from a database with materialization without a concrete domain. We are build relation in our graph that it responsibility to DL the main condition for building relation avoid concrete domain. On Pic5.3 demonstrate relation between our components. . (32) For modeling our approach, we are using Neo4j as a system for storing and managing big data (Miller, 2013), (Shi, et al., 2021). Neo4j is a database whose data model is a graph, specifically a property graph. We took a database for electronic components consisting of boxes, main boards, and memory modules. Our goal is to find all available interpretations which will be models for our knowledge base. It means the need to find all compatible components or find a list of components that are compatible with the selected. This problem more detail describe in (Trentin, et al., 2012), (Thorsten, et al., 2004), (Wang, et al., 2020). As specified in these works the quality of the result depends on the quality of metadata. And another important characteristic for semantic networks is the speed of reasoning for checking interpretation. It is related to time which needs to get answers about the compatibility of electronic components. The metrics of quality data are allowing us to reveal a problem with missing required metadata for inter- connecting components. Due to this information and metrics like compliance with metadata schema as a result of cleaning data, we built graph storage which consists of 44195 relations, we don’t have any nodes without missing important data. This graph has a relation between memories, main boards, and cases. At first look, this graph does not belong to big data but if we take only 54 different types of memories, 113 types of mainboards, and 119 types of cases the result of materialization gives 246912 available combinations for our system. This materialization is not included in the concrete domain. Materialization in the concrete domain will bring an enormous quantity of available nodes because if we have for example attribute which describes the count of ram slots on main board it allows putting on these slots a different combination of memory modules. Our optimization also includes checking only bi-directional dependencies between components. Our idea to split the knowledge database into two-part brings the possibility of extracting information from a database with materialization without a concrete domain. We are build relation in our graph that it responsibility to DL Програмні засоби аналітики даних 1 1 ( ) ( ) n i i n i original i IO F Completeness F IO F . (29) Conformity of the collection to the standard. Determines the extent to which collection IOs conform to the standard. Compliance with the standard of the family can be considered as the arithmetic average of compliance with the standard of its IO:: 1 ( ) n i i S dard IO F S dard F n tan tan , (30) where n is the number of IOs in the collection A variety of standards. It is believed that the family should be based on one standard metadata scheme specified in the external ontology, as the use of many schemes deteriorates the operational characteristics. The quality of this feature can be measured as the inverse of the number of metadata schema standards used. Consistency. There are many different situations where a collection can be considered inconsistent (conflicting). For non-limiting generalizations, we consider only one situation when there are two IOs with absolutely identical values of their metadata. Let the function , i j IdentMd IO IO acquire the following values: 1 have the same set of metadata , 0 otherwise i j i j IO and IO IdentMd IO IO . (31) Then the family matching function is defined as follows: 1 1, , 1 1 n n i j i j j i IdentMd IO IO Consistency F n n . (32) For modeling our approach, we are using Neo4j as a system for storing and managing big data (Miller, 2013), (Shi, et al., 2021). Neo4j is a database whose data model is a graph, specifically a property graph. We took a database for electronic components consisting of boxes, main boards, and memory modules. Our goal is to find all available interpretations which will be models for our knowledge base. It means the need to find all compatible components or find a list of components that are compatible with the selected. This problem more detail describe in (Trentin, et al., 2012), (Thorsten, et al., 2004), (Wang, et al., 2020). As specified in these works the quality of the result depends on the quality of metadata. And another important characteristic for semantic networks is the speed of reasoning for checking interpretation. It is related to time which needs to get answers about the compatibility of electronic components. The metrics of quality data are allowing us to reveal a problem with missing required metadata for interconnecting components. Due to this information and metrics like compliance with metadata schema as a result of cleaning data, we built graph storage which consists of 44195 relations, we don't have any nodes without missing important data. This graph has a relation between memories, main boards, and cases. At first look, this graph does not belong to big data but if we take only 54 different types of memories, 113 types of mainboards, and 119 types of cases the result of materialization gives 246912 available combinations for our system. This materialization is not included in the concrete domain. Materialization in the concrete domain will bring an enormous quantity of available nodes because if we have for example attribute which describes the count of ram slots on main board it allows putting on these slots a different combination of memory modules. Our optimization also includes checking only bi-directional dependencies between components. Our idea to split the knowledge database into two-part brings the possibility of extracting information from a database with materialization without a concrete domain. We are build relation in our graph that it responsibility to DL the main condition for building relation avoid concrete domain. On Pic5.3 demonstrate relation between our components. the main condition for building relation avoid concrete domain. On Pic5.3 demonstrate relation between our components. Three approaches were tested on the test data set. The first time q1 when relation were built taking into account all possible variations, including the quantities of the selected components. The second approach q2 consisted in grouping components by common value of attributes in such a way as to avoid building additional connections. And the last optimization q3 consisted in the fact that first all compatible compo- nents were searched, and only then the conditions of quantitative restrictions for a concrete domain were checked for satisfaction. 269 Програмні засоби аналітики даних Програмні засоби аналітики даних [Введите текст] Pic 5.3 Part of the graph representation for sematic big data Three approaches were tested on the test data set. The first time 1 q when relation were built taking into account all possible variations, including the quantities of the selected components. The second approach 2 q consisted in grouping components by common value of attributes in such a way as to avoid building additional connections. And the last optimization 3 q consisted in the fact that first all compatible components were searched, and only then the conditions of quantitative restrictions for a concrete domain were checked for satisfaction. One problem is that the same component can be reinstalled twice or more depending on the number of previously selected components. That is, if the motherboard has 8 RAM sockets, then there may be a situation when 8 identical memory modules are selected, and there may be 8 different modules. Moreover, for the motherboard, we must check not only the quantitative limitation of the number of occupied sockets, but also the limitation regarding the maximum amount of memory supported by the motherboard Type optimization and query Execution time Count results 1 q list all mainboards 5612 ms 6850 1 q list all mainboards for the specific memory modules 4630 ms 6432 2 q list all mainboards (specification was grouped) 3400 ms 6850 2 q list all mainboards for the specific memory modules (specification was grouped) 2530 ms 6432 3 q list all main boards (specification was grouped and quantity restriction included) time for two query 780 ms 6850 3 q list all main boards for the specific memory modules (specification was grouped and quantity restriction included) time for two query 43 ms 6432 As we can see, the simplification of requests gives a significant increase in the speed of execution. But result BD systems depend on characteristics such as the completeness of the description, compliance with the metadata scheme. It should be noted that according to the expert evaluation of work with web resources, the response of the web service should be up to 600 ms. 6. Conclusions The complexity of big data applications combined with the lack of standards for the representation of information objects, processing and storage requires significant resources. Data quality is one of the approaches that will allow achieving modeling of data that will require simpler algorithms for analysis. Analysis of data quality allows increasing their accuracy in various aspects. Enrich data semantics is a complex process of describing big data by ontological means. However, there is a problem with the speed of inference, the article proposes a method of knowledge base materialization in the environment of big data to optimize inference. The quality of the data plays a key role in this, allowing to build of appropriate graphs of schematic data on the basis of metadata. Pic 5.3 Part of the graph representation for sematic big data One problem is that the same component can be reinstalled twice or more depending on the number of previ- ously selected components. That is, if the motherboard has 8 RAM sockets, then there may be a situation when 8 identical memory modules are selected, and there may be 8 different modules. Moreover, for the motherboard, we must check not only the quantitative limitation of the number of occupied sockets, but also the limitation regarding the maximum amount of memory supported by the motherboard Type optimization and query Execution time Count results list all mainboards 5612 ms 6850 list all mainboards for the specific memory modules 4630 ms 6432 list all mainboards (specification was grouped) 3400 ms 6850 list all mainboards for the specific memory modules (specification was grouped) 2530 ms 6432 list all main boards (specification was grouped and quantity restriction included) time for two query 780 ms 6850 list all main boards for the specific memory modules (specification was grouped and quantity restriction included) time for two query 43 ms 6432 As we can see, the simplification of requests gives a significant increase in the speed of execution. But result BD systems depend on characteristics such as the completeness of the description, compliance with the metadata scheme. It should be noted that according to the expert evaluation of work with web resources, the response of the web service should be up to 600 ms. 6. Conclusions The complexity of big data applications combined with the lack of standards for the representation of informa- tion objects, processing and storage requires significant resources. Data quality is one of the approaches that will allow achieving modeling of data that will require simpler algorithms for analysis. Analysis of data quality allows increasing their accuracy in various aspects. Enrich data semantics is a complex process of describing big data by ontological means. However, there is a problem with the speed of inference, the article proposes a method of knowledge base materialization in the environment of big data to optimize inference. The quality of the data plays a key role in this, allowing to build of appropriate graphs of schematic data on the basis of metadata. Higher data quality levels can help produce better reasoning results but also help improve data maintainability and reusability and integration. References 1. Amsler, R., 1972. Application of Citation-based Automatic Classification, Austin: s.n. 2. Ceravolo, P. et al., 2018. Big data semantics. Journal on Data Semantics, 7(2), pp. 65-85. 3. Harford, T., 2014. Big data: A big mistake?. Significance , 11(5), pp. 14-19. 4. Intel IT Center, I. C., 2012. Centre. Big Data Analytics: Intel’s IT Manager Survey on How Organizations Are Using Big Data, Santa Clara: s.n. 270 Програмні засоби аналітики даних 5. Laney, D., 2001. 3D data management: Controlling data volume, velocity and variety, місце видання невідоме: META group. 6. Lutz, C., 2002. The Complexity of Description Logics with Concrete Domains, Hamburg: автор невідомий 7. Miller, J. J., 2013. Graph database applications and concepts with Neo4j. In Proceedings of the southern association for information systems con- ference, 2324(36). 8. Novitsky, A., Reznychenko, V. & Romanov, E., 2016. Characteristics and quality metrics of electronic libraries in the semantic web. Software engineering, 1(25), pp. 17-36. 9. Novytskyi, O., Proskudina, G. & Ovdiy, O., 2014. Development of an digital library quality model. місце видання невідоме, Lviv Polytechnic Publishing House, p. 284–285. 10. Novytskyi, O., Proskudina, G. Y., Reznichenko, V. & Ovdiy, O., 2014. Evaluation of the quality of electronic libraries in the web environment. Software engineering, 20(4). 11. Novytskyi, O. V., 2010. Data integration in the Internet: linked data. Kyiv, Institute of Software Systems of the National Academy of Sciences of Ukraine, pp. 487-493. 12. Raphael, V., Staab, S. & Motik, B., 2005. Incrementally maintaining materializations of ontologies stored in logic databases. Journal on Data Semantics, pp. 1-34. 13. Schmidt-Schaubß, M. & Smolka, G., 1991. Attributive concept descriptions with complements. Artif. Intell, 48(1), pp. 1-26. 14. Schroeck, M. et al., 2012. Analytics: The Real-World Use of Big Data, s.l.: IBM. 15. Shi, P., Fan, G., Li, S. & Kou, D., 2021. Big Data Storage Technology for Smart Distribution Grid Based on Neo4j Graph Database. IEEE 4th International Conference on Electronics Technology (ICET), pp. 441-445. 16. Spirin, O. M. et al., 2012. Collective monograph. Electronic library information systems of scientific and educational institutions. Kyiv: Pedagogi- cal press. 17. Stuart Ward, J. & Barker, A., 2013. Undefined By Data: A Survey of Big Data Definitions. 18. Suthaharan, S., 2014. Big data classification: Problems and challenges in network intrusion prediction with machine learning.. ACM SIGMETRICS Performance Evaluation Review, 41(4), pp. 70-73. 19. Thorsten, B., Nizar, A., Kreutler, G. & Gerhard, F., 2004. Product Configuration Systems: State of the Art, Conceptualization and Extensions. Munich, University Library of Munich, pp. 25-36. 20. Trentin, A., Perin, E. & Forza, C., 2012. Product configurator impact on product quality. International Journal of Production Economics, 135(2), pp. 850-859. 21. Wang, Y., Wenlong, Z. & Wayne, X. W., 2020. Needs-based product configurator design for mass customization using hierarchical attention net- work. IEEE Transactions on Automation Science and Engineering, 18(1), pp. 195-204. 22. Wilkinson, M. et al., 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific data, pp. 1-9. 23. Woods, W. A., 1975. What’s in a link: Foundations for semantic networks.. Representation and understanding, pp. 35-82. Received 03.08.2022 About the author: Novytskyi Oleksandr Vadumovuch PhD, researcher, Kyiv, Hirsch index 9, number of publications 70, ORCID 0000-0002-9955-7882, tel. 067 44 53 173, alex.googl@gmail.com Place of work: Institute of Software Systems of NAS of Ukraine 3187, Kyiv, ave. Akademika Glushkova, 40, building 5, tel. (044) 526-33-19, e-mail: iss@isofts.kiev.ua Прізвище та ініціали автора і назва доповіді українською мовою: Новицький О.В. Поняття якості та оцінювання якості великих даних в семантичному середовищі Прізвище та ініціали автора і назва доповіді англійською мовою: Novytskyi O.V. The concept and evaluating of big data quality in the semantic environment
id pp_isofts_kiev_ua-article-527
institution Problems in programming
keywords_txt_mv keywords
language English
last_indexed 2025-07-17T09:41:12Z
publishDate 2023
publisher PROBLEMS IN PROGRAMMING
record_format ojs
resource_txt_mv ppisoftskievua/a7/0e3d015f055b467859318072ecd958a7.pdf
spelling pp_isofts_kiev_ua-article-5272023-06-25T06:57:27Z The concept and evaluating of big data quality in the semantic environment Поняття якості та оцінювання якості великих даних в семантичному середовищі Novitsky, A.V. big data; complex data sets UDC 004.05 великі дані; складних наборів даних; теорія оцінювання якості УДК 004.05 Big data refers to large volumes, complex data sets with various autonomous sources, characterized by continuous growth. Data storage and data collection capabilities are now rapidly expanding in all fields of science and technology due to the rapid development of networks. Evaluating the quality of data is a difficult task in the context of big data, because the speed of semantic data reasoning directly depends on its quality. The appropriate strategies are necessary to evaluate and assess data quality according to the huge amount of data and its rapid generation. Managing a large volume of heterogeneous and distributed data requires defining and continuously updating metadata describing various aspects of data semantics and its quality, such as conformance to metadata schema, provenance, reliability, accuracy and other properties. The article examines the problem of evaluating the quality of big data in the semantic environment. The definition of big data and its semantics is given below and there is a short excursion on a theory of quality assessment. The model and its components which allow to form and specify metrics for quality have already been developed. This model includes such components as: quality characteristics; quality metric; quality system; quality policy. A quality model for big data that defines the main components and requirements for data evaluation has already been proposed. In particular, such evaluation components as: accessibility, relevance, popularity, compliance with the standard, consistency, etc. are highlighted. The problem of inference complexity is demonstrated in the article. Approaches to improving fast semantic inference through materialization and division of the knowledge base into two components, which are expressed by different dialects of descriptive logic, are also considered below. The materialization of big data makes it possible to significantly speed up the processing of requests for information extraction. It is demonstrated how the quality of metadata affects materialization. The proposed model of the knowledge base allows increasing the qualitative indicators of the reasoning speed.Prombles in programming 2022; 3-4: 260-270 Великі дані стосуються великих обсягів, складних наборів даних із різними автономними джерелами, що характеризуються постійним зростанням. Зі швидким розвитком мереж, зберігання даних і можливостей збору даних, великі дані швидко розши- рюються в усіх сферах науки та техніки. У контексті великих даних оцінка якості даних є складною задачею. Для семантичних даних якість і швидкість виводу безпосередньо залежить від якості даних. Враховуючи величезний обсяг даних і їх швидке генерування, це вимагає відповідних стратегій для оцінки якості даних. Управління великим обсягом різнорідних і розподілених даних вимагає визначення та постійного оновлення метаданих, що описують різні аспекти семантики та якості даних, такі як від- повідність схемі метаданих, походження, надійність, точність та інші властивості. В статі розглянута проблематика оцінювання якості великих даних у семантичному середовищі. Наведено визначення великих даних та їх семантики, зроблено невеликий екскурс в теорію оцінювання якості. Розроблена модель та її компоненти, що дозволяє сформувати та конкретизувати метрики для якості. В дану модель входять такі компоненти як: характеристика якості, метрика якості, система якості, політка якості. Запро- понована модель якості для великих даних, яка визначає основні компоненти та вимоги до оцінювання даних. Зокрема, виділено такі компоненти оцінювання як: доступність, релеватність, популярність, відповідність стандарту, узгодженість тощо. Продемонстрована проблема складності виводу. Розглянуто підходи до покращення швидкого семантичного виводу через матеріалізацію та поділ бази знань на два компоненти, які виражаються різними діалектами дескриптивної логіки. Оскільки матеріалізація великих даних дозволяє значно пришвидшити обробку запитів на екстракцію інформації. Продемонстровано як якість метаданих вливає на матеріалізацію. Запропонована модель бази знань, яка дозволяє підвищити якісні показники швидкості виводу.Prombles in programming 2022; 3-4: 260-270 PROBLEMS IN PROGRAMMING ПРОБЛЕМЫ ПРОГРАММИРОВАНИЯ ПРОБЛЕМИ ПРОГРАМУВАННЯ 2023-01-23 Article Article application/pdf https://pp.isofts.kiev.ua/index.php/ojs1/article/view/527 10.15407/pp2022.03-04.260 PROBLEMS IN PROGRAMMING; No 3-4 (2022); 260-270 ПРОБЛЕМЫ ПРОГРАММИРОВАНИЯ; No 3-4 (2022); 260-270 ПРОБЛЕМИ ПРОГРАМУВАННЯ; No 3-4 (2022); 260-270 1727-4907 10.15407/pp2022.03-04 en https://pp.isofts.kiev.ua/index.php/ojs1/article/view/527/579 Copyright (c) 2023 PROBLEMS IN PROGRAMMING
spellingShingle big data
complex data sets
UDC 004.05
Novitsky, A.V.
The concept and evaluating of big data quality in the semantic environment
title The concept and evaluating of big data quality in the semantic environment
title_alt Поняття якості та оцінювання якості великих даних в семантичному середовищі
title_full The concept and evaluating of big data quality in the semantic environment
title_fullStr The concept and evaluating of big data quality in the semantic environment
title_full_unstemmed The concept and evaluating of big data quality in the semantic environment
title_short The concept and evaluating of big data quality in the semantic environment
title_sort concept and evaluating of big data quality in the semantic environment
topic big data
complex data sets
UDC 004.05
topic_facet big data
complex data sets
UDC 004.05
великі дані
складних наборів даних
теорія оцінювання якості
УДК 004.05
url https://pp.isofts.kiev.ua/index.php/ojs1/article/view/527
work_keys_str_mv AT novitskyav theconceptandevaluatingofbigdataqualityinthesemanticenvironment
AT novitskyav ponâttââkostítaocínûvannââkostívelikihdanihvsemantičnomuseredoviŝí
AT novitskyav conceptandevaluatingofbigdataqualityinthesemanticenvironment