An automated scenario generation system for analytical activities.

Recently, for a certain circle of analytical processes there is a tendency of intellectualization of software components that implement these processes. It also means the accumulation of knowledge about the functioning of the analytical system, including knowledge of the analyst's actions. Accu...

Повний опис

Збережено в:
Бібліографічні деталі
Дата:2019
Автори: Dodonov, O. G., Koval, O. V., Senchenko, V. R., Shpurik, V. V.
Формат: Стаття
Мова:Ukrainian
Опубліковано: Інститут проблем реєстрації інформації НАН України 2019
Теми:
Онлайн доступ:http://drsp.ipri.kiev.ua/article/view/179167
Теги: Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
Назва журналу:Data Recording, Storage & Processing

Репозитарії

Data Recording, Storage & Processing
id drspiprikievua-article-179167
record_format ojs
institution Data Recording, Storage & Processing
collection OJS
language Ukrainian
topic аналітична діяльність
машинне навчання
дерева рішень
коефіцієнт Gini
ентропія
орграф
мова Python
analytical activity
machine learning
Classification and Regression Trees method
decision trees
Gini coefficient
entropy
graph
Python programming language
spellingShingle аналітична діяльність
машинне навчання
дерева рішень
коефіцієнт Gini
ентропія
орграф
мова Python
analytical activity
machine learning
Classification and Regression Trees method
decision trees
Gini coefficient
entropy
graph
Python programming language
Dodonov, O. G.
Koval, O. V.
Senchenko, V. R.
Shpurik, V. V.
An automated scenario generation system for analytical activities.
topic_facet аналітична діяльність
машинне навчання
дерева рішень
коефіцієнт Gini
ентропія
орграф
мова Python
analytical activity
machine learning
Classification and Regression Trees method
decision trees
Gini coefficient
entropy
graph
Python programming language
format Article
author Dodonov, O. G.
Koval, O. V.
Senchenko, V. R.
Shpurik, V. V.
author_facet Dodonov, O. G.
Koval, O. V.
Senchenko, V. R.
Shpurik, V. V.
author_sort Dodonov, O. G.
title An automated scenario generation system for analytical activities.
title_short An automated scenario generation system for analytical activities.
title_full An automated scenario generation system for analytical activities.
title_fullStr An automated scenario generation system for analytical activities.
title_full_unstemmed An automated scenario generation system for analytical activities.
title_sort automated scenario generation system for analytical activities.
title_alt Автоматизована система формування сценарію аналітичної діяльності
description Recently, for a certain circle of analytical processes there is a tendency of intellectualization of software components that implement these processes. It also means the accumulation of knowledge about the functioning of the analytical system, including knowledge of the analyst's actions. Accumulation of knowledge allows the software to independently classify new data and offer the user the most appropriate steps of the scenario for analytical activities.An analytical activity scenario has considered as a certain representation of knowledge, used to describe the sequence of related events — in the form of Directed Acyclic Graph. The article proposes an approach to solving the problem of intellectualization of the process of forming a scenario of analytical activity, based on the development of methods of machine learning, namely Classification and Regression Trees. This approach using a combination of metrics for evaluation of the effectiveness has been applied.The authors have proposed an own version of the intellectualization software, that implement of the Classification and Regression Trees method on Python programming language. This version differs from the known, the possibility of using different metrics in analyzing the quality of the partition and through it the choice of the next step of the probable actions of analytical scenarios. Unlike existing approaches, the authors have offered the choice of the most optimal metric for assessing the quality of approximation to the desired learning result — the Gini coefficient or the method of calculating the entropy of utility information by Shannon.The first step in constructing a scenario is a description of the matrix of all possible states of the oriented graph, which reflects the sequence of user actions to achieve the goal.Then, there have been computed the inhomogeneities input data, which contains a matrix of possible scenario actions. The measure of heterogeneity is entropy information by Shannon. To improve the quality of the partition, use the Gini coefficient. The best decomposition criterion for «True» or «False» has been calculated for constructing a decision tree. Based on the decision tree, the program offers the user the most appropriate next steps. The algorithm has been supplemented by a more convenient mechanism for forming the semantic conditions of the transition in the form of operators «if – then».The suggested approach allows you to reduce the number of user’s erroneous actions (especially inexperienced users) in the formation of complex scenarios with a variety of conditions for the use of data analysis operators.
publisher Інститут проблем реєстрації інформації НАН України
publishDate 2019
url http://drsp.ipri.kiev.ua/article/view/179167
work_keys_str_mv AT dodonovog anautomatedscenariogenerationsystemforanalyticalactivities
AT kovalov anautomatedscenariogenerationsystemforanalyticalactivities
AT senchenkovr anautomatedscenariogenerationsystemforanalyticalactivities
AT shpurikvv anautomatedscenariogenerationsystemforanalyticalactivities
AT dodonovog avtomatizovanasistemaformuvannâscenaríûanalítičnoídíâlʹností
AT kovalov avtomatizovanasistemaformuvannâscenaríûanalítičnoídíâlʹností
AT senchenkovr avtomatizovanasistemaformuvannâscenaríûanalítičnoídíâlʹností
AT shpurikvv avtomatizovanasistemaformuvannâscenaríûanalítičnoídíâlʹností
AT dodonovog automatedscenariogenerationsystemforanalyticalactivities
AT kovalov automatedscenariogenerationsystemforanalyticalactivities
AT senchenkovr automatedscenariogenerationsystemforanalyticalactivities
AT shpurikvv automatedscenariogenerationsystemforanalyticalactivities
first_indexed 2024-04-21T19:34:03Z
last_indexed 2024-04-21T19:34:03Z
_version_ 1796974096903307264
spelling drspiprikievua-article-1791672019-12-10T10:12:36Z An automated scenario generation system for analytical activities. Автоматизована система формування сценарію аналітичної діяльності Dodonov, O. G. Koval, O. V. Senchenko, V. R. Shpurik, V. V. аналітична діяльність машинне навчання дерева рішень коефіцієнт Gini ентропія орграф мова Python analytical activity machine learning Classification and Regression Trees method decision trees Gini coefficient entropy graph Python programming language Recently, for a certain circle of analytical processes there is a tendency of intellectualization of software components that implement these processes. It also means the accumulation of knowledge about the functioning of the analytical system, including knowledge of the analyst's actions. Accumulation of knowledge allows the software to independently classify new data and offer the user the most appropriate steps of the scenario for analytical activities.An analytical activity scenario has considered as a certain representation of knowledge, used to describe the sequence of related events — in the form of Directed Acyclic Graph. The article proposes an approach to solving the problem of intellectualization of the process of forming a scenario of analytical activity, based on the development of methods of machine learning, namely Classification and Regression Trees. This approach using a combination of metrics for evaluation of the effectiveness has been applied.The authors have proposed an own version of the intellectualization software, that implement of the Classification and Regression Trees method on Python programming language. This version differs from the known, the possibility of using different metrics in analyzing the quality of the partition and through it the choice of the next step of the probable actions of analytical scenarios. Unlike existing approaches, the authors have offered the choice of the most optimal metric for assessing the quality of approximation to the desired learning result — the Gini coefficient or the method of calculating the entropy of utility information by Shannon.The first step in constructing a scenario is a description of the matrix of all possible states of the oriented graph, which reflects the sequence of user actions to achieve the goal.Then, there have been computed the inhomogeneities input data, which contains a matrix of possible scenario actions. The measure of heterogeneity is entropy information by Shannon. To improve the quality of the partition, use the Gini coefficient. The best decomposition criterion for «True» or «False» has been calculated for constructing a decision tree. Based on the decision tree, the program offers the user the most appropriate next steps. The algorithm has been supplemented by a more convenient mechanism for forming the semantic conditions of the transition in the form of operators «if – then».The suggested approach allows you to reduce the number of user’s erroneous actions (especially inexperienced users) in the formation of complex scenarios with a variety of conditions for the use of data analysis operators. Проведено дослідження щодо тенденцій інтелектуалізації програмних компонентів у сучасних аналітичних системах. Показано що однією з головних вимог до сучасних аналітичних систем є комфортність самого процесу спілкування з системою за рахунок їхньої інтелектуалізації, тобто здатність системи пропонувати корстувачеві найбільш імовірній крок сценарію, виходячи з аналізу попередніх дій і накопичених знань. Запропоновано підхід до вирішення задачі інтелектуалізації процесу формування сценарію аналітичної діяльності, що заснований на розвитку методів машинного навчання, а саме розвитку навчання деревами класифікації і регресії (Classification and Regression Trees) з використанням комбінації метрик оцінок ефективності запропонованого варіанта — коефіцієнта Gini або розрахунку ентропії корисності інформації. На підставі запропонованого підходу виконано реалізацію алгоритму у вигляді програми мовою Python, яка дозволяє пропонувати ймовірний крок сценарію аналітичної діяльності, навчаючись на діях користувача. Інститут проблем реєстрації інформації НАН України 2019-11-05 Article Article application/pdf http://drsp.ipri.kiev.ua/article/view/179167 10.35681/1560-9189.2019.1.1.179167 Data Recording, Storage & Processing; Vol. 21 No. 1 (2019); 11-22 Регистрация, хранение и обработка данных; Том 21 № 1 (2019); 11-22 Реєстрація, зберігання і обробка даних; Том 21 № 1 (2019); 11-22 1560-9189 uk http://drsp.ipri.kiev.ua/article/view/179167/182645 Авторське право (c) 2021 Реєстрація, зберігання і обробка даних