A linguistic approach to time series forecasting

Methods for predicting dynamic time series (including non-stationary ones) based on a linguistic approach, namely, the study of occurrences and repetition of so-called N-grams, are proposed. This approach is used in computational linguistics to create statistical translators, detect plagiarism and d...

Повний опис

Збережено в:

Бібліографічні деталі
Дата:	2022
Автори:	Ланде, Д. В., Юзефович , В. В.
Формат:	Стаття
Мова:	Ukrainian
Опубліковано:	Інститут проблем реєстрації інформації НАН України 2022
Теми:	time series forecasting N-gram method quantization trend model similarity criterion correlation linear regression часовий ряд прогнозування метод N-грам квантування тренд модель критерій подібності кореляція лінійна регресія
Онлайн доступ:	http://drsp.ipri.kiev.ua/article/view/262673
Теги:	Додати тег Немає тегів, Будьте першим, хто поставить тег для цього запису!
Назва журналу:	Data Recording, Storage & Processing

Репозитарії

Data Recording, Storage & Processing

id	drspiprikievua-article-262673
record_format	ojs
spelling	drspiprikievua-article-2626732022-09-11T03:35:59Z A linguistic approach to time series forecasting Лінгвістичний підхід до прогнозування часових рядів Ланде, Д. В. Юзефович , В. В. time series, forecasting, N-gram method, quantization, trend, model, similarity criterion, correlation, linear regression часовий ряд, прогнозування, метод N-грам, квантування, тренд, модель, критерій подібності, кореляція, лінійна регресія Methods for predicting dynamic time series (including non-stationary ones) based on a linguistic approach, namely, the study of occurrences and repetition of so-called N-grams, are proposed. This approach is used in computational linguistics to create statistical translators, detect plagiarism and duplicate documents. In contrast to the application in linguistics, the method can be extended by taking into account the correlations of sequences of stable word combinations, as well as trends. The proposed methods do not require a preliminary study and determination of the characteristics of time series and complex setting of the input parameters of the forecasting model. The proposed methods allow, with a high level of automation, to carry out short-term and medium-term forecasts of time series, which are characterized by trends and cyclicality, in particular, series of publication dynamics in content monitoring systems. Also, the proposed methods can be used to predict the values of the parameters of a large complex system in the task of monitoring its state, when the number of such parameters is significant, and therefore a high level of automation of the forecasting process is desirable. A significant advantage of the approach is the absence of requirements for time series stationarity and a small number of tuning parameters. Further research may focus on the study of various criteria for the similarity of time series fragments, the use of nonlinear similarity criteria, the search for ways to automatically determine the rational step of quantization of the time series. Запропоновано методи прогнозування динамічних часових рядів (у тому числі й нестаціонарних), що базуються на лінгвістичному підході, а саме: дослідженні входжень і повторюваності так званих N-грам, які застосовуються в комп’ютерній лінгвістиці при створенні статистичних перекладачів, виявленні плагіату, дублікатів документів. На відміну від застосування в лінгвістиці, метод може бути розширений урахуванням кореляцій послідовностей сталих словосполучень, а також трендів. При цьому запропоновані методи не вимагають поперед-нього дослідження та визначення характеристик часових рядів і склад-ного налаштування вхідних параметрів моделі прогнозування. Запропоновані методи дозволяють з високим рівнем автоматизації здійснювати короткострокові та середньострокові прогнози динамічних часових рядів, яким притаманні тренди та циклічність, зокрема, рядів динаміки публікацій у системах контент-моніторингу. Суттєвою перевагою підходу є відсутність вимог до стаціонарності часових рядів і мала кількість параметрів налаштування. Інститут проблем реєстрації інформації НАН України 2022-06-07 Article Article application/pdf http://drsp.ipri.kiev.ua/article/view/262673 10.35681/1560-9189.2022.24.1.262673 Data Recording, Storage & Processing; Vol. 24 No. 1 (2022); 13-22 Регистрация, хранение и обработка данных; Том 24 № 1 (2022); 13-22 Реєстрація, зберігання і обробка даних; Том 24 № 1 (2022); 13-22 1560-9189 uk http://drsp.ipri.kiev.ua/article/view/262673/259957 Авторське право (c) 2022 Реєстрація, зберігання і обробка даних
institution	Data Recording, Storage & Processing
collection	OJS
language	Ukrainian
topic	time series forecasting N-gram method quantization trend model similarity criterion correlation linear regression часовий ряд прогнозування метод N-грам квантування тренд модель критерій подібності кореляція лінійна регресія
spellingShingle	time series forecasting N-gram method quantization trend model similarity criterion correlation linear regression часовий ряд прогнозування метод N-грам квантування тренд модель критерій подібності кореляція лінійна регресія Ланде, Д. В. Юзефович , В. В. A linguistic approach to time series forecasting
topic_facet	time series forecasting N-gram method quantization trend model similarity criterion correlation linear regression часовий ряд прогнозування метод N-грам квантування тренд модель критерій подібності кореляція лінійна регресія
format	Article
author	Ланде, Д. В. Юзефович , В. В.
author_facet	Ланде, Д. В. Юзефович , В. В.
author_sort	Ланде, Д. В.
title	A linguistic approach to time series forecasting
title_short	A linguistic approach to time series forecasting
title_full	A linguistic approach to time series forecasting
title_fullStr	A linguistic approach to time series forecasting
title_full_unstemmed	A linguistic approach to time series forecasting
title_sort	linguistic approach to time series forecasting
title_alt	Лінгвістичний підхід до прогнозування часових рядів
description	Methods for predicting dynamic time series (including non-stationary ones) based on a linguistic approach, namely, the study of occurrences and repetition of so-called N-grams, are proposed. This approach is used in computational linguistics to create statistical translators, detect plagiarism and duplicate documents. In contrast to the application in linguistics, the method can be extended by taking into account the correlations of sequences of stable word combinations, as well as trends. The proposed methods do not require a preliminary study and determination of the characteristics of time series and complex setting of the input parameters of the forecasting model. The proposed methods allow, with a high level of automation, to carry out short-term and medium-term forecasts of time series, which are characterized by trends and cyclicality, in particular, series of publication dynamics in content monitoring systems. Also, the proposed methods can be used to predict the values of the parameters of a large complex system in the task of monitoring its state, when the number of such parameters is significant, and therefore a high level of automation of the forecasting process is desirable. A significant advantage of the approach is the absence of requirements for time series stationarity and a small number of tuning parameters. Further research may focus on the study of various criteria for the similarity of time series fragments, the use of nonlinear similarity criteria, the search for ways to automatically determine the rational step of quantization of the time series.
publisher	Інститут проблем реєстрації інформації НАН України
publishDate	2022
url	http://drsp.ipri.kiev.ua/article/view/262673
work_keys_str_mv	AT landedv alinguisticapproachtotimeseriesforecasting AT ûzefovičvv alinguisticapproachtotimeseriesforecasting AT landedv língvístičnijpídhíddoprognozuvannâčasovihrâdív AT ûzefovičvv língvístičnijpídhíddoprognozuvannâčasovihrâdív AT landedv linguisticapproachtotimeseriesforecasting AT ûzefovičvv linguisticapproachtotimeseriesforecasting
first_indexed	2024-04-21T19:34:26Z
last_indexed	2024-04-21T19:34:26Z
_version_	1796974121629777920

A linguistic approach to time series forecasting

Репозитарії

Схожі ресурси