A linguistic approach to time series forecasting

Methods for predicting dynamic time series (including non-stationary ones) based on a linguistic approach, namely, the study of occurrences and repetition of so-called N-grams, are proposed. This approach is used in computational linguistics to create statistical translators, detect plagiarism and d...

Full description

Saved in:
Bibliographic Details
Date:2022
Main Authors: Ланде, Д. В., Юзефович , В. В.
Format: Article
Language:Ukrainian
Published: Інститут проблем реєстрації інформації НАН України 2022
Subjects:
Online Access:http://drsp.ipri.kiev.ua/article/view/262673
Tags: Add Tag
No Tags, Be the first to tag this record!
Journal Title:Data Recording, Storage & Processing

Institution

Data Recording, Storage & Processing
Description
Summary:Methods for predicting dynamic time series (including non-stationary ones) based on a linguistic approach, namely, the study of occurrences and repetition of so-called N-grams, are proposed. This approach is used in computational linguistics to create statistical translators, detect plagiarism and duplicate documents. In contrast to the application in linguistics, the method can be extended by taking into account the correlations of sequences of stable word combinations, as well as trends. The proposed methods do not require a preliminary study and determination of the characteristics of time series and complex setting of the input parameters of the forecasting model. The proposed methods allow, with a high level of automation, to carry out short-term and medium-term forecasts of time series, which are characterized by trends and cyclicality, in particular, series of publication dynamics in content monitoring systems. Also, the proposed methods can be used to predict the values of the parameters of a large complex system in the task of monitoring its state, when the number of such parameters is significant, and therefore a high level of automation of the forecasting process is desirable. A significant advantage of the approach is the absence of requirements for time series stationarity and a small number of tuning parameters. Further research may focus on the study of various criteria for the similarity of time series fragments, the use of nonlinear similarity criteria, the search for ways to automatically determine the rational step of quantization of the time series.