Дослідження ефективності штучних нейронних мереж (ШНМ) різних поколінь у задачі прогнозування у фінансовій сфері
This paper discusses ANNs of different generations. The efficiency of using computational intelligence in the task of short- and medium-term forecasting in the financial sphere is investigated. For the investigation, a fully connected feed-forward network (Back Propagation), a recurrent network (LST...
Gespeichert in:
| Datum: | 2025 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Artikel |
| Sprache: | Englisch |
| Veröffentlicht: |
The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"
2025
|
| Schlagworte: | |
| Online Zugang: | https://journal.iasa.kpi.ua/article/view/312420 |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Назва журналу: | System research and information technologies |
| Завантажити файл: | |
Institution
System research and information technologies| _version_ | 1867334445329022976 |
|---|---|
| author | Bodyanskiy, Yevgeniy Zaychenko, Yuriy Zaichenko, Helen Kuzmenko, Oleksii |
| author_facet | Bodyanskiy, Yevgeniy Zaychenko, Yuriy Zaichenko, Helen Kuzmenko, Oleksii |
| author_institution_txt_mv | [
{
"author": "Yevgeniy Bodyanskiy",
"institution": "Kharkiv National University of Radio Electronics, Kharkiv"
},
{
"author": "Yuriy Zaychenko",
"institution": "Educational and Research Institute for Applied System Analysis of the National Technical University of Ukraine \"Igor Sikorsky Kyiv Polytechnic Institute\", Kyiv"
},
{
"author": "Helen Zaichenko",
"institution": "Educational and Research Institute for Applied System Analysis of the National Technical University of Ukraine \"Igor Sikorsky Kyiv Polytechnic Institute\", Kyiv"
},
{
"author": "Oleksii Kuzmenko",
"institution": "Educational and Research Institute for Applied System Analysis of the National Technical University of Ukraine \"Igor Sikorsky Kyiv Polytechnic Institute\", Kyiv"
}
] |
| author_sort | Bodyanskiy, Yevgeniy |
| baseUrl_str | http://journal.iasa.kpi.ua/oai |
| collection | OJS |
| datestamp_date | 2025-05-20T17:56:07Z |
| description | This paper discusses ANNs of different generations. The efficiency of using computational intelligence in the task of short- and medium-term forecasting in the financial sphere is investigated. For the investigation, a fully connected feed-forward network (Back Propagation), a recurrent network (LSTM), a hybrid deep learning network based on self-organization (GMDH neo fuzzy), and a hybrid system of computational intelligence based on bagging and group method of data handling (HSCI bagging) were chosen. The experimental parameters chosen are the prediction interval, the number of inputs, the percentage of validation data in the training set, and the number of fuzzifiers (for GMDH neo-fuzzy). Experiments were conducted, and the best results for different prediction intervals were compared. The optimal parameters of the networks and the feasibility of their use in the task of forecasting at different intervals are determined. |
| doi_str_mv | 10.20535/SRIT.2308-8893.2025.1.09 |
| first_indexed | 2025-07-17T10:28:34Z |
| format | Article |
| fulltext |
Publisher IASA at the Igor Sikorsky Kyiv Polytechnic Institute, 2025
124 ISSN 1681–6048 System Research & Information Technologies, 2025, № 1
UDC 519.925.51
DOI: 10.20535/SRIT.2308-8893.2025.1.09
INVESTIGATION OF THE EFFECTIVENESS OF ARTIFICIAL
NEURAL NETWORKS OF DIFFERENT GENERATIONS IN THE
TASK OF FORECASTING IN THE FINANCIAL SPHERE
Ye. BODYANSKIY, Yu. ZAYCHENKO, He. ZAICHENKO, O. KUZMENKO
Abstract. This paper discusses ANNs of different generations. The efficiency of us-
ing computational intelligence in the task of short- and medium-term forecasting in
the financial sphere is investigated. For the investigation, a fully connected feed-
forward network (Back Propagation), a recurrent network (LSTM), a hybrid deep
learning network based on self-organization (GMDH neo fuzzy), and a hybrid sys-
tem of computational intelligence based on bagging and group method of data handling
(HSCI bagging) were chosen. The experimental parameters chosen are the predic-
tion interval, the number of inputs, the percentage of validation data in the training
set, and the number of fuzzifiers (for GMDH neo-fuzzy). Experiments were con-
ducted, and the best results for different prediction intervals were compared. The op-
timal parameters of the networks and the feasibility of their use in the task of fore-
casting at different intervals are determined.
Keywords: generations of ANNs, Back Propagation, LSTM, GMDH neo fuzzy,
HSCI bagging.
INTRODUCTION
Generations of artificial neural networks represent different stages in the evolu-
tion of artificial intelligence (AI) and machine learning technologies. Each gen-
eration introduces new approaches, architectures, and improvements that make
neural networks more powerful and efficient.
The first generation of artificial neural networks encompasses early
developments in artificial intelligence and machine learning. The basic model of
this generation is the perceptron, developed in 1957 by Frank Rosenblatt [1]. It
was the simplest type of artificial neural network and consisted of three main
components. The S-System (Sensory System) was represented by a set of points
in a TV raster, or a set of photocells. The A-System (Association System) per-
formed the switching functions between input and output. The R-System (Re-
sponse System) consisted of a typically of a relatively small number of units.
Such a model allowed solving only linearly separable problems, i.e. problems
where a straight line can be drawn to separate data classes. However, it was not
possible to solve more complex problems, such as the XOR-problem, where
classes cannot be separated by a straight line. Although the perceptron was an
important step forward, its capabilities were limited. After researchers realized
that it could not solve nonlinear problems, the development of neural networks
slowed down for a while, but the first generation of neural networks laid the
foundation for future research. It demonstrated the ability of machines to learn
from experience, albeit with limited capabilities. This led to the further
development of more complex models in the following generations.
Investigation of the effectiveness of artificial neural networks of different generations …
Системні дослідження та інформаційні технології, 2025, № 1 125
The second generation of artificial neural networks has advanced signifi-
cantly compared to the first, introducing multi-layer architectures and advanced
learning methods such as the back-propagation algorithm. This generation opened
up new possibilities for solving more complex problems that could not be solved
by simple first-generation models. The main feature of this generation was the
introduction of Multilayer Perceptrons (MLPs). Such networks consisted of sev-
eral layers of neurons: an input layer, one or more hidden layers, and an output
layer. The second generation will include the Back Propagation neural network,
which was proposed by Rumelhart, Hinton, and Williams in 1986 [2]. The authors
of this paper first showed how to train such a network with an arbitrary number of
layers and proposed a recurrent gradient-type algorithm for training it for a net-
work with an arbitrary structure. This network belongs to the class of feed-
forward neural networks. The Back Propagation Network has been widely used in
numerous tasks of function approximation, forecasting, and pattern recognition.
Its versatility is defined by the Universal Approximation Theorem [3]. With its
multilayer architecture and backpropagation algorithm, the second generation of
neural networks was able to solve problems that cannot be separated linearly, such
as the XOR-problem. This significantly expanded the scope of neural networks.
The third generation of artificial neural networks was marked by the
emergence of Deep Learning [4], an approach that has led to significant
breakthroughs in many areas of artificial intelligence. Deep neural networks
consist of a large number of layers (deep architecture), which allows modeling
more complex and abstract data representations. Deep Learning requires large
amounts of data for training and powerful computing resources, in particular
graphics processing units (GPUs). This has become possible due to the
development of data storage and processing technologies, as well as
improvements in hardware. It is worth noting that Jurgen Schmidhuber refers to
the Group Method of Data Handling (GMDH) [5; 6] as the earliest deep learning
method, noting that it was used to train a neural network consisting of eight layers
back in 1971 [7]. This method was proposed in the late 60s and early 70s by acad.
A.G. Ivakhnenko and his colleagues. This method is based on the selective
selection of models on the basis of which more complex models are built. The
modeling accuracy at each subsequent step increases due to the model's
complexity. Solving complex problems has led to the emergence of new types of
neural networks. A special type of deep neural network, convolutional neural
networks (CNNs) [8], has been developed for working with images. They use
special layers (convolutional layers) that can detect various image features, such
as contours, textures, and objects, at different levels of abstraction. Recurrent
neural networks (RNNs) [9–11] are designed to work with sequences of data,
such as text or audio. They store information about previous processing steps,
which allows them to consider context in natural language processing and speech
recognition tasks. In addition to error backpropagation, the third generation uses
advanced optimization, regularization, and normalization techniques to train deep
networks more efficiently, reducing the likelihood of overtraining. The third
generation of neural networks has become a real breakthrough in the field of
artificial intelligence. This generation, artificial neural networks have become
a key tool in the development of intelligent systems that are now used in
everyday life.
Ye. Bodyanskiy, Yu. Zaychenko, He. Zaichenko, O. Kuzmenko
ISSN 1681–6048 System Research & Information Technologies, 2025, № 1 126
The fourth generation of artificial neural networks is characterized by the
emergence of new architectures and methods that have further expanded the
capabilities of artificial intelligence. The main feature of this generation is the
ability to perform several different tasks using a single architecture. Quite often,
new approaches to training are used, such as pre-training methods on large data
sets followed by fine-tuning on specific tasks, which allows creating models that
can easily adapt to different contexts and tasks. This hybrid approach to building
neural networks allows them to be used in such industries as medical diagnostics,
virtual assistants, business process automation, autonomous vehicles,
personalized advertising, and much more. Their versatility and adaptability are
also achieved through self-organization. The fourth generation is represented by
self-organizing deep learning networks [12–15]. The key feature of this type of
network is that it builds its structure in the process of learning. Transformers have
become an important innovation of the fourth generation. Transformers use a self-
attention mechanism that allows the model to efficiently process sequential data,
such as text, and consider the dependencies between sequence elements regardless
of their location. Large Language Models (LLMs) are one of the most striking
achievements of the fourth generation, such as GPT (Generative Pre-trained
Transformer), BERT (Bidirectional Encoder Representations from Transformers),
and others. These models are capable of generating, analyzing, and understanding
text at a level that was previously considered unattainable. They can perform
translation, text generation, question answering, and other tasks. In addition to
language models, the fourth generation includes powerful generative models such
as DALL-E, Stable Diffusion, and others. These models can create images based
on textual descriptions, opening up new possibilities in creativity, design, and
many other industries. The fourth generation of artificial neural networks has
significantly expanded the boundaries of what is possible in the field of artificial
intelligence. The use of transformers and large language models has allowed for
new heights in understanding and generating natural language, which has become
the foundation for many innovations in various industries. This generation has
also emphasized the importance of multitasking and versatility, as models have
become capable of performing a variety of tasks using a single architecture. Such
advances continue to change the technological landscape and drive the further
development of artificial intelligence, making it even more powerful and useful in
various aspects of life.
What will the next generation of artificial neural networks look like? What
opportunities will we have? What tasks will we be able to solve? To answer these
questions, it is worth taking a closer look at the evolution of the structure of artificial
neural network training algorithms and practically exploring their capabilities.
EVOLUTION OF ANNS
The perceptron [1] is the basis for more complex neural networks, where hidden
layers are added to solve nonlinear problems and work with more complex data.
The Back Propagation neural network has a multilayer fully connected architec-
ture. During training, neuron weights are adjusted to reduce the error between the
predicted result and the actual value [2]. This process ensures efficient training of
the neural network based on a large amount of data [3].
The third-generation LSTM neural network is more complex (Fig. 1).
Investigation of the effectiveness of artificial neural networks of different generations …
Системні дослідження та інформаційні технології, 2025, № 1 127
A forward pass through a single LSTM block consists of several main steps
that are supposed to interact with the internal network state [4; 10; 11]. The first
step is a forget gate unit that decides which information should be erased in the
internal state. The internal state saves all information from all previous steps. The
process of “forgetting” information from previous steps can be expressed with the
usage of sigmoid function and weights matrices:
)1(
,
)(
,
t
j
f
ji
j
t
j
f
ji
j
f
i
t
i hWxUbf ,
where ( )tx is a current input vector at time step t ; ( )th is a hidden unit vector at
the current time step that contains information from LSTM block outputs in the
previous time steps; f
ib is forget gate bias vector; fU is a matrix of input
weights for forget gate; fW is a matrix of recurrent weights for forget gate.
The next step for LSTM block consists of several intermediate steps. First,
the input gate decides which information in the internal state should be updated
with new data. Then, the network creates a list of new elements that reflect new
information that should be added to the internal state. Finally, the network
combines all information from previous steps and updates the internal state ( )t
is .
All these operations are described with the following equation:
)1(
,
)(
,
)()1()()( t
jji
j
t
jji
j
i
t
i
t
i
t
i
i
i hWxUbgsfs ,
)1(
,
)(
,
)( t
j
g
ji
j
t
j
g
ji
j
g
i
t
i hWxUbg ,
where b is a bias vector into LSTM block; U — input weights in the LSTM
block; W is recurrent weights into LSTM block; )(t
ig is an external input gate
function.
The last step of LSTM block decides which information should be returned
as output. Output value calculated using output gate mechanism:
)(tanh )()()( t
i
t
i
t
i qsh , (4)
)1(
,
)(
,
)( t
j
o
ji
j
t
j
o
ji
i
o
i
t
i hWxUbq ,
Fig. 1. LSTM recurrent network architecture
Ye. Bodyanskiy, Yu. Zaychenko, He. Zaichenko, O. Kuzmenko
ISSN 1681–6048 System Research & Information Technologies, 2025, № 1 128
where ob , oU , oW is respectively bias vector, input, and recurrent weights ma-
trices of output gate.
For training LSTM stochastic gradient method and its modern modifications
are used. LSTM architecture has been successful on real-world tasks in different
domains and shows that it works much better with long-term dependencies than
poor RNNs.
A hybrid deep learning network based on self-organization — GMDH-neo-
fuzzy – is a fourth-generation network (Fig. 2) [12; 13].
To the system’s input layer a )1( n -dimensional vector of input signals is
fed. After that this signal is transferred to the first hidden layer. This layer
contains 2
1 ncn nodes, and each of these neurons has only two inputs.
At the outputs ]1[N of the first hidden layer the output signals are formed.
Then these signals are fed to the selection block of the first hidden layer.
It selects among the output signals *ˆ 1
]1[ nyl ( Fn *1 is so called freedom of
choice) most precise signals by some chosen criterion (mostly by the mean
squared error 2
]1[
ly
). Among these *
1n best outputs of the first hidden layer
2
]1[ *ˆ nyl pairwise combinations *ˆ ]1[
ly , *ˆ ]1[
py are formed. These signals are fed to
the second hidden layer, that is formed by neurons ]2[N . After training these
neurons output signals of this layer ]2[ˆly are transferred to the selection block
]2[SB which choses F best neurons by accuracy (e.g. by the value of 2
]2[
ly
) if
the best signal of the second layer is better than the best signal of the first hidden
layer *ˆ ]1[
1y . Other hidden layers forms signals similarly. The system evolution
process continues until the best signal of the selection block ]1[ sSB appears to be
worse than the best signal of the previous s th layer. Then we return to the
previous layer and choose its best node neuron ][sN with output signal ][ˆ sy . And
moving from this neuron (node) along its connections backwards and sequentially
passing all previous layers we finally get the structure of the GMDH-neo-fuzzy
network.
x1
x2
xn
Fig. 2. The process of synthesis of the GMDH-neo-fuzzy network structure
Investigation of the effectiveness of artificial neural networks of different generations …
Системні дослідження та інформаційні технології, 2025, № 1 129
It should be noted that in such a way not only the optimal structure of the
network may be constructed but also well-trained network due to the GMDH al-
gorithm [5; 6]. Besides, since the training is performed sequentially layer by layer
the problems of high dimensionality as well as vanishing or exploding gradient
are avoided.
Let’s introduce into consideration the architecture of the node that is sug-
gested as a neuron of the GMDH-system. As a node of this structure a neo-fuzzy
neuron (NFN) by Takeshi Yamakawa and co-authors in is used [14]. The neo-
fuzzy neuron is a nonlinear multi-input single-output system shown in Fig. 3. The
main difference of this node from the general neo-fuzzy neuron structure is that
each node uses only two inputs.
It realizes the following mapping:
)(ˆ
2
1
ii
i
xfy
, (6)
where ix is the input ),,2,1( nii , ŷ is a system output. Structural blocks of
neo-fuzzy neuron are nonlinear synapses iNS which perform transformation of
input signal in the form
Fig. 3. Neo-fuzzy neuron with two inputs
w12
w22
w11
w21
wh1
wh2
μh1
μh2
f1(x1)
f2(x2)
x1
x2
y
Ye. Bodyanskiy, Yu. Zaychenko, He. Zaichenko, O. Kuzmenko
ISSN 1681–6048 System Research & Information Technologies, 2025, № 1 130
)()(
1
ijiji
h
j
ii xwxf
(7)
and realize fuzzy inference: if ix is jix then the output is jiw , where jix is a fuzzy
set which membership function is ji , jiw is a synaptic weight in consequent.
The learning criterion (goal function) is the standard local quadratic error
function:
.))(()(
2
1
)(
2
1
))(ˆ)((
2
1
)(
2
1
2
1
22
kxwkykekykykE ijiji
h
ji
(8)
It is minimized via the conventional stochastic gradient descent algorithm.
In case we have priori defined data set the training process can be performed
in a batch mode for one epoch using conventional least squares method
)()()()()()()( ]1[
1
]1[]1[
1
]1[]1[
1
]1[ kykNPkykkkNw
N
k
N
k
T
N
k
, (9)
where • means pseudo inverse of Moore–Penrose (here )(ky denotes external
reference signal (real value).
If training observations are fed sequentially in on-line mode, the recurrent
form of the LSM can be used in the form
,
))(()1()))(((1
))(()))(())1(()(()1(
)1()(
kxkkx
kxkxkwkyk
kwkw
ijijTij
ijijTij
l
ij
ij
l
ij
l
.
))(()1()))(((1
)1()))((())(()1(
)1()(
kxkkx
kkxkxk
kk
ijijTij
ijTijijij
ijij
(10)
Consider the HSCI-bagging network (Fig. 4). It is a hybrid system of com-
putational intelligence (HSCI) built on the basis of the ensemble approach and
batching, which builds its architecture in the process of learning based on the
ideas of GMDH [15].
The architecture of the system contains 2S sequentially connected stacks,
while odd stacks are formed by ensembles of parallel-connected subsystems that
solve the same problem (recognition, prediction, etc.) and even ones are
essentially learning metamodels that generalize the output signals of ensembles
and form optimal results in the sense of the accepted criterion. The output signal
of the first metamodel is the generalized optimal signal )(1* ky and )1( n output
signals ,)(ˆ ]1[
kyi )( 1 ...,2,1 kni “best members of the ensemble”. At their core,
)(ˆ ]1[ kyq )(ˆ ]2[ kyq
Fig. 4. Hybrid system of computational intelligence based on bagging and GMDH
x1(k)
x2(k)
xn(k) m
et
a
m
od
el
[
S
E
n
se
m
b
le
S
m
et
a
m
od
el
2
E
n
se
m
b
le
2
m
et
a
m
od
el
1
E
n
se
m
b
le
1
…
…
…
)(ˆ ]1[
1 ky
)(ˆ ]1[
1 ky )(ˆ ]1[
2 ky
)(ˆ ]1[
,1 kyk
..
)(]1[* ky )(ˆ ]2[
1 ky
)(ˆ ]2[
2 ky
.. ..
)(ˆ ]2[
1 ky
)(ˆ ]2[
,1 kyk
)(]2[* ky )(ˆ ][
1 ky s
)(ˆ ][
2 ky s
)(ˆ ][ ky s
q
.. .. ..
y+[s](k)
Investigation of the effectiveness of artificial neural networks of different generations …
Системні дослідження та інформаційні технології, 2025, № 1 131
metamodels function as selection units in traditional GMDH systems, but not only
select the best results from the previous stack, but also form the optimal solution
based on these results.
Further, the output signals of the first metamodel are fed to the inputs of the
second ensemble, which is completely similar to the first. The outputs of the sec-
ond ensemble )(ˆ ),...,(ˆ),(ˆ ]2[]2[
2
]2[
1 kykyky q come to the second metamodel, which
calculates the optimal signal )(]2[* ky and )(ˆ)1( ]2[
kyn i “closest” to it. The last
S-th ensemble is similar to the first two, and the output of the last S-th metamodel
is )(*[s] ky , which exactly corresponds to a priori established requirements for the
quality of solving the problem under consideration.
Each of the ensembles contains q different computational intelligence sys-
tems that solve the same problem. There may still be simple neural networks such
as a single-layer perceptron, radial- basis neural network (RBFN), counterpropa-
gating neural network, etc., which do not use error backpropagation procedure for
training, neuro-fuzzy systems such as ANFIS, Wang–Mendel or Takagi–Sugeno–
Kang type, wavelet-neuro systems, neo-fuzzy neurons and others, the output sig-
nal of which linearly depends on the adapted parameters, which allows to use op-
timal speed learning algorithms.
The input information, on the basis of which the system is configured, is a
training selection of input signals:
,)( , ),( , ),2( ),1( Nxkxxx
,))( ,),(, ),(()( T
1
n
ni Rkxkxkxkx (11)
and its corresponding scalar refence signals )(),...,(,...,(1) Nykyy . On the basis of
these observations, the elements of the first ensemble are tuned independently of
each other, at the outputs of which q scalar signals ,)(ˆ ]2[ ky p qp ,...,2,1 , are
formed, which are conveniently represented in the form of a vector
T]1[]1[]1[
1
]1[ ))(ˆ,),(ˆ,),(ˆ()(ˆ kykykyky qp . These signals are sent to the inputs of
the first metamodel, at the outputs of which n sequences ,)(ˆ*[1] ky
)(ˆ,),(ˆ,),(ˆ [1]
,1
[1][1]
1 kykyky ni the main of which is )(*[1] ky while others are
auxiliary. The main signal of the metamodel )(*[1] ky is the union of the outputs
of all members of the ensemble in the form of:
[1]*[1][1][1]*
1
[1]* )(ˆ)(ˆ )( wkykywky T
pp
q
p
, (12)
where )...,...,( T*[1]*[1]*[1]*[1]
qp wwww is a vector of adapted parameters-synaptic
weights on which additionally restrictions are set on unbiasedness:
,1 [1]*[1]*
1
wIw T
qp
q
p
(13)
where qI — )1( q is the vector of unities.
Ye. Bodyanskiy, Yu. Zaychenko, He. Zaichenko, O. Kuzmenko
ISSN 1681–6048 System Research & Information Technologies, 2025, № 1 132
The problem of teaching the first metamodel is reduced to minimizing the
standard quadratic criterion in the presence of additional constraints.
Thus, the problem of training the first metamodel can be solved using the
standard method of penalty functions, which in this case reduces to minimizing
the expression:
)(())()((), ( T]1*[]1[]1*[ NYwNYNYwJ
)1 ())( ]1*[2]1*[]1[ wIwNY T
q , (14)
where T))( ,),( ,),1(()( NykyyNY — is a vector, ,1ˆ()( ]1[1 yNY
T]1[]1[ ))(ˆ ,),(ˆ , Nyky — )( qN is a matrix, is the penalty coefficient.
As a result of learning the first metamodel, the optimal signal )(*[1] ky is
formed at its output, as well as q signals )(ˆ[1]
, ky p from which we choose
) if(1 nqn with the highest levels of fuzzy membership ]1[
p , which
subsequently in the form of )1( n — vector are fed to the input of the second
ensemble, the outputs of which go to the inputs of the second metamodel, and so
on. The process of increasing the number of ensembles and metamodels continues
until the required accuracy of the last metamodel with the output )(*[s] ky is
achieved, or the value of the criterion minimized for the bagging model begins to
increase, i.e. ))(( ))(( *[s] 2]1*[ 2 kyky s .
DATA SET
Data on the dynamics of changes in the Dow Jones Industrial Average (DJIA)
index from August 2, 2022 to July 31, 2024 were used for forecasting [16]. The
dynamics of DJIA Close values is shown in the Fig. 5.
Fig. 5. Dynamics of the index Close DJIA
Investigation of the effectiveness of artificial neural networks of different generations …
Системні дослідження та інформаційні технології, 2025, № 1 133
The correlogram of DJIA Close vales is presented in the Fig. 6.
Analyzing the presented curve, one may conclude that there is strong corre-
lation between preceding and conceding values and even for lag 50 days the cor-
relation is more than 0.7.
EXPERIMENTAL INVESTIGATION AND DISCUSSION
Experimental studies of the accuracy of index forecasting using networks were
conducted: Back Propagation, LSTM, HMDH-neo-fuzzy, and HSCI-bagging.
During the experiments, the values of the parameters presented in Table 1 were
changed. The data set was split into three subsets: training, validation, and test.
The test subset of data for all experiments had a fixed size (30 last points from the
dataset) and was not used for training and validation.
T a b l e 1 . Experimental Parameters
Parameter Values
Interval 1; 3; 5; 7; 15; 20; 30
Number of inputs 3; 4; 5
Number of fuzzifiers 2; 3; 4
Validation split 0.4; 0.3; 0.2
After training, the accuracy of the models was checked on a test sample. For
each network, the best results of the forecast accuracy according to the MAPE
criterion were determined.
The first set of experiments was conducted with a back propagation network
(2nd generation). It had three hidden layers: the first layer of 7 neurons, the sec-
ond layer of 5 neurons, and the third layer of 3 neurons. The output was a single
signal. The best prediction results of this network for all intervals are shown in
Table 2.
Fig. 6. Correlogram of the index Close DJIA
Ye. Bodyanskiy, Yu. Zaychenko, He. Zaichenko, O. Kuzmenko
ISSN 1681–6048 System Research & Information Technologies, 2025, № 1 134
T a b l e 2 . The best results of the Back Propagation network
Interval Number of inputs Validation split MSE MAPE
1 3 0.2 1644597.0402 2.9792
3 3 0.2 1824141.7299 3.0973
5 3 0.2 1820227.1563 3.1028
7 3 0.2 1847014.9194 3.1448
15 4 0.2 1926517.248 3.175
20 4 0.2 4176396.3173 4.8037
30 3 0.2 24150324.9138 12.1998
The second set of experiments investigated the prediction accuracy of the
3rd generation network — LSTM. It had the following structure: input signals
determined by an experimental parameter, three hidden layers (32 neurons, 16
neurons, and 8 neurons), and one neuron with an output signal. For all forecasting
intervals, the best results were determined and are shown in Table 3.
For the Back Propagation and LSTM networks, the structure was selected
using Cross-Validation and Grid Search methods. As a result, optimal structures
were obtained and used that do not have too many hidden layers. This approach
made it possible to use minimal computational costs to obtain sufficiently high
accuracy of the results.
T a b l e 3 . The best results of the LSTM network
Interval Number of inputs Validation split MSE MAPE
1 5 0.2 96500.7026 0.582
3 4 0.2 258966.107 0.979
5 5 0.2 474929.5015 1.3487
7 5 0.2 537666.5732 1.4243
15 3 0.2 1159702.2855 2.0442
20 4 0.2 2847643.2154 3.7911
30 4 0.3 4233239.0843 4.6857
The third set of experiments was conducted to determine the forecasting
accuracy of the 4th generation network — GMDH-neo-fuzzy. The structure of the
network was synthesized during training and in most cases had two hidden layers
and one layer with the output signal. After comparing the obtained forecasting
results on the test subsample, Table 4 was created with the best results and
optimal network parameters for all intervals.
T a b l e 4 . The best results of the GMDH-neo-fuzzy
Int. Number of inputs Number of
fuzzifiers Validation split MSE MAPE
1 3 2 0.3 155027.4886 0.7004
3 3 3 0.2 332615.1181 1.1576
5 5 3 0.3 417632.8307 1.198
7 5 4 0.3 394795.8022 1.2579
15 5 3 0.4 622501.8958 1.6785
20 4 3 0.3 1086960.4794 2.204
30 4 3 0.3 1138011.8375 2.353
Investigation of the effectiveness of artificial neural networks of different generations …
Системні дослідження та інформаційні технології, 2025, № 1 135
The last series of experiments was conducted to evaluate the prediction accu-
racy of the HSCI-bagging network, which also belongs to the 4th generation. The
previous networks were used for its implementation. The best forecasting results
are presented in Table 5.
T a b l e 5 . The best results of the HSCI-bagging
Interval Number of inputs Validation split MSE MAPE
1 5 0.2 80602.0465 0.5054
3 4 0.2 248445.23 0.9685
5 4 0.2 384022.2476 1.1568
7 4 0.2 409905.6681 1.2263
15 5 0.2 595730.7206 1.6062
20 5 0.2 827577.2912 2.011
30 4 0.2 978388.1249 2.1217
Based on the data from Tables 2–5, Table 6 was created to compare the
forecasting results according to the MAPE criterion.
T a b l e 6 . Comparative table of the best forecasting results
Interval Back Propagation LSTM GMDH-neo-fuzzy HSCI-bagging
1 2.9792 0.582 0.7004 0.5054
3 3.0973 0.979 1.1576 0.9685
5 3.1028 1.3487 1.198 1.1568
7 3.1448 1.4243 1.2579 1.2263
15 3.175 2.0442 1.6785 1.6062
20 4.8037 3.7911 2.204 2.011
30 12.1998 4.6857 2.353 2.1217
For the convenience of analyzing the results, a comparative graph of the average
forecast accuracy according to the MAPE criterion for all the investigated networks
at each of the intervals was also constructed. This graph is shown in Fig. 7.
Fig. 7. Comparative diagram of forecast accuracy according to the MAPE criterion
1 –
2 –
3 –
4 –
2
1
3 4
Interval
Ye. Bodyanskiy, Yu. Zaychenko, He. Zaichenko, O. Kuzmenko
ISSN 1681–6048 System Research & Information Technologies, 2025, № 1 136
Thus, the results of the conducted studies show that the Back Propagation
network showed the worst results at all intervals. The best forecasting accuracy
according to the MAPE criterion was obtained using the 4th generation
HSCI-bagging network, its slightly better than hybrid GMDH-neo-fuzzy network.
The LSTM recurrent network showed good results on short-term intervals, but
starting from interval 5, it is inferior to the GMDH-neo-fuzzy network.
CONCLUSION
This article considers the problem of short- and middle-term forecasting in the
financial sector using the Dow Jones Industrial Averagse (DJIA) dataset.
Experimental investigations of the forecasting accuracy of neural networks
of different generations were conducted: Back Propagation (2nd generation),
LSTM (3rd generation), GMDH-neo-fuzzy (4th generation) and HSCI-bagging
(4th generation).
During the experiments, at each of the short- and medium-term intervals, the
optimal parameters for each of the networks were determined, at which it
demonstrated the best forecasting results.
The accuracy of forecasts by the MAPE criterion of all networks at short and
medium-term intervals was compared. The best forecasting results were obtained
using HSCI-bagging, and the GMDH-neo-fuzzy hybrid network showed slightly
worse results, but better than other studied networks of previous generations.
The results of the investigation show that, in general, the forecasting
accuracy increases with the generation of neural networks. In addition, the latest
generations of artificial neural networks have shown better results on medium-
term intervals.
REFERENCES
1. Frank Rosenblatt, “The Perceptron: A Probabilistic Model for Information Storage and
Organization in the Brain,” Psychological Review, vol. 65, no. 6, pp. 386–408, 1958.
2. David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams, “Learning repre-
sentations by back-propagating errors,” Nature, vol. 323, pp. 533–536, 1986. doi:
10.1038/323533a0
3. G.V. Cybenko, “Approximation by Superpositions of a Sigmoidal function,”
Mathematics of Control, Signals and Systems, vol. 2, no. 4, pp. 303–314, 1989, doi:
10.1007/BF02551274
4. I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.
Available: http://www.deeplearningbook.org
5. A.G. Ivakhnenko, V.G. Lapa, Cybernetic forecasting devices (in Ukrainian). K.:
Naukova Dumka, 1965, 216 p.
6. A.G. Ivakhnenko, G.A. Ivakhnenko, and J.A. Mueller, “Self-organization of the neu-
ral networks with active neurons,” Pattern Recognition and Image Analysis, vol. 4,
no. 2, pp. 177–188, 1994.
7. Jürgen Schmidhuber, “Deep learning in neural networks: An overview,” Neural
Networks, pp. 85–117, 2015. doi: 10.1016/j.neunet.2014.09.003
8. Purwono Purwono et al., “Understanding of Convolutional Neural Network (CNN):
A Review,” International Journal of Robotics and Control Systems, vol. 2, no. 4,
pp. 739–748, 2023. doi: 10.31763/ijrcs.v2i4.888
9. B. Hammer, “On the approximation capability of recurrent neural networks,” Neuro-
computing, vol. 31, pp. 107–123, 1998. doi: 10.1016/S0925-2312(99)00174-5
10. S. Hochreiter, J. Schmidhuber, “Long short-term memory,” Neural Computation,
vol. 9, pp. 1735–1780, 1997. doi: 10.1162/neco.1997.9.8.1735
Investigation of the effectiveness of artificial neural networks of different generations …
Системні дослідження та інформаційні технології, 2025, № 1 137
11. C. Olah, Understanding LSTM networks. 2020. Available: https://colah.github.io/
posts/2015-08- Understanding-LSTMs
12. Yu. Zaychenko, Galib Hamidov, “The Hybrid Deep Learning GMDH-neo-fuzzy
Neural Network and Its Applications,” Proceedings of 13-th IEEE International
Conference Application of Information and Communication Technologies-AICT2019.
23–25 October 2019, Baku, pp. 72–77. doi: 10.1109/AICT47866.2019.8981725
13. Evgeniy Bodyanskiy, Yuriy Zaychenko, Olena Boiko, Galib Hamidov, and Anna
Zelikman, “Structure Optimization and Investigations of Hybrid GMDH-Neo-fuzzy
Neural Networks in Forecasting Problems,” System Analysis & Intelligent Comput-
ing; Eds. Michael Zgurovsky, Natalia Pankratova. Book Studies in Computational
Intelligence, SCI, vol. 1022. Springer, 2022, pp. 209–228.
14. T. Yamakawa, E. Uchino, T. Miki, and H. Kusanagi, “A neo-fuzzy neuron and its
applications to system identification and prediction of the system behavior,” Proc. 2nd
Intеrn. Conf. Fuzzy Logic and Neural Networks “LIZUKA-92”, Lizuka, 1992, pp. 477–483.
15. Ye. Bodyanskiy, O. Kuzmenko, He. Zaichenko, and Yu. Zaychenko, “Hybrid Sys-
tem of Computational Intelligence based on Bagging and Group Method of Data
Handling,” System Research and Information Technologies, no. 1, pp. 75–85, 2024.
doi: 10.20535/SRIT.2308-8893.2024.1.06
16. “DJIA - Dow Jones Industrial Average Historical Prices,” WSJ. Accessed on: August
1, 2024. [Online]. Available: https://www.wsj.com/market-data/quotes/index/DJIA/
historical-prices
Received 04.09.2024
INFORMATION ON THE ARTICLE
Yevgeniy V. Bodyanskiy, ORCID: 0000-0001-5418-2143, Kharkiv National University of
Radio Electronics, Ukraine, e-mail: yevgeniy.bodyanskiy@nure.ua
Yuriy P. Zaychenko, ORCID: 0000-0001-9662-3269, Educational and Research Institute
for Applied System Analysis of the National Technical University of Ukraine “Igor Sikor-
sky Kyiv Polytechnic Institute”, Ukraine, e-mail: zaychenkoyuri@ukr.net
Helen Yu. Zaichenko, ORCID: 0000-0002-4630-5155, Educational and Research Insti-
tute for Applied System Analysis of the National Technical University of Ukraine “Igor
Sikorsky Kyiv Polytechnic Institute”, Ukraine, e-mail: zaichenko.helen@lll.kpi.ua
Oleksii V. Kuzmenko, ORCID: 0000-0003-1581-6224, Educational and Research Insti-
tute for Applied System Analysis of the National Technical University of Ukraine “Igor
Sikorsky Kyiv Polytechnic Institute”, Ukraine, e-mail: oleksii.kuzmenko@ukr.net
ДОСЛІДЖЕННЯ ЕФЕКТИВНОСТІ ШТУЧНИХ НЕЙРОННИХ МЕРЕЖ
(ШНМ) РІЗНИХ ПОКОЛІНЬ У ЗАДАЧІ ПРОГНОЗУВАННЯ У ФІНАНСОВІЙ
СФЕРІ / Є.В. Бодянський, Ю.П. Зайченко, О.Ю. Зайченко, О.В. Кузьменко
Анотація. Розглянуто ШНМ різних поколінь. Досліджено ефективність вико-
ристання обчислювального інтелекту в задачах коротко- та середньостроково-
го прогнозування у фінансовій сфері. Для дослідження обрано повнозв’язну
мережу прямого поширення (Back Propagation), рекурентну мережу (LSTM),
гібридну мережу глибокого навчання на основі самоорганізації (GMDH-neo-
fuzzy) та гібридну систему обчислювального інтелекту на основі беггінгу та
методу групового урахування аргументів (HSCI-bagging). Як експериментальні
параметри обрано інтервал прогнозування, кількість входів, відсоток валіда-
ційних даних у навчальній вибірці та кількість фазифікаторів (для GMDH-neo-
fuzzy). Проведено експерименти та порівняно найкращі результати, отримані
для різних інтервалів прогнозування. Визначено оптимальні параметри мереж
та доцільність їх використання в задачі прогнозування на різних інтервалах.
Ключові слова: покоління ШНМ, Back Propagation, LSTM, GMDH neo fuzzy,
HSCI bagging.
|
| id | journaliasakpiua-article-312420 |
| institution | System research and information technologies |
| keywords_txt_mv | keywords |
| language | English |
| last_indexed | 2025-09-17T09:26:02Z |
| publishDate | 2025 |
| publisher | The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" |
| record_format | ojs |
| resource_txt_mv | journaliasakpiua/fb/73f3a0d4444a0c30cf776ab012c339fb.pdf |
| spelling | journaliasakpiua-article-3124202025-05-20T17:56:07Z Investigation of the effectiveness of artificial neural networks of different generations in the task of forecasting in the financial sphere Дослідження ефективності штучних нейронних мереж (ШНМ) різних поколінь у задачі прогнозування у фінансовій сфері Bodyanskiy, Yevgeniy Zaychenko, Yuriy Zaichenko, Helen Kuzmenko, Oleksii покоління ШНМ Back Propagation LSTM GMDH neo fuzzy HSCI bagging generations of ANNs Back Propagation LSTM GMDH neo fuzzy HSCI bagging This paper discusses ANNs of different generations. The efficiency of using computational intelligence in the task of short- and medium-term forecasting in the financial sphere is investigated. For the investigation, a fully connected feed-forward network (Back Propagation), a recurrent network (LSTM), a hybrid deep learning network based on self-organization (GMDH neo fuzzy), and a hybrid system of computational intelligence based on bagging and group method of data handling (HSCI bagging) were chosen. The experimental parameters chosen are the prediction interval, the number of inputs, the percentage of validation data in the training set, and the number of fuzzifiers (for GMDH neo-fuzzy). Experiments were conducted, and the best results for different prediction intervals were compared. The optimal parameters of the networks and the feasibility of their use in the task of forecasting at different intervals are determined. Розглянуто ШНМ різних поколінь. Досліджено ефективність використання обчислювального інтелекту в задачах коротко- та середньострокового прогнозування у фінансовій сфері. Для дослідження обрано повнозв’язну мережу прямого поширення (Back Propagation), рекурентну мережу (LSTM), гібридну мережу глибокого навчання на основі самоорганізації (GMDH-neo-fuzzy) та гібридну систему обчислювального інтелекту на основі беггінгу та методу групового урахування аргументів (HSCI-bagging). Як експериментальні параметри обрано інтервал прогнозування, кількість входів, відсоток валідаційних даних у навчальній вибірці та кількість фазифікаторів (для GMDH-neo-fuzzy). Проведено експерименти та порівняно найкращі результати, отримані для різних інтервалів прогнозування. Визначено оптимальні параметри мереж та доцільність їх використання в задачі прогнозування на різних інтервалах. The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" 2025-03-28 Article Article Peer-reviewed Article application/pdf https://journal.iasa.kpi.ua/article/view/312420 10.20535/SRIT.2308-8893.2025.1.09 System research and information technologies; No. 1 (2025); 124-137 Системные исследования и информационные технологии; № 1 (2025); 124-137 Системні дослідження та інформаційні технології; № 1 (2025); 124-137 2308-8893 1681-6048 en https://journal.iasa.kpi.ua/article/view/312420/319612 |
| spellingShingle | покоління ШНМ Back Propagation LSTM GMDH neo fuzzy HSCI bagging Bodyanskiy, Yevgeniy Zaychenko, Yuriy Zaichenko, Helen Kuzmenko, Oleksii Дослідження ефективності штучних нейронних мереж (ШНМ) різних поколінь у задачі прогнозування у фінансовій сфері |
| title | Дослідження ефективності штучних нейронних мереж (ШНМ) різних поколінь у задачі прогнозування у фінансовій сфері |
| title_alt | Investigation of the effectiveness of artificial neural networks of different generations in the task of forecasting in the financial sphere |
| title_full | Дослідження ефективності штучних нейронних мереж (ШНМ) різних поколінь у задачі прогнозування у фінансовій сфері |
| title_fullStr | Дослідження ефективності штучних нейронних мереж (ШНМ) різних поколінь у задачі прогнозування у фінансовій сфері |
| title_full_unstemmed | Дослідження ефективності штучних нейронних мереж (ШНМ) різних поколінь у задачі прогнозування у фінансовій сфері |
| title_short | Дослідження ефективності штучних нейронних мереж (ШНМ) різних поколінь у задачі прогнозування у фінансовій сфері |
| title_sort | дослідження ефективності штучних нейронних мереж (шнм) різних поколінь у задачі прогнозування у фінансовій сфері |
| topic | покоління ШНМ Back Propagation LSTM GMDH neo fuzzy HSCI bagging |
| topic_facet | покоління ШНМ Back Propagation LSTM GMDH neo fuzzy HSCI bagging generations of ANNs Back Propagation LSTM GMDH neo fuzzy HSCI bagging |
| url | https://journal.iasa.kpi.ua/article/view/312420 |
| work_keys_str_mv | AT bodyanskiyyevgeniy investigationoftheeffectivenessofartificialneuralnetworksofdifferentgenerationsinthetaskofforecastinginthefinancialsphere AT zaychenkoyuriy investigationoftheeffectivenessofartificialneuralnetworksofdifferentgenerationsinthetaskofforecastinginthefinancialsphere AT zaichenkohelen investigationoftheeffectivenessofartificialneuralnetworksofdifferentgenerationsinthetaskofforecastinginthefinancialsphere AT kuzmenkooleksii investigationoftheeffectivenessofartificialneuralnetworksofdifferentgenerationsinthetaskofforecastinginthefinancialsphere AT bodyanskiyyevgeniy doslídžennâefektivnostíštučnihnejronnihmerežšnmríznihpokolínʹuzadačíprognozuvannâufínansovíjsferí AT zaychenkoyuriy doslídžennâefektivnostíštučnihnejronnihmerežšnmríznihpokolínʹuzadačíprognozuvannâufínansovíjsferí AT zaichenkohelen doslídžennâefektivnostíštučnihnejronnihmerežšnmríznihpokolínʹuzadačíprognozuvannâufínansovíjsferí AT kuzmenkooleksii doslídžennâefektivnostíštučnihnejronnihmerežšnmríznihpokolínʹuzadačíprognozuvannâufínansovíjsferí |