Кластеризація векторних та матричних масивів даних із використанням комбінованого еволюційного методу рибних зграй
The problem of clustering data arrays described in both vector and matrix forms and based on the optimization of data distribution density functions in these arrays is considered. For the optimization of these functions, the algorithm that is a hybrid of Fish School Search, random search, and evolut...
Saved in:
| Date: | 2022 |
|---|---|
| Main Authors: | , , |
| Format: | Article |
| Language: | English |
| Published: |
The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"
2022
|
| Subjects: | |
| Online Access: | https://journal.iasa.kpi.ua/article/view/275083 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Journal Title: | System research and information technologies |
| Download file: | |
Institution
System research and information technologies| _version_ | 1867334432593018880 |
|---|---|
| author | Bodyanskiy, Yevgeniy Shafronenko, Alina Pliss, Iryna |
| author_facet | Bodyanskiy, Yevgeniy Shafronenko, Alina Pliss, Iryna |
| author_institution_txt_mv | [
{
"author": "Yevgeniy Bodyanskiy",
"institution": "Kharkiv National University of Radio Electronics, Kharkiv"
},
{
"author": "Alina Shafronenko",
"institution": "Kharkiv National University of Radio Electronics, Kharkiv"
},
{
"author": "Iryna Pliss",
"institution": "Kharkiv National University of Radio Electronics, Kharkiv"
}
] |
| author_sort | Bodyanskiy, Yevgeniy |
| baseUrl_str | http://journal.iasa.kpi.ua/oai |
| collection | OJS |
| datestamp_date | 2023-05-21T20:04:38Z |
| description | The problem of clustering data arrays described in both vector and matrix forms and based on the optimization of data distribution density functions in these arrays is considered. For the optimization of these functions, the algorithm that is a hybrid of Fish School Search, random search, and evolutionary optimization is proposed. This algorithm does not require calculating the optimized function’s derivatives and, in the general case, is designed to find optimums of multiextremal functions of the matrix argument (images). The proposed approach reduces the number of runs of the optimization procedure, finds extrema of complex functions with many extrema, and is simple in numerical implementation. |
| doi_str_mv | 10.20535/SRIT.2308-8893.2022.4.07 |
| first_indexed | 2025-07-17T10:28:06Z |
| format | Article |
| fulltext |
Ye. Bodyanskiy, А. Shafronenko, І. Pliss, 2022
Системні дослідження та інформаційні технології, 2022, № 4 79
TIДC
МЕТОДИ ОПТИМІЗАЦІЇ, ОПТИМАЛЬНЕ
УПРАВЛІННЯ І ТЕОРІЯ ІГОР
UDC 004.8:004.032.26
DOI: 10.20535/SRIT.2308-8893.2022.4.07
CLUSTERIZATION OF VECTOR AND MATRIX DATA ARRAYS
USING THE COMBINED EVOLUTIONARY METHOD
OF FISH SCHOOLS
Ye. BODYANSKIY, А. SHAFRONENKO, І. PLISS
Abstract. The problem of clustering data arrays described in both vector and matrix
forms and based on the optimization of data distribution density functions in these
arrays is considered. For the optimization of these functions, the algorithm that is a
hybrid of Fish School Search, random search, and evolutionary optimization is pro-
posed. This algorithm does not require calculating the optimized function’s deriva-
tives and, in the general case, is designed to find optimums of multiextremal func-
tions of the matrix argument (images). The proposed approach reduces the number
of runs of the optimization procedure, finds extrema of complex functions with
many extrema, and is simple in numerical implementation.
Keywords: combined optimization, fuzzy clustering, evolutionary algorithms, den-
sity functions, Fish School.
INTRODUCTION
The problem of clustering arrays of arbitrary nature observations is integral part
of Data Mining, and more generally Data Science. To solve this problem it was
proposed a lot of approaches that differ as a priori assumptions about the physical
nature of data and problems solved by their basis, and the mathematical apparatus
that was used [1–4]. From a computational point of view, the simplest are the so-
called hierarchical methods and algorithms based on partitions [3], among of that
we should mention the k -means procedure, that has become widespread to solve
a variety of problems. It should be noted here that the most adequate mathemati-
cal apparatus for solving clustering problems are methods of computational intel-
ligence [5–7] and, above all, artificial neural networks, fuzzy systems, evolutionary
optimization and so-called hybrid systems of computational intelligence that
connect these three areas. It is interesting to note that one of the most popular
neural networks — self-organizing Kohonen maps [8] actually implements the
k -means procedure, presented in recurrent form.
It should be noted that in the general case the solution of the clustering
problem is significantly complicated if the original vectors (in the general case
matrices) observations have a large variety are, distorted by perturbations and
noises, contain outliers and omissions, the original arrays themselves or too
large (Big Data) or too short, clusters can have a rather complex shape, and their
number is a priori unknown.
Ye. Bodyanskiy, А. Shafronenko, І. Pliss
ISSN 1681–6048 System Research & Information Technologies, 2022, № 4 80
In this case, the most effective (but also the most complex) are algorithms
based on the analysis of data distribution densities, among which as one of the
most “popular” are DENCLUE [9] and its modifications [10–12], which were
proposed to solve clustering problems of large arrays of high-dimensional vector
data, and the classes formed in the clustering process can have any complex
shape. At the heart of these algorithms is the search for extremes — maxima in
the data density functions in the analyzed array (multi-extremal optimization), and
this function is formed as a superposition of kernel (bell-shaped) functions
associated with each observation. In fact, this function is based on Parzen
windows [13] and Nadaraya–Watson estimates [14, 15].
From a computational point of view, the clustering problem becomes of
finding local extrema of the multiextrema function of the density vector argument
using gradient procedures that are repeatedly run from different points in the
original data set. It is clear that this takes a long time, because a priori it is not
even known how many extremes the formed density function.
The process of finding these extremes can be accelerated by using the ideas
of evolutionary optimization, that includes algorithms inspired by nature, swarms
algorithms, population algorithms, etc. [16–18]. In this case, the search is con-
ducted simultaneously by a group of agents acting either independently or in in-
teraction, which can significantly speed up the process of finding extremes, each
of which “corresponds” to one or another cluster that is being formed.
FORMATION OF THE DATA DISTRIBUTION DENSITY FUNCTION IN THE
CLUSTERING ARRAY
The initial information for solving the clustering problem is traditionally an array
of observation vectors )},(),...,(),...,2(),1({ NxkxxxX ,)}({)( n
i Rkxkx the
data are pre-centered on a hypercube so that .)}({)( 21
2,
nn
ii Rkxkx This situa-
tion can occur in the case of image array processing. The basic concepts on which
DENCLUE is based are the influence function, the density function and the den-
sity attractors, which are essentially the local extremes of the density function. In
the general case, the influence function for any vector observation )(x from the
original array Х is a kernel bell shaped function )()( xf x , in this case the most
popular is the traditional Gaussian one
2
2
2
2
)(
2
)(
exp
2
))(,(
exp)(
xxxxd
xf x
G , (1)
where ))(,(2 xxd — euclidean distance; 2 — parameter of the influence func-
tion width, due to the simplicity of calculating its derivatives.
In the matrix case, instead of the Euclidean one, we can use the Flobenius
metric, and the influence function takes the form
22
2
)(
2
))(())((
exp
2
))(,(
exp)(
T
x
G
xxxxTrxxd
xf , (2)
where )(Tr — matrix trace symbol.
It is easy to see that (2) is a generalization of (1).
Based on the influence functions, formed the data density distribution
function in the array Х in the form
Clusterization of vector and matrix data arrays using the combined evolutionary method …
Системні дослідження та інформаційні технології, 2022, № 4 81
N
k
X kxxfxf
1
)(,)( , (3)
which is essentially an estimate of Nadaraya–Watson. It’s easy to see what the
function )(xf x can take values in an interval ,)(1 Nxf x in this case the ex-
trema values from this interval are accepted when the sample contains only one
observation or all N observation observations coincide, i.e. there is only one
cluster — a degenerate situation.
To find 1m clusters it’s necessary to introduce some threshold 1 , that
allows to build really significant clusters by excluding anomalous observations
and classes that contain too small data.
Actually, the process of cluster formation is associated with finding all
extremes of the density function (3) using a gradient procedure
,,...,2,1,...;2,1,0),( ,
),(
),(
01
1
1 Nklkxx
xxf
xxf
xx
llX
llX
lll
(4)
i.e. the number of runs of algorithm (4) is determined by the size of the training
sample N . It is clear that with large N the process of clustering — finding local
extrema can take a lot of time. Therefore, the proposed modifications of
DENCLUE are associated with speeding up the process of finding local extrema
(3) by modifying the gradient procedure (4) [10–12].
In the case, when observations )(kx in dataset 21 nnX are matrices, it
is easy to consider the matrix version of the procedure (4):
)2/1(1111 )),(),((),( lxTlXlXlll xXxXTrxXxx ,
where .
),(
),( 21
21
1
1 nn
ii
lX
lX R
x
xXf
xX
The gradient optimization process ends with a search m local extrema of
function (3), with less value , than more clusters can be formed.
It is possible to speed up the process of finding local extrema by using
evolutionary optimization methods instead of gradient search, among which the
so-called Fish schools search can be noted as quite efficient, numerically simple
and fast [19–21], which should be modified to solve the clustering problem.
MODIFIED OPTIMIZATION METHOD BASED ON FISH SCHOOL
When using the methods of evolutionary optimization, which are essentially zero-
order optimization methods, i.e. do not use derivatives, it is assumed that when
finding the extrema of some function )(xf x the population of agents are used,
each of them acts either independently or in interaction with others, with the
movement of each q th agent ),...,2,1( Qq on l th search iteration can be written as:
QqDirxx l
q
l
q
l
q
l
q ,...,2,1 ,1 ,
where ;),...,,( T
21
l
qn
l
q
l
q
l
q xxxx l
qDir — vector that specifies the direction of
movement q th agent on l th search iteration.
Ye. Bodyanskiy, А. Shafronenko, І. Pliss
ISSN 1681–6048 System Research & Information Technologies, 2022, № 4 82
In a large family of such methods should be noted the method based on of
fish schools, where each agent of the population simulates the movement of an
individual fish in the school [19–21].
The main advantage of this method is the sufficient efficiency of finding the
global extrema of rather complex functions, which include the density function of
data distribution in clustering problems.
The authors of the method introduce iterations related to the movement of
the school: feeding and swimming.
The feeding operator is responsible for the weight of each fish as an element
of school — the agent. The heavier the fish, the closer it is to the extreme — the
maximum. The weight of each fish qw is tuned according to the expression
Qq
xfxf
xfxf
ww
p
l
q
xl
q
x
l
q
xl
q
x
l
q
l
q ,...,2,1
)}()({max
)()(
1
1
1
, (5)
where
.5,0 ,0 max
0
max wwww l
l
q
The swimming operator describes both the individual movement of each fish
and the collective movement of the school as a whole. Three types of movement
are considered here: individual, instinctively — collective and collective volitional.
Individual movement is described by the relation
,else
;)()( if },1,0{
1
1
l
q
l
q
xl
q
xl
q
l
qil
qi
x
xfxfRandx
x (6)
where }1,0{Rand — evenly distributed in the interval )1,0( random number. It
should be noted that (6) is essentially a local random search with return, intro-
duced by L. Rastrigin [22]. In fact, this is the procedure of “probing” the function
)(xf x around the point 1l
qx , in this case, in addition to (5), any other random
search algorithm can be used here.
On the basis of probing the density function with the help of individual
movement (5) the instinctive-collective movement in the direction of growth of
this function is realized as:
.
))()((
))()(()(
1
1
1
1
1
1
Q
p
l
p
xl
p
x
l
q
xl
q
x
Q
p
l
p
l
p
l
q
l
q
xfxf
xfxfxx
xx (7)
At this stage, there is a balanced averaging of individual movements, taking
into account the “success” of each of the fish-agents.
And finally, collectively-volitional movement, when all the fishes of the
school “pull” to the weighted center of gravity, if the cant goes to the extreme,
and “run away” if the population moves in the wrong direction.
Considering the weighted center of gravity of the fish school
Clusterization of vector and matrix data arrays using the combined evolutionary method …
Системні дослідження та інформаційні технології, 2022, № 4 83
,
1
1
Q
p
l
p
Q
p
l
p
l
p
l
w
wx
Bar (8)
we can record this movement as
Q
p
l
p
Q
p
l
pll
q
ll
ql
q
l
q
Q
p
l
p
Q
p
l
pll
q
ll
ql
q
l
q
l
q
ww
Barx
Barx
Randx
ww
Barx
Barx
Randx
x
1
1
1
11
11
1
1
1
11
11
. if ,1,0
; if ,1,0
(9)
To increase the efficiency of FSS, an additional breeding operator may be in-
troduced, which allows the creation of new fish-agents that have improved char-
acteristics compared to existing members of the school. To do this, we can use the
ideas of evolutionary operations [23], among which from a computational point of
view and efficiency — the credibility of finding the extrema can be noted sequen-
tial simplex method [24] and its modifications [25].
Let’s form the school that containing 1 nQ fish-agents, but this number
remains unchanged in the search process, i.e. the population 00
2
0
1 ,...,, Qxxx generated
randomly. In this population we find the “worst” fish 0
qworstx , which has the
lowest weight 0
minqw and the “best” fish 0
qbestx with the greatest weight 0
maxqw .
The main operation of the simplex movement is mapping 0
qworstx through the cen-
ter of gravity n fishes (without the worst), which can be written in the form
.)(
1
1
000
Q
q
qworstq xx
n
x
As a result of this operation, a new fish is created
),( 000*1
qworstq xxxx
which replaces the worst individual in the school 0
qworstx . Thus creates a new
population 11
2
1
1 ,...,, Qxxx . Here 25,0 — parameter that controls the shape of
the school-simplex in the optimization process. In the case, when 1 the map-
ping of the simplex through the center of gravity is realized 0x in the case if
)()( 0*1
best
x
q
x xfxf accepted 2 , that is, the school is “stretched” in a favor-
able direction, if )()( 0*1
worst
x
q
x xfxf 5,0 is accepted, that is, the simplex
faces a relatively unfortunate direction. Thus the motion of the school-simplex
can be described by relations
Ye. Bodyanskiy, А. Shafronenko, І. Pliss
ISSN 1681–6048 System Research & Information Technologies, 2022, № 4 84
),(
;)(
1
111
1
111
l
qwost
l
q
ll
q
Q
q
l
qwost
l
q
l
xxxx
xx
n
x
(10)
then in the general case it is essentially an Nelder–Mead optimization algorithm
[25]. Thus, in the process of finding the extreme, the worst fishes with the lowest
weight are removed from the cant and new agents with higher weight are created.
Thus, the process of combined optimization of the density function (5)–(10)
is essentially a combination of FSS, random search and evolutionary operating
based on the Nelder–Mead method.
Since the problem under consideration is essentially a problem of
multiextrema optimization, it is necessary to find a set of extremums, each of
which is a centroid of a cluster. Therefore, the optimization problem must be
solved repeatedly at different values 2 and . When finding any of the
extremes from the original sample Х observations located directly in its vicinity
are excluded. After this removal, the proposed procedure of combined
evolutionary optimization is repeated until all extrema centroids are found.
EXPERIMENTAL RESEARCH
The experimental research was conducted on two databases, such us Page blocks
and Spam base and two test multiextrema functions. The description of datasets
shown in Table 1 and test multiextrema functions in Table 2.
T a b l e 1 . Data set description
Dataset Instances Attributes Clusters
Page blocks 5472 10 5
Spam base 4601 57 2
T a b l e 2 . Test multiextreme functions
Name
of function Formulas Domain Step
Rastrigin )2cos()2cos(1020)( 22 yxyxxf [ 5.12;5.12] 0.01
Griewangk’s 1
2
cos
1
cos
4000
1
4000
1
)(
xx
yxxf [ 30; 30] 0.1
Due the fact, that Rastrigin’s and Griewangk’s functions has a lot if local extreme
points in its search area, as shown on Fig. 1, a and Fig. 2, a, we add 514 agents.
Fig. 1. Rastrigin’s function, that has a lot of extreme points (a); modified optimization
method based on fish School on Rastrigin’s function (b)
a b
Clusterization of vector and matrix data arrays using the combined evolutionary method …
Системні дослідження та інформаційні технології, 2022, № 4 85
In Page blocks dataset was presents classified blocks of the page layout in a
document that has been detected by a segmentation process. Spam base dataset
also extracted from the UCI Machine Learning Repository and describes e-mail
classified as spam or not spam.
The accuracy comparison of the well known optimization algorithms such as
Fish School (FSS) and Cat Swarm (CSO) and proposed Modified Optimization
Method Based on Fish School (OMFS).
T a b l e 3 . Accuracy comparison
Data Accuracy OMFS FSS CSO
Rastrigin Mean
Best
190.46
195.83
189.65
195.59
190.46
195.83
Griewangk’s Mean
Best
3.65
4.82
3.41
4.12
3.65
4.81
Page blocks Mean
Best
951.47
959.64
951.01
959.43
951.15
959.55
Spam base Mean
Best
291.77
299.84
291.17
299.48
291.77
299.64
From obtained result, that shown in Table 3 we see, that Modified Optimization
Method Based on Fish School generally perform better than the original
algorithm. The convergence process of hybrid algorithm demonstrate in Fig. 3.
To evaluate the performance of clustering method used the several validity
metrics: Dunn Index (DI) — high value indicates a better clustering; Davies-
Fig. 2. Griewangk’s function, that has a lot of extreme points (a); modified optimization
method based on fish School on Griewangk’s function (b)
a b
Iteration number
B
es
t f
un
ct
io
n
va
lu
e
OMFS optimizing the Rastrigin function
a Iteration number
B
es
t f
un
ct
io
n
v
al
ue
OMFS optimizing the Griewangk”s function
b
Fig. 3. Modified Optimization Method Based on Fish School on test function: Rastrigin’s
function (a) and Griewangk’s function (b)
Ye. Bodyanskiy, А. Shafronenko, І. Pliss
ISSN 1681–6048 System Research & Information Technologies, 2022, № 4 86
Bouldin Index (DBI) — the smallest value indicates the better clustering; Cluster
Accuracy (CA) — the high value indicates the best clustering quality.
For comparison proposed method classification of vector and matrix data
sets based on combined optimization of distribution functions (CODF) against
classical DENCLUE algorithm and DENCLUE-IM for big data clustering (Table 4).
T a b l e 4 . The comparison algorithms according to their validity metric
Data Measures Spam base Page blocks
CODF
DENCLUE
DENCLUE-IM
DI
0.835
0.789
0.831
0.721
0.721
0.693
CODF
DENCLUE
DENCLUE-IM
DBI
0.768
0.867
1.041
0.764
0.864
1.041
CODF
DENCLUE
DENCLUE-IM
CA
0.718
0.805
0.701
0.920
0.920
0.911
All these results conclude that proposed method classification of vector and
matrix data sets based on combined optimization of distribution functions has an
acceptable clustering performance.
CONCLUSION
The problem of clustering data arrays that are described in both vector and matrix
forms based on the optimization of data distribution density functions in these
arrays is considered. For optimization of these functions — local extrema search
we have proposed the hybrid of Fish School Search algorithm, random search and
evolutionary optimization. This algorithm does not require the calculation of de-
rivatives of the function, which is optimized and in the general case is designed to
find the maxima of multiextrema functions of the matrix argument (images).
The proposed approach allows to reduce the number of the optimization pro-
cedure runs, allows to find the extremes of complex shape functions and is easy in
numerical implementation.
REFERENCES
1. G. Gan, Ch. Ma, and J. Wu, Data Clustering: Theory, Algorithms and Applications.
Philadelphia, Pennsylvania: SIAM, 2007, 455 p.
2. J. Abonyi and D. Feil, Cluster Analisis for Data Mining and System Identification. Basel,
Birdhouses, 2007, 303 p.
3. R. Xu and D.C. Wusch, II - Clustering. Hoboken, N.J.: John Willey Sons, Inc., 2009, 341 p.
4. C.C. Aggarwal, Data Mining: Text Book. Springer, 2015.
5. A.P. Engelbrecht, Computational Intelligence An Introducion. John Willey& Sons, 2007, 597 p.
6. L. Rutkowski, Computational Intelligence Methods and Techniques. Berlin Heidelberg:
Springer-Verlag, 2008, 514 p.
7. A. Kroll, Computational Intelligence. Eine Einfürung in Problelme, Methoden and
Tchnische Anwendungen. München: Oldenbourg Verlag, 2013, 428 p.
8. T. Kohonen, Self-Organizing Maps. Berlin: Springer, 1995, 362 p. doi: 10.1007/978-3-
642-56927-2.
9. A. Hinneburg and D.A. Keim, ”An efficient approach to clustering in large multimedia
databases with noise,” Proc. 4th Int. Conf. in Knowkedge Discovery and Data Mining
(KDD 98), N.Y.: AAAI Press, 1998, pp. 58–65.
10. A. Hinneburg and H.-H. Gabriel, ”DENCLUE 2.0: Fast clustering based on kernel
density estimation,” Proceedings of the 7th International Symposium on Intelligent Data
Clusterization of vector and matrix data arrays using the combined evolutionary method …
Системні дослідження та інформаційні технології, 2022, № 4 87
Analysis, New York: Springer Science + Business Media, 2007, pp. 70–80.
doi:10.1007/978-3-540-74825-0_7.
11. A. Hinneburg and D.A. Keim, ”A general approach to clustering in large databases with
noise,” Knowledge and Information Systems, 5 (5), pp. 387–415, 2003.
12. H. Rehioni, A. Idrissi, M. Abourezq, and F. Zegrary, ”DENCLUE-IM: A new approach
for big data clustering,” Procedia Computer Science, 83, pp. 560–567, 2016.
13. E. Parzen, ”On estimation of a probably density function and mode,” The Annals of Math
Statistics., 33, no. 3, pp. 1065–1076, 1962.
14. E.A. Nadaraya, ”On nonparametric estimation of density function and regressiion
curves,” Theory of Probab. Appl., 10, pp. 186–190, 1965.
15. G.S. Watson, ”Smooth regression analysis,” Sankhya: The Indian Journal of Statistics,
Ser. A., 26, no. 4, pp. 359–372, 1964.
16. J. Kennedy and R. Eberhart, ”Particle swarm optimization,” Proc. IEEE Int. Conf. on
Neural Networks, Perth, Australia, 1995, pp. 1942–1948.
17. A. Eiben and J. Smith, Introduction to Evolutionary Computing. Heidelberg: Springer, 2003.
18. A.P. Karpenko, ”Population algorithms for global continious optimization,” Review of
new and little-known algorithms. Supplement to the journal ”Information Technologies”
№7/2012, 32 p.
19. C.J.A. Bastos-Filho, F.B. Lima Neto, A.J.C.C. Lins, A.I.S. Nascimento, and M.P. Lima,
”Fish School Search,” Nature-Insperiod Algorithms for Optimization. Berlin Hedelberg:
Springer Verlag, 2009, SCI 193, pp. 261–277.
20. Jr. G.M.Cavalcanti, C.J.A. Bastos-Filho, F.B. Lima Neto, and R.M.C.S. Castro, ”A hybrid
algorithm based on fish school search and particle swarm optimization for dynamic
problems,” Proc. Int. Conf. in Swarm Intelligence (ICSI), 2011, vol. 2, pp.543–552.
21. A. Janecek and Y. Tan, ”Feeding the fish-weight update strategies for the fish school
seach algorithm,” Lecture Notes in Computer Scince. Berlin Heidelberg: Springer-
Verlag, 2011, vol. 6729, Part II, pp. 553–562.
22. L.A. Rastrigin, Random search in adaptation processes. Riga: Zinatne, 1973, 132 p.
23. Y.E.P. Box, “Evolutionary operation: A method for increasing industrial productivity,”
Applied Statistics, 6, pp. 81–101, 1957.
24. W. Spendley, G.R. Hext, and F.R. Himswath, “Sequential application of simplex design
in optimization and evolutionary operation,” Tehnometrics, 4, pp. 441–461, 1962.
25. J.A. Nelder and R. Mead, “A simplex method for function minimization,” Computer J.,
7, pp. 308–313, 1965.
Received 30.06.2022
INFORMATION ON THE ARTICLE
Yevgeniy V. Bodyanskiy, ORCID: 0000-0001-5418-2143, Kharkiv National University
of Radio Electronics, Ukraine, e-mail: yevgeniy.bodyanskiy@nure.ua
Alina Yu. Shafronenko, ORCID: 0000-0002-8040-0279, Kharkiv National University of
Radio Electronics, Ukraine, e-mail: alina.shafronenko@nure.ua
Iryna P. Pliss, ORCID: 0000-0001-7918-7362, Kharkiv National University of Radio
Electronics, Ukraine, e-mail: iryna.pliss@nure.ua
КЛАСТЕРИЗАЦІЯ ВЕКТОРНИХ ТА МАТРИЧНИХ МАСИВІВ ДАНИХ ІЗ
ВИКОРИСТАННЯМ КОМБІНОВАНОГО ЕВОЛЮЦІЙНОГО МЕТОДУ
РИБНИХ ЗГРАЙ / Є.В. Бодянський, А.Ю. Шафроненко, І.П. Плісс
Анотація. Розглянуто задачу кластеризації масивів даних, що описано як у век-
торній, так і матричній формі на основі оптимізації функцій щільності розпо-
ділу даних у цих масивах. Для оптимізації цих функцій – пошуку локальних
екстремумів запропоновано алгоритм, що є гібридом Fish School Search,
випадкового пошуку та еволюційної оптимізації. Цей алгоритм не потребує
обчислення похідних функції, що оптимізується, і у загальному випадку при-
значений для відшукання максимумів багатоекстремальних функцій матрич-
ного аргумента (зображень). Запропонований підхід дозволяє скоротити кіль-
кість запусків процедури оптимізації, знаходити екстремуми функцій складної
форми та є простим у числовій реалізації.
Ключові слова: комбінована оптимізація, нечітка кластеризація, еволюційні
алгоритми, функція щільності, Fish School.
|
| id | journaliasakpiua-article-275083 |
| institution | System research and information technologies |
| keywords_txt_mv | keywords |
| language | English |
| last_indexed | 2025-07-17T10:28:06Z |
| publishDate | 2022 |
| publisher | The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" |
| record_format | ojs |
| resource_txt_mv | journaliasakpiua/10/87f2020c908b4e74aef7bcdbfc51e410.pdf |
| spelling | journaliasakpiua-article-2750832023-05-21T20:04:38Z Clusterization of vector and matrix data arrays using the combined evolutionary method of fish schools Кластеризація векторних та матричних масивів даних із використанням комбінованого еволюційного методу рибних зграй Bodyanskiy, Yevgeniy Shafronenko, Alina Pliss, Iryna комбінована оптимізація нечітка кластеризація еволюційні алгоритми функція щільності Fish School combined optimization fuzzy clustering evolutionary algorithms density functions Fish School The problem of clustering data arrays described in both vector and matrix forms and based on the optimization of data distribution density functions in these arrays is considered. For the optimization of these functions, the algorithm that is a hybrid of Fish School Search, random search, and evolutionary optimization is proposed. This algorithm does not require calculating the optimized function’s derivatives and, in the general case, is designed to find optimums of multiextremal functions of the matrix argument (images). The proposed approach reduces the number of runs of the optimization procedure, finds extrema of complex functions with many extrema, and is simple in numerical implementation. Розглянуто задачу кластеризації масивів даних, що описано як у векторній, так і матричній формі на основі оптимізації функцій щільності розподілу даних у цих масивах. Для оптимізації цих функцій – пошуку локальних екстремумів запропоновано алгоритм, що є гібридом Fish School Search, випадкового пошуку та еволюційної оптимізації. Цей алгоритм не потребує обчислення похідних функції, що оптимізується, і у загальному випадку призначений для відшукання максимумів багатоекстремальних функцій матричного аргумента (зображень). Запропонований підхід дозволяє скоротити кількість запусків процедури оптимізації, знаходити екстремуми функцій складної форми та є простим у числовій реалізації. The National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" 2022-12-27 Article Article application/pdf https://journal.iasa.kpi.ua/article/view/275083 10.20535/SRIT.2308-8893.2022.4.07 System research and information technologies; No. 4 (2022); 79-87 Системные исследования и информационные технологии; № 4 (2022); 79-87 Системні дослідження та інформаційні технології; № 4 (2022); 79-87 2308-8893 1681-6048 en https://journal.iasa.kpi.ua/article/view/275083/270207 |
| spellingShingle | комбінована оптимізація нечітка кластеризація еволюційні алгоритми функція щільності Fish School Bodyanskiy, Yevgeniy Shafronenko, Alina Pliss, Iryna Кластеризація векторних та матричних масивів даних із використанням комбінованого еволюційного методу рибних зграй |
| title | Кластеризація векторних та матричних масивів даних із використанням комбінованого еволюційного методу рибних зграй |
| title_alt | Clusterization of vector and matrix data arrays using the combined evolutionary method of fish schools |
| title_full | Кластеризація векторних та матричних масивів даних із використанням комбінованого еволюційного методу рибних зграй |
| title_fullStr | Кластеризація векторних та матричних масивів даних із використанням комбінованого еволюційного методу рибних зграй |
| title_full_unstemmed | Кластеризація векторних та матричних масивів даних із використанням комбінованого еволюційного методу рибних зграй |
| title_short | Кластеризація векторних та матричних масивів даних із використанням комбінованого еволюційного методу рибних зграй |
| title_sort | кластеризація векторних та матричних масивів даних із використанням комбінованого еволюційного методу рибних зграй |
| topic | комбінована оптимізація нечітка кластеризація еволюційні алгоритми функція щільності Fish School |
| topic_facet | комбінована оптимізація нечітка кластеризація еволюційні алгоритми функція щільності Fish School combined optimization fuzzy clustering evolutionary algorithms density functions Fish School |
| url | https://journal.iasa.kpi.ua/article/view/275083 |
| work_keys_str_mv | AT bodyanskiyyevgeniy clusterizationofvectorandmatrixdataarraysusingthecombinedevolutionarymethodoffishschools AT shafronenkoalina clusterizationofvectorandmatrixdataarraysusingthecombinedevolutionarymethodoffishschools AT plissiryna clusterizationofvectorandmatrixdataarraysusingthecombinedevolutionarymethodoffishschools AT bodyanskiyyevgeniy klasterizacíâvektornihtamatričnihmasivívdanihízvikoristannâmkombínovanogoevolûcíjnogometoduribnihzgraj AT shafronenkoalina klasterizacíâvektornihtamatričnihmasivívdanihízvikoristannâmkombínovanogoevolûcíjnogometoduribnihzgraj AT plissiryna klasterizacíâvektornihtamatričnihmasivívdanihízvikoristannâmkombínovanogoevolûcíjnogometoduribnihzgraj |