Multivariate convergence-targeted operator for the genetic algorithm
Optimization of complex particle transport simulation packages could be managed using genetic algorithms as a tuning instrument for learning statistics and behavior of multi-objective optimisation functions. Combination of genetic algorithm and unsupervised machine learning could significantly incre...
Збережено в:
| Опубліковано в: : | Системні дослідження та інформаційні технології |
|---|---|
| Дата: | 2017 |
| Автори: | , , |
| Формат: | Стаття |
| Мова: | English |
| Опубліковано: |
Навчально-науковий комплекс "Інститут прикладного системного аналізу" НТУУ "КПІ" МОН та НАН України
2017
|
| Теми: | |
| Онлайн доступ: | https://nasplib.isofts.kiev.ua/handle/123456789/151069 |
| Теги: |
Додати тег
Немає тегів, Будьте першим, хто поставить тег для цього запису!
|
| Назва журналу: | Digital Library of Periodicals of National Academy of Sciences of Ukraine |
| Цитувати: | Multivariate convergence-targeted operator for the genetic algorithm / O. Shadura, A. Petrenko, S. Svistunov // Системні дослідження та інформаційні технології. — 2017. — № 1. — С. 126-140. — Бібліогр.: 17 назв. — англ. |
Репозитарії
Digital Library of Periodicals of National Academy of Sciences of Ukraine| id |
nasplib_isofts_kiev_ua-123456789-151069 |
|---|---|
| record_format |
dspace |
| spelling |
Shadura, O. Petrenko, A. Svistunov, S. 2019-04-23T19:42:51Z 2019-04-23T19:42:51Z 2017 Multivariate convergence-targeted operator for the genetic algorithm / O. Shadura, A. Petrenko, S. Svistunov // Системні дослідження та інформаційні технології. — 2017. — № 1. — С. 126-140. — Бібліогр.: 17 назв. — англ. 1681–6048 DOI: https://doi.org/10.20535/SRIT.2308-8893.2017.4.10 https://nasplib.isofts.kiev.ua/handle/123456789/151069 519.85, 539.3 Optimization of complex particle transport simulation packages could be managed using genetic algorithms as a tuning instrument for learning statistics and behavior of multi-objective optimisation functions. Combination of genetic algorithm and unsupervised machine learning could significantly increase convergence of algorithm to true Pareto Front (PF). We tried to apply specific multivariate analysis operator that can be used in case of expensive fitness function evaluations, in order to speed-up the convergence of the "black-box" optimization problem. The results delivered in the article shows that current approach could be used for any type of genetic algorithm and deployed as a separate genetic operator. Cкладні пакети моделювання транспорту частинок можна оптимізувати за допомогою генетичних алгоритмів, що дає змогу застосовувати для таких задач підходи статистичного навчання та методи оптимізації декількох цільових функцій. Поєднання генетичного алгоритму та неконтрольованого машинного навчання значно підвищує збіжність алгоритму до істинного парето-фронту. У межах багатофакторного аналізу запропоновано додатковий оператор, який може бути застосований для задач оптимізації багатоцільових функцій, що потребують великого обсягу ресурсів та часу, зокрема для пришвидшення збіжності задачі оптимізації "чорного ящика". Отримані результати показують, що запропонований підхід можна використовувати для генетичного алгоритму будь-якого типу, а додатковий оператор розглядати як окремий генетичний оператор. Сложные пакеты моделирования транспорта частиц можно оптимизировать с помощью генетических алгоритмов, что позволяет применять для таких задач подходы статистического обучения и методы оптимизации нескольких целевых функций. Сочетание генетического алгоритма и неконтролируемого машинного обучения может значительно повышает сходимость алгоритма к истинному парето-фронта. В рамках многофакторного анализа предложен дополнительный оператор, который может быть применен для задач оптимизации многоцелевых функций, требующих большого объема ресурсов и времени, в частности для ускорения сходимости задачи оптимизации "черного ящика". Полученные результаты показывают, что предложенный подход можно использовать для генетического алгоритма любого типа, а дополнительный оператор рассматривать как отдельный генетический оператор. en Навчально-науковий комплекс "Інститут прикладного системного аналізу" НТУУ "КПІ" МОН та НАН України Системні дослідження та інформаційні технології Методи аналізу та управління системами в умовах ризику і невизначеності Multivariate convergence-targeted operator for the genetic algorithm Багатофакторний конвергенційно-націлений оператор для генетичного алгоритму Многофакторный конвергенционо-нацеленный оператор для генетического алгоритма Article published earlier |
| institution |
Digital Library of Periodicals of National Academy of Sciences of Ukraine |
| collection |
DSpace DC |
| title |
Multivariate convergence-targeted operator for the genetic algorithm |
| spellingShingle |
Multivariate convergence-targeted operator for the genetic algorithm Shadura, O. Petrenko, A. Svistunov, S. Методи аналізу та управління системами в умовах ризику і невизначеності |
| title_short |
Multivariate convergence-targeted operator for the genetic algorithm |
| title_full |
Multivariate convergence-targeted operator for the genetic algorithm |
| title_fullStr |
Multivariate convergence-targeted operator for the genetic algorithm |
| title_full_unstemmed |
Multivariate convergence-targeted operator for the genetic algorithm |
| title_sort |
multivariate convergence-targeted operator for the genetic algorithm |
| author |
Shadura, O. Petrenko, A. Svistunov, S. |
| author_facet |
Shadura, O. Petrenko, A. Svistunov, S. |
| topic |
Методи аналізу та управління системами в умовах ризику і невизначеності |
| topic_facet |
Методи аналізу та управління системами в умовах ризику і невизначеності |
| publishDate |
2017 |
| language |
English |
| container_title |
Системні дослідження та інформаційні технології |
| publisher |
Навчально-науковий комплекс "Інститут прикладного системного аналізу" НТУУ "КПІ" МОН та НАН України |
| format |
Article |
| title_alt |
Багатофакторний конвергенційно-націлений оператор для генетичного алгоритму Многофакторный конвергенционо-нацеленный оператор для генетического алгоритма |
| description |
Optimization of complex particle transport simulation packages could be managed using genetic algorithms as a tuning instrument for learning statistics and behavior of multi-objective optimisation functions. Combination of genetic algorithm and unsupervised machine learning could significantly increase convergence of algorithm to true Pareto Front (PF). We tried to apply specific multivariate analysis operator that can be used in case of expensive fitness function evaluations, in order to speed-up the convergence of the "black-box" optimization problem. The results delivered in the article shows that current approach could be used for any type of genetic algorithm and deployed as a separate genetic operator.
Cкладні пакети моделювання транспорту частинок можна оптимізувати за допомогою генетичних алгоритмів, що дає змогу застосовувати для таких задач підходи статистичного навчання та методи оптимізації декількох цільових функцій. Поєднання генетичного алгоритму та неконтрольованого машинного навчання значно підвищує збіжність алгоритму до істинного парето-фронту. У межах багатофакторного аналізу запропоновано додатковий оператор, який може бути застосований для задач оптимізації багатоцільових функцій, що потребують великого обсягу ресурсів та часу, зокрема для пришвидшення збіжності задачі оптимізації "чорного ящика". Отримані результати показують, що запропонований підхід можна використовувати для генетичного алгоритму будь-якого типу, а додатковий оператор розглядати як окремий генетичний оператор.
Сложные пакеты моделирования транспорта частиц можно оптимизировать с помощью генетических алгоритмов, что позволяет применять для таких задач подходы статистического обучения и методы оптимизации нескольких целевых функций. Сочетание генетического алгоритма и неконтролируемого машинного обучения может значительно повышает сходимость алгоритма к истинному парето-фронта. В рамках многофакторного анализа предложен дополнительный оператор, который может быть применен для задач оптимизации многоцелевых функций, требующих большого объема ресурсов и времени, в частности для ускорения сходимости задачи оптимизации "черного ящика". Полученные результаты показывают, что предложенный подход можно использовать для генетического алгоритма любого типа, а дополнительный оператор рассматривать как отдельный генетический оператор.
|
| issn |
1681–6048 |
| url |
https://nasplib.isofts.kiev.ua/handle/123456789/151069 |
| citation_txt |
Multivariate convergence-targeted operator for the genetic algorithm / O. Shadura, A. Petrenko, S. Svistunov // Системні дослідження та інформаційні технології. — 2017. — № 1. — С. 126-140. — Бібліогр.: 17 назв. — англ. |
| work_keys_str_mv |
AT shadurao multivariateconvergencetargetedoperatorforthegeneticalgorithm AT petrenkoa multivariateconvergencetargetedoperatorforthegeneticalgorithm AT svistunovs multivariateconvergencetargetedoperatorforthegeneticalgorithm AT shadurao bagatofaktorniikonvergencíinonacíleniioperatordlâgenetičnogoalgoritmu AT petrenkoa bagatofaktorniikonvergencíinonacíleniioperatordlâgenetičnogoalgoritmu AT svistunovs bagatofaktorniikonvergencíinonacíleniioperatordlâgenetičnogoalgoritmu AT shadurao mnogofaktornyikonvergenciononacelennyioperatordlâgenetičeskogoalgoritma AT petrenkoa mnogofaktornyikonvergenciononacelennyioperatordlâgenetičeskogoalgoritma AT svistunovs mnogofaktornyikonvergenciononacelennyioperatordlâgenetičeskogoalgoritma |
| first_indexed |
2025-11-25T20:36:34Z |
| last_indexed |
2025-11-25T20:36:34Z |
| _version_ |
1850524148821917696 |
| fulltext |
© O. Shadura, A. Petrenko, S. Svistunov, 2017
126 ISSN 1681–6048 System Research & Information Technologies, 2017, № 1
TIДC
МЕТОДИ АНАЛІЗУ ТА УПРАВЛІННЯ
СИСТЕМАМИ В УМОВАХ
РИЗИКУ І НЕВИЗНАЧЕНОСТІ
УДК 519.85, 539.3
DOI: 10.20535/SRIT.2308-8893.2017.4.10
MULTIVARIATE CONVERGENCE-TARGETED OPERATOR
FOR THE GENETIC ALGORITHM
O. SHADURA, A. PETRENKO, S. SVISTUNOV
Abstract. Optimization of complex particle transport simulation packages could be
managed using genetic algorithms as a tuning instrument for learning statistics and
behavior of multi-objective optimisation functions. Combination of genetic
algorithm and unsupervised machine learning could significantly increase
convergence of algorithm to true Pareto Front (PF). We tried to apply specific
multivariate analysis operator that can be used in case of expensive fitness function
evaluations, in order to speed-up the convergence of the "black-box" optimization
problem. The results delivered in the article shows that current approach could be
used for any type of genetic algorithm and deployed as a separate genetic operator.
Keywords: machine learning, genetic algorithm, Pareto Front, principle component
analysis, transport particle simulations.
INTRODUCTION
A set of scientific researches that required data verification or generating big set
of datasets like the studies in cosmology, high energy physics (HEP), biology and
genetics, require the development of new approaches and methods for their
efficient analysis on modern computer platforms.
In the point of the work on analyzing and optimizing the performance of the
GeantV code [1], which is the prototype of the next-generation particle transport
simulation software intended to succeed to Geant4 [2], which is the current
golden standard in high energy physics (HEP) and beyond. Geant4 is a toolkit for
simulation of the passage of particles through different kinds of matter, with
application including high energy and nuclear physics, accelerator physics,
medicine and space science and it is widely used in HEP experiments at the Large
Hadron Collider (LHC) located at CERN (Geneva, Switzerland).
As a history, GeantV project had been started in 2013 with an R&D phase
focused on optimal exploitation of instruction level parallelism for particle
transport simulation both on CPU and on accelerators such as GPUs and Intel
Xeon Phi [3].
GeantV is based on a specially developed vectorized computational solid
geometry (CSG) modeler, which provides a set of optimized shape primitives and
highly parallel geometry navigator and necessary ray-tracing functionality for the
efficient propagation of particles through the target geometry [4].
Multivariate convergence-targeted operator for the genetic algorithm …
Системні дослідження та інформаційні технології, 2017, № 1 127
The goal of GeantV project is to optimize the simulation algorithms to get
maximum benefit from highly massive parallel SIMD/MIMD architectures [5]
while finding the optimal point for factors focused on computational performance
(floating-point performance, off-chip memory bandwidth, usage of cache and
memory hierarchy and etc.). As a consequence, a large number of parameters
have to be optimized and GeantV optimization task can be treated as a black-box
problem.
DTLZ [6] set of benchmarks is covering cases in convex and non-convex,
separable and non separable and multimodal functions with degenerate Pareto
optimal fronts or disconnected Pareto optimal fronts, and disconnected Pareto
optimal sets. These helps us to prototype of behavior of our algorithm in case of
different set of realistic functions. The objective of this work is to observe
whether, by using unsupervised machine learning, we can accelerate the process
of finding a Pareto front.
Also using genetic algorithm together with machine learning approach we
will try to analyze convergence and fixed points of evolutionary systems, trying to
accelerate convergence rate of algorithm for “black-box” optimization. Before
going to optimize Geant-V simulations, we will try to prototype algorithm’s
performance on a set of numerical DTLZ benchmarks [6] in order to accelerate
their convergence to the true Pareto front via the integration of multi-objective
search/optimisation (MOO) algorithms and unsupervised machine learning
Principal Component Analisis (PCA) and kernel PCA.
GENETIC ALGORITHMS AS A DYNAMIC SYSTEM
Genetic algorithm is one of the widely used evolutionary algorithms for studies of
various optimization problems, in the same time the theory of genetic algorithms
(GA) was a subject of research for the last decades. The easiest model for
studying GA is a simple model of genetic algorithm (SGA)[7], that could be used
as a prototype of evolutionary system. This model is describing genetic algorithm
(GA) as a dynamical system with accurate mathematical definitions and well
studied in a literature.
In the model for description of GA as a Markov chain is used next
definitions where states are populations and transition are operated by sets of
genetic operators: selection, crossover and mutation [8]. Mutation ensures that the
Markov chain is connected, therefore there is an unique equilibrium distribution
over populations, the probability to produce a particular population in one
generation depends only on the previous generation external influencing factors.
This randomized process is described by a Markov chain, characterized by a
transition matrix pq rr,Θ from the population pr to the population qr .
Dynamical systems describe the evolution of individuals in the finite space
of possible populations of fixed size m , where m is number of measurements
during the experiment. While rethinking the genetic algorithms as a discrete
dynamical system, many interesting mathematical objects like fixed points could
be found. These objects are apparently not only generic for simple genetic
algorithms, but also general for optimization problems. Let’s briefly recall the
results presented in [7] and establish the possible links with the task of optimizing
our parameters.
O. Shadura, A. Petrenko, S. Svistunov
ISSN 1681–6048 System Research & Information Technologies, 2017, № 1 128
We have a population of N different types of individuals in search sample
space Ω . Each element of Ω can be thought of as a “unique individual” with
a given fitness value defined by the cost function.
A population consists of m -subsets ( Nm = ) each of which contains
i
vα of
the iα -type individual where mi 1,...,= and defined by vector
t
m
bbbb ),...,,(=
21 ααα
r
,
where Ω∈αi . The size of the population is
i
m
i bm α∑ 1== .
We can redefine the population vector in the following form
t
Npppp ),...,,(= 21
r ,
where )/=(
1
mbpp
i ααα is the probability of occurrence α -th individual in the
population vector b
r
.
In the mentioned before representation the repeated application of the
genetic algorithm gives a sequence of vectors Λ∈pr where
1}.=1,0|),...,,{(=
1=
21 α
α
α ∑≤≤∈Λ ppRppp
N
Nt
N
Λ is a set of admissible states for the populations. We can consider Λ as a
1)( −N -dimensional simplex (a hyper-tetrahedron).
)( pG r
α is a certain probability of producing individual α in the next
generation if the previous population was pr and define map Λ→Λ:G , where
)(=)( pGpG rr
αΩ∈α∏ , and Λ∈)( pG r could be considered as heuristic function.
)( pG r is GA procedure on Λ∈pr and the map G is actually the composition of
three different maps: selection, mutation and crossover.
Let define genetic selection operator Λ→Λ:F , where )(=)( pFpF rr
αΩ∈α∏
and the α -th component, )( pF r
α , represents the probability of the appearance of
an individual of type α if the selection is applied to Λ∈pr . A selection operator
chooses individuals from the current population using the cost function vector,
NRff ∈α}{=
r
, where )(= αα ff , Ω∈α . This generic type of selection collects
elements with probability proportional to their fitness. This corresponds to a
heuristic function
,)(diag=)(
pf
pfpF t rr
r
r
⋅
⋅
where Λ∈pr is the population vector, and )(diag f
r
is the matrix with entries
from f
r
along the diagonal and zeros elsewhere.
The mutation operator Λ→Λ:U is an NN × real valued matrix with
),( βα -th entry 0>,βαu for all βα, , and βα,u represents the probability that
individual Ω∈β mutates into Ω∈α . Then α⋅ )( pU r is the appearance of an
individual of type α after applying a mutation to the population pr .
Multivariate convergence-targeted operator for the genetic algorithm …
Системні дослідження та інформаційні технології, 2017, № 1 129
Crossover operator is defined Λ→Λ:C as
)ˆ,...,ˆ(=)( 1 pCppCppC N
tt rrrrr
⋅⋅⋅⋅ ,
for Λ∈pr , where NCC ˆ,...,ˆ
1 is a sequence of symmetric non-negative NN × real-
valued matrices. Here )(ˆ pC r
α represents the probability that an individual α is
generated by applying the crossover to population pr .
Combining the selection, mutation and crossover maps we obtain the
complete operator Ĝ for the genetic algorithm (GA map)
).(ˆˆ=)(ˆ,:ˆ pFUCpGG r
oo
r
Λ→Λ
If we know the heuristic function G , we can write the transition matrix
which is stochastic and based on the probability of transforming the population pr
into the population qr :
( )
( ) ,
!
)(!=,
α
α
α
Ω∈α
∏Θ
qm
pGm
qm
pq
r
rr (1)
where )( pG r
α is probability of producing individual α in the next generation and
αqM is the number of copies of individuals α in the population qr , m is the size
of the population.
As a brief review, the convergence properties of the simple genetic
algorithm evolution schema was properly explored in the work [9]. While [10]
showed that the convergence rate of the genetic algorithm is determined by the
second largest eigenvalue of the transition matrix (1). The details of the proof was
performed for diagonalizable transition matrices and transferred to matrices in
Jordan normal form.
Another remarkable feature of the SGA is the presence of a rich structure of
fixed and metastable points (for a detailed discussion see [8]).
Describing GA model through Markov chain representation we try to
discover "hotspots" and find algorithmic or data patterns that could be used for
improvement of the GA.
For the optimization of the GeantV simulation, we identify a set of
optimization parameters important for the performance of particle transport
simulations (e.g. the size of vector of particles to be transported or other
significant design features) and build the data matrix }{=}){(=, ααα xxX ii
rr which
contains the values of these parameters. In this matrix index i enumerates the
tuning parameters ( ni 1,...,= ) and index α enumerates the number of
measurements of the parameters ( M1,...,=α for M measurements), while in
terms of GA index α enumerates M individuals and the population vector is
constituted by ),...,,( 21 Mxxx rrr
Recall the data and probabilistic sample representation. In the first case we
can associate the vector based on the measurements of the i -th parameter
})(,...,)(,){(=}){(= 21 M
'
i
'
i
'
i
'
i
'
i xxxxx rrrrr
α , where the component α)( '
ixr corresponds
to the value of the i -th parameter in the α -th measurement with the population
vector ),...,,( 21
'
n
'' xxx rrr .
O. Shadura, A. Petrenko, S. Svistunov
ISSN 1681–6048 System Research & Information Technologies, 2017, № 1 130
In the second case )(xPi be the probability distribution function of the
measurements of the i -th parameter, with normalization
1.=)(xdxPi∫
∞
∞−
Using the previous strategy we associate the population vector
),...,,(with),...,,( 2121
'
n
''
n xxxppp rrrrrr ,
where
},)(,...,)(,){(= 21 Miiii pppp rrrr
and the component α)( ipr is the probability to measure of the i -th parameter
value α)( '
ixr in the α -th measurement.
One of the challenges of a Markov chains is to determine the evolution of
the components along an appropriate direction for faster convergence to
equilibrium. Using Principal Component Analysis (PCA) allows to check the
genetic algorithm parameter sensitivity and the possible correlation between
parameters. For this we introduce a operator that will be based on PCA and
inverse PCA noise reduction operation for a genetic algorithm’s optimisation of
set of parameters.
We considered a possibility to improve the convergence rate by adding to a
standard set of GA operator’s (selection, mutation, crossing), a new operator P̂
performing uncentered PCA on the GA populations. We will analyze the result of
the implementation of the operator on the uncentered data matrix on standard GA
performance benchmarks. From the experimental output we see that as in the
SGA case [10], the convergence rate of genetic algorithm depends on the
eigenvalues following the highest one, and for this reason the proposed operator
P̂ was applied on them.
UNCENTERED PCA AS A SVD APPROXIMATION
FOR POPULATION DATA MATRIX
In PCA, we usually manipulate with centered data matrix in order to reduce a
complex data set (in our case performance measurements data) to a lower
dimensional set through analyzing the covariance matrix. Here is presented a way
that a “sort of PCA” could be implemented on an uncentered data matrix. This is
particularly convenient in the case of transformations of constrained data
measurements using genetic algorithms, which are in our case highly constrained
and multi-scaled performance parameters. As a basis of ideas about the
connection between the centered and uncentered data matrix was used ideas from
[11, 12].
PCA for centered data matrix and SVD
Let briefly recall PCA for the centered data matrix. The elements of the data
matrix X̂ of size nm× are described through m -samples of data from an n -
dimensional space. In our case m is the number of individuals in the generation
Multivariate convergence-targeted operator for the genetic algorithm …
Системні дослідження та інформаційні технології, 2017, № 1 131
and n is the size of the individual ( n is the dimension of vector of genes
)(1}{= nixx i ≤≤
r .
Let ),1}(1){(= nimxx i ≤≤≤α≤αα
rr is α -th individual of the population and
},){(=}{=ˆ
, ii xXX αα
r
be a uncentered data matrix, size nm× . Let us define the centered data matrix Ŷ :
},){(=}{=}{=ˆ
,, iiii yXYY ααα μ−
r
where iμ is mean over M -individuals of i -th component of the gene:
}.{=,1,1= ,
1=
ii
m
i niX
m
μμ≤≤μ α
α
∑ r
The centered data matrix defines the covariance matrix Σ̂ :
j
t
iji
t YY
m
YY
m ,,,
1=}{=ˆˆ1=ˆ
ααΣ⋅Σ
with the matrix multiplication repeated induces imply summation. Similarly for
the uncentered data matrix we obtain the matrix of non-central second moments,
j
t
iji
t XX
m
TXX
m
T ,,,
1=}{=ˆˆ1=ˆ
αα⋅ .
In standard PCA terms the first principal component (PC)
)1,...,=(}{= ,11 mvv αα
r is the linear combination
1,=,== 11,1,
1=
1,1 uuuYuyv t
ii
n
i
t rrrr
⋅⋅ ααα ∑
where the orthonormal n-dimensional vector t
n
t uuu ),...,(= ,11,11
r is defined from
condition that the first principal component has the largest variance
.1=
2
,1,
1=1=
2
1 ⎥
⎦
⎤
⎢
⎣
⎡
σ α
α
∑∑ ii
n
i
m
u uY
m
The second principal component is the linear combination with the second
largest variance and orthogonal to the first principal component, and so on.
To calculate PC, it is more comfortable to review the variational problem.
For }{=}{= , iiuYvv αα
r we have
uuuYYu
m
v ttt rrrrr
⋅Σ⋅⋅⋅⋅ ˆ=ˆˆ1=)(Var (2)
and the Lagrangian for the variational problem
1).(ˆ=L −λ+⋅Σ⋅ uuuu tt rrrr
The stationary condition is
.=ˆ0,=2ˆ2= uuuu
u
L rrrr
r λ⋅Σλ−⋅Σ
∂
∂
O. Shadura, A. Petrenko, S. Svistunov
ISSN 1681–6048 System Research & Information Technologies, 2017, № 1 132
This matrix equation has n solutions
,1,=ˆ )( njuu j
c
jj ≤≤λ⋅Σ
rr
where jur are eigenvectors of Σ̂ with the eigenvalue jλ and jur satisfy the
orthonormality condition
,,1,= , njiuu jij
t
i ≤≤δ⋅
rr (3)
and
.=ˆ
jj
t
j uu λ⋅Σ⋅
rr
(4)
Then the direction with maximum variance is the eigenvector with the
largest eigenvalue. This procedure can be iterated to get the second largest
variance projection (orthogonal to the first one), and so on.
From (2) it follows that the variance of the i -th centered principal
component
ii
t
ii uuv λ⋅Σ⋅ =ˆ=)(Var rrr
and the covariance of the i -th and j -th centered principal components
.0,=ˆ=),(Cov jiuuvv j
t
iji ≠⋅Σ⋅
rrrr
Defining the matrix as jijji uuU )(==,
r , which consists from the
eigenvectors of the covariance matrix Σ̂ . From (3) this matrix satisfies the
orthogonality condition
.= ,,, jiji
t
ii UU δ′′ (5)
Then from (4) we have
,=,ˆ=ˆˆˆ
,, jiiji
t UU δλΛΛ⋅Σ⋅ (6)
Let define the matrix }){(=}{=, jjj vvV αα
r , where jvr — j -th centered
principal compoent. Then
,1,= ,,, mUYV jiij ≤α≤αα (7)
and the first principal component 1vr
1,1,,1,1 === uyUYVv t
ii
rr
⋅αααα
if 1λ is the largest eigenvalue. From (5), (6) we have
.== ,,,, jiijij
t
i mmVV δλΛαα
It is convenient to define the new matrix jV ,
~
α
,=,~= ,
1/21/2
,
1/2
,,, jiijijiij VmV δλΛΛαα
which satify the condition
,=~~
,,,, jij
t
i VV δαα
Multivariate convergence-targeted operator for the genetic algorithm …
Системні дослідження та інформаційні технології, 2017, № 1 133
The matrix })~{(=~
, αα jj vV consists from eigenvectors α)~( jv of the matrix
tYYK ⋅ˆ=ˆ of the size mm×
.)~(=)~(=)~( ,,, αββαββα λ jjj
t
kkj vvYYvK
with the same eigenvalues as in (4). From (7) we have
,= ,,,
t
ijji UVY αα
and obtain the Singular Value Decomposition (SVD) [13] for the centered data
matrix
.~= ,
1/2
,,,
t
ijjiii UVmY Λαα (8)
We suppose that the covariance matrix Σ̂ has )( pn − smallest eigenvalues
njpj ≤≤+λ 11,= . Then we can apply the dimension reduction and after the
reverse PCA, we obtain the output data matrix iY ,α :
=~~= ,
1/2
,,,
t
ijjiii UVmY Λαα .)~...~( ,,
1/2
1,,1
1/2
1
t
ippp
t
i UVUVm αα λ++λ . (9)
The approximation of matrix iY ,α is the matrix iY ,α of reduced rank nm < .
This transformation is also known as the discrete Karhunen-Loéve or the
Hotelling transformation [16].
Using the SVD representation (8) and (9) for the centered data matrix we can
calculate the mean square error (the standard error)
=)(1= 2
,,
1=1=
ii
n
i
m
m YY
nm αα
α
−η ∑∑
.1=~1=
1=
2
,,
1=1=1=
k
n
pk
t
ikkk
n
pk
n
i
m
n
UVm
nm
λ⎟
⎟
⎠
⎞
⎜
⎜
⎝
⎛
λ ∑∑∑∑
+
α
+α
Thus the minimum error is obtained if the covariance matrix Σ̂ has )( pn −
smallest eigenvalues njpj ≤≤+λ 1, and the Hotelling transformation can be
considered as the “eigenvalue control parameter” approximation.
PCA for uncentered data matrix and SVD
Next step is to apply the PCA method for the uncentered data matrix X̂ .
Vectors )(1 njw j ≤≤
r are eigenvectors of the matrix of non-central second
moments
,ˆˆ1=ˆ XX
m
T t ⋅
with the corresponding eigenvalues jt
,1,=ˆ njwtwT jjj ≤≤⋅
rr (10)
and satisfy the orthonormality condition
O. Shadura, A. Petrenko, S. Svistunov
ISSN 1681–6048 System Research & Information Technologies, 2017, № 1 134
.,1,= , njiww jij
t
i ≤≤δ⋅
rr
Then
.=ˆ
jj
t
j twTw rr
⋅⋅ (11)
We define matrix jijji wwW )(=}{=,
r that satisfies the orthogonality
condition
.ˆ=ˆˆ IWW t ⋅ (12)
From (11) we have
.=,ˆ=ˆˆˆ
,, jiiji
t tWTW δΔΔ⋅⋅ (13)
Let )1,...,=(}{=}){(= mwx j
t
jj α⋅θθ αα
rrr
is j -th uncentered principal
component. By analogy with (7) we define the matrix }){(=}{=, αα θθΘ jjj
r
,1,= ,,, mWX jiij ≤α≤Θ αα (14)
which from (12) and (13) satisfy the condition
.== ,,,, jiijij
t
i mtm δΔΘΘ αα (15)
For the variance of j -th uncentered principal component we obtain
),,(cos=)(1==)(Var 22
2
,,
1=1=
2
, jjjiii
n
i
m
jj wtWX
m
rrrr
μμ−⎥
⎦
⎤
⎢
⎣
⎡
μ−σθ α
α
θ ∑∑
For case of uncentered matrix we do not have a simple relationship between
the eigenvalues jt and the variance j -th uncentered principal component
2
, )( jθσ as for the centered data matrix. However, this property is not essential for
the usage of the PCA method for the GA and in this case it is convenient to apply
the “eigenvalue control parameter” approximation. The idea is to use the PCA
method for the SVD representation of the uncentered data matrix.
We define the matrix j,
~
αΘ
.=,~= ,
1/21/2
,
1/2
,,, jiijijiij tm δΔΔΘΘ αα (16)
From (15) and (16) we obtain:
.=~~
,,, jij
t
i δΘΘ αα
Using (16), (15) and (11) it is not hard to show that }
~
{=~
, jj θΘα
r
is the
matrix of eigenvectors αθ )~( j of the matrix tXXK ˆˆ=~
⋅ of size mm×
.)~(=)~(=)~(~
,,, αββαββα θθθ jjj
t
kkj tXXK
From (14) we obtain the representation for the uncentered data matrix
,= ,,,
t
ijji WX αα Θ
Multivariate convergence-targeted operator for the genetic algorithm …
Системні дослідження та інформаційні технології, 2017, № 1 135
from which we get the SVD representation for the uncentered data matrix
.~= ,
1/2
,,,
t
ijjkki WmX ΔΘαα
If the matrix of non-central second moments T̂ has )( qn − smallest
eigenvalues njqt j ≤≤+11,= we can use the “eigenvalue control parameter”
approximation and get the output data matrix jX ,
~
α of rang q
=~~=~
,
1/2
,,,
t
ijjkki WmX ΔΘαα ,)~...~( ,,
1/2
1,,1
1/2
1
t
iqqq
t
i WtWtm αα Θ++Θ (17)
where the eigenvalue matrix jk ,
~
Δ has rang q 0)==...==( 21 nqq ttt ++ .
We approximate iX ,α with rank n by the matrix iX ,
~
α which has rank q .
This is the analog of the Hotelling transformation.
Using the SVD representation we can estimate the mean square error qη for
this approximation:
=)~(1= 2
,,
1=1=
ii
n
i
m
q XX
mn αα
α
−η ∑∑
.1=~1=
1=
2
,,
1=1=1=
k
n
qk
t
ikkk
n
qk
n
i
m
t
n
Wtm
mn ∑∑∑∑
+
α
+α
⎟
⎟
⎠
⎞
⎜
⎜
⎝
⎛
Θ
The minimum error is obtained in the case if the matrix of non-central
second moments T̂ has )( qn − smallest eigenvalues njqt j ≤≤+1, .
The second case we can get this approximation using a projector P̂ , which
projects the data matrix X̂ onto the subspace spanned by the principal axes with
largest eigenvalues ,jt qj ≤≤1 . Let define matrix ikkki wwW )(=}{=~
, ′′′
r
)(1 qk ≤′≤ of the size qn× . This matrix consists from the first q largest
eigenvectors kw ′
r (20).
The projector )(1,ˆ qP is defined the following way
.ˆ=ˆˆ,~~=ˆ )(1,)(1,)(1,
,,
)(1,
,
qqqt
jkki
q
ji PPPWWP ⋅′′
Then it is not hard to show that the approximation iX ,
~
α in (17) can be
written using the projector )(1,ˆ qP
....=ˆ=~
,,1,,11
)(1,
,,,
t
jqqq
t
j
q
jiij WtWtPXX αααα Θ++Θ
Analysis of eigenvalues in SVD representation of the uncentered input data
matrix iX ,α used as population in GA can significantly accelerate the processes
of finding the Pareto front for the MOP. We verified this hypothesis for the
standard GA test problems [6].
Eigenvectors with the largest eigenvalues likely determine the subspace of
solutions of the MOP in which lies the Pareto front. Using an iterative procedure
O. Shadura, A. Petrenko, S. Svistunov
ISSN 1681–6048 System Research & Information Technologies, 2017, № 1 136
for uncentered data matrix from MOP we can faster converge to the optimal
solution subspace.
PCA-based genetic operator )(ˆˆˆ=)( pFUCPpGP
r
ooo
r allows to check the
genetic algorithms parameter sensitivity and the possible correlation between
parameters. We introduced a new algorithmic step applied to generation
modification step that performs data transformation based on PCA and inverse
PCA noise reduction operation the set of parameters used for GA.
EVOLUTIONARY SCHEMA PERFORMANCE IMPROVEMENT FOR NSGA-II
In the article we propose to modify NSGA-II [14] as one of the most common
GAs with specific operator shown on figure 1 that can be regarded as a denoising
factor for faster approximation and convergence to the true Pareto front consisting
of ideal individuals, we can apply orthogonal transformation to be able to
discover strong patterns in data set. NSGA-II features fast non-dominance sorting
procedure of population and preservation of a good convergence rate to the
optimal Pareto set and it preserves a spread of best individuals is using a diversity
preservation operation called crowding distance and non-dominated ranking
procedure. In case of NSGA-III [15] as a evolution of NSGA-II has more specific
algorithm schema based on reference point’s selection procedure.
On a Fig. 1 is shown how was integrated operator performing UPCA in the
algorithm. It is particularly important to notice that authors tested more
combinations, and case of schema described on Fig. 1, we got maximum of
benefits in speedup and algorithm convergence caused by increased size of
population matrix used for tournament selection pool.
RESULTS OF RUNNING DTLZ BENCHMARKS
The DTLZ problems [6] are a set of numerical MOP benchmarks that are used for
comparing and validating results from different GA algorithms. We present
results of the DTLZ benchmarks [6] for NSGA-II and NSGA-II with PCA noise
cleanup operator. We recognized that currently NSGA-III is outperforming
NSGA-II but here results are provided as a proof of concept. On Figure 2, 5 are
Fig. 1. New schema of algorithm
2M×N
M×N
New
population
Rank – 12 Rank – 3
Rank – 3
Transition matrix
Transition matrix
Multivariate convergence-targeted operator for the genetic algorithm …
Системні дослідження та інформаційні технології, 2017, № 1 137
presented the parameter distribution (mean and standard deviation values) and
cost function values behavior depending on used algorithms.
Comparing Fig. 5, 6 and Fig. 7 where was applied noise-removing procedure
and Figure 2, 3 and Figure 4 where was not, we can observe faster convergence to
PopDist10
Entries 2400
Mean 0,4911
Std deviation 0,2422
Population distribution
TGenes/Bins
z
Fig. 2. Population distribution on 10th generation - NSGA-II - DTLZ2
DTLZ2
NSGA–2
Entries 200
Mean Y1 0,7881
Mean Y2 0,824
Mean Y3 0,754
Std deviation Y1 0,563
Std deviation Y2 0,5814
Std deviation Y3 0,4796
Fig. 3. Pareto Front on 10th generation of NSGA-II - DTLZ2
DTLZ4
NSGA–2
Entries 200
Mean Y1 0,6303
Mean Y2 0,4336
Mean Y3 0,6171
Std deviation Y1 0,5334
Std deviation Y2 0,5928
Std deviation Y3 0,7106
Fig. 4. Pareto Front on 40th generation of NSGA-II - DTLZ4
O. Shadura, A. Petrenko, S. Svistunov
ISSN 1681–6048 System Research & Information Technologies, 2017, № 1 138
the ideal values of the parameters in the first case. Fig. 7 and Fig. 6 show the first
approach to Pareto front in combination with correct set of parameters.
Fig. 6. Pareto Front on 10th generation of NSGA-II with preprocessing of data - DTLZ2
PopDist10
Entries 2400
Mean 0,487
Std deviation 0,4
Population distribution
TGenes/Bins
z
DTLZ2
NSGA2UPSA
Entries 200
Mean Y1 0,615
Mean Y2 0,6235
Mean Y3 0,5528
Std deviation Y1 0,384
Std deviation Y2 0,4275
Std deviation Y3 0,4366
Fig. 5. Population distribution on 10th generation — NSGA-II with preprocessing of
data -DTLZ2
DTLZ4
NSCA2UPCA
Entries 200
Mean Y1 0,6407
Mean Y2 0,5633
Mean Y3 0,6273
Std deviation Y1 0,3989
Std deviation Y2 0,4217
Std deviation Y3 0,4117
Fig.7. Pareto Front on 40th generation of NSGA-II with preprocessing of data - DTLZ4
Multivariate convergence-targeted operator for the genetic algorithm …
Системні дослідження та інформаційні технології, 2017, № 1 139
The next steps of our work will be to agree our concept with the existence of
fixed points in dynamical systems, to re-evaluate a possible speedup comparing to
other algorithms together with the “black-box” benchmarks [17] and port a new
algorithm as a part of the optimization framework for GeantV particle transport
simulations code.
CONCLUSIONS
In this work we tried to explore the possibility to combine genetic algorithms and
unsupervised machine learning (PCA/UPCA/SVD/KPCA) to obtain a powerful
combination that speedup existing GA algorithms. Usage of this algorithm for
performance optimization of simulation of particle physics with clearly give
benefit in finding optimal value with smaller number evolutions with costly
fitness function. Next step of work is to implement this algorithm as a part of
optimization routine in Geant-V project and test a benefits in full scale mode
system.
REFERENCES
1. Amadio G. GeantV: from CPU to accelerators / G. Amadio, A. Ananya, J. Aposto-
lakis, A. Arora, M. Bandieramonte, A. Bhattacharyya, C. Bianchini, R. Brun,
P. Canal, F. Carminati, L. Duhem, D. Elvira, A. Gheata, M. Gheata, I. Goulas,
R. Iope, S. Jun, G. Lima, A. Mohanty, T. Nikitina, M. Novak, W. Pokorski,
A. Ribon, R. Sehgal, O. Shadura, S. Vallecorsa, S. Wenzel, Y. Zhang // Journal
of Physics: Conference Series. — 2016. — Vol. 762, N 1. — P. 012019.
2. Allison J. Geant4 developments and applications / J. Allison, K. Amako, J. Aposto-
lakis, H. Araujo, P. Arce Dubois, M. Asai, G. Barrand, R. Capra and others //
IEEE Transactions on Nuclear Science. — 2006. — Vol. 53, N 1. — P. 270–278.
3. Amadio G. The GeantV project: preparing the future of simulation / G. Amadio,
J. Apostolakis, M.Bandieramonte, A. Bhattacharyya, C. Bianchini, R. Brun and
others // Journal of Physics: Conference Series. — 2015. — Vol. 664, N 7. —
P. 072006.
4. Apostolakis J. Towards a high performance geometry library for particle-detector
simulations / J.Apostolakis, M.Bandieramonte, G. Bitzes, R. Brun, P. Canal,
F. Carminati and others // Journal of Physics: Conference Series. — 2015. —
Vol. 608, N 1. — P. 012023.
5. Apostolakis J. Adaptive track scheduling to optimize concurrency and vectorization
in GeantV / J.Apostolakis, M.Bandieramonte, G. Bitzes, R. Brun, P. Canal,
F Carminati and others // Journal of Physics: Conference Series. — 2015. —
Vol. 608, N 1. — P. 012003.
6. Deb K. Scalable Test Problems for Evolutionary Multi-Objective Optimization, Evo-
lutionary Multiobjective Optimization: Theoretical Advances and Applications /
K. Deb, L. Thiele, M. Laumanns, E. Zitzler // Advanced Information and Knowl-
edge Processing - Evolutionary Multiobjective Optimization - Springer London.
— 2015. — P. 105–145.
7. Vose M. The Simple Genetic Algorithm: Foundations and Theory / M. Vose // MIT
Press. Cambridge. — 1999. — 251 p.
8. Rowe J.E. Genetic algorithm theory / J.E. Rowe // Proceedings of the 14th Annual
Conference Companion on Genetic and Evolutionary Computation
(GECCO2012). — New York, USA. — 2012. — P. 917–940.
O. Shadura, A. Petrenko, S. Svistunov
ISSN 1681–6048 System Research & Information Technologies, 2017, № 1 140
9. Rudolph G. Convergence properties of evolutionary algorithms / G. Rudolph //
Kovac, Hamburg. — 1997. — 286 p.
10. Schmitt F. On the Importance of the Second Largest Eigenvalue on the Convergence
Rate of Genetic Algorithms / F. Schmitt, F. Rothlauf // Proceedings of the 14th
Symposium on Reliable Distributed Systems. — 2001. — P. 559–564.
11. Cadima J. On Relationships between Uncentered and Column-centered Principal
Component Analysis / J. Cadima, I. Jolliffe // Pakistan Journal of Statistics. —
2009. — Vol. 25(4). — P. 473–503.
12. Honeine P. An eigenanalysis of data centering in machine learning / P. Honeine //
Preprint ArXiV ID: 1407.2904. — 2014. — 14 p.
13. Baker K. Singular value decomposition tutorial / K. Baker. — Available at:
https://datajobs.com/data-science-repo/SVD-Tutorial-%5BKirk-Baker%5D.pdf
14. Deb K. A fast and elitist multiobjective genetic algorithm: NSGA-II / K. Deb,
A. Pratap, S. Agarwal, T. Meyarivan // IEEE Transactions on Evolutionary
Computation. — 2002. — Vol. 6. — P.182–197.
15. Haitham S. U-NSGA-III: A unified evolutionary algorithm for single, multiple, and
many-objective optimization / S. Haitham, K. Deb // Lecture Notes in Computer
Science. — 2015. — Vol. 6019. — P. 34–49.
16. Annadurai S. Fundamentals of digital image processing / S. Annadurai // Pearson
Education India. — 2007. — 440 p.
17. Hansen N. COCO: A platform for Comparing Continuous Optimizers in a Black-
Box Setting / N. Hansen, A. Auger, O. Mersmann, T. Tuar, D. Brockhoff // Pre-
print ArXiV ID: 1603.08785. — 2016. — 10 p.
Received 19.01.2017
From the Editorial Board: the article corresponds completely to submitted manuscript.
|