A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models
The Berezinskii-Kosterlitz-Thouless transition is a very specific phase transition where all thermodynamic quantities are smooth. Therefore, it is difficult to determine the critical temperature in a precise way. In this paper we demonstrate how neural networks can be used to perform this task. In...
Gespeichert in:
Datum: | 2018 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | English |
Veröffentlicht: |
Інститут фізики конденсованих систем НАН України
2018
|
Schriftenreihe: | Condensed Matter Physics |
Online Zugang: | http://dspace.nbuv.gov.ua/handle/123456789/157119 |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Назва журналу: | Digital Library of Periodicals of National Academy of Sciences of Ukraine |
Zitieren: | A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models / M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka // Condensed Matter Physics. — 2018. — Т. 21, № 3. — С. 33602: 1–11. — Бібліогр.: 32 назв. — англ. |
Institution
Digital Library of Periodicals of National Academy of Sciences of Ukraineid |
irk-123456789-157119 |
---|---|
record_format |
dspace |
spelling |
irk-123456789-1571192019-06-20T01:30:23Z A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models Richter-Laskowska, M. Khan, H. Trivedi, N. Maśka, M.M. The Berezinskii-Kosterlitz-Thouless transition is a very specific phase transition where all thermodynamic quantities are smooth. Therefore, it is difficult to determine the critical temperature in a precise way. In this paper we demonstrate how neural networks can be used to perform this task. In particular, we study how the accuracy of the transition identification depends on the way the neural networks are trained. We apply our approach to three different systems: (i) the classical XY model, (ii) the phase-fermion model, where classical and quantum degrees of freedom are coupled and (iii) the quantum XY model. Перехiд Березинського-Костерлiца-Таулесса є дуже специфiчним фазовим переходом, при якому всi термодинамiчнi величини є неперервними. Тому важко точно визначити критичну температуру. У цiй статтi нами показано, як можна використати нейроннi мережi для розв’язання цього завдання. Зокрема, дослiджено, до якої мiри точнiсть розпiзнавання переходу залежить вiд способу навчання нейронних мереж. Ми застосовуємо наш пiдхiд до трьох рiзних систем: (i) класична XY модель, (ii) фазово-фермiонна модель iз взаємодiєю мiж класичними й квантовими ступенями вiльностi та (iii) квантова XY модель. 2018 Article A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models / M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka // Condensed Matter Physics. — 2018. — Т. 21, № 3. — С. 33602: 1–11. — Бібліогр.: 32 назв. — англ. 1607-324X PACS: 64.60.-i, 05.70.Fh, 07.05.Mh DOI:10.5488/CMP.21.33602 arXiv:1809.09927 http://dspace.nbuv.gov.ua/handle/123456789/157119 en Condensed Matter Physics Інститут фізики конденсованих систем НАН України |
institution |
Digital Library of Periodicals of National Academy of Sciences of Ukraine |
collection |
DSpace DC |
language |
English |
description |
The Berezinskii-Kosterlitz-Thouless transition is a very specific phase transition where all thermodynamic quantities are smooth. Therefore, it is difficult to determine the critical temperature in a precise way. In this paper we
demonstrate how neural networks can be used to perform this task. In particular, we study how the accuracy
of the transition identification depends on the way the neural networks are trained. We apply our approach to
three different systems: (i) the classical XY model, (ii) the phase-fermion model, where classical and quantum
degrees of freedom are coupled and (iii) the quantum XY model. |
format |
Article |
author |
Richter-Laskowska, M. Khan, H. Trivedi, N. Maśka, M.M. |
spellingShingle |
Richter-Laskowska, M. Khan, H. Trivedi, N. Maśka, M.M. A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models Condensed Matter Physics |
author_facet |
Richter-Laskowska, M. Khan, H. Trivedi, N. Maśka, M.M. |
author_sort |
Richter-Laskowska, M. |
title |
A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models |
title_short |
A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models |
title_full |
A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models |
title_fullStr |
A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models |
title_full_unstemmed |
A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models |
title_sort |
machine learning approach to the berezinskii-kosterlitz-thouless transition in classical and quantum models |
publisher |
Інститут фізики конденсованих систем НАН України |
publishDate |
2018 |
url |
http://dspace.nbuv.gov.ua/handle/123456789/157119 |
citation_txt |
A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models / M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka // Condensed Matter Physics. — 2018. — Т. 21, № 3. — С. 33602: 1–11. — Бібліогр.: 32 назв. — англ. |
series |
Condensed Matter Physics |
work_keys_str_mv |
AT richterlaskowskam amachinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels AT khanh amachinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels AT trivedin amachinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels AT maskamm amachinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels AT richterlaskowskam machinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels AT khanh machinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels AT trivedin machinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels AT maskamm machinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels |
first_indexed |
2025-07-14T09:26:28Z |
last_indexed |
2025-07-14T09:26:28Z |
_version_ |
1837613909901574144 |
fulltext |
Condensed Matter Physics, 2018, Vol. 21, No 3, 33602: 1–11
DOI: 10.5488/CMP.21.33602
http://www.icmp.lviv.ua/journal
A machine learning approach to the
Berezinskii-Kosterlitz-Thouless transition in classical
and quantum models
M. Richter-Laskowska1, H. Khan2, N. Trivedi2, M.M. Maśka1
1 Institute of Physics, University of Silesia, 75 Pułku Piechoty 1, 41-500 Chorzów, Poland
2 Department of Physics, The Ohio State University, 191 W. Woodruff Ave., Columbus, Ohio 43210, USA
Received May 31, 2018, in final form August 16, 2018
The Berezinskii-Kosterlitz-Thouless transition is a very specific phase transition where all thermodynamic quan-
tities are smooth. Therefore, it is difficult to determine the critical temperature in a precise way. In this paper we
demonstrate how neural networks can be used to perform this task. In particular, we study how the accuracy
of the transition identification depends on the way the neural networks are trained. We apply our approach to
three different systems: (i) the classical XY model, (ii) the phase-fermion model, where classical and quantum
degrees of freedom are coupled and (iii) the quantum XY model.
Key words: phase transitions, topological defects, XY model, artificial neural networks, machine learning
PACS: 64.60.-i, 05.70.Fh, 07.05.Mh
1. Introduction
In many cases, thermodynamic phase transitions are clearly visible with well defined and easily
identifiable critical points. In the Landau picture, we define a macroscopic order parameter that is non-
zero in the ordered phase and vanishes when we cross the critical temperature. Typically, the transition
is signaled by a discontinuity or divergence of some thermodynamic quantities such as specific heat or
magnetic susceptibility. There also exist unconventional phase transitions which are much more difficult
to find. One example are topological phase transitions, connected to the formation of topological defects,
such as dislocations in two-dimensional crystals, vortices in two-dimensional superconductors and so
on. The proliferation of these defects leads to the Berezinskii-Kosterlitz-Thouless (BKT) phase transition
[1–4] where thermodynamic quantities behave smoothly. Therefore, traditional detection methods based
on, e.g., the divergence of the specific heat or the spin susceptibility, cannot be used. In numerical
approaches, the main difficulty with the identification of the BKT transition stems from the fact that the
interaction between the topological charges depends logarithmically on the spatial separation. Therefore,
numerical results converge very slowly with the size of the system, and a precise determination of the
critical temperature is a computationally challenging task. The standard approach is based on the scaling
properties of, e.g., the spin stiffness or superfluid density. This is particularly difficult for quantummodels,
where numerical methods are usually very involved and memory- and time-consuming.
Besides the traditionalmethodswhich rely on the idea of the order parameter or quantities accessible in
experiments, such as the specific heat ormagnetic susceptibility, attempts to use artificial neutral networks
to identify phases in condensed matter and transition between them have been recently undertaken [5–
20]. These new methods have proven to be accurate and reliable for classical Ising-type models [5–7].
Less spectacular progress has been made for quantum systems [15, 16] and for systems with topological
phase transitions [17–20]. In the latter case, the main difficulty comes from the fact that the topologically
This work is licensed under a Creative Commons Attribution 4.0 International License . Further distribution
of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.
33602-1
https://doi.org/10.5488/CMP.21.33602
http://www.icmp.lviv.ua/journal
http://creativecommons.org/licenses/by/4.0/
M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka
ordered phase is not described by a local order parameter [1–4]. Instead, its formation is connected with
suppression of non-local topological defects which are difficult to identify.
In this paper, we demonstrate the application of machine learning approaches to identify topological
transitions in a few different types of two-dimensional classical and quantum systems. In particular, we
study the classical XY (c-XY) model, the phase-fermion (PF) [21] where the interaction is only between
quantum and classical degrees of freedom, and the fully quantum XY (q-XY) model.
The first of these models has already been thoroughly analyzed in [17]. The authors have shown there
that treating spin configurations as raw images in the case of a feed-forward network does not lead to
the correct value of the critical temperature. Instead, they propose to preprocess the spin configurations
into vorticity and then use the results to train two kinds of artificial neural networks (ANN): a one-layer
feed-forward network and a deep convolutional network. In both cases, the results are scaled with the
system size towards the correct value of the critical temperature, but the one-layer network performed
poorly for large systems. In the present approach, we do not have convolutional layers, but we use a deep
feed-forward network composed of four fully-connected layers (we learned that the choice of particular
meta-parameters is not crucial for the network performance). What is important is that, instead of using
raw spin configurations where each spin is represented by a number from 0 to 2π (which gives rather
inaccurate results, as demonstrated in [17]), we represent the configurations as vectors of sines and
cosines of the spin angles, which reflects the system’s symmetry.
2. Models
The first model describes classical spins of unit length in a square lattice with a nearest neighbor
interaction given by the following Hamiltonian
Hc-XY = −J
∑
〈i, j 〉
cos
(
θi − θ j
)
, (2.1)
where J is the coupling between the spins, and θi describes the direction of spin i. It is known that this
model exhibits the BKT phase transition, and precise finite-size scaling gives the critical temperature
TBKT ≈ 0.8935 in units of J [22].
In the next model, classical spins θi interact with fermions. The Hamiltonian of the PF model is given
by
HPF = −t
∑
〈i, j 〉,σ
ĉ†iσ ĉjσ + g
∑
i
(
eiθi ĉi↑ĉi↓ + h.c.
)
, (2.2)
where ĉ†iσ (ĉiσ) is an operator that creates (annihilates) spin-σ electron at the lattice site i, and g describes
the strength of the interaction between the classical and quantum degrees of freedom. We set the hopping
integral t as the energy unit (t = 1). In this model, the fermions mediate an effective interaction between
the classical spins θi which also leads to the BKT phase transition. The critical temperature is a function
of g, and for g = 2 Monte Carlo (MC) simulations give TBKT ≈ 0.12 [21]. The PF model can be treated
as an approximation of the boson-fermion model [23], valid when fluctuations of the number of bosons
at a lattice site can be neglected.
The Hamiltonian of the last model, the quantum XY model, is given by
Hq-XY =
Ec
2
∑
i
n̂2
i − J
∑
〈i, j 〉
cos
(
θ̂i − θ̂ j
)
, (2.3)
where n̂i is the number operator that is canonically conjugate to the quantum phase operator θ̂i , and Ec
is the charging energy.
To train neural networks and to classify phases, one needs extensive sets of spin configurations
generated at different temperatures. They were produced with the help of the MC simulations. In the case
of the classical XY model, we directly used the Metropolis algorithm. For the PF model, we also used
the Metropolis algorithm, but in each MC step (i.e., for each generated spin configuration) we needed
33602-2
A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition
to diagonalize the fermionic Hamiltonian (2.2) [24]. For the quantum XY model, we used a quantum
MC method in which the Hamiltonian was mapped to a classical action of spins on an effective (2+1)D
lattice. We then used a Wolff cluster algorithm to simulate these spins. In all cases, the simulations were
performed on 16 × 16 systems.
In all these models, the helicity modulus Υ, at finite temperature defined as the second derivative of
the free energy with respect to an externally imposed global twist across the sample [25], has a universal
jump at the BKT transition. In the thermodynamic limit, it drops at TBKT from 2
π
TBKT to zero. However,
in finite systems it evolves smoothly and converges only logarithmically to the thermodynamic limit, as
shown in figures 1 (a), 1 (c), and 1 (e). A finite size scaling of the temperature at which Υ(T) crosses 2
π
T
can give a rough estimate of TBKT. This is demonstrated in figures 1 (b), 1 (d), and 1 (f). It is also possible
to determine TBKT in a more precise way. At the critical temperature, the helicity modulus scales with
the system size according to the Kosterlitz renormalization group (RG) equations [26],
0.000 0.025 0.050 0.075 0.100 0.125 0.150
log(L)−2
0.900
0.925
T
/
J
b)
TBKT = 0.893
0.0 0.2 0.4 0.6 0.8 1.0 1.2
T/J
0.0
0.2
0.4
0.6
0.8
1.0
Υ
a)
16× 16
20× 20
32× 32
48× 48
64× 64
2
π
T
0.000 0.025 0.050 0.075 0.100 0.125 0.150 0.175 0.200
T/t
0.00
0.02
0.04
0.06
0.08
0.10
0.12
Υ
c)
5× 5
8× 8
12× 12
16× 16
36× 36
2
π
T
0.0 0.1 0.2 0.3 0.4 0.5
log(L)−2
0.09
0.10
T
/t
d)
TBKT = 0.089
0.000 0.025 0.050 0.075 0.100 0.125 0.150
log(L)−2
0.900
0.925
T
/J
f)
TBKT =0.891
0.0 0.2 0.4 0.6 0.8 1.0 1.2
T/J
0.0
0.2
0.4
0.6
0.8
1.0
Υ
e)
16× 16
24× 24
32× 32
40× 40
48× 48
2
π
T
Figure 1. (Colour online) Helicity modulus for the c-XY model (a), PF model for g = 4t (c), and q-XY
model for Ec = 0.1J (e). The corresponding critical temperatures extrapolated to the thermodynamic limit
are presented in panels (b), (d), and (f). Since T∗(L) −TBKT ∝ log(L)−2, where T∗(L) is the temperature
at which Υ drops the most rapidly in a L × L system [22, 27, 28], the temperature is presented in these
panels as a function of log(L)−2.
33602-3
M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka
0.85 0.90 0.95
T/J
0.00
0.01
0.02
0.03
0.04
δ
a)
0.08 0.09 0.10
T/t
0.000
0.002
0.004
0.006
0.008
0.010
0.012
δ
b)
0.0 0.5 1.0
T/J
0.0
0.2
0.4
0.6
0.8
1.0
1.2
δ
c)
Figure 2. (Colour online) The root-mean-square error δ for fitting MC results for the c-XY model (a), the
PF model (b), and the q-XYmodel (c) to the RG predictions given by equation (2.4). The sharp minimum
of the fitting errors indicates TBKT. The model parameters are the same as in figure 1.
ΥL = ΥL→∞
[
1 +
1
2
1
ln(L) + C
]
, (2.4)
where L is a linear size of the system and C is a constant. Therefore, if one fits ΥL(T) calculated in
MC simulations for different system sizes L to the RG predictions, the fitting errors drop almost to zero
exactly at TBKT [29]. In figure 2 one can see the procedure.
3. Artificial neural network
Here, however, we study how the critical temperature can be determined with the help of ANN trained
to identify the low- and high-temperature phases. When a system approaches the critical temperature,
thermal fluctuations increase drastically so that the spin configurations can be very different from con-
figurations generated at very low temperatures. However, as long as T is below TBKT, the system is still
in the same low-temperature phase. The problem with systems where the BKT transition takes place,
such as the c-XY, PF or q-XY models, is that in the low-temperature phase, MC simulations on finite
clusters show finite magnetization, which according to the Mermin-Wagner theorem cannot exist in the
thermodynamic limit. Since the MC results are used to train the ANN, it is possible that the network will
learn finite-size features of the spin configurations, such as magnetization. It was proposed in [17] that
this difficulty can be overcome by adding a convolutional layer designed to identify topological defects in
spin configurations. Then, the network learns configurations of vortices and antivortices instead of raw
spin configurations.
Our aim here is to show how the features that can be used to identify phases depend on the distance
from the critical temperature TBKT. The standard ANN-based approach to the problem of finding a
phase transition is to train the ANN to recognize features of the low- and high-temperature phases.
Within the scheme of supervised learning, one needs to feed the ANN with labelled spin configurations
generated at low and high temperatures. The network is supposed to extract characteristic patterns
and to learn how to classify configurations which were not used at the training stage. Usually, this
is an easy task for configurations generated at very low or at very high temperatures, since they are
clearly distinctive. However, to precisely determine the critical temperature, the ANN must distinguish
configurations generated slightly belowTBKT and slightly aboveTBKT. Thermal fluctuations in this regime
make those configurations very different from the fully ordered low-temperature configurations and from
completely random high-temperature configurations. The question then is, whether an ANN trained at
extreme temperatures will be capable of classifying the phases close to TBKT? In other words, do the
33602-4
A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition
distinctive features learned by the ANN at the extremes balance each other out just right in the evaluation
process such that the correct TBKT is predicted? To answer this question we trained the ANN at different
distances from TBKT and checked how the distance |T −TBKT | affects the accuracy of finding TBKT. To be
precise, we usedMC simulations to generate sets of configurations {C1}, {C2}, . . . , {CN } at temperatures
T1, T2, . . . , TN (T1 < T2 < . . . < TN ) below and above TBKT. For the c-XY, PF, and q-XY models, the
ranges of temperatures were from 0.1J to 1.6J, from 0.02t to 0.2t, and from 0.1J to 1.5J, respectively.
Then, we generated sets of configurations representing different low- (Lm) and high-temperature (Hm)
ranges:
Lm =
m⋃
i=1
Ci and Hm =
N⋃
i=N−m+1
Ci , (3.1)
where m < N/2 is the number of temperatures in each range. We randomly removed from Lm andHm
some number of configurations to keep the cardinality of these sets fixed. In this way, our study, on how
the critical temperature predicted by the ANN depends on the range of temperatures used to train it, was
not affected by a different number of configurations for different ranges. In order to connect m with a
temperature range used in the trainings, we introduce τ as a measure of the relative temperature range:
τ =
Thigh − Tlow
TBKT − Tlow
, (3.2)
where Tlow and Thigh are the lowest and the highest temperatures lower than TBKT which were used to
train the ANN.
Configurations from Lm andHm were mixed together and shuffled and then they were used to teach
the ANN to identify the low- and high-temperature phases.
The spin configurations generated in MC are stored as numbers θi from 0 to 2π. However, in order
to take into account the character of classical two-dimensional spins, the configurations were rewritten
as an array composed of cosines and sines: cos θ1, . . . , cos θN, sin θ1, . . . , sin θN , where N = L × L
is the number of lattice sites. This is equivalent to a representation by complex numbers and has the
advantage that almost parallel spins are represented by close numbers, which is not the case for the
original representation by the angles θi .
Since the generation of extensive sets of spin configurations for different system sizes and at different
temperatures is a time-consuming task, especially for the PF and q-XY models, we used a technique
known from machine learning-based image recognition to increase the number of configurations without
additional MC simulations. Namely, we transformed the original configurations according to symmetries
of the system. We used periodic boundary conditions, which allowed us to apply translations to create
new configurations. Other transformations were reflections and rotations. Figure 3 schematically shows
how the ANN is used to classify spin configurations.
The ANN was implemented using the KERAS package with TENSORFLOW as the computational
backend. We used a deep feedforward network with four fully connected hidden layers with 512, 192,
YES
NO
Figure 3. (Colour online) A scheme of a binary classification process in a feedforward ANN: a given spin
configuration {θi}, i = 1, . . . , N , is rewritten as a length-2N vector [cos θ1, . . . , cos θN , sin θ1, . . . , sin θN ]
and is presented to the input layer of theANN (green circles). Then, activations flow across fully connected
hidden layers (red circles) to the output layer composed of only one neuron (violet circle).
33602-5
M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka
weights 1 weights 2 weights 3
weights 4
0.4
0.2
0.0
0.2
0.4
0.6
Figure 4. (Colour online) Visualization of the activations in a deep ANN composed of hidden layers of
512, 192, 64, 16, and 16 neurons. White colour corresponds to zero weight, blue to negative weight and
red to positive weight. “Weights 1” connect the output of layer 1 to layer 2, so they are represented by a
rectangle with 512 × 192 colour squares. Similarly, “weights 2” are represented by a 192 × 64 rectangle,
etc. The network was initiated with all weights close to zero (we assumed a finite value |wi | < 10−3 to
avoid the vanishing gradient problem [30]), so blue and red parts indicate neurons which were activated
during training.
64, 16, and 16 neurons. Such a structure was a balance between the number of training epochs required
for convergence and the time needed for one epoch. It turns out, however, that the metaparameters are
not crucial for the ANN performance. As the activation function, a rectified linear unit (ReLU) was used
in the hidden layers and sigmoid function in the output layer. The network was trained to minimize the
distance between the MC data and the model predictions defined by the binary cross entropy. The loss
function is given by
L = −
∑
i
[yi log pi + (1 − yi) log(1 − pi)] , (3.3)
where yi are labels and pi are the corresponding predictions.
We found that with the Tikhonov regularization (L2) [31], the network performs better in the classi-
fication of the low- and high-temperature phases. Figure 4 shows an example of learned weights used to
classify phases of a 16 × 16 system. One can see that despite a rather large size of the network, most of
the neurons are activated.
4. Results
Figure 5 shows how the predictions for TBKT of the c-XY model calculated by the ANN depend
on the way the network was trained. In each case, we trained the neural network using a 10-fold cross
validation technique [32] and repeated this procedure 10 times. As a result, we obtained 100 possible
values of probability P that a given configuration belongs to the high-temperature phase (the probability
that it belongs to the low-temperature phase is 1 − P). Each time, the ANN was initialized with random
weights and biases, and a larger spread of the predictions indicates a greater difficulty in an unambiguous
classification of the phase. The same method of multiple trainings starting from different random weights
and biases was used in [13] to determine the standard error of the predicted critical temperature. One
can see in figure 5 that even if the ANN was trained only at the extreme temperatures (T = 0.1 and
T = 1.6, m = 1) corresponding to a fully ordered and completely random configurations, the average
predicted TBKT is not far from the actual value. This could be an accidental coincidence because with
an increasing m the deviation slightly increases, but always remains below 10%, which can be seen in
figure 5 (d). For m = 16, the deviation is smaller than the line width. The problem, however, is that in
33602-6
A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition
0.00 0.25 0.50 0.75 1.00 1.25 1.50
0.0
0.5
1.0
P
a)
m = 1, τ = 0
0.00 0.25 0.50 0.75 1.00 1.25 1.50
0.0
0.5
1.0
P
b)
m = 1, τ = 0.56
0.00 0.25 0.50 0.75 1.00 1.25 1.50
T/J
0.0
0.5
1.0
P
c) m = 16, τ = 0.95
0.0 0.2 0.4 0.6 0.8
τ
0.86
0.88
0.90
0.92
0.94
0.96
0.98
T
B
K
T
d)
−2
0
2
4
6
8
10
%
Figure 5. (Colour online) Calculated by the ANN, the probability that a given configuration belongs to
the high-temperature phase of the c-XY model. The network was trained on labelled data generated
according to equation (3.1) for m = 1 (a), m = 10 (b), and m = 16 (c). Then, the network determined
the probabilities 100 times for configurations generated at different temperatures each time starting from
different weights and biases. The vertical error bars show the standard deviation. The solid red line is the
best fit P(T) = 0.5 tanh [α (T − TBKT)] + 0.5, where α and TBKT are fitting parameters. The dashed red
line shows 1 − P(T) which is the probability of being classified as a low-temperature phase. The black
arrows indicate the temperatures (a) or the ranges of temperatures (b), (c) used to train the network. The
black vertical line indicates TBKT determined from the RG equations. The dashed green line in panel (c)
shows magnetization. Comparing panels (a)–(c) one can see that the for the c-XY model, the average
critical temperature does not change significantly with an increasing range of training temperatures, but
the spread of the results shrinks. The temperatures at which the ANN was trained and tested were (in
units of J): from 0.10 to 0.70 and from 1.10 to 1.60 with stepsize 0.05 and from 0.750 to 1.050 with
stepsize 0.025. Panel (d) shows estimated TBKT as a function of τ. The horizontal red line shows the
critical temperature determined from fitting the MC results to the RG equation (2.4). The right-hand
vertical axis shows (TBKT −T0
BKT)/T
0
BKT × 100%, where TBKT is the average ANN prediction and T0
BKT
is the actual critical temperature.
this case, the uncertainty is large. So, for a precise determination of TBKT, the averaging over a large
number of configurations generated at different temperatures is necessary. For example, for m = 1, the
average prediction for 1000 statistically independent configurations is less than 1% off the actual TBKT,
but individual predictions are spread within the range of ±18% around this value. This means that if
one saves the computational time required to generate configurations for training, more configurations
should be generated for the evaluation stage. With an increasing width of the temperature range used
for training, the spread of the calculated probabilities decreases significantly. Figure 5 (d) shows how
the average TBKT depends on the number of different temperatures used in training. On the right-hand
vertical axis, showing the relative error, one can see that the error is always below 10%, and with an
increasing number m, it converges to the actual value of TBKT for the c-XY model.
In the case of the PF model, the results are different. As can be seen in figure 6, the spread of the
calculated probabilities is less dependent onm, but the average critical temperature strongly depends onm.
This means that for the PF model, an increase of the number of configurations used at the stage of phase
classification will not guarantee a more precise estimation of TBKT. Instead, for this model, a sufficiently
33602-7
M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka
0.00 0.05 0.10 0.15 0.20
0.0
0.5
1.0
P
a)
m = 1, τ = 0
0.00 0.05 0.10 0.15 0.20
0.0
0.5
1.0
P
b)
m = 4, τ = 0.3
0.00 0.05 0.10 0.15 0.20
T/t
0.0
0.5
1.0
P
c)
m = 8, τ = 0.7
0.0 0.2 0.4 0.6 0.8 1.0
τ
0.120
0.125
0.130
0.135
0.140
0.145
T
B
K
T
d)
−3
0
3
6
9
12
15
18
21
%
Figure 6. (Colour online) The same as in figure 5, but for the PFmodel. In this case, the critical temperature
is affected by the width of the range training temperatures and a relatively wide range is necessary to
obtain precise TBKT [c.f. figure 6 (b)]. The temperatures at which the ANN was trained and tested were
(in units of t): from 0.02 to 0.20 with stepsize 0.01.
wide range of temperatures at which configurations for training are generated is necessary. From the
physical point of view, this means that in the PF model, extremely low-temperature configurations and
extremely high-temperature configurations are more different from the configurations close to the critical
point than in the case of the c-XY model. It can be seen in figure 6 (d) that for the PF model, the relative
error for m = 1 extends to more than 20%.
The distribution of probabilities calculated for the q-XY model is similar to that for its classical
counterpart. It is presented in figure 7. Though for m = 1, the estimated critical temperature differs from
its real value by 14% [see figure 7 (d)], the difference decreases very quickly with an increasing m and
already for m > 3, the relative error is around 2%.
The results show that in the case of the PF model, a much richer set of configurations is required to
properly train the ANN than for the XY models. The reason can be connected to a different character
of this model. In both the classical and quantum XY models, the interaction range is limited to nearest
neighbors. On the other hand, fermions in the PF model mediate the effective interactions between
arbitrarily spaced lattice sites. This effect is seen in figure 1: at low temperature the helicity modulus in
the c-XY and q-XY models converges even for very small systems. This is not the case for the PF model,
where even at very low temperatures (i.e., in an almost fully ordered state) the energy per lattice site
depends on the system size. This results from the delocalization of fermions which in the ordered state
are similar to quantum particles in an infinite quantum well, with their energies strongly dependent on
the size of the well.
5. Summary
We have demonstrated how the accuracy of finding the BKT transition in three different models
depends on the range of temperatures at which the ANN was trained. We used a simple feedforward
network with densely connected hidden layers. We did not perform any feature engineering of the spin
33602-8
A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition
0.00 0.25 0.50 0.75 1.00 1.25 1.50
0.0
0.5
1.0
P
a)
m = 1, τ = 0
0.00 0.25 0.50 0.75 1.00 1.25 1.50
0.0
0.5
1.0
P
b)
m = 4, τ = 0.5
0.00 0.25 0.50 0.75 1.00 1.25 1.50
T/J
0.0
0.5
1.0
P
c)
m = 8, τ = 0.88
0.0 0.2 0.4 0.6 0.8
τ
0.78
0.80
0.82
0.84
0.86
0.88
0.90
T
B
K
T
d)
−14
−12
−10
−8
−6
−4
−2
0
2
%
Figure 7. (Colour online) The same as in figure 5, but for the q-XY model. Similarly to the c-XY model,
also here the critical temperature is rather insensitive to the width of the training temperatures, but the
spread of the results decreases with an increasing m. The temperatures at which the ANN was trained
and tested were (in units of J): from 0.1 to 0.8 and from 1.2 to 1.5 with stepsize 0.1 and from 0.85 to
1.15 with stepsize 0.05.
configurations generated in MC simulations and we did not use convolutional layers. Therefore, the
phase classification was based on raw spin configurations, not on the explicitly extracted vortices as in
[17]. Nevertheless, in figure 5 (c) we compare the calculated probabilities and magnetization that results
from the finite size of the system. One can see there that the section of P(T) that indicates the BKT
transition is much steeper than the temperature dependence of the magnetization, even if the network was
trained on the extreme temperatures [figure 5 (a)]. Therefore, we believe that the ANN learns not only
the magnetization (which would vanish in the thermodynamic limit), but also some topological features
connected with the BKT transition. One also cannot exclude that the ANN is capable of learning the
character of the spin-spin correlations which change their behavior at the BKT transition. To confirm
this, however, at least a finite size analysis of the ML results would be necessary, which has not been
performed here. However, our aimwas different—wewanted to demonstrate how the critical temperature
determined by the ANN depends on the composition of the training set. As one can expect, the larger
is the variety of the configurations representing the low-temperature and high-temperature phases, the
better is the accuracy of the critical temperature. We found, however, that for the c-XY and q-XY model,
the average TBKT was close to the actual one even if the ANN was trained relatively far from the critical
point. Increasing the range of temperatures at which the network was trained, only slightly improves the
numerical accuracy (i.e., the difference between the average TBKT determined by the ANN and the value
found from the RG equations), but significantly reduces the uncertainty. The situation is different for
the PF model, where the numerical accuracy is strongly dependent on the temperature range used at the
training stage. We attribute this behavior to the long-range effective interaction present in the PF model
which can lead to a longer range of the spin-spin correlations and their different temperature dependence.
Despite the difference in the results for the XY models and the PF model, in all cases it is important
to train the ANN not only at very low and at very high temperatures, but also as close as possible to the
critical temperature. The main problem in training at temperatures close to TBKT is that for supervised
learning, the configurations must be labelled, so one should know at least an approximate value of the
33602-9
M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka
critical temperature. One of the ways to overcome this difficulty is to use the learning by confusion
approach [14] based on a combination of supervised and unsupervised techniques.
Acknowledgements
M.M.M. acknowledges support by NCN (Poland) under grant 2016/23/B/ST3/00647. H.K. and N.T.
acknowledge funding from grant no. NSF DMR 1629382.
References
1. Berezinskii V.L., Sov. Phys. JETP, 1971, 32, 493.
2. Berezinskii V.L., Sov. Phys. JETP, 1972, 34, 610.
3. Kosterlitz J.M., Thouless D.J., J. Phys. C: Solid State Phys., 1972, 5, L124, doi:10.1088/0022-3719/5/11/002.
4. Kosterlitz J.M., Thouless D.J., J. Phys. C: Solid State Phys., 1973, 6, 1181, doi:10.1088/0022-3719/6/7/010.
5. Carrasquilla J., Melko R.G., Nat. Phys., 2017, 13, 431, doi:10.1038/nphys4035.
6. Morningstar A., Melko R., J. Mach. Learn. Res., 2018, 18, 163.
7. Ponte P., Melko R., Phys. Rev. B, 2017, 96, 205146, doi:10.1103/PhysRevB.96.205146.
8. Zhang Y., Kim E.-A., Phys. Rev. Lett., 2017, 118, 216401, doi:10.1103/PhysRevLett.118.216401.
9. Wang L., Phys. Rev. B, 2016, 94, 195105, doi:10.1103/PhysRevB.94.195105.
10. Hu W., Singh R.R.P., Scalettar R.T., Phys. Rev. E, 2017, 95, 062122, doi:10.1103/PhysRevE.95.062122.
11. Wetzel S.J., Phys. Rev. E, 2017, 96, 022140, doi:10.1103/PhysRevE.96.022140.
12. Broecker P., Carrasquilla J., Melko R.G., Trebst S., Sci. Rep., 2017, 7, 8823,
doi:10.1038/s41598-017-09098-0.
13. Ch’ng K., Carrasquilla J., Melko R.G., Khatami E., Phys. Rev. X, 2017, 7, 031038,
doi:10.1103/PhysRevX.7.031038.
14. Van Nieuwenburg E.P.L., Liu Y.-H., Huber S.D., Nat. Phys., 2017, 13, 435, doi:10.1038/nphys4037.
15. Torlai G., Mazzola G., Carrasquilla J., Troyer M., Melko R., Carleo G., Nat. Phys., 2018, 14, 447,
doi:10.1038/s41567-018-0048-5.
16. Carleo G., Troyer M., Science, 2017, 355, 602, doi:10.1126/science.aag2302.
17. Beach M.J.S., Golubeva A., Melko R.G., Phys. Rev. B, 2018, 97, 045207, doi:10.1103/PhysRevB.97.045207.
18. Deng D.-L., Li X., Sarma S.D., Phys. Rev. B, 2017, 96, 195145, doi:10.1103/PhysRevB.96.195145.
19. Zhang W., Liu J., Wei T.-C., Preprint arXiv:1804.02709, 2018.
20. Rodriguez-Nieva J.F., Scheurer M.S., Preprint arXiv:1805.05961, 2018.
21. Maśka M.M., Trivedi N., Preprint arXiv:1706.04197, 2017.
22. Hsieh Y.-D., Kao Y.-J., Sandvik A.W., J. Stat. Mech.: Theory Exp., 2013, 2013, P09001,
doi:10.1088/1742-5468/2013/09/P09001.
23. Micnas R., Ranninger J., Robaszkiewicz S., Rev. Mod. Phys., 1990, 62, 113, doi:10.1103/RevModPhys.62.113.
24. Maśka M.M., Czajka K., Phys. Rev. B, 2006, 74, 035109, doi:10.1103/PhysRevB.74.035109.
25. Fisher M.E., Barber M.N., Jasnow D., Phys. Rev. A, 1973, 8, 1111, doi:10.1103/PhysRevA.8.1111.
26. Kosterlitz J.M., J. Phys. C: Solid State Phys., 1974, 7, 1046, doi:10.1088/0022-3719/7/6/005.
27. Schultka N., Manousakis E., Phys. Rev. B, 1994, 49, 12071, doi:10.1103/PhysRevB.49.12071.
28. Tomita Y., Okabe Y., Phys. Rev. B, 2002, 65, 184405, doi:10.1103/PhysRevB.65.184405.
29. Weber H., Minnhagen P., Phys. Rev. B, 1988, 37, 5986(R), doi:10.1103/PhysRevB.37.5986.
30. Sussillo D., Abbott L.F., Preprint arXiv:1412.6558v3, 2015.
31. Ng A.Y., In: Proceedings of the Twenty-First International Conference on Machine Learning (Banff, Canada,
2004), ACM, New York, 2004, 78, doi:10.1145/1015330.1015435.
32. Gunasegaran T., Cheah Y.-N., In: Proceedings of the 8th International Conference on Information Technology
(Amman, Jordan, 2017), IEEE, 2017, 89–95, doi:10.1109/ICITECH.2017.8079960.
33602-10
https://doi.org/10.1088/0022-3719/5/11/002
https://doi.org/10.1088/0022-3719/6/7/010
https://doi.org/10.1038/nphys4035
https://doi.org/10.1103/PhysRevB.96.205146
https://doi.org/10.1103/PhysRevLett.118.216401
https://doi.org/10.1103/PhysRevB.94.195105
https://doi.org/10.1103/PhysRevE.95.062122
https://doi.org/10.1103/PhysRevE.96.022140
https://doi.org/10.1038/s41598-017-09098-0
https://doi.org/10.1103/PhysRevX.7.031038
https://doi.org/10.1038/nphys4037
https://doi.org/10.1038/s41567-018-0048-5
https://doi.org/10.1126/science.aag2302
https://doi.org/10.1103/PhysRevB.97.045207
https://doi.org/10.1103/PhysRevB.96.195145
http://arxiv.org/abs/1804.02709
http://arxiv.org/abs/1805.05961
http://arxiv.org/abs/1706.04197
https://doi.org/10.1088/1742-5468/2013/09/P09001
https://doi.org/10.1103/RevModPhys.62.113
https://doi.org/10.1103/PhysRevB.74.035109
https://doi.org/10.1103/PhysRevA.8.1111
https://doi.org/10.1088/0022-3719/7/6/005
https://doi.org/10.1103/PhysRevB.49.12071
https://doi.org/10.1103/PhysRevB.65.184405
https://doi.org/10.1103/PhysRevB.37.5986
http://arxiv.org/abs/1412.6558v3
https://doi.org/10.1145/1015330.1015435
https://doi.org/10.1109/ICITECH.2017.8079960
A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition
Застосування машинного навчання до переходу
Березинського-Костерлiца-Таулесса в класичних i квантових
моделях
М. Рiхтер-Лясковська1, Г. Хан2, Н. Трiведi2,М.М.Маська1
1 Iнститут фiзики, Сiлезький унiверситет, вул. 75-го пiхотного полку, 1, 41-500 Хожiв, Польща
2 Факультет фiзики, унiверситет штату Огайо, просп. В. Вудраффа, 191, Колумбус, Огайо 43210, США
Перехiд Березинського-Костерлiца-Таулесса є дуже специфiчним фазовим переходом, при якому всi тер-
модинамiчнi величини є неперервними. Тому важко точно визначити критичну температуру. У цiй статтi
нами показано, як можна використати нейроннi мережi для розв’язання цього завдання. Зокрема, дослi-
джено, до якої мiри точнiсть розпiзнавання переходу залежить вiд способу навчання нейронних мереж.
Ми застосовуємо наш пiдхiд до трьох рiзних систем: (i) класична XY модель, (ii) фазово-фермiонна модель
iз взаємодiєю мiж класичними й квантовими ступенями вiльностi та (iii) квантова XY модель.
Ключовi слова: фазовi переходи, топологiчнi дефекти, XY модель,штучнi нейроннi мережi, машинне
навчання
33602-11
Introduction
Models
Artificial neural network
Results
Summary
|