A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models

The Berezinskii-Kosterlitz-Thouless transition is a very specific phase transition where all thermodynamic quantities are smooth. Therefore, it is difficult to determine the critical temperature in a precise way. In this paper we demonstrate how neural networks can be used to perform this task. In...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Datum:2018
Hauptverfasser: Richter-Laskowska, M., Khan, H., Trivedi, N., Maśka, M.M.
Format: Artikel
Sprache:English
Veröffentlicht: Інститут фізики конденсованих систем НАН України 2018
Schriftenreihe:Condensed Matter Physics
Online Zugang:http://dspace.nbuv.gov.ua/handle/123456789/157119
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Назва журналу:Digital Library of Periodicals of National Academy of Sciences of Ukraine
Zitieren:A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models / M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka // Condensed Matter Physics. — 2018. — Т. 21, № 3. — С. 33602: 1–11. — Бібліогр.: 32 назв. — англ.

Institution

Digital Library of Periodicals of National Academy of Sciences of Ukraine
id irk-123456789-157119
record_format dspace
spelling irk-123456789-1571192019-06-20T01:30:23Z A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models Richter-Laskowska, M. Khan, H. Trivedi, N. Maśka, M.M. The Berezinskii-Kosterlitz-Thouless transition is a very specific phase transition where all thermodynamic quantities are smooth. Therefore, it is difficult to determine the critical temperature in a precise way. In this paper we demonstrate how neural networks can be used to perform this task. In particular, we study how the accuracy of the transition identification depends on the way the neural networks are trained. We apply our approach to three different systems: (i) the classical XY model, (ii) the phase-fermion model, where classical and quantum degrees of freedom are coupled and (iii) the quantum XY model. Перехiд Березинського-Костерлiца-Таулесса є дуже специфiчним фазовим переходом, при якому всi термодинамiчнi величини є неперервними. Тому важко точно визначити критичну температуру. У цiй статтi нами показано, як можна використати нейроннi мережi для розв’язання цього завдання. Зокрема, дослiджено, до якої мiри точнiсть розпiзнавання переходу залежить вiд способу навчання нейронних мереж. Ми застосовуємо наш пiдхiд до трьох рiзних систем: (i) класична XY модель, (ii) фазово-фермiонна модель iз взаємодiєю мiж класичними й квантовими ступенями вiльностi та (iii) квантова XY модель. 2018 Article A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models / M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka // Condensed Matter Physics. — 2018. — Т. 21, № 3. — С. 33602: 1–11. — Бібліогр.: 32 назв. — англ. 1607-324X PACS: 64.60.-i, 05.70.Fh, 07.05.Mh DOI:10.5488/CMP.21.33602 arXiv:1809.09927 http://dspace.nbuv.gov.ua/handle/123456789/157119 en Condensed Matter Physics Інститут фізики конденсованих систем НАН України
institution Digital Library of Periodicals of National Academy of Sciences of Ukraine
collection DSpace DC
language English
description The Berezinskii-Kosterlitz-Thouless transition is a very specific phase transition where all thermodynamic quantities are smooth. Therefore, it is difficult to determine the critical temperature in a precise way. In this paper we demonstrate how neural networks can be used to perform this task. In particular, we study how the accuracy of the transition identification depends on the way the neural networks are trained. We apply our approach to three different systems: (i) the classical XY model, (ii) the phase-fermion model, where classical and quantum degrees of freedom are coupled and (iii) the quantum XY model.
format Article
author Richter-Laskowska, M.
Khan, H.
Trivedi, N.
Maśka, M.M.
spellingShingle Richter-Laskowska, M.
Khan, H.
Trivedi, N.
Maśka, M.M.
A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models
Condensed Matter Physics
author_facet Richter-Laskowska, M.
Khan, H.
Trivedi, N.
Maśka, M.M.
author_sort Richter-Laskowska, M.
title A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models
title_short A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models
title_full A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models
title_fullStr A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models
title_full_unstemmed A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models
title_sort machine learning approach to the berezinskii-kosterlitz-thouless transition in classical and quantum models
publisher Інститут фізики конденсованих систем НАН України
publishDate 2018
url http://dspace.nbuv.gov.ua/handle/123456789/157119
citation_txt A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models / M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka // Condensed Matter Physics. — 2018. — Т. 21, № 3. — С. 33602: 1–11. — Бібліогр.: 32 назв. — англ.
series Condensed Matter Physics
work_keys_str_mv AT richterlaskowskam amachinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels
AT khanh amachinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels
AT trivedin amachinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels
AT maskamm amachinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels
AT richterlaskowskam machinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels
AT khanh machinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels
AT trivedin machinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels
AT maskamm machinelearningapproachtotheberezinskiikosterlitzthoulesstransitioninclassicalandquantummodels
first_indexed 2025-07-14T09:26:28Z
last_indexed 2025-07-14T09:26:28Z
_version_ 1837613909901574144
fulltext Condensed Matter Physics, 2018, Vol. 21, No 3, 33602: 1–11 DOI: 10.5488/CMP.21.33602 http://www.icmp.lviv.ua/journal A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition in classical and quantum models M. Richter-Laskowska1, H. Khan2, N. Trivedi2, M.M. Maśka1 1 Institute of Physics, University of Silesia, 75 Pułku Piechoty 1, 41-500 Chorzów, Poland 2 Department of Physics, The Ohio State University, 191 W. Woodruff Ave., Columbus, Ohio 43210, USA Received May 31, 2018, in final form August 16, 2018 The Berezinskii-Kosterlitz-Thouless transition is a very specific phase transition where all thermodynamic quan- tities are smooth. Therefore, it is difficult to determine the critical temperature in a precise way. In this paper we demonstrate how neural networks can be used to perform this task. In particular, we study how the accuracy of the transition identification depends on the way the neural networks are trained. We apply our approach to three different systems: (i) the classical XY model, (ii) the phase-fermion model, where classical and quantum degrees of freedom are coupled and (iii) the quantum XY model. Key words: phase transitions, topological defects, XY model, artificial neural networks, machine learning PACS: 64.60.-i, 05.70.Fh, 07.05.Mh 1. Introduction In many cases, thermodynamic phase transitions are clearly visible with well defined and easily identifiable critical points. In the Landau picture, we define a macroscopic order parameter that is non- zero in the ordered phase and vanishes when we cross the critical temperature. Typically, the transition is signaled by a discontinuity or divergence of some thermodynamic quantities such as specific heat or magnetic susceptibility. There also exist unconventional phase transitions which are much more difficult to find. One example are topological phase transitions, connected to the formation of topological defects, such as dislocations in two-dimensional crystals, vortices in two-dimensional superconductors and so on. The proliferation of these defects leads to the Berezinskii-Kosterlitz-Thouless (BKT) phase transition [1–4] where thermodynamic quantities behave smoothly. Therefore, traditional detection methods based on, e.g., the divergence of the specific heat or the spin susceptibility, cannot be used. In numerical approaches, the main difficulty with the identification of the BKT transition stems from the fact that the interaction between the topological charges depends logarithmically on the spatial separation. Therefore, numerical results converge very slowly with the size of the system, and a precise determination of the critical temperature is a computationally challenging task. The standard approach is based on the scaling properties of, e.g., the spin stiffness or superfluid density. This is particularly difficult for quantummodels, where numerical methods are usually very involved and memory- and time-consuming. Besides the traditionalmethodswhich rely on the idea of the order parameter or quantities accessible in experiments, such as the specific heat ormagnetic susceptibility, attempts to use artificial neutral networks to identify phases in condensed matter and transition between them have been recently undertaken [5– 20]. These new methods have proven to be accurate and reliable for classical Ising-type models [5–7]. Less spectacular progress has been made for quantum systems [15, 16] and for systems with topological phase transitions [17–20]. In the latter case, the main difficulty comes from the fact that the topologically This work is licensed under a Creative Commons Attribution 4.0 International License . Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI. 33602-1 https://doi.org/10.5488/CMP.21.33602 http://www.icmp.lviv.ua/journal http://creativecommons.org/licenses/by/4.0/ M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka ordered phase is not described by a local order parameter [1–4]. Instead, its formation is connected with suppression of non-local topological defects which are difficult to identify. In this paper, we demonstrate the application of machine learning approaches to identify topological transitions in a few different types of two-dimensional classical and quantum systems. In particular, we study the classical XY (c-XY) model, the phase-fermion (PF) [21] where the interaction is only between quantum and classical degrees of freedom, and the fully quantum XY (q-XY) model. The first of these models has already been thoroughly analyzed in [17]. The authors have shown there that treating spin configurations as raw images in the case of a feed-forward network does not lead to the correct value of the critical temperature. Instead, they propose to preprocess the spin configurations into vorticity and then use the results to train two kinds of artificial neural networks (ANN): a one-layer feed-forward network and a deep convolutional network. In both cases, the results are scaled with the system size towards the correct value of the critical temperature, but the one-layer network performed poorly for large systems. In the present approach, we do not have convolutional layers, but we use a deep feed-forward network composed of four fully-connected layers (we learned that the choice of particular meta-parameters is not crucial for the network performance). What is important is that, instead of using raw spin configurations where each spin is represented by a number from 0 to 2π (which gives rather inaccurate results, as demonstrated in [17]), we represent the configurations as vectors of sines and cosines of the spin angles, which reflects the system’s symmetry. 2. Models The first model describes classical spins of unit length in a square lattice with a nearest neighbor interaction given by the following Hamiltonian Hc-XY = −J ∑ 〈i, j 〉 cos ( θi − θ j ) , (2.1) where J is the coupling between the spins, and θi describes the direction of spin i. It is known that this model exhibits the BKT phase transition, and precise finite-size scaling gives the critical temperature TBKT ≈ 0.8935 in units of J [22]. In the next model, classical spins θi interact with fermions. The Hamiltonian of the PF model is given by HPF = −t ∑ 〈i, j 〉,σ ĉ†iσ ĉjσ + g ∑ i ( eiθi ĉi↑ĉi↓ + h.c. ) , (2.2) where ĉ†iσ (ĉiσ) is an operator that creates (annihilates) spin-σ electron at the lattice site i, and g describes the strength of the interaction between the classical and quantum degrees of freedom. We set the hopping integral t as the energy unit (t = 1). In this model, the fermions mediate an effective interaction between the classical spins θi which also leads to the BKT phase transition. The critical temperature is a function of g, and for g = 2 Monte Carlo (MC) simulations give TBKT ≈ 0.12 [21]. The PF model can be treated as an approximation of the boson-fermion model [23], valid when fluctuations of the number of bosons at a lattice site can be neglected. The Hamiltonian of the last model, the quantum XY model, is given by Hq-XY = Ec 2 ∑ i n̂2 i − J ∑ 〈i, j 〉 cos ( θ̂i − θ̂ j ) , (2.3) where n̂i is the number operator that is canonically conjugate to the quantum phase operator θ̂i , and Ec is the charging energy. To train neural networks and to classify phases, one needs extensive sets of spin configurations generated at different temperatures. They were produced with the help of the MC simulations. In the case of the classical XY model, we directly used the Metropolis algorithm. For the PF model, we also used the Metropolis algorithm, but in each MC step (i.e., for each generated spin configuration) we needed 33602-2 A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition to diagonalize the fermionic Hamiltonian (2.2) [24]. For the quantum XY model, we used a quantum MC method in which the Hamiltonian was mapped to a classical action of spins on an effective (2+1)D lattice. We then used a Wolff cluster algorithm to simulate these spins. In all cases, the simulations were performed on 16 × 16 systems. In all these models, the helicity modulus Υ, at finite temperature defined as the second derivative of the free energy with respect to an externally imposed global twist across the sample [25], has a universal jump at the BKT transition. In the thermodynamic limit, it drops at TBKT from 2 π TBKT to zero. However, in finite systems it evolves smoothly and converges only logarithmically to the thermodynamic limit, as shown in figures 1 (a), 1 (c), and 1 (e). A finite size scaling of the temperature at which Υ(T) crosses 2 π T can give a rough estimate of TBKT. This is demonstrated in figures 1 (b), 1 (d), and 1 (f). It is also possible to determine TBKT in a more precise way. At the critical temperature, the helicity modulus scales with the system size according to the Kosterlitz renormalization group (RG) equations [26], 0.000 0.025 0.050 0.075 0.100 0.125 0.150 log(L)−2 0.900 0.925 T / J b) TBKT = 0.893 0.0 0.2 0.4 0.6 0.8 1.0 1.2 T/J 0.0 0.2 0.4 0.6 0.8 1.0 Υ a) 16× 16 20× 20 32× 32 48× 48 64× 64 2 π T 0.000 0.025 0.050 0.075 0.100 0.125 0.150 0.175 0.200 T/t 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Υ c) 5× 5 8× 8 12× 12 16× 16 36× 36 2 π T 0.0 0.1 0.2 0.3 0.4 0.5 log(L)−2 0.09 0.10 T /t d) TBKT = 0.089 0.000 0.025 0.050 0.075 0.100 0.125 0.150 log(L)−2 0.900 0.925 T /J f) TBKT =0.891 0.0 0.2 0.4 0.6 0.8 1.0 1.2 T/J 0.0 0.2 0.4 0.6 0.8 1.0 Υ e) 16× 16 24× 24 32× 32 40× 40 48× 48 2 π T Figure 1. (Colour online) Helicity modulus for the c-XY model (a), PF model for g = 4t (c), and q-XY model for Ec = 0.1J (e). The corresponding critical temperatures extrapolated to the thermodynamic limit are presented in panels (b), (d), and (f). Since T∗(L) −TBKT ∝ log(L)−2, where T∗(L) is the temperature at which Υ drops the most rapidly in a L × L system [22, 27, 28], the temperature is presented in these panels as a function of log(L)−2. 33602-3 M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka 0.85 0.90 0.95 T/J 0.00 0.01 0.02 0.03 0.04 δ a) 0.08 0.09 0.10 T/t 0.000 0.002 0.004 0.006 0.008 0.010 0.012 δ b) 0.0 0.5 1.0 T/J 0.0 0.2 0.4 0.6 0.8 1.0 1.2 δ c) Figure 2. (Colour online) The root-mean-square error δ for fitting MC results for the c-XY model (a), the PF model (b), and the q-XYmodel (c) to the RG predictions given by equation (2.4). The sharp minimum of the fitting errors indicates TBKT. The model parameters are the same as in figure 1. ΥL = ΥL→∞ [ 1 + 1 2 1 ln(L) + C ] , (2.4) where L is a linear size of the system and C is a constant. Therefore, if one fits ΥL(T) calculated in MC simulations for different system sizes L to the RG predictions, the fitting errors drop almost to zero exactly at TBKT [29]. In figure 2 one can see the procedure. 3. Artificial neural network Here, however, we study how the critical temperature can be determined with the help of ANN trained to identify the low- and high-temperature phases. When a system approaches the critical temperature, thermal fluctuations increase drastically so that the spin configurations can be very different from con- figurations generated at very low temperatures. However, as long as T is below TBKT, the system is still in the same low-temperature phase. The problem with systems where the BKT transition takes place, such as the c-XY, PF or q-XY models, is that in the low-temperature phase, MC simulations on finite clusters show finite magnetization, which according to the Mermin-Wagner theorem cannot exist in the thermodynamic limit. Since the MC results are used to train the ANN, it is possible that the network will learn finite-size features of the spin configurations, such as magnetization. It was proposed in [17] that this difficulty can be overcome by adding a convolutional layer designed to identify topological defects in spin configurations. Then, the network learns configurations of vortices and antivortices instead of raw spin configurations. Our aim here is to show how the features that can be used to identify phases depend on the distance from the critical temperature TBKT. The standard ANN-based approach to the problem of finding a phase transition is to train the ANN to recognize features of the low- and high-temperature phases. Within the scheme of supervised learning, one needs to feed the ANN with labelled spin configurations generated at low and high temperatures. The network is supposed to extract characteristic patterns and to learn how to classify configurations which were not used at the training stage. Usually, this is an easy task for configurations generated at very low or at very high temperatures, since they are clearly distinctive. However, to precisely determine the critical temperature, the ANN must distinguish configurations generated slightly belowTBKT and slightly aboveTBKT. Thermal fluctuations in this regime make those configurations very different from the fully ordered low-temperature configurations and from completely random high-temperature configurations. The question then is, whether an ANN trained at extreme temperatures will be capable of classifying the phases close to TBKT? In other words, do the 33602-4 A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition distinctive features learned by the ANN at the extremes balance each other out just right in the evaluation process such that the correct TBKT is predicted? To answer this question we trained the ANN at different distances from TBKT and checked how the distance |T −TBKT | affects the accuracy of finding TBKT. To be precise, we usedMC simulations to generate sets of configurations {C1}, {C2}, . . . , {CN } at temperatures T1, T2, . . . , TN (T1 < T2 < . . . < TN ) below and above TBKT. For the c-XY, PF, and q-XY models, the ranges of temperatures were from 0.1J to 1.6J, from 0.02t to 0.2t, and from 0.1J to 1.5J, respectively. Then, we generated sets of configurations representing different low- (Lm) and high-temperature (Hm) ranges: Lm = m⋃ i=1 Ci and Hm = N⋃ i=N−m+1 Ci , (3.1) where m < N/2 is the number of temperatures in each range. We randomly removed from Lm andHm some number of configurations to keep the cardinality of these sets fixed. In this way, our study, on how the critical temperature predicted by the ANN depends on the range of temperatures used to train it, was not affected by a different number of configurations for different ranges. In order to connect m with a temperature range used in the trainings, we introduce τ as a measure of the relative temperature range: τ = Thigh − Tlow TBKT − Tlow , (3.2) where Tlow and Thigh are the lowest and the highest temperatures lower than TBKT which were used to train the ANN. Configurations from Lm andHm were mixed together and shuffled and then they were used to teach the ANN to identify the low- and high-temperature phases. The spin configurations generated in MC are stored as numbers θi from 0 to 2π. However, in order to take into account the character of classical two-dimensional spins, the configurations were rewritten as an array composed of cosines and sines: cos θ1, . . . , cos θN, sin θ1, . . . , sin θN , where N = L × L is the number of lattice sites. This is equivalent to a representation by complex numbers and has the advantage that almost parallel spins are represented by close numbers, which is not the case for the original representation by the angles θi . Since the generation of extensive sets of spin configurations for different system sizes and at different temperatures is a time-consuming task, especially for the PF and q-XY models, we used a technique known from machine learning-based image recognition to increase the number of configurations without additional MC simulations. Namely, we transformed the original configurations according to symmetries of the system. We used periodic boundary conditions, which allowed us to apply translations to create new configurations. Other transformations were reflections and rotations. Figure 3 schematically shows how the ANN is used to classify spin configurations. The ANN was implemented using the KERAS package with TENSORFLOW as the computational backend. We used a deep feedforward network with four fully connected hidden layers with 512, 192, YES NO Figure 3. (Colour online) A scheme of a binary classification process in a feedforward ANN: a given spin configuration {θi}, i = 1, . . . , N , is rewritten as a length-2N vector [cos θ1, . . . , cos θN , sin θ1, . . . , sin θN ] and is presented to the input layer of theANN (green circles). Then, activations flow across fully connected hidden layers (red circles) to the output layer composed of only one neuron (violet circle). 33602-5 M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka weights 1 weights 2 weights 3 weights 4 0.4 0.2 0.0 0.2 0.4 0.6 Figure 4. (Colour online) Visualization of the activations in a deep ANN composed of hidden layers of 512, 192, 64, 16, and 16 neurons. White colour corresponds to zero weight, blue to negative weight and red to positive weight. “Weights 1” connect the output of layer 1 to layer 2, so they are represented by a rectangle with 512 × 192 colour squares. Similarly, “weights 2” are represented by a 192 × 64 rectangle, etc. The network was initiated with all weights close to zero (we assumed a finite value |wi | < 10−3 to avoid the vanishing gradient problem [30]), so blue and red parts indicate neurons which were activated during training. 64, 16, and 16 neurons. Such a structure was a balance between the number of training epochs required for convergence and the time needed for one epoch. It turns out, however, that the metaparameters are not crucial for the ANN performance. As the activation function, a rectified linear unit (ReLU) was used in the hidden layers and sigmoid function in the output layer. The network was trained to minimize the distance between the MC data and the model predictions defined by the binary cross entropy. The loss function is given by L = − ∑ i [yi log pi + (1 − yi) log(1 − pi)] , (3.3) where yi are labels and pi are the corresponding predictions. We found that with the Tikhonov regularization (L2) [31], the network performs better in the classi- fication of the low- and high-temperature phases. Figure 4 shows an example of learned weights used to classify phases of a 16 × 16 system. One can see that despite a rather large size of the network, most of the neurons are activated. 4. Results Figure 5 shows how the predictions for TBKT of the c-XY model calculated by the ANN depend on the way the network was trained. In each case, we trained the neural network using a 10-fold cross validation technique [32] and repeated this procedure 10 times. As a result, we obtained 100 possible values of probability P that a given configuration belongs to the high-temperature phase (the probability that it belongs to the low-temperature phase is 1 − P). Each time, the ANN was initialized with random weights and biases, and a larger spread of the predictions indicates a greater difficulty in an unambiguous classification of the phase. The same method of multiple trainings starting from different random weights and biases was used in [13] to determine the standard error of the predicted critical temperature. One can see in figure 5 that even if the ANN was trained only at the extreme temperatures (T = 0.1 and T = 1.6, m = 1) corresponding to a fully ordered and completely random configurations, the average predicted TBKT is not far from the actual value. This could be an accidental coincidence because with an increasing m the deviation slightly increases, but always remains below 10%, which can be seen in figure 5 (d). For m = 16, the deviation is smaller than the line width. The problem, however, is that in 33602-6 A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition 0.00 0.25 0.50 0.75 1.00 1.25 1.50 0.0 0.5 1.0 P a) m = 1, τ = 0 0.00 0.25 0.50 0.75 1.00 1.25 1.50 0.0 0.5 1.0 P b) m = 1, τ = 0.56 0.00 0.25 0.50 0.75 1.00 1.25 1.50 T/J 0.0 0.5 1.0 P c) m = 16, τ = 0.95 0.0 0.2 0.4 0.6 0.8 τ 0.86 0.88 0.90 0.92 0.94 0.96 0.98 T B K T d) −2 0 2 4 6 8 10 % Figure 5. (Colour online) Calculated by the ANN, the probability that a given configuration belongs to the high-temperature phase of the c-XY model. The network was trained on labelled data generated according to equation (3.1) for m = 1 (a), m = 10 (b), and m = 16 (c). Then, the network determined the probabilities 100 times for configurations generated at different temperatures each time starting from different weights and biases. The vertical error bars show the standard deviation. The solid red line is the best fit P(T) = 0.5 tanh [α (T − TBKT)] + 0.5, where α and TBKT are fitting parameters. The dashed red line shows 1 − P(T) which is the probability of being classified as a low-temperature phase. The black arrows indicate the temperatures (a) or the ranges of temperatures (b), (c) used to train the network. The black vertical line indicates TBKT determined from the RG equations. The dashed green line in panel (c) shows magnetization. Comparing panels (a)–(c) one can see that the for the c-XY model, the average critical temperature does not change significantly with an increasing range of training temperatures, but the spread of the results shrinks. The temperatures at which the ANN was trained and tested were (in units of J): from 0.10 to 0.70 and from 1.10 to 1.60 with stepsize 0.05 and from 0.750 to 1.050 with stepsize 0.025. Panel (d) shows estimated TBKT as a function of τ. The horizontal red line shows the critical temperature determined from fitting the MC results to the RG equation (2.4). The right-hand vertical axis shows (TBKT −T0 BKT)/T 0 BKT × 100%, where TBKT is the average ANN prediction and T0 BKT is the actual critical temperature. this case, the uncertainty is large. So, for a precise determination of TBKT, the averaging over a large number of configurations generated at different temperatures is necessary. For example, for m = 1, the average prediction for 1000 statistically independent configurations is less than 1% off the actual TBKT, but individual predictions are spread within the range of ±18% around this value. This means that if one saves the computational time required to generate configurations for training, more configurations should be generated for the evaluation stage. With an increasing width of the temperature range used for training, the spread of the calculated probabilities decreases significantly. Figure 5 (d) shows how the average TBKT depends on the number of different temperatures used in training. On the right-hand vertical axis, showing the relative error, one can see that the error is always below 10%, and with an increasing number m, it converges to the actual value of TBKT for the c-XY model. In the case of the PF model, the results are different. As can be seen in figure 6, the spread of the calculated probabilities is less dependent onm, but the average critical temperature strongly depends onm. This means that for the PF model, an increase of the number of configurations used at the stage of phase classification will not guarantee a more precise estimation of TBKT. Instead, for this model, a sufficiently 33602-7 M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka 0.00 0.05 0.10 0.15 0.20 0.0 0.5 1.0 P a) m = 1, τ = 0 0.00 0.05 0.10 0.15 0.20 0.0 0.5 1.0 P b) m = 4, τ = 0.3 0.00 0.05 0.10 0.15 0.20 T/t 0.0 0.5 1.0 P c) m = 8, τ = 0.7 0.0 0.2 0.4 0.6 0.8 1.0 τ 0.120 0.125 0.130 0.135 0.140 0.145 T B K T d) −3 0 3 6 9 12 15 18 21 % Figure 6. (Colour online) The same as in figure 5, but for the PFmodel. In this case, the critical temperature is affected by the width of the range training temperatures and a relatively wide range is necessary to obtain precise TBKT [c.f. figure 6 (b)]. The temperatures at which the ANN was trained and tested were (in units of t): from 0.02 to 0.20 with stepsize 0.01. wide range of temperatures at which configurations for training are generated is necessary. From the physical point of view, this means that in the PF model, extremely low-temperature configurations and extremely high-temperature configurations are more different from the configurations close to the critical point than in the case of the c-XY model. It can be seen in figure 6 (d) that for the PF model, the relative error for m = 1 extends to more than 20%. The distribution of probabilities calculated for the q-XY model is similar to that for its classical counterpart. It is presented in figure 7. Though for m = 1, the estimated critical temperature differs from its real value by 14% [see figure 7 (d)], the difference decreases very quickly with an increasing m and already for m > 3, the relative error is around 2%. The results show that in the case of the PF model, a much richer set of configurations is required to properly train the ANN than for the XY models. The reason can be connected to a different character of this model. In both the classical and quantum XY models, the interaction range is limited to nearest neighbors. On the other hand, fermions in the PF model mediate the effective interactions between arbitrarily spaced lattice sites. This effect is seen in figure 1: at low temperature the helicity modulus in the c-XY and q-XY models converges even for very small systems. This is not the case for the PF model, where even at very low temperatures (i.e., in an almost fully ordered state) the energy per lattice site depends on the system size. This results from the delocalization of fermions which in the ordered state are similar to quantum particles in an infinite quantum well, with their energies strongly dependent on the size of the well. 5. Summary We have demonstrated how the accuracy of finding the BKT transition in three different models depends on the range of temperatures at which the ANN was trained. We used a simple feedforward network with densely connected hidden layers. We did not perform any feature engineering of the spin 33602-8 A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition 0.00 0.25 0.50 0.75 1.00 1.25 1.50 0.0 0.5 1.0 P a) m = 1, τ = 0 0.00 0.25 0.50 0.75 1.00 1.25 1.50 0.0 0.5 1.0 P b) m = 4, τ = 0.5 0.00 0.25 0.50 0.75 1.00 1.25 1.50 T/J 0.0 0.5 1.0 P c) m = 8, τ = 0.88 0.0 0.2 0.4 0.6 0.8 τ 0.78 0.80 0.82 0.84 0.86 0.88 0.90 T B K T d) −14 −12 −10 −8 −6 −4 −2 0 2 % Figure 7. (Colour online) The same as in figure 5, but for the q-XY model. Similarly to the c-XY model, also here the critical temperature is rather insensitive to the width of the training temperatures, but the spread of the results decreases with an increasing m. The temperatures at which the ANN was trained and tested were (in units of J): from 0.1 to 0.8 and from 1.2 to 1.5 with stepsize 0.1 and from 0.85 to 1.15 with stepsize 0.05. configurations generated in MC simulations and we did not use convolutional layers. Therefore, the phase classification was based on raw spin configurations, not on the explicitly extracted vortices as in [17]. Nevertheless, in figure 5 (c) we compare the calculated probabilities and magnetization that results from the finite size of the system. One can see there that the section of P(T) that indicates the BKT transition is much steeper than the temperature dependence of the magnetization, even if the network was trained on the extreme temperatures [figure 5 (a)]. Therefore, we believe that the ANN learns not only the magnetization (which would vanish in the thermodynamic limit), but also some topological features connected with the BKT transition. One also cannot exclude that the ANN is capable of learning the character of the spin-spin correlations which change their behavior at the BKT transition. To confirm this, however, at least a finite size analysis of the ML results would be necessary, which has not been performed here. However, our aimwas different—wewanted to demonstrate how the critical temperature determined by the ANN depends on the composition of the training set. As one can expect, the larger is the variety of the configurations representing the low-temperature and high-temperature phases, the better is the accuracy of the critical temperature. We found, however, that for the c-XY and q-XY model, the average TBKT was close to the actual one even if the ANN was trained relatively far from the critical point. Increasing the range of temperatures at which the network was trained, only slightly improves the numerical accuracy (i.e., the difference between the average TBKT determined by the ANN and the value found from the RG equations), but significantly reduces the uncertainty. The situation is different for the PF model, where the numerical accuracy is strongly dependent on the temperature range used at the training stage. We attribute this behavior to the long-range effective interaction present in the PF model which can lead to a longer range of the spin-spin correlations and their different temperature dependence. Despite the difference in the results for the XY models and the PF model, in all cases it is important to train the ANN not only at very low and at very high temperatures, but also as close as possible to the critical temperature. The main problem in training at temperatures close to TBKT is that for supervised learning, the configurations must be labelled, so one should know at least an approximate value of the 33602-9 M. Richter-Laskowska, H. Khan, N. Trivedi, M.M. Maśka critical temperature. One of the ways to overcome this difficulty is to use the learning by confusion approach [14] based on a combination of supervised and unsupervised techniques. Acknowledgements M.M.M. acknowledges support by NCN (Poland) under grant 2016/23/B/ST3/00647. H.K. and N.T. acknowledge funding from grant no. NSF DMR 1629382. References 1. Berezinskii V.L., Sov. Phys. JETP, 1971, 32, 493. 2. Berezinskii V.L., Sov. Phys. JETP, 1972, 34, 610. 3. Kosterlitz J.M., Thouless D.J., J. Phys. C: Solid State Phys., 1972, 5, L124, doi:10.1088/0022-3719/5/11/002. 4. Kosterlitz J.M., Thouless D.J., J. Phys. C: Solid State Phys., 1973, 6, 1181, doi:10.1088/0022-3719/6/7/010. 5. Carrasquilla J., Melko R.G., Nat. Phys., 2017, 13, 431, doi:10.1038/nphys4035. 6. Morningstar A., Melko R., J. Mach. Learn. Res., 2018, 18, 163. 7. Ponte P., Melko R., Phys. Rev. B, 2017, 96, 205146, doi:10.1103/PhysRevB.96.205146. 8. Zhang Y., Kim E.-A., Phys. Rev. Lett., 2017, 118, 216401, doi:10.1103/PhysRevLett.118.216401. 9. Wang L., Phys. Rev. B, 2016, 94, 195105, doi:10.1103/PhysRevB.94.195105. 10. Hu W., Singh R.R.P., Scalettar R.T., Phys. Rev. E, 2017, 95, 062122, doi:10.1103/PhysRevE.95.062122. 11. Wetzel S.J., Phys. Rev. E, 2017, 96, 022140, doi:10.1103/PhysRevE.96.022140. 12. Broecker P., Carrasquilla J., Melko R.G., Trebst S., Sci. Rep., 2017, 7, 8823, doi:10.1038/s41598-017-09098-0. 13. Ch’ng K., Carrasquilla J., Melko R.G., Khatami E., Phys. Rev. X, 2017, 7, 031038, doi:10.1103/PhysRevX.7.031038. 14. Van Nieuwenburg E.P.L., Liu Y.-H., Huber S.D., Nat. Phys., 2017, 13, 435, doi:10.1038/nphys4037. 15. Torlai G., Mazzola G., Carrasquilla J., Troyer M., Melko R., Carleo G., Nat. Phys., 2018, 14, 447, doi:10.1038/s41567-018-0048-5. 16. Carleo G., Troyer M., Science, 2017, 355, 602, doi:10.1126/science.aag2302. 17. Beach M.J.S., Golubeva A., Melko R.G., Phys. Rev. B, 2018, 97, 045207, doi:10.1103/PhysRevB.97.045207. 18. Deng D.-L., Li X., Sarma S.D., Phys. Rev. B, 2017, 96, 195145, doi:10.1103/PhysRevB.96.195145. 19. Zhang W., Liu J., Wei T.-C., Preprint arXiv:1804.02709, 2018. 20. Rodriguez-Nieva J.F., Scheurer M.S., Preprint arXiv:1805.05961, 2018. 21. Maśka M.M., Trivedi N., Preprint arXiv:1706.04197, 2017. 22. Hsieh Y.-D., Kao Y.-J., Sandvik A.W., J. Stat. Mech.: Theory Exp., 2013, 2013, P09001, doi:10.1088/1742-5468/2013/09/P09001. 23. Micnas R., Ranninger J., Robaszkiewicz S., Rev. Mod. Phys., 1990, 62, 113, doi:10.1103/RevModPhys.62.113. 24. Maśka M.M., Czajka K., Phys. Rev. B, 2006, 74, 035109, doi:10.1103/PhysRevB.74.035109. 25. Fisher M.E., Barber M.N., Jasnow D., Phys. Rev. A, 1973, 8, 1111, doi:10.1103/PhysRevA.8.1111. 26. Kosterlitz J.M., J. Phys. C: Solid State Phys., 1974, 7, 1046, doi:10.1088/0022-3719/7/6/005. 27. Schultka N., Manousakis E., Phys. Rev. B, 1994, 49, 12071, doi:10.1103/PhysRevB.49.12071. 28. Tomita Y., Okabe Y., Phys. Rev. B, 2002, 65, 184405, doi:10.1103/PhysRevB.65.184405. 29. Weber H., Minnhagen P., Phys. Rev. B, 1988, 37, 5986(R), doi:10.1103/PhysRevB.37.5986. 30. Sussillo D., Abbott L.F., Preprint arXiv:1412.6558v3, 2015. 31. Ng A.Y., In: Proceedings of the Twenty-First International Conference on Machine Learning (Banff, Canada, 2004), ACM, New York, 2004, 78, doi:10.1145/1015330.1015435. 32. Gunasegaran T., Cheah Y.-N., In: Proceedings of the 8th International Conference on Information Technology (Amman, Jordan, 2017), IEEE, 2017, 89–95, doi:10.1109/ICITECH.2017.8079960. 33602-10 https://doi.org/10.1088/0022-3719/5/11/002 https://doi.org/10.1088/0022-3719/6/7/010 https://doi.org/10.1038/nphys4035 https://doi.org/10.1103/PhysRevB.96.205146 https://doi.org/10.1103/PhysRevLett.118.216401 https://doi.org/10.1103/PhysRevB.94.195105 https://doi.org/10.1103/PhysRevE.95.062122 https://doi.org/10.1103/PhysRevE.96.022140 https://doi.org/10.1038/s41598-017-09098-0 https://doi.org/10.1103/PhysRevX.7.031038 https://doi.org/10.1038/nphys4037 https://doi.org/10.1038/s41567-018-0048-5 https://doi.org/10.1126/science.aag2302 https://doi.org/10.1103/PhysRevB.97.045207 https://doi.org/10.1103/PhysRevB.96.195145 http://arxiv.org/abs/1804.02709 http://arxiv.org/abs/1805.05961 http://arxiv.org/abs/1706.04197 https://doi.org/10.1088/1742-5468/2013/09/P09001 https://doi.org/10.1103/RevModPhys.62.113 https://doi.org/10.1103/PhysRevB.74.035109 https://doi.org/10.1103/PhysRevA.8.1111 https://doi.org/10.1088/0022-3719/7/6/005 https://doi.org/10.1103/PhysRevB.49.12071 https://doi.org/10.1103/PhysRevB.65.184405 https://doi.org/10.1103/PhysRevB.37.5986 http://arxiv.org/abs/1412.6558v3 https://doi.org/10.1145/1015330.1015435 https://doi.org/10.1109/ICITECH.2017.8079960 A machine learning approach to the Berezinskii-Kosterlitz-Thouless transition Застосування машинного навчання до переходу Березинського-Костерлiца-Таулесса в класичних i квантових моделях М. Рiхтер-Лясковська1, Г. Хан2, Н. Трiведi2,М.М.Маська1 1 Iнститут фiзики, Сiлезький унiверситет, вул. 75-го пiхотного полку, 1, 41-500 Хожiв, Польща 2 Факультет фiзики, унiверситет штату Огайо, просп. В. Вудраффа, 191, Колумбус, Огайо 43210, США Перехiд Березинського-Костерлiца-Таулесса є дуже специфiчним фазовим переходом, при якому всi тер- модинамiчнi величини є неперервними. Тому важко точно визначити критичну температуру. У цiй статтi нами показано, як можна використати нейроннi мережi для розв’язання цього завдання. Зокрема, дослi- джено, до якої мiри точнiсть розпiзнавання переходу залежить вiд способу навчання нейронних мереж. Ми застосовуємо наш пiдхiд до трьох рiзних систем: (i) класична XY модель, (ii) фазово-фермiонна модель iз взаємодiєю мiж класичними й квантовими ступенями вiльностi та (iii) квантова XY модель. Ключовi слова: фазовi переходи, топологiчнi дефекти, XY модель,штучнi нейроннi мережi, машинне навчання 33602-11 Introduction Models Artificial neural network Results Summary