Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Variational neural annealing


Many important challenges in science and technology can be cast as optimization problems. When viewed in a statistical physics framework, these can be tackled by simulated annealing, where a gradual cooling procedure helps search for ground-state solutions of a target Hamiltonian. Although powerful, simulated annealing is known to have prohibitively slow sampling dynamics when the optimization landscape is rough or glassy. Here we show that, by generalizing the target distribution with a parameterized model, an analogous annealing framework based on the variational principle can be used to search for ground-state solutions. Modern autoregressive models such as recurrent neural networks provide ideal parameterizations because they can be sampled exactly without slow dynamics, even when the model encodes a rough landscape. We implement this procedure in the classical and quantum settings on several prototypical spin glass Hamiltonians and find that, on average, it substantially outperforms traditional simulated annealing in the asymptotic limit, illustrating the potential power of this yet unexplored route to optimization.

This is a preview of subscription content

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Schematic of the space of probability distributions visited during SA.
Fig. 2: Variational neural annealing protocols.
Fig. 3: Methods description and benchmarks for the disordered Ising chain.
Fig. 4: Benchmarking the 2D Edwards–Anderson spin glass.
Fig. 5: An illustration of a dilated RNN used for fully connected spin glasses.
Fig. 6: Benchmarking SA, SQA (P = 100 Trotter slices) and VCA on the SK model and the WPE.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Code availability

The SA code and the SQA code are publicly available at Our variational neural annealing implementation with RNNs is publicly available at The hyperparameters we use are provided in Supplementary Appendix D.


  1. Lucas, A. Ising formulations of many NP problems. Front. Phys. 2, 5 (2014).

    Article  Google Scholar 

  2. Barahona, F. On the computational complexity of Ising spin glass models. J. Phys. A Math. Gen. 15, 3241–3253 (1982).

    MathSciNet  Article  Google Scholar 

  3. Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Optimization by simulated annealing. Science 220, 671–680 (1983).

    MathSciNet  Article  Google Scholar 

  4. Koulamas, C., Antony, S. & Jaen, R. A survey of simulated annealing applications to operations research problems. Omega 22, 41–56 (1994).

    Article  Google Scholar 

  5. Hajek, B. A tutorial survey of theory and applications of simulated annealing. In 1985 24th IEEE Conference on Decision and Control 755–760 (IEEE, 1985).

  6. Svergun, D. Restoring low resolution structure of biological macromolecules from solution scattering using simulated annealing. Biophys. J. 76, 2879–2886 (1999).

    Article  Google Scholar 

  7. Johnson, D. S., Aragon, C. R., McGeoch, L. A. & Schevon, C. Optimization by simulated annealing: an experimental evaluation; Part II, graph coloring and number partitioning. Oper. Res. 39, 378–406 (1991).

    Article  Google Scholar 

  8. Abido, M. A. Robust design of multimachine power system stabilizers using simulated annealing. IEEE Trans. Energy Convers. 15, 297–304 (2000).

    Article  Google Scholar 

  9. Karzig, T., Rahmani, A., von Oppen, F. & Refael, G. Optimal control of Majorana zero modes. Phys. Rev. B 91, 201404 (2015).

    Article  Google Scholar 

  10. Gielen, G., Walscharts, H. & Sansen, W. Analog circuit design optimization based on symbolic simulation and simulated annealing. In Proc. 15th European Solid-State Circuits Conference (ESSCIRC ’89) 252–255 (1989).

  11. Santoro, G. E., Martoňák, R., Tosatti, E. & Car, R. Theory of quantum annealing of an Ising spin glass. Science 295, 2427–2430 (2002).

    Article  Google Scholar 

  12. Brooke, J., Bitko, D., Rosenbaum, T. F. & Aeppli, G. Quantum annealing of a disordered magnet. Science 284, 779–781 (1999).

    Article  Google Scholar 

  13. Mitra, D., Romeo, F. & Sangiovanni-Vincentelli, A. Convergence and finite-time behavior of simulated annealing. Adv. Appl. Probab. 18, 747–771 (1986).

    MathSciNet  Article  Google Scholar 

  14. Delahaye, D., Chaimatanan, S. & Mongeau, M. in Handbook of Heuristics (eds Gendreau, M. & Potvin, J. Y.) 1–35 (Springer, 2019);

  15. Sutskever, I., Martens, J. & Hinton, G. Generating text with recurrent neural networks. In Proc. 28th International Conference on International Conference on Machine Learning (ICML ’11) 1017–1024 (Omnipress, 2011).

  16. Larochelle, H. & Murray, I. The neural autoregressive distribution estimator. In Proc. Fourteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research Vol. 15 (eds Gordon, G. et al.) 29–37 (JMLR, 2011);

  17. Vaswani, A. et al. Attention is all you need. Preprint at (2017).

  18. Wu, D., Wang, L. & Zhang, P. Solving statistical mechanics using variational autoregressive networks. Phys. Rev. Lett. 122, 080602 (2019).

    Article  Google Scholar 

  19. Sharir, O., Levine, Y., Wies, N., Carleo, G. & Shashua, A. Deep autoregressive models for the efficient variational simulation of many-body quantum systems. Phys. Rev. Lett. 124, 020503 (2020).

    Article  Google Scholar 

  20. Hibat-Allah, M., Ganahl, M., Hayward, L. E., Melko, R. G. & Carrasquilla, J. Recurrent neural network wave functions. Phys. Rev. Res 2, 023358 (2020).

    Article  Google Scholar 

  21. Roth, C. Iterative retraining of quantum spin models using recurrent neural networks. Preprint at (2020).

  22. Feynman, R. Statistical Mechanics: a Set of Lectures (Avalon, 1998).

  23. Long, P. M. & Servedio, R. A. Restricted Boltzmann machines are hard to approximately evaluate or simulate. In Proc. 27th International Conference on Machine Learning (ICML ’10) 703–710 (Omnipress, 2010).

  24. Boixo, S. et al. Evidence for quantum annealing with more than one hundred qubits. Nat. Phys. 10, 218–224 (2014).

    Article  Google Scholar 

  25. Kadowaki, T. & Nishimori, H. Quantum annealing in the transverse Ising model. Phys. Rev. E 58, 5355–5363 (1998).

    Article  Google Scholar 

  26. Born, M. & Fock, V. Beweis des Adiabatensatzes. Z. Phys. 51, 165–180 (1928).

    Article  Google Scholar 

  27. Mbeng, G. B., Privitera, L., Arceci, L. & Santoro, G. E. Dynamics of simulated quantum annealing in random Ising chains. Phys. Rev. B 99, 064201 (2019).

    Article  Google Scholar 

  28. Zanca, T. & Santoro, G. E. Quantum annealing speedup over simulated annealing on random Ising chains. Phys. Rev. B 93, 224431 (2016).

    Article  Google Scholar 

  29. Spin Glass Server (Univ. Cologne);

  30. Dickson, N. G. et al. Thermally assisted quantum annealing of a 16-qubit problem. Nat. Commun. 4, 1903 (2013).

    MathSciNet  Article  Google Scholar 

  31. Gomes, J., McKiernan, K. A., Eastman, P. & Pande, V. S. Classical quantum optimization with neural network quantum states. Preprint at (2019).

  32. Sinchenko, S. & Bazhanov, D. The deep learning and statistical physics applications to the problems of combinatorial optimization. Preprint at (2019).

  33. Zhao, T., Carleo, G., Stokes, J. & Veerapaneni, S. Natural evolution strategies and quantum approximate optimization. Preprint at (2020).

  34. Martoňák, R., Santoro, G. E. & Tosatti, E. Quantum annealing by the path-integral Monte Carlo method: the two-dimensional random Ising model. Phys. Rev. B 66, 094203 (2002).

    Article  Google Scholar 

  35. Mezard, M., Parisi, G. & Virasoro, M. Spin Glass Theory and Beyond (World Scientific, 1986);

  36. Sherrington, D. & Kirkpatrick, S. Solvable model of a spin-glass. Phys. Rev. Lett. 35, 1792–1796 (1975).

    Article  Google Scholar 

  37. Hamze, F., Raymond, J., Pattison, C. A., Biswas, K. & Katzgraber, H. G. Wishart planted ensemble: a tunably rugged pairwise Ising model with a first-order phase transition. Phys. Rev. E 101, 052102 (2020).

    MathSciNet  Article  Google Scholar 

  38. Mills, K., Ronagh, P. & Tamblyn, I. Controlled online optimization learning (COOL): finding the ground state of spin Hamiltonians with reinforcement learning. Preprint at (2020).

  39. Bengio, Y., Lodi, A. & Prouvost, A. Machine learning for combinatorial optimization: a methodological tour d'horizon. Eur. J. Oper. Res. 290, 405–421 (2020).

    MathSciNet  Article  Google Scholar 

  40. Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016);

  41. Kelley, R. Sequence modeling with recurrent tensor networks (2016);

  42. Chang, S. et al. Dilated recurrent neural networks. Preprint at (2017).

  43. Bengio, Y., Simard, P. & Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5, 157–166 (1994).

    Article  Google Scholar 

  44. Hihi, S. E. & Bengio, Y. Hierarchical recurrent neural networks for long-term dependencies. In Advances in Neural Information Processing Systems 8 (eds Touretzky, D. S. et al.) 493–499 (MIT Press, 1996);

  45. Vidal, G. Class of quantum many-body states that can be efficiently simulated. Phys. Rev. Lett. 101, 110501 (2008).

    Article  Google Scholar 

  46. Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at (2014).

  47. Bravyi, S., Divincenzo, D. P., Oliveira, R. & Terhal, B. M. The complexity of stoquastic local Hamiltonian problems. Quantum Inf. Comput. 8, 361–385 (2008).

    MathSciNet  MATH  Google Scholar 

  48. Ozfidan, I. et al. Demonstration of a nonstoquastic Hamiltonian in coupled superconducting flux qubits. Phys. Rev. Appl. 13, 034037 (2020).

    Article  Google Scholar 

  49. Mohamed, S., Rosca, M., Figurnov, M. & Mnih, A. Monte Carlo gradient estimation in machine learning. Preprint at (2019).

  50. Zhang, S.-X., Wan, Z.-Q. & Yao, H. Automatic differentiable Monte Carlo: theory and application. Preprint at (2019).

  51. Norris, N. The standard errors of the geometric and harmonic means and their application to index numbers. Ann. Math. Stat. 11, 445–448 (1940).

    MathSciNet  Article  Google Scholar 

  52. Simulated Classical and Quantum Annealing (GitHub, 2021);

  53. Variational Neural Annealing (GitHub, 2021);

Download references


We acknowledge J. Raymond for suggesting to use the Wishart planted ensemble as a benchmark for our variational annealing set-up and for a careful reading of the manuscript. We also thank C. Roth, C. Zhou, M. Ganahl, S. Pilati and G. Santoro for fruitful discussions. We are also grateful to L. Hayward for providing her plotting code to produce our figures using the Matplotlib library. Our RNN implementation is based on Tensorflow and NumPy. We acknowledge support from the Natural Sciences and Engineering Research Council (NSERC), a Canada Research Chair, the Shared Hierarchical Academic Research Computing Network (SHARCNET), Compute Canada, Google Quantum Research Award and the Canadian Institute for Advanced Research (CIFAR) AI chair programme. Resources used in preparing this research were provided, in part, by the Province of Ontario, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute ( Research at Perimeter Institute is supported in part by the Government of Canada through the Department of Innovation, Science and Economic Development Canada and by the Province of Ontario through the Ministry of Economic Development, Job Creation and Trade.

Author information




M.H., E.M.I. and J.C. conceived and designed the research. M.H., E.M.I. and R.W. performed the numerical experiments. All authors contributed to the analysis of the results and writing of the manuscript.

Corresponding author

Correspondence to Mohamed Hibat-Allah.

Ethics declarations

Competing interests

Given the broad applicability of our strategies, we disclose that we have filed a United States provisional patent application protecting our discoveries (patent application no. 63/123,917).

Additional information

Peer review informationNature Machine Intelligence thanks Titus Neupert and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary discussion. Figs. 1–4 and Tables 1 and 2.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hibat-Allah, M., Inack, E.M., Wiersema, R. et al. Variational neural annealing. Nat Mach Intell 3, 952–961 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing