Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Parallel convolutional processing using an integrated photonic tensor core

A Publisher Correction to this article was published on 23 February 2021

This article has been updated

Abstract

With the proliferation of ultrahigh-speed mobile networks and internet-connected devices, along with the rise of artificial intelligence (AI)1, the world is generating exponentially increasing amounts of data that need to be processed in a fast and efficient way. Highly parallelized, fast and scalable hardware is therefore becoming progressively more important2. Here we demonstrate a computationally specific integrated photonic hardware accelerator (tensor core) that is capable of operating at speeds of trillions of multiply-accumulate operations per second (1012 MAC operations per second or tera-MACs per second). The tensor core can be considered as the optical analogue of an application-specific integrated circuit (ASIC). It achieves parallelized photonic in-memory computing using phase-change-material memory arrays and photonic chip-based optical frequency combs (soliton microcombs3). The computation is reduced to measuring the optical transmission of reconfigurable and non-resonant passive components and can operate at a bandwidth exceeding 14 gigahertz, limited only by the speed of the modulators and photodetectors. Given recent advances in hybrid integration of soliton microcombs at microwave line rates3,4,5, ultralow-loss silicon nitride waveguides6,7, and high-speed on-chip detectors and modulators, our approach provides a path towards full complementary metal–oxide–semiconductor (CMOS) wafer-scale integration of the photonic tensor core. Although we focus on convolutional processing, more generally our results indicate the potential of integrated photonics for parallel, fast, and efficient computational hardware in data-heavy AI applications such as autonomous driving, live video processing, and next-generation cloud computing services.

This is a preview of subscription content

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Photonic in-memory computing using a photonic-chip-based microcomb and PCMs.
Fig. 2: Concept of photonic tensor cores for convolution operations.
Fig. 3: Convolution using sequential MVM operations.
Fig. 4: Convolution using parallel MVM operations.
Fig. 5: Digit recognition with a CNN and scalability.

Data availability

All data used in this study are available from the corresponding author upon reasonable request.

Change history

References

  1. 1.

    Batra, G., Jacobson, Z., Madhav, S., Queirolo, A. & Santhanam, N. Artificial-intelligence hardware: new opportunities for semiconductor companies. https://www.mckinsey.com/industries/semiconductors/our-insights/artificial-intelligence-hardware-new-opportunities-for-semiconductor-companies (McKinsey & Company, 2019).

  2. 2.

    Ben-Nun, T. & Hoefler, T. Demystifying parallel and distributed deep learning: an in-depth concurrency analysis. ACM Comput. Surv. 52, https://doi.org/10.1145/3320060 (2019).

  3. 3.

    Herr, T. et al. Temporal solitons in optical microresonators. Nat. Photon. 8, 145–152 (2014).

    ADS  CAS  Google Scholar 

  4. 4.

    Herr, T., Gorodetsky, M. L. & Kippenberg, T. J. Dissipative Kerr solitons in optical microresonators. In Nonlinear Optical Cavity Dynamics From Microresonators to Fiber Lasers (ed. Grelu, P.) Vol. 8083, Ch. 6, 129–162 (Wiley, 2015).

  5. 5.

    Raja, A. S. et al. Electrically pumped photonic integrated soliton microcomb. Nat. Commun. 10, 680 (2019).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  6. 6.

    Pfeiffer, M. H. P. et al. Photonic damascene process for integrated high-Q microresonator based nonlinear photonics. Optica 3, 20–25 (2016).

    ADS  CAS  Google Scholar 

  7. 7.

    Liu, J. et al. Ultralow-power chip-based soliton microcombs for photonic integration. Optica 5, 1347–1353 (2019).

    ADS  Google Scholar 

  8. 8.

    Machine Learning on AWS https://aws.amazon.com/machine-learning/ (accessed 12 October 2020).

  9. 9.

    Google Cloud AI And Machine Learning Products https://cloud.google.com/products/machine-learning/ (accessed 12 October 2020).

  10. 10.

    Zhang, C. et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks. In ACM/SIGDA Int. Symp. Field-Programmable Gate Arrays (FPGA ’15) https://doi.org/10.1145/2684746.2689060 (2015).

  11. 11.

    Jouppi, N. P. et al. In-datacenter performance analysis of a tensor processing unit. Proc. ISCA ’17 https://doi.org/10.1145/3079856.3080246 (2017).

  12. 12.

    Wang, P. S., Liu, Y., Guo, Y. X., Sun, C. Y. & Tong, X. O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Trans. Graph. 36, https://doi.org/10.1145/3072959.3073608 (2017).

  13. 13.

    Miller, D. A. B. Attojoule optoelectronics for low-energy information processing and communications. J. Lightwave Technol. 35, 346–396 (2017).

    ADS  CAS  Google Scholar 

  14. 14.

    Agrawal, S. R. et al. A many-core architecture for in-memory data processing. In Proc. 50th Annu. IEEE/ACM Int. Symp. Microarchitecture (MICRO-50 ’17) 245–258, https://doi.org/10.1145/3123939.3123985 (IEEE/ACM, 2017).

  15. 15.

    Miller, D. A. B. Are optical transistors the logical next step? Nat. Photon. 4, 3–5 (2010).

    ADS  CAS  Google Scholar 

  16. 16.

    Ielmini, D. & Wong, H. S. P. In-memory computing with resistive switching devices. Nat. Electron. 1, 333–343 (2018).

    Google Scholar 

  17. 17.

    Le Gallo, M. et al. Mixed-precision in-memory computing. Nat. Electron. 1, 246–253 (2018).

    Google Scholar 

  18. 18.

    Boybat, I. et al. Neuromorphic computing with multi-memristive synapses. Nat. Commun. 9, 2514 (2018).

    ADS  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R. & Eleftheriou, E. Memory devices and applications for in-memory computing. Nat. Nanotechnol. 15, 529–544 (2020).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Hu, M. et al. Dot-product engine for neuromorphic computing: programming 1T1M crossbar to accelerate matrix-vector multiplication. In Proc. 53rd Annu. Design Automation Conf. (DAC ’16) https://doi.org/10.1145/2897937.2898010 (ACM Digital Library, 2016).

  21. 21.

    Gong, N. et al. Signal and noise extraction from analog memory elements for neuromorphic computing. Nat. Commun. 9, 2102 (2018).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Joshi, V. et al. Accurate deep neural network inference using computational phase-change memory. Nat. Commun. 11, 2473 (2020).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Yang, T. Y., Park, I. M., Kim, B. J. & Joo, Y. C. Atomic migration in molten and crystalline Ge2Sb2Te5 under high electric field. Appl. Phys. Lett. 95, 032104 (2009).

    ADS  Google Scholar 

  24. 24.

    Koelmans, W. W. et al. Projected phase-change memory devices. Nat. Commun. 6, 8181 (2015).

    ADS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Kim, S. et al. A phase change memory cell with metallic surfactant layer as a resistance drift stabilizer. In 2013 IEEE Int. Electron Devices Meeting https://doi.org/10.1109/IEDM.2013.6724727 (IEEE, 2013).

  26. 26.

    Bell, T. E. Optical computing: a field in flux: a worldwide race is on to develop machines that compute with photons instead of electrons but what is the best approach? IEEE Spectr. 23, 34–38 (1986).

    ADS  Google Scholar 

  27. 27.

    Hamerly, R., Bernstein, L., Sludds, A., Soljačić, M. & Englund, D. Large-scale optical neural networks based on photoelectric multiplication. Phys. Rev. X 9, 021032 (2018).

    Google Scholar 

  28. 28.

    Silva, A. et al. Performing mathematical operations with metamaterials. Science 343, 160–163 (2014).

    ADS  MathSciNet  CAS  PubMed  PubMed Central  MATH  Google Scholar 

  29. 29.

    Lin, X. et al. All-optical machine learning using diffractive deep neural networks. Science 361, 1004–1008 (2018).

    ADS  MathSciNet  CAS  PubMed  PubMed Central  MATH  Google Scholar 

  30. 30.

    Colburn, S., Chu, Y., Shilzerman, E. & Majumdar, A. Optical frontend for a convolutional neural network. Appl. Opt. 58, 3179–3186 (2019).

    ADS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photon. 11, 441–446 (2017).

    ADS  CAS  Google Scholar 

  32. 32.

    Tait, A. N. et al. Silicon photonic modulator neuron. Phys. Rev. Appl. 11, 064043 (2019).

    ADS  CAS  Google Scholar 

  33. 33.

    Pérez, D. et al. Multipurpose silicon photonics signal processor core. Nat. Commun. 8, 636 (2017).

    ADS  PubMed  PubMed Central  Google Scholar 

  34. 34.

    Galal, S. & Horowitz, M. Energy-efficient floating-point unit design. IEEE Trans. Comput. 60, 913–922 (2011).

    MathSciNet  MATH  Google Scholar 

  35. 35.

    Bangari, V. et al. Digital electronics and analog photonics for convolutional neural networks (DEAP-CNNs). IEEE J. Sel. Top. Quantum Electron. 26, https://doi.org/10.1109/JSTQE.2019.2945540 (2020).

  36. 36.

    LeCun, Y., Cortes, C. & Borges, C. J. C. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist.

  37. 37.

    Stern, B., Ji, X., Okawachi, Y., Gaeta, A. L. & Lipson, M. Battery-operated integrated frequency comb generator. Nature 562, 401–405 (2018).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Jones, R. et al. Heterogeneously integrated InP/silicon photonics: fabricating fully functional transceivers. IEEE Nanotechnol. Mag. 13, 17–26 (2019).

    Google Scholar 

  39. 39.

    Marin-Palomo, P. et al. Microresonator-based solitons for massively parallel coherent optical communications. Nature 546, 274–279 (2017).

    ADS  CAS  Google Scholar 

  40. 40.

    Spencer, D. T. et al. An optical-frequency synthesizer using integrated photonics. Nature 557, 81–85 (2018).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Riemensberger, J. et al. Massively parallel coherent laser ranging using soliton microcombs. Nature 581, 164–170 (2019).

    ADS  Google Scholar 

  42. 42.

    Moss, D. J., Morandotti, R., Gaeta, A. L. & Lipson, M. New CMOS-compatible platforms based on silicon nitride and Hydex for nonlinear optics. Nat. Photon. 7, 597–607 (2013).

    ADS  CAS  Google Scholar 

  43. 43.

    He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 2016 Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) https://doi.org/10.1109/CVPR.2016.90 (IEEE, 2016).

  44. 44.

    Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In 3rd Int. Conf. Learning Representations (ICLR 2015) (eds Bengio, Y. & LeCun, Y.) 4 (2015); https://arxiv.org/abs/1409.1556.

  45. 45.

    Al-Ashrafy, M., Salem, A. & Anis, W. An efficient implementation of floating point multiplier. In 2011 Saudi Int. Electronics, Communications and Photonics Conf. (SIECPC) https://doi.org/10.1109/SIECPC.2011.5876905 (2011).

  46. 46.

    Gao, L., Chen, P. Y. & Yu, S. Demonstration of convolution kernel operation on resistive cross-point array. IEEE Electron Device Lett. 37, 870–873 (2016).

    ADS  Google Scholar 

  47. 47.

    Shafiee, A. et al. ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In Proc. 2016 43rd Int. Symp. Computer Architecture (ISCA 2016) https://doi.org/10.1109/ISCA.2016.12 (2016).

  48. 48.

    Li, X. et al. Fast and reliable storage using a 5 bit, nonvolatile photonic memory cell. Optica 6, 1–6 (2019).

    ADS  Google Scholar 

  49. 49.

    Ríos, C. et al. Integrated all-photonic non-volatile multi-level memory. Nat. Photon. 9, 725–732 (2015).

    ADS  Google Scholar 

  50. 50.

    Feldmann, J. et al. Calculating with light using a chip-scale all-optical abacus. Nat. Commun. 8, 1256 (2017).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  51. 51.

    Gehring, H. et al. Low-loss fiber-to-chip couplers with ultrawide optical bandwidth. APL Photon. 4, 010801 (2019).

    ADS  Google Scholar 

  52. 52.

    Gehring, H., Eich, A., Schuck, C. & Pernice, W. H. P. Broadband out-of-plane coupling at visible wavelengths. Opt. Lett. 44, 5089 (2019).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  53. 53.

    Nahmias, M. A. et al. Photonic multiply-accumulate operations for neural networks. IEEE J. Sel. Top. Quantum Electron. https://doi.org/10.1109/jstqe.2019.2941485 (2019).

    Article  Google Scholar 

  54. 54.

    Gehring, H., Blaicher, M., Hartmann, W. & Pernice, W. H. P. Python based open source design framework for integrated nanophotonic and superconducting circuitry with 2D-3D-hybrid integration. OSA Continuum 2, 3091–3101 (2019).

    CAS  Google Scholar 

  55. 55.

    Guo, H. et al. Universal dynamics and deterministic switching of dissipative Kerr solitons in optical microresonators. Nat. Phys. 13, 94–102 (2017).

    CAS  Google Scholar 

  56. 56.

    Karpov, M. et al. Dynamics of soliton crystals in optical microresonators. Nat. Phys. 15, 1071–1077 (2019).

    CAS  Google Scholar 

  57. 57.

    Fialka, O. & Čadík, M. FFT and convolution performance in image filtering on GPU. In Proc. 10th Int. Conf. Information Visualisation (IV’06) https://doi.org/10.1109/IV.2006.53 (IEEE, 2006).

  58. 58.

    Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, https://doi.org/10.1145/3065386 (2017).

  59. 59.

    Szegedy, C. et al. Going deeper with convolutions. In Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR) https://doi.org/10.1109/CVPR.2015.7298594 (IEEE, 2015).

  60. 60.

    Ríos, C. et al. In-memory computing on a photonic platform. Sci. Adv. 5, eaau5759 (2019).

    ADS  PubMed  PubMed Central  Google Scholar 

  61. 61.

    Gaeta, A. L., Lipson, M. & Kippenberg, T. J. Photonic-chip-based frequency combs. Nat. Photon. 13, 158–169 (2019).

    ADS  CAS  Google Scholar 

  62. 62.

    Ma, Y. et al. Ultralow loss single layer submicron silicon waveguide crossing for SOI optical interconnect. Opt. Express 21, 29374–29382 (2013).

    ADS  PubMed  PubMed Central  Google Scholar 

  63. 63.

    Lu, Z. et al. Broadband silicon photonic directional coupler using asymmetric-waveguide based phase control. Opt. Express 23, 3795–3808 (2015).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  64. 64.

    Farmakidis, N. et al. Plasmonic nanogap enhanced phase change devices with dual electrical-optical functionality. Sci. Adv. 5, eaaw2687 (2019).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  65. 65.

    Zhang, H. et al. Miniature multilevel optical memristive switch using phase change material. ACS Photon. 6, 2205–2212 (2019).

    CAS  Google Scholar 

  66. 66.

    Atabaki, A. H. et al. Integrating photonics with silicon nanoelectronics for the next generation of systems on a chip. Nature 556, 349–354 (2018).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  67. 67.

    Wang, X. & Liu, J. Emerging technologies in Si active photonics. J. Semicond. 39, 061001 (2018).

    ADS  Google Scholar 

  68. 68.

    Sun, J., Timurdogan, E., Yaacobi, A., Hosseini, E. S. & Watts, M. R. Large-scale nanophotonic phased array. Nature 493, 195–199 (2013).

    ADS  CAS  Google Scholar 

Download references

Acknowledgements

This research was supported by EPSRC via grants EP/J018694/1, EP/M015173/1 and EP/M015130/1 in the UK and Deutsche Forschungsgemeinschaft (DFG) grant PE 1832/5-1 in Germany. This material is based upon work supported by the Air Force Office of Scientific Research under award number FA9550-19-1-0250. W.H.P.P. gratefully acknowledges support by the European Research Council through grant 724707. We further acknowledge funding for this work from the European Union’s Horizon 2020 Research and Innovation Programme (Fun-COMP project number 780848). A.S. acknowledges support by the European Research Council though grant 682675. H.G. thanks the Studienstiftung des deutschen Volkes for financial support. We thank F. Brückerhoff-Plückelmann, S. Agarwal and W. Zhou for help with sample fabrication and discussions of the experimental results.

Author information

Affiliations

Authors

Contributions

W.H.P.P., H.B., A.S., T.J.K. and C.D.W. conceived the experiment. J.F. fabricated the devices with assistance from N.Y., H.G. and X.L. N.Y. performed the deposition of the Ge2Sb2Te5 material, together with X.L. J.F. implemented the measurement setup and carried out the measurements with help from N.Y., M.K., M.S. and H.G. M.K., X.F., A.L., A.S.R. and J.L. implemented the frequency comb source. All authors discussed the data and wrote the manuscript together.

Corresponding authors

Correspondence to A. Sebastian, T. J. Kippenberg, W. H. P. Pernice or H. Bhaskaran.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Huaqiang Wu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary methods and notes. The file contains Supplementary Tables 1–2, Supplementary Figures 1–22 and Supplementary References. It gives further methodological information on the experimental setups and provides additional data to validate and illustrate the main results of the manuscript.

Peer Review File

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Feldmann, J., Youngblood, N., Karpov, M. et al. Parallel convolutional processing using an integrated photonic tensor core. Nature 589, 52–58 (2021). https://doi.org/10.1038/s41586-020-03070-1

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing