Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Hierarchical deep reinforcement learning reveals a modular mechanism of cell movement


Time-lapse images of cells and tissues contain rich information about dynamic cell behaviours, which reflect the underlying processes of proliferation, differentiation and morphogenesis. However, we lack computational tools for effective inference. Here we exploit deep reinforcement learning (DRL) to infer cell–cell interactions and collective cell behaviours in tissue morphogenesis from three-dimensional (3D) time-lapse images. We use hierarchical DRL (HDRL), known for multiscale learning and data efficiency, to examine cell migrations based on images with a ubiquitous nuclear label and simple rules formulated from empirical statistics of the images. When applied to Caenorhabditis elegans embryogenesis, HDRL reveals a multiphase, modular organization of cell movement. Imaging with additional cellular markers confirms the modular organization as a novel migration mechanism, which we term sequential rosettes. Furthermore, HDRL forms a transferable model that successfully differentiates sequential rosettes-based migration from others. Our study demonstrates a powerful approach to infer the underlying biology from time-lapse imaging without prior knowledge.

This is a preview of subscription content

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Concepts and design to model cell movement with HDRL.
Fig. 2: Modelling Cpaaa migration in C. elegans embryogenesis.
Fig. 3: Modular organization of Cpaaa movement in HDRL and 3D time-lapse imaging.
Fig. 4: Migration of Cpaaa upon genetic perturbations.
Fig. 5: Validation and characterization of the TMM.
Fig. 6: TMM classification and 3D time-lapse imaging of mu_int_R and CANL migration.

Data availability

The data that support the findings of this study have been deposited at A 50 wild-type C. elegans dataset, embryonic data for Cpaaa training and the TMM evaluation, as well the data for mu_int_R case, are included, named WT50_release, Cpaaa_release, cpaaa_1(2,3) and mu_int_R_CANL_1(2), respectively.

Code availability

Source code with data information and several pre-trained models are available at (


  1. Belthangady, C. & Royer, L. A. Applications, promises and pitfalls of deep learning for fluorescence image reconstruction. Nat. Methods 16, 1215–1225 (2019).

    Article  Google Scholar 

  2. Moen, E. et al. Deep learning for cellular image analysis. Nat. Methods 16, 1233–1246 (2019).

    Article  Google Scholar 

  3. Barnes, K. M. et al. Cadherin preserves cohesion across involuting tissues during C. elegans neurulation.eLife 9, e58626 (2020).

    Article  Google Scholar 

  4. Buggenthin, F. et al. Prospective identification of hematopoietic lineage choice by deep learning. Nat. Methods 14, 403–406 (2017).

    Article  Google Scholar 

  5. Keller, P. J. Imaging morphogenesis: technological advances and biological insights. Science 340, 1234168 (2013).

    Article  Google Scholar 

  6. Ladoux, B. & Mège, R.-M. Mechanobiology of collective cell behaviours. Nat. Rev. Mol. Cell Biol. 18, 743–757 (2017).

    Article  Google Scholar 

  7. Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).

    Article  Google Scholar 

  8. Lillicrap, T. P. et al. Continuous control with deep reinforcement learning. In Proc. 4th International Conference on Learning Representations (eds Bengio, Y. & LeCun, Y.) 1–10 (ICLR, 2016).

  9. Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).

    Article  Google Scholar 

  10. Silver, D. et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362, 1140–1144 (2018).

    MathSciNet  Article  Google Scholar 

  11. Gu, S., Holly, E., Lillicrap, T. & Levine, S. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In Proc. IEEE International Conference on Robotics and Automation (eds Chen, I. M. & Ang, M.), 29-3 (ICRA, 2017).

  12. Nguyen, H. & La, H. Review of deep reinforcement learning for robot manipulation. In Proc. 3rd IEEE International Conference on Robotic Computing (eds Brugali, D., Sheu, P. C.-Y., Siciliano, B. & Tsai, J. J. P.) 590–595 (IEEE, 2019).

  13. Kalashnikov, D. et al. Scalable deep Reinforcement learning for vision-based robotic manipulation. In Proc. 2nd Annual Conference on Robot Learning Vol. 87 (eds Billard, A. & Siegwart, R.) 651–673 (2018).

  14. Arulkumaran, K., Deisenroth, M. P., Brundage, M. & Bharath, A. A. Deep reinforcement learning: a brief survey. IEEE Signal Process. Mag. 34, 26–38 (2017).

    Article  Google Scholar 

  15. Neftci, E. O. & Averbeck, B. B. Reinforcement learning in artificial and biological systems. Nat. Mach. Intell 1, 133–143 (2019).

    Article  Google Scholar 

  16. Sutton, R. S., Precup, D. & Singh, S. Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112, 181–211 (1999).

    MathSciNet  Article  Google Scholar 

  17. Vezhnevets, A. S. et al. FeUdal networks for hierarchical reinforcement learning. In Proc. 34th International Conference on Machine Learning, ICML 2017 Vol. 70 (eds Precup, D. and Teh, Y.) 3540–3549 (ACM, 2017).

  18. Kulkarni, T. D., Narasimhan, K. R., Saeedi, A. & Tenenbaum, J. B. Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation. In Proc. 30th International Conference on Neural Information Processing Systems (eds Lee, D. & Sugiyama, M.) 3682–3690 (ACM, 2016).

  19. Tessler, C., Givony, S., Zahavy, T., Mankowitz, D. J. & Mannor, S. A deep hierarchical approach to lifelong learning in minecraft. In Proc. 31st AAAI Conference on Artificial Intelligence, AAAI 2017 (ed. Zilberstein, S.) 1553–1561 (ACM, 2017).

  20. Sulston, J. E., Schierenberg, E., White, J. G. & Thomson, J. N. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev. Biol. 100, 64–119 (1983).

    Article  Google Scholar 

  21. Bao, Z. et al. Automated cell lineage tracing in Caenorhabditis elegans. Proc. Natl Acad. Sci. USA 103, 2707–2712 (2006).

    Article  Google Scholar 

  22. Santella, A., Du, Z., Nowotschin, S., Hadjantonakis, A. K. & Bao, Z. A hybrid blob-slice model for accurate and efficient detection of fluorescence labeled nuclei in 3D. BMC Bioinformatics 11, 580 (2010).

    Article  Google Scholar 

  23. Santella, A., Du, Z. & Bao, Z. A semi-local neighborhood-based framework for probabilistic cell lineage tracing. BMC Bioinformatics 15, 217 (2014).

    Article  Google Scholar 

  24. Katzman, B., Tang, D., Santella, A. & Bao, Z. AceTree: a major update and case study in the long term maintenance of open-source scientific software. BMC Bioinformatics 19, 121 (2018).

    Article  Google Scholar 

  25. Wang, Z. et al. Deep reinforcement learning of cell movement in the early stage of C. elegans embryogenesis. Bioinformatics 34, 3169–3177 (2018).

    Article  Google Scholar 

  26. Shah, P. K. et al. PCP and SAX-3/Robo pathways cooperate to regulate convergent extension-based nerve cord assembly in C. elegans. Dev. Cell 41, 195–203.e3 (2017).

    Article  Google Scholar 

  27. Moore, J. L., Du, Z. & Bao, Z. Systematic quantification of developmental phenotypes at single-cell resolution during embryogenesis. Development 140, 3266–3274 (2013).

    Article  Google Scholar 

  28. Paré, A. C. et al. A positional Toll receptor code directs convergent extension in Drosophila. Nature 515, 523–527 (2014).

    Article  Google Scholar 

  29. Du, Z. et al. The regulatory landscape of lineage differentiation in a metazoan embryo. Dev. Cell 34, 592–607 (2015).

    Article  Google Scholar 

  30. Hunter, C. P. & Kenyon, C. Spatial and temporal controls target pal-1 blastomere-specification activity to a single blastomere lineage in C. elegans embryos. Cell 87, 217–226 (1996).

    Article  Google Scholar 

  31. Wu, Y. et al. Inverted selective plane illumination microscopy (iSPIM) enables coupled cell identity lineaging and neurodevelopmental imaging in Caenorhabditis elegans. Proc. Natl Acad. Sci. USA 108, 17708–17713 (2011).

    Article  Google Scholar 

  32. Banino, A. et al. Vector-based navigation using grid-like representations in artificial agents. Nature 557, 429–433 (2018).

    Article  Google Scholar 

  33. Wang, Z., Li, H., Wang, D. & Bao, Z. Cell neighbor determination in the metazoan embryo system. In Proc. 8th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics (eds Haspel, N. and Cowen, L.) 305–312 (ACM, 2017).

  34. Santella, A. et al. WormGUIDES: an interactive single cell developmental atlas and tool for collaborative multidimensional data exploration. BMC Bioinformatics 16, 189 (2015).

    Article  Google Scholar 

  35. Wang, Z. et al. An observation-driven agent-based modeling and analysis framework for C. elegans embryogenesis. PLoS ONE 11, e0166551 (2016).

    Article  Google Scholar 

  36. Paszke, A. et al. in Proc. NeurIPS Vol. 32 (eds Wallach, H. et al.) 8024–8035 (NIPS, 2019).

  37. Kazil, J., Masad, D. & Crooks, A. Utilizing Python for Agent-based Modeling: the Mesa Framework Vol. 12268 (eds Thomson, R. et al.) 308–317 (Lecture Notes in Computer Science, Springer, 2020).

  38. Umesh, P. Image processing in Python. CSI Commun. 23, (2012).

  39. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    MathSciNet  MATH  Google Scholar 

  40. Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).

    Article  Google Scholar 

Download references


We thank A. Santella for discussions and technical help and H. Shroff and Q. Morris for critiquing the manuscript. This study was partly supported by an NIH grant (R01GM097576) to Z.B. and D.W. Research in Z.B.’s laboratory is also supported by an NIH centre grant to MSKCC (P30CA008748). This research used resources of the Compute and Data Environment for Science (CADES) at the Oak Ridge National Laboratory, which is supported by the Office of Science of the US Department of Energy under contract no. DE-AC05-00OR22725.

Author information

Authors and Affiliations



Z.W., Y.X., D.W. and Z.B. designed the experiments. Z.W., J.Y. and Y.X. performed the experiments and analysed the data. Z.W., Y.X., D.W., J.Y. and Z.B. wrote the manuscript. D.W. and Z.B. supervised the project.

Corresponding authors

Correspondence to Dali Wang or Zhirong Bao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review information

Nature Machine Intelligence thanks Nico Scherf and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Figs. 1–5, Table 1 and Videos 1–4.

Reporting Summary

Supplementary Video 1

The migration of Cpaaa.

Supplementary Video 2

The Cpaaa training process.

Supplementary Video 3

The migration of mu_int_R.

Supplementary Video 4

The migration of CANL.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, Z., Xu, Y., Wang, D. et al. Hierarchical deep reinforcement learning reveals a modular mechanism of cell movement. Nat Mach Intell 4, 73–83 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing