The mammalian cerebral cortex has an unparalleled diversity of cell types, which are generated during development through a series of temporally orchestrated events that are under tight evolutionary constraint and are critical for proper cortical assembly and function1,2. However, the molecular logic that governs the establishment and organization of cortical cell types remains unknown, largely due to the large number of cell classes that undergo dynamic cell-state transitions over extended developmental timelines. Here we generate a comprehensive atlas of the developing mouse neocortex, using single-cell RNA sequencing and single-cell assay for transposase-accessible chromatin using sequencing. We sampled the neocortex every day throughout embryonic corticogenesis and at early postnatal ages, and complemented the sequencing data with a spatial transcriptomics time course. We computationally reconstruct developmental trajectories across the diversity of cortical cell classes, and infer their spatial organization and the gene regulatory programs that accompany their lineage bifurcation decisions and differentiation trajectories. Finally, we demonstrate how this developmental map pinpoints the origin of lineage-specific developmental abnormalities that are linked to aberrant corticogenesis in mutant mice. The data provide a global picture of the regulatory mechanisms that govern cellular diversification in the neocortex.
This is a preview of subscription content
Subscription info for Chinese customers
We have a dedicated website for our Chinese customers. Please go to naturechina.com to subscribe to this journal.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The datasets generated during the current study are available in the Gene Expression Omnibus (GEO SuperSeries GSE153164) and at the Single Cell Portal: https://singlecell.broadinstitute.org/single_cell/study/SCP1290/molecular-logic-of-cellular-diversification-in-the-mammalian-cerebral-cortex.
R markdown scripts enabling the main steps of analysis are available on a GitHub repository https://github.com/ehsanhabibi/MolecularLogicMouseNeoCortex.
Lodato, S. & Arlotta, P. Generating neuronal diversity in the mammalian cerebral cortex. Annu. Rev. Cell Dev. Biol. 31, 699–720 (2015).
Greig, L. C., Woodworth, M. B., Galazo, M. J., Padmanabhan, H. & Macklis, J. D. Molecular logic of neocortical projection neuron specification, development and diversity. Nat. Rev. Neurosci. 14, 755–769 (2013).
Yuzwa, S. A. et al. Developmental emergence of adult neural stem cells as revealed by single-cell transcriptional profiling. Cell Rep. 21, 3970–3986 (2017).
Frazer, S. et al. Transcriptomic and anatomic parcellation of 5-HT3AR expressing cortical interneuron subtypes revealed by single-cell RNA sequencing. Nat. Commun. 8, 14219 (2017).
Mayer, C. et al. Developmental diversification of cortical inhibitory interneurons. Nature 555, 457–462 (2018).
Bielle, F. et al. Multiple origins of Cajal–Retzius cells at the borders of the developing pallium. Nat. Neurosci. 8, 1002–1012 (2005).
Stickels, R. R. et al. Highly sensitive spatial transcriptomics at near-cellular resolution with Slide-seqV2. Nat. Biotechnol. 39, 313–319 (2021).
Biancalani, T. et al. Deep learning and alignment of spatially-resolved whole transcriptomes of single cells in the mouse brain with Tangram. Preprint at https://doi.org/10.1101/2020.08.29.272831 (2020).
Kim, E. J., Juavinett, A. L., Kyubwa, E. M., Jacobs, M. W. & Callaway, E. M. Three types of cortical layer 5 neurons that differ in brain-wide connectivity and function. Neuron 88, 1253–1267 (2015).
Tasic, B. et al. Shared and distinct transcriptomic cell types across neocortical areas. Nature 563, 72–78 (2018).
Allen Cell Types Database. http://celltypes.brain-map.org/rnaseq/mouse_ctx-hip_smart-seq (Allen Institute for Brain Science, 2015).
Farrell, J. A. et al. Single-cell reconstruction of developmental trajectories during zebrafish embryogenesis. Science 360, eaar3131 (2018).
Arlotta, P. et al. Neuronal subtype-specific genes that control corticospinal motor neuron development in vivo. Neuron 45, 207–221 (2005).
Florio, M. & Huttner, W. B. Neural progenitors, neurogenesis and the evolution of the neocortex. Development 141, 2182–2194 (2014).
Jhas, S. et al. Hes6 inhibits astrocyte differentiation and promotes neurogenesis through different mechanisms. J. Neurosci. 26, 11061–11071 (2006).
Malatesta, P. & Götz, M. Radial glia – from boring cables to stem cell stars. Development 140, 483–486 (2013).
Telley, L. et al. Temporal patterning of apical progenitors and their daughter neurons in the developing neocortex. Science 364, eaav2522 (2019).
Llorca, A. et al. A stochastic framework of neurogenesis underlies the assembly of neocortical cytoarchitecture. eLife 8, e51381 (2019).
Guo, C. et al. Fezf2 expression identifies a multipotent progenitor for neocortical projection neurons, astrocytes, and oligodendrocytes. Neuron 80, 1167–1174 (2013).
Gao, P. et al. Deterministic progenitor behavior and unitary production of neurons in the neocortex. Cell 159, 775–788 (2014).
Franco, S. J. et al. Fate-restricted neural progenitors in the mammalian cerebral cortex. Science 337, 746–749 (2012).
Zahr, S. K. et al. A translational repression complex in developing mammalian neural stem cells that regulates neuronal specification. Neuron 97, 520–537.e6 (2018).
Thompson, C. L. et al. A high-resolution spatiotemporal atlas of gene expression of the developing mouse brain. Neuron 83, 309–323 (2014).
Molyneaux, B. J. et al. DeCoN: genome-wide analysis of in vivo transcriptional dynamics during pyramidal neuron fate selection in neocortex. Neuron 85, 275–288 (2015).
Dixit, A. et al. Perturb-seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167, 1853–1866 (2016).
Preissl, S. et al. Single-nucleus analysis of accessible chromatin in developing mouse forebrain reveals cell-type-specific transcriptional regulation. Nat. Neurosci. 21, 432–439 (2018).
Ma, S. et al. Chromatin potential identified by shared single-cell profiling of RNA and chromatin. Cell 183, 1103–1116 (2020).
Urquhart, J. E. et al. DMRTA2 (DMRT5) is mutated in a novel cortical brain malformation. Clin. Genet. 89, 724–727 (2016).
Cubelos, B. et al. Cux1 and Cux2 regulate dendritic branching, spine morphology, and synapses of the upper layer neurons of the cortex. Neuron 66, 523–535 (2010).
Lodato, S. et al. Excitatory projection neuron subtypes control the distribution of local inhibitory interneurons in the cerebral cortex. Neuron 69, 763–779 (2011).
Molyneaux, B. J., Arlotta, P., Hirata, T., Hibi, M. & Macklis, J. D. Fezl is required for the birth and specification of corticospinal motor neurons. Neuron 47, 817–831 (2005).
Chen, B., Schaevitz, L. R. & McConnell, S. K. Fezl regulates the differentiation and axon targeting of layer 5 subcortical projection neurons in cerebral cortex. Proc. Natl Acad. Sci. USA 102, 17184–17189 (2005).
Lodato, S. et al. Gene co-regulation by Fezf2 selects neurotransmitter identity and connectivity of corticospinal neurons. Nat. Neurosci. 17, 1046–1054 (2014).
Hirata, T. et al. Zinc finger gene fez-like functions in the formation of subplate neurons and thalamocortical axons. Dev. Dyn. 230, 546–556 (2004).
Loo, L. et al. Single-cell transcriptomic analysis of mouse neocortical development. Nat. Commun. 10, 134 (2019).
10x Genomics. Nuclei Isolation for Single Cell ATAC Sequencing. Demonstrated Protocol CG000169 (10x Genomics, 2019).
Fleming, S. J., Marioni, J. C. & Babadi, M. CellBender remove-background: a deep generative model for unsupervised removal of background noise from scRNA-seq datasets. Preprint at https://doi.org/10.1101/791699 (2019).
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).
Kowalczyk, M. S. et al. Single-cell RNA-seq reveals changes in cell cycle and differentiation programs upon aging of hematopoietic stem cells. Genome Res. 25, 1860–1872 (2015).
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. P10008 (2008).
McGinnis, C. S., Murrow, L. M. & Gartner, Z. J. DoubletFinder: doublet detection in single-cell RNA sequencing data using artificial nearest neighbors. Cell Syst. 8, 329–337 (2019).
Wolock, S. L., Lopez, R. & Klein, A. M. Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Cell Syst. 8, 281–291 (2019).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Angerer, P. et al. destiny: diffusion maps for large-scale single-cell data in R. Bioinformatics 32, 1241–1243 (2016).
Bergen, V., Lange, M., Peidli, S., Wolf, F. A. & Theis, F. J. Generalizing RNA velocity to transient cell states through dynamical modeling. Nat. Biotechnol. 38, 1408–1414 (2020).
Haghverdi, L., Büttner, M., Wolf, F. A., Buettner, F. & Theis, F. J. Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13, 845–848 (2016).
Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).
Stuart, T., Srivastava, A., Lareau, C. & Satija, R. Multimodal single-cell chromatin analysis with Signac. Preprint at https://doi.org/10.1101/2020.11.09.373613 (2020).
Dong, M. et al. SCDC: bulk gene expression deconvolution by multiple single-cell RNA sequencing references. Brief. Bioinform. 22, 416–427 (2021).
Tan, Y. & Cahan, P. SingleCellNet: a computational tool to classify single cell RNA-seq data across platforms and across species. Cell Syst. 9, 207–213 (2019).
Yu, G., Wang, L. G., Han, Y. & He, Q. Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).
DeCoN. http://decon.fas.harvard.edu/pyramidal/ (2014).
Allen Developing Mouse Brain Atlas. http://developingmouse.brain-map.org/ (Allen Institute for Brain Science, 2008).
We thank former and present members of the Arlotta and Regev laboratories for discussions and editing of the manuscript. This work was supported by grants from the Stanley Center for Psychiatric Research, the Broad Institute of MIT and Harvard to P.A., grants from the National Institutes of Health (P50MH094271, U19MH114821 and R01NS103758 to P.A., and DP5OD024583 to F.C.), and grants from The Klarman Cell Observatory, HHMI and NHGRI Center for Cell Circuits CEGS to A.R. D.J.D. was supported by the Pew Latin American Postdoctoral Fellowship.
P.A. is a scientific advisory board member for System 1 Biosciences and Foresite Labs and is a co-founder of Serqet Therapeutics. A.R. is a co-founder of and an equity holder in Celsius Therapeutics, an equity holder in Immunitas, and until 31 July 2020 was a scientific advisory board member of Thermo Fisher Scientific, Syros Pharmaceuticals, Asimov and Neogene Therapeutics. Since 1 August 2020, A.R. has been an employee of Genentech/Roche. Since 1 February 2021, T.B. has been an employee of Genentech. Since 1 January 2021, G.S. has been an employee of Roche.
Peer review information Nature thanks Debra Silver, Fabian Theis and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Related to Fig. 1. a, Number of replicates, total number of embryos, sex of mice and number of cells analysed per time point. b, Number of genes, number of mRNA molecules (counts) and percentage of mitochondrial counts per cell in each time point. c, Proportion of cells corresponding to the different cell types present in each time point. 85–98% of cells were successfully identified for each time point. The earliest stages were primarily composed of apical and intermediate progenitors: AP + IP = 77% at E10.5, 80% at E11.5, 69% at E12.5, 66% at E13.5. d, Correlation between male (M, Xist expression < 1) and female (F, Xist expression > 1) cells at E12.5 and E18.5 in selected cell types. Pearson correlation coefficients are indicated. Distinct genes include X-chromosome genes Xist and Tsix and Y-chromosome genes Ddx3y and Eif2s3y. Some haemoglobin genes also appear distinct, but, as shown in e they constitute few outlier cells. e, Normalized expression levels of some of distinct genes between male and female cells at E18.5. Only two cell types are shown for clarity. f, UMAP visualization of cells collected at each time point, showing expression levels (normalized) of marker genes for dorsal derivatives (Emx1), apical progenitors (Sox2), intermediate progenitors (Eomes), excitatory neurons (Neurod2, Fezf2, Satb2), inhibitory interneurons (Dlx2) and glial cells (Apoe).
Extended Data Fig. 2 Molecular signatures and interneuron heterogeneity in the developing cerebral cortex.
Related to Fig. 1. a, Selective expression (normalized) of marker genes per cell type in the combined scRNA-seq dataset. Cell types are grouped on the basis of their identity and shared marker genes. b, Gene signatures for all cell types identified in the combined time points. Top 20 differentially expressed genes for each cell type are presented. Cells were down-sampled to a maximum of 500 cells per cell type. c, Expression of canonical marker genes for selected cell types in the UMAP visualization of the combined scRNA-seq time course. d, Different subtypes of interneuron integrate into the developing cortex through time. Left, clustering of interneurons collected at all time points, visualized via UMAP. Middle, interneuron UMAP plots show the expression of the inhibitory markers Dlx2 and Gad2, as well as a marker of dorsally-derived cell types (Emx1), not expressed by interneurons. Right, proportion of cells corresponding to each cluster in each time point. e, Expression of genes characteristic of interneurons of different embryonic origins. Medial ganglionic eminence (MGE)-derived interneurons express Npy, Sst, Lhx6 and Nxph1. Interneurons originating in the CGE (caudal ganglionic eminence) are positive for Htr3a, Prox1, Cxcl14 and Sp8. A second population of Htr3a+ interneurons express Meis2, Etv1 and Sp8, putatively from the pallial–subpallial (P–SP) boundary.
Related to Fig. 2. a, Scheme summarizing the approach to map cell types from the scRNA-seq developmental atlas onto matching tissue spatial transcriptomes (Slide-seq v2) using Tangram8. b, Mapping of extended cell types from the scRNA-seq data onto the matching Slide-seq section. Beads are coloured according to the probability of the cell type being mapped in that position. c, Gene expression of characteristic genes validating cell types matched for each time point. d, Cell-type assignment in the deep layers of P1 cortex shows accurate distribution. Beads are coloured according to the cell type with highest mapping probability. Right, summed probabilities across bins parallel to the ventricular surface (normalized to maximum value). e, Mapping probabilities for the deep layer cell types grouped by the cell type assigned (cell type with highest probability) corresponding to d. In box plots the middle line is the median, the lower and upper hinges correspond to the 25% and 75% quantiles, the upper whisker corresponds to the largest value no larger than 1.5 × IQR from the hinge (where IQR is the interquartile range) and the lower whisker corresponds to the smallest value at most 1.5 × IQR of the lower hinge. Total number of beads = 812. f, Gene expression in E15.5 scRNA-seq data of genes associated with the migrating neuron substates identified in Fig. 2b.
Extended Data Fig. 4 Consistent ordering of cells in developmental trajectories and characterization of branching tree of cortical development.
Related to Fig. 3. a, UMAP visualizations of the scRNA-seq data from combined time points, with cells coloured by pseudotime inferred by different methods. Left to right: URD pseudotime, Monocle3 pseudotime47, Latent time from sc-Velo45, Diffusion pseudotime (DPT)46, and Velocity pseudotime45. Purple represents earlier cells in the trajectory, while yellow labels later cells. Grey indicates cells that were excluded from the trajectory. b, Correlation (red, low; white, high) for all cells between URD pseudotime values and pseudotime calculated by the specified method. R coefficient and P value of the Pearson correlation is stated. c, UMAP visualization of the cells used for trajectory building (same as cells used for Fig. 3a and related figures) coloured by cell type (left) and pseudotime (right), on which a developmental trajectory was calculated using Monocle3. A similar branching structure was found. Although it did not allow for finer segregation of the terminal neuronal types, Monocle3 ascribed a unique trajectory going from progenitors to all classes of neurons, with a post-mitotic branching into CPN and CFuPN branches (arrows, similar to URD). d, Gene expression along trajectories calculated with URD (right) or Monocle3 (left). e, URD trajectory branching tree of the developing cortex. Cells are coloured according to their developmental time of collection. f, g, Normalized fraction of cells corresponding to each time point of collection (f) and to each cell type (g) across binned pseudotime, showing that pseudotime is aligned with age and cell type (compare with Fig. 1c).
Related to Fig. 3. a, Branching trees showing the expression of marker genes of apical progenitors (Sox2, Hes5), intermediate progenitors (Eomes) and excitatory neurons (Neurod2), as well as genes characteristic of the dorsally-derived cortical cell types, including callosal neurons (Satb2, Cux2), layer 4 stellate neurons (Rorb), corticofugal neurons (Fezf2, Tle4, Pcp4, Tcerg1l), putative near-projecting neurons (Tshz2), astrocytes (Slc1a3, Aqp4, Aldh1l1), and ependymocytes (Foxj1). There is a sequential progression of apical progenitors, intermediate progenitors and excitatory neurons, followed by neuronal subtypes, astrocytes and ependymocytes. b, c, Force-directed layout embedding representation of the developmental branching tree, showing the initial part of the tree. Cells are coloured according to their pseudotime value (left), age of collection (middle), or cell type (right). Differentially expressed genes between AP in each branch are highlighted and their expression levels are shown in c (see also Supplementary Table 2). AP corresponding to the astrocytic and neuronal branches form a continuum of cells. d, Tangram mapping probabilities of E13.5 AP from each branch onto matching Slide-seq section show that both states coexist in the ventricular zone. Arrowheads and arrows in the inset show probabilities in individual beads. e, Top, apical progenitors from different ages form a continuum of cells and do not segregate into distinct clusters. AP from all time points were sub-clustered separately, coloured by age (left) and clusters identified by Seurat (right). Bottom, a similar effect is observed when both apical and intermediate progenitors were sub-clustered, cells first separate mostly by cell type (right), and then continuously by time point (left). f, Expression of CPN markers (Satb2, Pou3f3 and Cux1, left), and CFuPN markers (Fezf2, Tle4 and Bcl11b, right) in both early (E12.5) and late (E15.5) AP, as well as in the combined AP populations (all time points), when AP were co-embedded using the top 100 differentially expressed genes between CFuPN and CPN as input for principal component analysis and downstream clustering and visualization. Cell-type marker genes are expressed in progenitors but do not drive clustering of the cells. g, Separation in different classes of neurons occurs post-mitotically. Branching tree and UMAP representation of the full developmental atlas coloured by cell-cycle phase, as predicted by gene expression. h, Tangram mapping of layer 5 and 6 CPN on P1 Slide-seq section. P1 cells allocated to each of the two terminal branches broadly labelled as layer 5&6 CPN were mapped onto the Slide-seq P1 section to find their distribution in the developing cortex. Mapping probabilities (top) indicated that cells from branch 1 were more likely to be mapped to layer 5, while cells from branch 2 mapped with enrichment to layer 6. Genes differentially expressed between both populations, layer 5- (Rorb, Fam19a2) and layer 6-CPN markers (Cdh13, Igsf21, Gnb4) show matching distribution (bottom).
Related to Fig. 3. a–d, New expression patterns emerging from the inferred tree. Expression levels overlaid on the tree (left), UMAP of full scRNA-seq developmental data (middle), and Slide-seq counts on an E15.5 or P1 section of cortex (right) for each gene. a, Rorb is expressed in developing CFuPN, astrocytes and layer 4 stellate neurons and present in the deep cortical plate (CP). b, Pcp4 is expressed in migrating and immature neurons that contribute to both CPN and CFuPN, as well as in SCPN, layer 6b, NP and Cajal–Retzius cells (CR), and is found in the intermediate zone (IZ) and CP. c, Npy is expressed in CFuPN and highly in CPN of layers 5 and 6. A positive Npy signal is evident in the deep CP through Slide-seq. d, Cck was also detected in CFuPN and at higher levels in CPN of layers 5 and 6. Low levels of expression in the CP were detected via Slide-seq. VZ, ventricular zone. e, Validation of expression of novel cell-type-specific genes emerging from the cascade analysis. Expression levels overlaid on the tree (left), time course expression on purified subtypes of PN from DeCoN transcriptomic resource24,52 (middle), and in situ hybridization from the Allen Developing Mouse Brain Atlas23,53 (right, age indicated in figure). f, Complete set of gene programs of connected modules found by NMF. Each circular node represents a module. Modules are horizontally aligned to the developmental stage that the module was computed from, and coloured by the annotated function (see also Supplementary Table 3). g, Scaled expression overlaid on branching tree of modules corresponding to broad neuronal differentiation programs, coloured according to program identity. h, Selected NMF modules expression from scRNA-seq data mapped onto time-matched Slide-seq section using Tangram (Methods).
Related to Fig. 3. Gene cascades for projection neuron subtypes, astrocytes and ependymocytes differentiaton. The x axis represents pseudotime across the tree. Each row is a gene in which gene expression is scaled to the maximum observed expression and then smoothened. Genes are ordered by the pseudotime value at which they enter and then leave ‘peak’ expression (expression 50% higher than minimum value), and start and then leave ‘expression’ (expression 20% higher than minimum value), in that order. Smoothening of expression values was performed using spline fitting from URD for expression dynamics (Methods). Known marker genes for the cell type are labelled; see Supplementary Table 3 for the full list of genes.
Related to Fig. 3. a, Feature importance (0.5 power transformed, dot size) and average expression of genes predicted to be involved in cell-type divergence (row-scaled, colour). Top 10 genes per branch, ranked by their Friedman MSE score (importance) for distinguishing between cells in one branch versus cells in sibling and parent branch. The colour bar at the top indicates branch-points marked on the tree to the left. Genes in red correspond to TFs. Expression in parent branch not shown. b, Gene Ontology analysis showing molecular function enrichment among genes involved in branch-points as determined in a. c, Simplified URD branching trees on which average gene expression within a segment and a pseudotime bin is overlaid on the tree structure, showing restricted expression patterns of genes identified in a.
Extended Data Fig. 9 Characterization of scATAC–seq atlas and developmental trajectories of accessible elements through cortical development.
Related to Fig. 4. a, scATAC–seq data per time point. UMAP visualization of the single cells coloured by their predicted identity from integration with scRNA-seq datasets (left). Gene accessibility of selected markers for main cell types present in each time point (middle). Maximum prediction score for each cell based on labels transferred from scRNA-seq data (right). b, URD chromatin accessibility trajectories during cortical development. Cells are coloured according to their age of collection. c, ATAC trees highlighting the accessibility of marker genes characteristic of the different cortical cell types, including apical and intermediate progenitors, astrocytes, callosal and corticofugal neurons. d, RNA-based tree generated from only the E13.5, E15.5 and E18.5 time points, corresponding to the scATAC–seq data. Trees are coloured by cell type (left) and time of collection (right). e, Chromatin accessibility and gene expression cascades for layers 2&3 CPN and SCPN. Same genes are plotted for both modalities, in the same order. f, Chromatin accessibility and gene expression across pseudotime for illustrative genes from the SCPN cascade, CPN markers, or general neuronal markers plotted on the SCPN cascade. In many cases accessibility rises before gene expression.
Related to Fig. 4. a, Total number of accessible sites identified per time point and fraction that is dynamic across cell types (that is, is enriched in at least one cell type). b, Left, schematic of the approach used to identify candidate cell-type-specific enhancers. Differential expression analysis identified cell-type-specific genes, for which we calculated co-accessibility (correlation higher than 25%) between distal elements (within a 100 kb region) and target gene promoters using Cicero, within each cell type. c, Distal elements co-accessible with the Pcp4 promoter region in E18.5 SCPN and migrating neurons. Cicero co-accessibility is shown in blue curves, detected peaks in each cell type are shown as coloured bars. Black bars correspond to promoter peak, blue bars are peaks selectively co-accessible in CFuPN, and purple bars are peaks only co-accessible in migrating neurons. Boxes indicate TFs for which motifs are present in indicated peaks. Peaks are aligned to coverage plots (bottom) showing combined ATAC reads for the indicated cell types. Chromosome coordinates and genes are indicated at bottom. d, TF binding sites enrichment on accessible sites of cells in the CPN vs CFuPN branch-point (see Fig. 3d) shows significant enrichment of some of the TF detected in Fig. 3d, suggesting an actual role in this step. e, Left, in situ hybridization against Eomes (IP marker), Ube2c (mitotic marker) and Dmrta2 showing expression of the latter in the dorsal ventricular zone (VZ) of E12.5 developing cortex. Right, in situ hybridization against Satb2 and Myt1l showing expression of the latter in newborn neurons, co-expressed with Satb2. Slide-seq gene expression at the indicated ages show similar expression patterns. Scale bars, 30 μm. Representative images from in situ hybridizations repeated in 2 different embryos. ML and DV indicate dorso-ventral and medio-lateral orientations. f, Slide-seq gene expression of several TFs whose binding sites were found to be enriched within the accessible regions of the indicated trajectories (or portion of). Confirmation of gene expression in target cell type supports TF activity.
Related to Fig. 5. a, Violin plots of number of genes (left), number of mRNA molecules (counts; middle), and percentage of mitochondrial counts (right) per cell in control (Het) and Fezf2 knockout, and UMAP visualizations of merged scRNA-seq datasets at E15.5 (top) and P1 (bottom). UMAP visualizations are coloured by genotype or assigned cell type. b, UMAP visualization of single-cell transcriptomes from the excitatory lineage of control and KO cortices at P1 (as shown in Fig. 5c for E15.5), coloured by genotype (left) and cell type (right). Proportion of cells of each cell type by genotype (bottom). c, Heat map showing the overlapping scores between NMF modules identified in the E15.5 Fezf2 datasets and the original E15.5 wild-type modules. All modules were identified with an overlapping score of 40% or higher. d, Left, scaled module expression of significant modules in all cells (two-sided Wilcoxon Sum Rank test, Bonferroni correction). Right, average expression of the top 30 genes from selected modules, in apical and intermediate progenitors, and excitatory neurons, by genotype. Differential expression between control (Fezf2 Het) and knockout neurons, at the single-cell level (two-sided Wilcoxon rank-sum test, Bonferroni correction). e, Gene Ontology terms enriched in the Fezf2-knockout-specific module. f, Confusion matrix for random forest classifier calculated using 1,000 cells per cluster of the wild-type developmental atlas. The remaining held-out cells were used to test accuracy. g, Classification of control (Fezf2 Het) and Fezf2 knockout excitatory neurons by the classifier presented in f, for P1 (left) or E15.5 (right) datasets. Cells are grouped according to their manually assigned identity on the basis of the expression of marker genes. Box plots to the right show the corresponding classification scores in which the middle line is the median, the lower and upper hinges correspond to the 25% and 75% quantiles, the upper whisker corresponds to the largest value no larger than 1.5 × IQR from the hinge and the lower whisker corresponds to the smallest value at most 1.5 × IQR of the lower hinge. Lines in magenta, cyan and green indicate 1, 0.5 and 0 values, respectively. Total number of cells: Fezf2 Het E15.5 = 6,092, Fezf2 knockout E15.5 = 6,110, Fezf2 Het P1 = 5,101, Fezf2 knockout P1 = 4,235.
Extended Data Fig. 12 CFuPN acquire CThPN-like and layers 5&6 CPN-like identities in the absence of Fezf2.
Related to Fig. 5. a–f, Two subtypes of deep-layers knockout cells were identified at E15.5. Sub-clustering of deep-layers knockout-exclusive cells alone at E15.5 (a) shows a Satb2low, Bcl11bhigh cluster (cluster 0), and a Satb2high cluster expressing also CPN markers Cux1 and Pou3f2 (cluster 1), as indicated in the violin plots (b). Differential expression analysis between both subtypes indicates enrichment of CFuPN genes in cluster 0 and CPN genes in cluster 1 (c). d, Comparison to neurons in E15.5 wild-type data showing overlap between differentially expressed genes and markers from E15.5 neuronal subtypes. Bars indicate the number of overlapping genes and are coloured by the adjusted P value calculated by hypergeometric test for significant enrichment. e, Classification of cells from both E15.5 knockout-specific clusters according to random forest classifier shows good agreement between both annotations. f, NMF module expression (as in Extended Data Fig. 11d) in the knockout-specific cells, grouped according to the cell type assigned by the random forest classifier. g, h, Sub-clustering (g) and differential expression analysis (h) of deep-layers knockout-exclusive cells alone at P1 reveals two subpopulations that correspond to CThPN-like and layers 5&6 CPN-like populations. i, Classification of cells from both P1 knockout-specific clusters according to random forest classifier shows good agreement between both annotations. j, Differential expression analysis of the aberrant layer 5&6 CPN-like cells from the knockout-exclusive populations at P1 compared to layers 5&6 CPN (left) or SCPN (right) populations in the control. k, Differential expression analysis of the aberrant CThPN-like cells from the knockout-exclusive populations at P1 compared to CThPN in the control. l, m, In situ hybridization against Bcl11b and Lpl (l) or Ptn (m), in P1 control (wild type) and Fezf2 knockout coronal sections, showing higher levels of expression of Lpl and Ptn on layers 5 and 6 and reduced Bcl11b in layer 5 (insets to the right correspond to boxes in left panels). Note cells expressing both Bcl11b and Lpl in magnification from layer 6, reflecting an aberrant CThPN identity. Number of positive speckles per 104 μm2. Quantification was calculated with a modified pipeline from CellProfiler from an area of ~200 by 150 μm or ~200 by 100 μm centred in layers 6 or 5, respectively. Data correspond to mean ± s.e.m., from n = 3 mice, > 3 sections each. Unpaired t-test, exact P values indicated. Scale bars, 30 μm; except in higher magnification in l, 15 μm. n, Summary of phenotypes in the Fezf2 KO. CThPN acquire an aberrant identity and upregulate CPN genes, while SCPN convert to layers 5&6 CPN-like neurons, which project through the anterior commissure (ac) to the contralateral cortex. th: thalamus, sc: spinal cord, bs: brain stem. o, Violin plots of number of genes (left) and mRNA molecules (counts; middle), and percentage of mitochondrial counts (right) per cell in control and KO Fezf2 E13.5 single cell transcriptomes, and UMAP visualizations of combined control and KO complete datasets, coloured by genotype or assigned cell type. p, Dorsally-derived cells in Fezf2 control and KO E13.5 scRNA-seq, visualized via UMAP and coloured by genotype (left) or cell types (right). Proportion of cells in each cell type, according to their genotype. q, Differential expression analysis between control and KO migrating or immature neurons shows upregulation of a subset of CPN marker genes and downregulation of CFuPN-specific genes.
Marker genes used for mapping scRNA-seq data onto Slide-seq using Tangram.
Differentially expressed genes between apical progenitors of all ages in the neuronal vs. glial branch of the inferred tree.
Genes that constitute the cascades plotted in Extended Data Fig. 7, in the same order as plotted.
Genes and gene scores in NMF modules calculated at each time point of the developmental wild-type atlas.
About this article
Cite this article
Di Bella, D.J., Habibi, E., Stickels, R.R. et al. Molecular logic of cellular diversification in the mouse cerebral cortex. Nature 595, 554–559 (2021). https://doi.org/10.1038/s41586-021-03670-5
Nature Methods (2021)