Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Inherited myeloproliferative neoplasm risk affects haematopoietic stem cells


Myeloproliferative neoplasms (MPNs) are blood cancers that are characterized by the excessive production of mature myeloid cells and arise from the acquisition of somatic driver mutations in haematopoietic stem cells (HSCs). Epidemiological studies indicate a substantial heritable component of MPNs that is among the highest known for cancers1. However, only a limited number of genetic risk loci have been identified, and the underlying biological mechanisms that lead to the acquisition of MPNs remain unclear. Here, by conducting a large-scale genome-wide association study (3,797 cases and 1,152,977 controls), we identify 17 MPN risk loci (P < 5.0 × 10−8), 7 of which have not been previously reported. We find that there is a shared genetic architecture between MPN risk and several haematopoietic traits from distinct lineages; that there is an enrichment for MPN risk variants within accessible chromatin of HSCs; and that increased MPN risk is associated with longer telomere length in leukocytes and other clonal haematopoietic states—collectively suggesting that MPN risk is associated with the function and self-renewal of HSCs. We use gene mapping to identify modulators of HSC biology linked to MPN risk, and show through targeted variant-to-function assays that CHEK2 and GFI1B have roles in altering the function of HSCs to confer disease risk. Overall, our results reveal a previously unappreciated mechanism for inherited MPN risk through the modulation of HSC function.

This is a preview of subscription content

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Genetic architecture of inherited MPN risk.
Fig. 2: Functional enrichments in MPN risk.
Fig. 3: Target genes for MPN risk.
Fig. 4: Characterizing the mechanisms of two MPN risk variants.

Data availability

Summary statistics for variants with fine-mapped PP > 0.1% from the full GWAS meta-analysis (UKBB, Finngen and 23andMe) are available in Supplementary Table 5. Full summary statistics from 23andMe data cannot be reported owing to a clause in the 23andMe data transfer agreement, intended to protect the privacy of the 23andMe research participants. Thus, we provide full summary statistics for the MPN meta-analysis comprising UK Biobank and Finngen cohorts on GWAS Catalog under the accession code GCST90000032 ( To fully recreate our meta-analysis results for MPN: researchers can (1) obtain MPN summary statistics from 23andMe (; and (2) conduct a meta-analysis of our summary statistics with the 23andMe summary statistics. For downloads of FinnGen summary statistics, information on how to access individual level FinnGen data by application to responsible agencies (FinBB and THL), and other collaborative access inquiries, please see Individual genetic and phenotypic data for the following cohorts are available by application: UKBB ( and Million Veteran Program (

Code availability

Code and source data required for reproducing results and figures discussed herein are available on GitHub (


  1. 1.

    Sud, A. et al. Familial risks of acute myeloid leukemia, myelodysplastic syndromes, and myeloproliferative neoplasms. Blood 132, 973–976 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. 2.

    Landgren, O. et al. Increased risks of polycythemia vera, essential thrombocythemia, and myelofibrosis among 24,577 first-degree relatives of 11,039 patients with myeloproliferative neoplasms in Sweden. Blood 112, 2199–2204 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. 3.

    Brewer, H. R., Jones, M. E., Schoemaker, M. J., Ashworth, A. & Swerdlow, A. J. Family history and risk of breast cancer: an analysis accounting for family structure. Breast Cancer Res. Treat. 165, 193–200 (2017).

    PubMed  PubMed Central  Google Scholar 

  4. 4.

    Albright, F. et al. Prostate cancer risk prediction based on complete prostate cancer family history. Prostate 75, 390–398 (2015).

    PubMed  Google Scholar 

  5. 5.

    Johns, L. E. & Houlston, R. S. A systematic review and meta-analysis of familial colorectal cancer risk. Am. J. Gastroenterol. 96, 2992–3003 (2001).

    CAS  PubMed  Google Scholar 

  6. 6.

    Tapper, W. et al. Genetic variation at MECOM, TERT, JAK2 and HBS1L-MYB predisposes to myeloproliferative neoplasms. Nat. Commun. 6, 6691 (2015).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Hinds, D. A. et al. Germ line variants predispose to both JAK2 V617F clonal hematopoiesis and myeloproliferative neoplasms. Blood 128, 1121–1128 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    Jones, A. V. et al. JAK2 haplotype is a major risk factor for the development of myeloproliferative neoplasms. Nat. Genet. 41, 446–449 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Olcaydu, D. et al. A common JAK2 haplotype confers susceptibility to myeloproliferative neoplasms. Nat. Genet. 41, 450–454 (2009).

  12. 12.

    Kilpivaara, O. et al. A germline JAK2 SNP is associated with predisposition to the development of JAK2 V617F-positive myeloproliferative neoplasms. Nat. Genet. 41, 455-459 (2009).

  13. 13.

    Wakefield, J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am. J. Hum. Genet. 81, 208–227 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Ulirsch, J. C. et al. Interrogation of human hematopoiesis at single-cell and single-variant resolution. Nat. Genet. 51, 683–693 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Kimura, M. et al. Synchrony of telomere length among hematopoietic cells. Exp. Hematol. 38, 854–859 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Morrison, S. J., Prowse, K. R., Ho, P. & Weissman, I. L. Telomerase activity in hematopoietic cells is associated with self-renewal potential. Immunity 5, 207–216 (1996).

    CAS  PubMed  Google Scholar 

  17. 17.

    Yamaguchi, H. et al. Mutations in TERT, the gene for telomerase reverse transcriptase, in aplastic anemia. N. Engl. J. Med. 352, 1413–1424 (2005).

    CAS  PubMed  Google Scholar 

  18. 18.

    Li, C. et al. Genome-wide association analysis in humans links nucleotide metabolism to leukocyte telomere length. Am. J. Hum. Genet. 106, 389–404 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Bick, A. G. et al. Inherited causes of clonal haematopoiesis in 97,691 whole genomes. Nature (2020).

  21. 21.

    Szklarczyk, D. et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 43, D447–D452 (2015).

    CAS  PubMed  Google Scholar 

  22. 22.

    Garrison, B. S. et al. ZFP521 regulates murine hematopoietic stem cell function and facilitates MLL-AF9 leukemogenesis in mouse and human cells. Blood 130, 619–624 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Rodrigues, N. P. et al. Haploinsufficiency of GATA-2 perturbs adult hematopoietic stem-cell homeostasis. Blood 106, 477–484 (2005).

    CAS  PubMed  Google Scholar 

  24. 24.

    Kataoka, K. et al. Evi1 is essential for hematopoietic stem cell self-renewal, and its expression marks hematopoietic cells with long-term multilineage repopulating activity. J. Exp. Med. 208, 2403–2416 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Tober, J., Yzaguirre, A. D., Piwarzyk, E. & Speck, N. A. Distinct temporal requirements for Runx1 in hematopoietic progenitors and stem cells. Development 140, 3765–3776 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Cabezas-Wallscheid, N. et al. Identification of regulatory networks in HSCs and their immediate progeny via integrated proteome, transcriptome, and DNA methylome analysis. Cell Stem Cell 15, 507–522 (2014).

    CAS  PubMed  Google Scholar 

  27. 27.

    Ito, K. et al. Regulation of oxidative stress by ATM is required for self-renewal of haematopoietic stem cells. Nature 431, 997–1002 (2004).

    ADS  CAS  PubMed  Google Scholar 

  28. 28.

    Tothova, Z. et al. FoxOs are critical mediators of hematopoietic stem cell resistance to physiologic oxidative stress. Cell 128, 325–339 (2007).

    CAS  Google Scholar 

  29. 29.

    Moran-Crusio, K. et al. Tet2 loss leads to increased hematopoietic stem cell self-renewal and myeloid transformation. Cancer Cell 20, 11–24 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Akada, H. et al. Critical role of Jak2 in the maintenance and function of adult hematopoietic stem cells. Stem Cells 32, 1878–1889 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Buza-Vidas, N. et al. Cytokines regulate postnatal hematopoietic stem cell expansion: opposing roles of thrombopoietin and LNK. Genes Dev. 20, 2018–2023 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Seita, J. et al. Lnk negatively regulates self-renewal of hematopoietic stem cells by modifying thrombopoietin-mediated signal transduction. Proc. Natl Acad. Sci. USA 104, 2349–2354 (2007).

    ADS  CAS  PubMed  Google Scholar 

  33. 33.

    Allsopp, R. C., Morin, G. B., DePinho, R., Harley, C. B. & Weissman, I. L. Telomerase is required to slow telomere shortening and extend replicative lifespan of HSCs during serial transplantation. Blood 102, 517–520 (2003).

    CAS  PubMed  Google Scholar 

  34. 34.

    Cai, Z., Chehab, N. H. & Pavletich, N. P. Structure and activation mechanism of the CHK2 DNA damage checkpoint kinase. Mol. Cell 35, 818–829 (2009).

    CAS  PubMed  Google Scholar 

  35. 35.

    Falck, J., Mailand, N., Syljuåsen, R. G., Bartek, J. & Lukas, J. The ATM–Chk2–Cdc25A checkpoint pathway guards against radioresistant DNA synthesis. Nature 410, 842–847 (2001).

    ADS  CAS  PubMed  Google Scholar 

  36. 36.

    Zipin-Roitman, A. et al. SMYD2 lysine methyltransferase regulates leukemia cell growth and regeneration after genotoxic stress. Oncotarget 8, 16712–16727 (2017).

    PubMed  PubMed Central  Google Scholar 

  37. 37.

    Khandanpour, C. et al. Evidence that growth factor independence 1b regulates dormancy and peripheral blood mobilization of hematopoietic stem cells. Blood 116, 5149–5161 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Polfus, L. M. et al. Whole-exome sequencing identifies loci associated with blood cell traits and reveals a role for alternative GFI1B splice variants in human hematopoiesis. Am. J. Hum. Genet. 99, 481–488 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Vassen, L. et al. Growth factor independence 1b (Gfi1b) is important for the maturation of erythroid cells and the regulation of embryonic globin expression. PLoS One 9, e96636 (2014).

    ADS  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Lundberg, P. et al. Myeloproliferative neoplasms can be initiated from a single hematopoietic stem cell expressing JAK2-V617F. J. Exp. Med. 211, 2213–2230 (2014).

    PubMed  PubMed Central  Google Scholar 

  41. 41.

    Mansier, O. et al. Description of a knock-in mouse model of JAK2V617F MPN emerging from a minority of mutated hematopoietic stem cells. Blood 134, 2383–2387 (2019).

    PubMed  Google Scholar 

  42. 42.

    Musa, J. et al. Cooperation of cancer drivers with regulatory germline variants shapes clinical outcomes. Nat. Commun. 10, 4128 (2019).

    ADS  PubMed  PubMed Central  Google Scholar 

  43. 43.

    Thompson, D. J. et al. Genetic predisposition to mosaic Y chromosome loss in blood. Nature 575, 652–657 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. 44.

    Loh, P.-R., Genovese, G. & McCarroll, S. A. Monogenic and polygenic inheritance become instruments for clonal selection. Nature 584, 136–141 (2020).

    ADS  CAS  PubMed  Google Scholar 

  45. 45.

    Terao, C. et al. Chromosomal alterations among age-related haematopoietic clones in Japan. Nature 584, 130–135 (2020).

    ADS  CAS  PubMed  Google Scholar 

  46. 46.

    Naucler, P. et al. Human papillomavirus and Papanicolaou tests to screen for cervical cancer. N. Engl. J. Med. 357, 1589–1597 (2007).

    CAS  PubMed  Google Scholar 

  47. 47.

    Løberg, M. et al. Long-term colorectal-cancer mortality after adenoma removal. N. Engl. J. Med. 371, 799–807 (2014).

    PubMed  Google Scholar 

  48. 48.

    Cimmino, L. et al. Restoration of TET2 function blocks aberrant self-renewal and leukemia progression. Cell 170, 1079–1095 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Chen, J. et al. Myelodysplastic syndrome progression to acute myeloid leukemia at the stem cell level. Nat. Med. 25, 103–110 (2019).

    CAS  PubMed  Google Scholar 

  50. 50.

    Agathocleous, M. et al. Ascorbate regulates haematopoietic stem cell function and leukaemogenesis. Nature 549, 476–481 (2017).

    ADS  PubMed  PubMed Central  Google Scholar 

  51. 51.

    Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  52. 52.

    Loh, P.-R. et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat. Genet. 47, 284–290 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. 53.

    Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  54. 54.

    Hunter-Zinck, H. et al. Measuring genetic variation in the multi-ethnic Million Veteran Program (MVP). Preprint at bioRxiv (2020).

  55. 55.

    Nielsen, C., Birgens, H. S., Nordestgaard, B. G. & Bojesen, S. E. Diagnostic value of JAK2 V617F somatic mutation for myeloproliferative cancer in 49 488 individuals from the general population. Br. J. Haematol. 160, 70–79 (2013).

    CAS  PubMed  Google Scholar 

  56. 56.

    Magosi, L. E., Goel, A., Hopewell, J. C. & Farrall, M. Identifying systematic heterogeneity patterns in genetic association meta-analysis studies. PLoS Genet. 13, e1006755 (2017).

    PubMed  PubMed Central  Google Scholar 

  57. 57.

    Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

    PubMed  PubMed Central  Google Scholar 

  58. 58.

    Michailidou, K. et al. Association analysis identifies 65 new breast cancer risk loci. Nature 551, 92–94 (2017).

    ADS  PubMed  PubMed Central  Google Scholar 

  59. 59.

    Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. 60.

    Roaldsnes, C., Holst, R., Frederiksen, H. & Ghanima, W. Myeloproliferative neoplasms: trends in incidence, prevalence and survival in Norway. Eur. J. Haematol. 98, 85–93 (2017).

    CAS  PubMed  Google Scholar 

  61. 61.

    Höglund, M., Sandin, F. & Simonsson, B. Epidemiology of chronic myeloid leukaemia: an update. Ann. Hematol. 94, 241–247 (2015).

    Google Scholar 

  62. 62.

    Finucane, H. K. et al. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat. Genet. 50, 621–629 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. 63.

    Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. 64.

    Walker, C. J. et al. Genome-wide association study identifies an acute myeloid leukemia susceptibility locus near BICRA. Leukemia 33, 771–775 (2019).

    PubMed  Google Scholar 

  65. 65.

    Coetzee, S. G., Coetzee, G. A. & Hazelett, D. J. motifbreakR: an R/Bioconductor package for predicting variant effects at transcription factor binding sites. Bioinformatics 31, 3847–3849 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. 66.

    Kulakovskiy, I. V. et al. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP–seq analysis. Nucleic Acids Res. 46, D252–D259 (2018).

    CAS  PubMed  Google Scholar 

  67. 67.

    Hemani, G. et al. The MR-Base platform supports systematic causal inference across the human phenome. eLife 7, e34408 (2018).

    PubMed  PubMed Central  Google Scholar 

  68. 68.

    Verbanck, M., Chen, C.-Y., Neale, B. & Do, R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 50, 693–698 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  69. 69.

    Sanna, S. et al. Causal relationships among the gut microbiome, short-chain fatty acids and metabolic diseases. Nat. Genet. 51, 600–605 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  70. 70.

    Bowden, J., Davey Smith, G. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).

    PubMed  PubMed Central  Google Scholar 

  71. 71.

    Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–5484 (2019).

    CAS  PubMed  Google Scholar 

  72. 72.

    McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol. 17, 122 (2016).

    PubMed  PubMed Central  Google Scholar 

  73. 73.

    Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  74. 74.

    Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).

    CAS  PubMed  Google Scholar 

  75. 75.

    de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: generalized gene-set analysis of GWAS data. PLOS Comput. Biol. 11, e1004219 (2015).

    PubMed  PubMed Central  Google Scholar 

  76. 76.

    Watanabe, K., Taskesen, E., van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).

    ADS  PubMed  PubMed Central  Google Scholar 

  77. 77.

    Grinfeld, J. et al. Classification and personalized prognosis in myeloproliferative neoplasms. N. Engl. J. Med. 379, 1416–1430 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. 78.

    Raudvere, U. et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. 47, W191–W198 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  79. 79.

    Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  80. 80.

    Pellin, D. et al. A comprehensive single cell transcriptional landscape of human hematopoietic progenitors. Nat. Commun. 10, 2395 (2019).

    ADS  PubMed  PubMed Central  Google Scholar 

  81. 81.

    van Dijk, D. et al. Recovering gene interactions from single-cell data using data diffusion. Cell 174, 716–729 (2018).

    PubMed  PubMed Central  Google Scholar 

  82. 82.

    Delano, W. L. The PyMOL Molecular Graphics System. (2002).

  83. 83.

    Milyavsky, M. et al. A distinctive DNA damage response in human hematopoietic stem cells reveals an apoptosis-independent role for p53 in self-renewal. Cell Stem Cell 7, 186–197 (2010).

    CAS  PubMed  Google Scholar 

  84. 84.

    Piacibello, W. et al. Lentiviral gene transfer and ex vivo expansion of human primitive stem cells capable of primary, secondary, and tertiary multilineage repopulation in NOD/SCID mice Blood 100, 4391–4400 (2002).

    CAS  PubMed  Google Scholar 

  85. 85.

    Cohen, S. et al. Hematopoietic stem cell transplantation using single UM171-expanded cord blood: a single-arm, phase 1–2 safety and feasibility study. Lancet Haematol. 7, e134–e145 (2020).

    PubMed  Google Scholar 

  86. 86.

    Fares, I. et al. Cord blood expansion. Pyrimidoindole derivatives are agonists of human hematopoietic stem cell self-renewal. Science 345, 1509–1512 (2014).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  87. 87.

    Tomellini, E. et al. Integrin-α3 is a functional marker of ex vivo expanded human long-term hematopoietic stem cells. Cell Rep. 28, 1063–1073 (2019).

    CAS  PubMed  Google Scholar 

Download references


We thank members of the Sankaran laboratory for comments; W. Zhou for technical guidance on the implementation of SAIGE; and the research participants and employees of 23andMe, UKBB, FinnGen and the Million Veteran Program. This research has been conducted using the UKBB Resource under application 31063. E.L.B. received support from the Howard Hughes Medical Institute Medical Research Fellowship. S.K.N. received support through a Scholar Award from the American Society of Hematology. This work was supported by the Claudia Adams Barr Program for Innovative Cancer Research, the New York Stem Cell Foundation, the MPN Research Foundation, the Leukemia & Lymphoma Society, and National Institutes of Health grants (R01 DK103794 and R01 HL146500 to V.G.S.). V.G.S. is a New York Stem Cell Foundation-Robertson Investigator.

Author information





E.L.B. and V.G.S. conceived the study. E.L.B., S.K.N. and V.G.S. designed the study. S.K.N., X.L., O.I.G., D.E.K. and M.M. performed experiments. E.L.B., X.L., A.G.B., J.K., M.T., A.H., T.K., C.A.L., A.L.d.L.P., D.E.K., B.L. and C.E. performed computational and statistical analyses. C.C. and B.M.N. contributed to genetic analysis of UKBB. A.G.B., C.E., P.N., P.W.F.W., K.C., S.P., J.M.G., C.J.O. and S.K. contributed to genetic analysis of the Million Veteran Program. J.K., A.H., T.K., A.P. and M.J.D. contributed to genetic analysis of FinnGen. M.T., B.L. and A.R. contributed to analysis of the Human Cell Atlas. V.C., C.P.N. and N.J.S. contributed to genetic analysis of leukocyte telomere length. C.J.W. and A.d.l.C. contributed to genetic analysis of AML. A.L.d.L.P., B.N., J.E.D. and M.M. contributed ideas and insights. V.G.S. supervised all experimental and analytic aspects of this work. E.L.B., S.K.N., X.L. and V.G.S. wrote the manuscript with input from all authors. All authors read and approved the final version of the manuscript.

Corresponding author

Correspondence to Vijay G. Sankaran.

Ethics declarations

Competing interests

P.N. reports research grants from Amgen, Apple and Boston Scientific, and is a scientific advisor to Apple and Blackstone Life Sciences, all unrelated to the present work. A.R. is a cofounder of and equity holder in Celsius Therapeutics, and a member of the scientific advisory boards for Thermo Fisher Scientific, Neogene Therapeutics and Syros Pharmaceuticals. S.K. is an employee of Verve Therapeutics, and holds equity in Verve Therapeutics, Maze Therapeutics, Catabasis and San Therapeutics. He is a member of the scientific advisory boards for Regeneron Genetics Center and Corvidia Therapeutics, and he has served as a consultant for Acceleron, Eli Lilly, Novartis, Merck, Novo Nordisk, Novo Ventures, Ionis, Alnylam, Aegerion, Haug Partners, Noble Insights, Leerink Partners, Bayer Healthcare, Illumina, Color Genomics, MedGenome, Quest and Medscape. The remaining authors declare no competing interests.

Additional information

Peer review information Nature thanks Ross Levine, Stephen Chanock and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Flowchart of genetic association analyses.

Flowchart of the quality control steps and analysis methods for the three discovery-phase genome-wide association studies in the UK Biobank, 23andMe and FinnGen, followed by replication in the Million Veteran Program.

Extended Data Fig. 2 MPN GWAS cohort-specific effect sizes.

a, Forest plot displaying cohort-specific odds ratios for lead variants of the 17 loci reaching genome-wide significance after replication. Sample sizes are: UKBB, n = 1,086 cases and 407,155 controls; 23andMe, n = 1,223 cases and 252,140 controls; FinnGen, n = 640 cases and 176,259 controls; MVP, n = 848 cases and 317,423 controls. Data represent odds ratios and 95% CI. b, Overall correlation of effect sizes between MVP cohort and combined discovery cohort (UKBB + 23andMe + FinnGen) for all 24 variants reaching suggestive significance (P < 1 × 10−6) which underwent replication (P = 3.76 × 10−5, two-tailed Pearson correlation). c, Forest plot displaying cohort-specific odds ratios for lead variants of the three most-significant loci in the meta-analysis: the JAK2 46/1 haplotype and two independent signals at the TERT locus. MVP_jak2 = JAK2V617F phenotype in MVP, MVP_jak2_or_mpn = JAK2V617F or ICD-based MPN definition in MVP. Data are odds ratios and 95% CI. Sample sizes are: UKBB, n = 1,086 cases and 407,155 controls; 23andMe, n = 1,223 cases and 252,140 controls; FinnGen, n = 640 cases and 176,259 controls; MVP_jak2, n = 848 cases and 317,423 controls; MVP_jak2_or_mpn, n = 2,203 cases and 218,607 controls.

Extended Data Fig. 3 Assessing the distribution and prevalence of the MPN polygenic risk score in UK Biobank.

a, Density distribution of the MPN PRS within the UK Biobank. b, Receiver operating characteristic curves for MPN predictions (n = 1,086 cases and 407,155 controls), using information from age, sex, genotyping array and ancestry-informed principal components (AUC1, blue) alone, or with the addition of PRS (AUC2, orange). c, Odds ratio (mean and 95% CI) for MPN acquisition according to deciles of the PRS (n = 1,086 cases and 407,155 controls), with decile 1 (10% of individuals with lowest PRS) as the reference group. d, Prevalence of MPN within each decile of the PRS in the UK Biobank population (n = 1,086 MPN cases, 407,155 controls). e, MPN cases and controls in the UK Biobank were stratified into three groups according to their PRS – low, intermediate, and high defined as the lowest quintile, the middle three quintiles, and the highest quintile of the PRS distribution respectively. For carriers and non-carriers of the JAK2 46/1 haplotype, the odds ratio for MPN was calculated in a logistic regression model with PRS group, age, sex and the top ten principal components of ancestry as covariates. Non-carriers with intermediate PRS served as the reference group. Data are odds ratios and 95% CI. f, Fine-mapped 95% credible sets for all 25 MPN risk loci reaching suggestive significance, stratified by the number of variants comprising each credible set. g, The fine-mapped posterior probability of causality for the highest fine-mapped variant in each locus credible set. h, Variants within the 95% credible sets and PP > 0.001 across all regions, grouped by genomic annotation.

Extended Data Fig. 4 Shared genetic associations between MPN risk and other phenotypes.

a, Schematic depicting the trajectory of undifferentiated haematopoietic stem and progenitor cells (HSPCs) into various committed cell types: lymphocytes (LYMPH), monocytes (MONO), neutrophils (NEUT), basophils (BASO), eosinophils (EO), red blood cells (RBC) and platelets (PLT). b, Regional association plots at the TERT locus (±50 kb from lead variant), showing the associations of variants with leukocyte telomere length and MPN. The colours of the points depict pairwise LD (r2) to sentinel variant rs7705526. The two conditionally independent lead variants for both traits, rs7705526 and rs2853677, are labelled. c, Individual single-nucleotide polymorphisms (SNPs) associated with telomere length and their effect sizes on MPN risk (n = 2,949 cases and 835,554 controls), calculated using the fixed effects meta-analysis method. Aggregate Mendelian randomization (MR) effects, calculated from three different methods (weighted median, inverse-variance weighted and Egger regression), are shown at the bottom. Data are MR effect sizes and standard errors. Red colour indicates significance. d, MR leave-one-out sensitivity analysis, showing MR effect estimates using the inverse variance weighted approach after excluding each individual SNP from the analysis (n = 2,949 cases and 835,554 controls). Data are MR effect sizes and standard errors. e, PheWAS of MPN risk variants. We tested fine-mapped MPN risk variants (PP > 0.10 or lead variant) for associations with 1,130 well-represented case–control phenotypes from the UKBB, calculated by two-tailed logistic mixed model association test. Shown in this heat map are the top MPN-associated variants at each locus with one or more associations reaching Bonferroni-corrected significance (P = 0.05/1,130 phenotypes = 4.4 × 10−5, or abs(z-score) = 4.08). Heat map colour indicates association z-score. All variant effects are oriented with respect to the risk-increasing MPN allele. Phenotypes are divided into major clinical categories, as listed in the annotations above the heat map.

Extended Data Fig. 5 Characterizing MPN target genes.

a, Target genes prioritized on the basis of noncoding criteria (red boxes) and coding consequences (blue boxes) and scored based on the number of criteria met. Only the highest-scoring gene per locus is reported, and for noncoding loci, only genes with a score of 2 or more are reported. b, Average expression (log2-transformed counts per million (CPM)) of MPN target genes (n = 15) across 16 primary haematopoietic cell types. Black diamonds indicate the mean expression of all non-zero expressed protein-coding genes in each cell type. Box plots show the median at the centre, with the top and bottom of the box indicating the interquartile range. Whiskers extend either to the maximum and minimum value or to 1.5 × the interquartile range. c, Protein–protein interaction network showing known and predicted associations between the protein products of MPN target genes, generated with the STRING database. d, Top-enriched biological annotations for MPN target genes identify key pathways associated with haematopoiesis and oncogenesis.

Extended Data Fig. 6 Structural basis for CHEK2 homodimer disruption by mutation of Ile157.

a, The crystal structure of the CHEK2 (forkhead-associated (FHA) kinase domain) homodimer (PDB: 3I6U). The FHA domain of molecule A (mol A) is shown in cyan and the kinase domain is coloured green. A second CHEK2 (mol B) has both domains coloured white. The two CHEK2 molecules are nearly symmetric—coiling around the central axis (black rod). The location of each Ile157 residue is marked with an asterisk. b, A magnified window showing details of the interactions. Ile157 links the FHA of one CHEK2 molecule (white) to the kinase domain of a second (green). The side chain of I157 mediates an FHA–kinase hydrophobic interface, interacting with Phe238 and Leu236 on the kinase domain. c, The second interface of the CHEK2 dimer (180° rotation from b) is nearly identical. A threonine residue at position 157 would diminish these hydrophobic interfaces and destabilize the CHEK2 dimer, as has been previously reported34.

Extended Data Fig. 7 CHEK2 is required for apoptosis of cycling HSPCs, but not for lineage commitment.

a, Assessment of IR-induced cell death of cycling HSPCs and myeloid progenitors following sublethal irradiation, after treatment with CHEK2 inhibitor (n = 3) or DMSO control (n = 3) (two-sided paired t-test). n is the number of biologically independent experiments. Data are mean ± s.e.m. b, Numbers (left) and percent (right) of HSPC colonies formed after CHEK2 inhibition (CHEK2 inhibitor II, Sigma 220486) (n = 4) versus DMSO control (n = 4). n is the number of biologically independent experiments. Data are mean ± s.e.m.

Extended Data Fig. 8 Supplementary data for variant-to-function studies at the GFI1B locus.

a, Map of the lentiviral constructs designed to assess enhancer activity at rs524137. b, Histogram displays GFP mean fluorescence intensity (MFI) of haematopoietic K562 cells infected with promoter-only versus promoter-and-enhancer lentiviral constructs. Compared to mock uninfected control cells, cells infected with the construct carrying both GFI1B promoter and enhancer show greater GFP intensity. c, FACS gating for sorting and identifying the primitive CD34+CD45RACD90+CD133+EPCR+ITGA3+ LT-HSC population in the day-7 CD34+ HSPCs presented in Fig. 4g–i. d, Schematic of colony-replating assays using human HSPCs edited with GFI1B coding (CDS) or enhancer guides (ENH). e, Representative western blot measuring GFI1B protein expression 5 days after CRISPR–Cas9 targeting with non-targeting control (NT), or coding regions of GFI1B (g1, g2). Lamin B was expression used as a loading control. Lamin B controls were probed on the same blot as the GFI1B. Similar results were obtained in three independent experiments. For gel source data, see Supplementary Fig. 3.

Extended Data Fig. 9 Schematics of the variant-to-function arcs for MPN risk loci.

a, CHEK2; b, GFI1B.

Extended Data Table 1 Genome-wide-significant loci from the MPN GWAS

Supplementary information

Supplementary Information

This file contains Supplementary Figs 1-3, a Supplementary Note, a list of contributors of the Million Veteran Program, and Supplementary References.

Reporting Summary

Supplementary Tables

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bao, E.L., Nandakumar, S.K., Liao, X. et al. Inherited myeloproliferative neoplasm risk affects haematopoietic stem cells. Nature 586, 769–775 (2020).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing