Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Inherited causes of clonal haematopoiesis in 97,691 whole genomes

An Author Correction to this article was published on 11 March 2021

This article has been updated


Age is the dominant risk factor for most chronic human diseases, but the mechanisms through which ageing confers this risk are largely unknown1. The age-related acquisition of somatic mutations that lead to clonal expansion in regenerating haematopoietic stem cell populations has recently been associated with both haematological cancer2,3,4 and coronary heart disease5—this phenomenon is termed clonal haematopoiesis of indeterminate potential (CHIP)6. Simultaneous analyses of germline and somatic whole-genome sequences provide the opportunity to identify root causes of CHIP. Here we analyse high-coverage whole-genome sequences from 97,691 participants of diverse ancestries in the National Heart, Lung, and Blood Institute Trans-omics for Precision Medicine (TOPMed) programme, and identify 4,229 individuals with CHIP. We identify associations with blood cell, lipid and inflammatory traits that are specific to different CHIP driver genes. Association of a genome-wide set of germline genetic variants enabled the identification of three genetic loci associated with CHIP status, including one locus at TET2 that was specific to individuals of African ancestry. In silico-informed in vitro evaluation of the TET2 germline locus enabled the identification of a causal variant that disrupts a TET2 distal enhancer, resulting in increased self-renewal of haematopoietic stem cells. Overall, we observe that germline genetic variation shapes haematopoietic stem cell function, leading to CHIP through mechanisms that are specific to clonal haematopoiesis as well as shared mechanisms that lead to somatic mutations across tissues.

This is a preview of subscription content

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Identifying CHIP in TOPMed genomes.
Fig. 2: Genetic determinants of CHIP.
Fig. 3: A TET2 locus risk variant specific to donors with African ancestry disrupts the haematopoietic stem cell TET2 enhancer, decreasing TET2 expression and increasing self-renewal.

Data availability

Individual WGS data for TOPMed whole genomes, individual-level harmonized phenotypes, harmonized germline variant call sets, the CHIP somatic variant call sets, RNA-seq and peripheral blood methylation data used in this analysis are available through restricted access via the dbGaP. Accession numbers for these datasets are provided in Supplementary Table 1. Summary-level genotype data are available through the BRAVO browser ( Full GWAS summary statistics are available for general research use through controlled access at dbGaP accession phs001974: NHLBI TOPMed: Genomic Summary Results for the Trans-Omics for Precision Medicine programme. A subset of the TOPMed cohorts analysed here is based on sensitive populations, precluding public sharing of full genomic summary results.

Change history


  1. 1.

    Kennedy, B. K. et al. Geroscience: linking aging to chronic disease. Cell 159, 709–713 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  2. 2.

    Jaiswal, S. et al. Age-related clonal hematopoiesis associated with adverse outcomes. N. Engl. J. Med. 371, 2488–2498 (2014).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  3. 3.

    Genovese, G. et al. Clonal hematopoiesis and blood-cancer risk inferred from blood DNA sequence. N. Engl. J. Med. 371, 2477–2487 (2014).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  4. 4.

    Xie, M. et al. Age-related mutations associated with clonal hematopoietic expansion and malignancies. Nat. Med. 20, 1472–1478 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  5. 5.

    Jaiswal, S. et al. Clonal hematopoiesis and risk of atherosclerotic cardiovascular disease. N. Engl. J. Med. 377, 111–121 (2017).

    PubMed  PubMed Central  Article  Google Scholar 

  6. 6.

    Steensma, D. P. et al. Clonal hematopoiesis of indeterminate potential and its distinction from myelodysplastic syndromes. Blood 126, 9–16 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  7. 7.

    Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Preprint at (2019).

  8. 8.

    Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  9. 9.

    Loh, P. R. et al. Insights into clonal haematopoiesis from 8,342 mosaic chromosomal alterations. Nature 559, 350–355 (2018).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  10. 10.

    Patel, K. V. et al. Red cell distribution width and mortality in older adults: a meta-analysis. J. Gerontol. A 65, 258–365 (2010).

    Article  Google Scholar 

  11. 11.

    Bick, A. G. et al. Genetic interleukin 6 signaling deficiency attenuates cardiovascular risk in clonal hematopoiesis. Circulation 141, 124–131 (2020).

    CAS  PubMed  Article  Google Scholar 

  12. 12.

    Alexandrov, L. B. et al. Clock-like mutational processes in human somatic cells. Nat. Genet. 47, 1402–1407 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  13. 13.

    Zink, F. et al. Clonal hematopoiesis, with and without candidate driver mutations, is common in the elderly. Blood 130, 742–752 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  14. 14.

    Bowman, R. L., Busque, L. & Levine, R. L. et al. Clonal hematopoiesis and evolution to hematopoietic malignancies. Cell Stem Cell 22, 157–170 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  15. 15.

    Desai, P. et al. Somatic mutations precede acute myeloid leukemia years before diagnosis. Nat. Med. 24, 1015–1023 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  16. 16.

    Bojesen, S. E. et al. Multiple independent variants at the TERT locus are associated with telomere length and risks of breast and ovarian cancer. Nat. Genet. 45, 371–384 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  17. 17.

    Bao, E. L., et al. Inherited myeloproliferative neoplasm risk affects haematopoietic stem cells. Nature (2020).

  18. 18.

    Zhou, W. et al. Mosaic loss of chromosome Y is associated with common variation near TCL1A. Nat. Genet. 48, 563–568 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  19. 19.

    Hinds, D. A. et al. Germ line variants predispose to both JAK2 V617F clonal hematopoiesis and myeloproliferative neoplasms. Blood 128, 1121–1128 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  20. 20.

    Hu, Y. et al. A statistical framework for cross-tissue transcriptome-wide association analysis. Nat. Genet. 51, 568–576 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  21. 21.

    Smith, B. W. et al. The aryl hydrocarbon receptor directs hematopoietic progenitor cell expansion and differentiation. Blood 122, 376–385 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  22. 22.

    Cybulski, C. et al. CHEK2 is a multiorgan cancer susceptibility gene. Am. J. Hum. Genet. 75, 1131–1135 (2004).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  23. 23.

    Rudd, M. F., Sellick, G. S., Webb, E. L., Catovsky, D. & Houlston, R. S. Variants in the ATM–BRCA2–CHEK2 axis predispose to chronic lymphocytic leukemia. Blood 108, 638–644 (2006).

    CAS  PubMed  Article  Google Scholar 

  24. 24.

    Huynh, M. et al. Hyaluronan and proteoglycan link protein 1 (HAPLN1) activates bortezomib-resistant NF-κB activity and increases drug resistance in multiple myeloma. J. Biol. Chem. 293, 2452–2465 (2018).

    CAS  PubMed  Article  Google Scholar 

  25. 25.

    Moran-Crusio, K. et al. Tet2 loss leads to increased hematopoietic stem cell self-renewal and myeloid transformation. Cancer Cell 20, 11–24 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  26. 26.

    Kilpivaara, O. et al. A germline JAK2 SNP is associated with predisposition to the development of JAK2 V617F-positive myeloproliferative neoplasms. Nat. Genet. 41, 455–459 (2009).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. 27.

    Jones, A. V. et al. JAK2 haplotype is a major risk factor for the development of myeloproliferative neoplasms. Nat. Genet. 41, 446–449 (2009).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  28. 28.

    Olcaydu, D. et al. A common JAK2 haplotype confers susceptibility to myeloproliferative neoplasms. Nat. Genet. 41, 450–454 (2009).

    CAS  PubMed  Article  Google Scholar 

  29. 29.

    Young, A. L., Challen, G. A., Birmann, B. M. & Druley, T. E. Clonal haematopoiesis harbouring AML-associated mutations is ubiquitous in healthy adults. Nat. Commun. 7, 12484 (2016).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  30. 30.

    Regier, A. A. et al. Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects. Nat. Commun. 9, 4038 (2018).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  31. 31.

    Jun, G., Wing, M. K., Abecasis, G. R. & Kang, H. M. An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data. Genome Res. 25, 918–925 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  32. 32.

    Karczewski, K. J. et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. Nature 581, 434–443 (2020).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  33. 33.

    Gibson, C. J. et al. Clonal hematopoiesis associated with adverse outcomes after autologous stem-cell transplantation for lymphoma. J. Clin. Oncol. 35, 1598–1605 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. 34.

    Hiatt, J. B., Pritchard, C. C., Salipante, S. J., O’Roak, B. J. & Shendure, J. Single molecule molecular inversion probes for targeted, high-accuracy detection of low-frequency variation. Genome Res. 23, 843–854 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  35. 35.

    Pérez Millán, M. I. et al. Next generation sequencing panel based on single molecule molecular inversion probes for detecting genetic variants in children with hypopituitarism. Mol. Genet. Genomic Med. 6, 514–525 (2018).

    PubMed Central  Article  CAS  PubMed  Google Scholar 

  36. 36.

    Li, Y., Willer, C. J., Ding, J., Scheet, P. & Abecasis, G. R. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 34, 816–834 (2010).

    PubMed  PubMed Central  Article  Google Scholar 

  37. 37.

    Vattathil, S. & Scheet, P. Haplotype-based profiling of subtle allelic imbalance with SNP arrays. Genome Res. 23, 152–158 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  38. 38.

    Fowler, J., San Lucas, F. A. & Scheet, P. System for quality-assured data analysis: flexible, reproducible scientific workflows. Genet. Epidemiol. 43, 227–237 (2019).

    PubMed  Article  Google Scholar 

  39. 39.

    Natarajan, P. et al. Deep-coverage whole genome sequences and blood lipids among 16,324 individuals. Nat. Commun. 9, 3391 (2018).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  40. 40.

    Nik-Zainal, S. et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 534, 47–54 (2016).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  41. 41.

    Blokzijl, F., Janssen, R., van Boxtel, R. & Cuppen, E. MutationalPatterns: comprehensive genome-wide analysis of mutational processes. Genome Med. 10, 33 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  42. 42.

    Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  43. 43.

    Bates, D., Mächler, M., Bolker, B. & Walker, S. Fitting linear mixed-effects models using lme4. J. Stat. Softw. (2015).

  44. 44.

    Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  45. 45.

    Amemiya, H. M., Kundaje, A. & Boyle, A. P. The ENCODE blacklist: identification of problematic regions of the genome. Sci. Rep. 9, 9354 (2019).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  46. 46.

    Fulco, C. P. et al. Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  47. 47.

    Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).

    PubMed  MATH  Article  Google Scholar 

  48. 48.

    Nasser, J. et al. Genome-wide maps of enhancer regulation connect risk variants to disease genes. Preprint at (2020).

  49. 49.

    Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    CAS  Article  Google Scholar 

  50. 50.

    DeLuca, D. S. et al. RNA-SeQC: RNA-seq metrics for quality control and process optimization. Bioinformatics 28, 1530–1532 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  51. 51.

    eGTEx Project. Enhancing GTEx by bridging the gaps between genotype, gene expression, and disease. Nat. Genet. 49, 1664–1670 (2017).

    Article  CAS  Google Scholar 

  52. 52.

    Houseman, E. A. et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics 13, 86 (2012).

    PubMed  PubMed Central  Article  Google Scholar 

  53. 53.

    Horvath, S. & Levine, A. J. HIV-1 infection accelerates age according to the epigenetic clock. J. Infect. Dis. 212, 1563–1573 (2015).

    PubMed  PubMed Central  Article  Google Scholar 

  54. 54.

    Barfield, R. T., Kilaru, V., Smith, A. K. & Conneely, K. N. CpGassoc: an R function for analysis of DNA methylation microarray data. Bioinformatics 28, 1280–1281 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

Download references


Investigators who conducted this research report individual research support from R35 HL135818 (S.R.), P01 HL132825 (S.T.W.), R01 HL091357 and R01 HL055673 (D.K.A.), W81 XWH-17-1-0597 (D.A.S.), K01 HL135405 (B.E.C.), P01 HL132825 (J.L.-S.), K01HL136700 (S.A.), R01 HL113323 (J.E.C.), R01HL1333040 (D.E.W.), R01 HL138737 (D.D.), P01 HL132825 (P.K.), T32 HL129982 (L.M.R.), R01 HL113323 (J.B.), HHS-N268201800002I (T.W.B. and A.V.S.), U54 GM115428 (J.G.W.), R01 HL148565 and R01 HL148050 (P.N.), F30 HL149180 (S.M.Z.), R01 HL139731 and AHA-18SFRN34250007 (S.L.), DP5 OD029586 (A.G.B.), Claudia Adams Barr Program for Innovative Cancer Research (V.G.S.), R01 142711, MGH Hassenfeld Scholar Award (P.N.), Fondation Leducq TNE-18CVD04 (A.G.B., B.L.E., S.J., P.N. and S.K.), Burroughs Wellcome Foundation (A.G.B. and S.J.), Ludwig Cancer Center (S.J.) and UM1-HG008895 (S.K.). WGS for the Trans-Omics in Precision Medicine (TOPMed) programme was supported by the National Heart, Lung, and Blood Institute (NHLBI). Centralized read mapping and genotype calling, along with variant quality metrics and filtering, were provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1; contract HHSN268201800002I). Phenotype harmonization, data management, sample-identity QC and general study coordination were provided by the TOPMed Data Coordinating Center (3R01HL-120393-02S1; contract HHSN268201800001I). We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed. The views expressed in this manuscript are those of the authors and do not necessarily represent the views of the National Heart, Lung, and Blood Institute, the National Institutes of Health or the U.S. Department of Health and Human Services.

Author information





A.G.B., P.N. and S.K. conceived the study. A.G.B. and J.S.W. performed the germline and somatic WGS analyses. C.P.F., E.L.B., S.M.Z, M.D.S., M.J.L., J.N., K.C., C.J.G., A.E.L., B.B.B., P.S., J.O.K., J.M.E., A.P.R, B.L.E. and S.J. performed additional bioinformatic analyses. S.K.N., X.L. and V.G.S. experimentally characterized the TET2 locus. M.A.T., F.A., K.A., B.D.M, K.C.B., A.M., M.F., S.R., B.M.P., E.K.S., S.T.W., N.D.P., R.S.V., E.G.B., S.L.R.K., J.H., R.C.K., N.L.S., D.K.A., D.A.S., A.C., M.d.A., X.G., B.A.K., B.C., J.M.P., H.G., D.A.M., S.T.M., I.Y.-D.C., M.B.S., P.A.P., J.G.B., S.M.G., F.F.W., Q.W., M.E.M., M.D., E.E.K., K.E.N., L.J.L., B.E.C., J.C.B., M.H.C., J.L.-S., D.W.B., L.A.C., A.C.M., L.C.B., J.A.S., T.N.K, S.A., S.R.H., H.K.T., I.V.Y., J.A.H., S.L., J.M.J., J.E.C., S.E.W., D.E.W., D.C.R., D.D., J.-Y.M., R.P.T., E.J.B., C.L., N.R., R.J.F.L., P. Durda, Y.L., L.H., J.L., P.K., B.I.F., D.L., L.F.B., J.E.H., J.S.F., E.A.W., P.T.E., M.R.I., T.E.F., L.M.R., S.M.A., M.M.W., E.C.S., J.B., L.K.W., B.D.L., W.H.-H.S., D.M.R., E.B., J.E.M., R.A.M., P. Desai, K.D.T., A.D.J., P.L.A., C.K., C.C.L., T.W.B., A.V.S., H.Z.,, E.L., L.L., S.S.R., J.I.R., J.G.W., P.S., J.O.K., E.S.L., J.M.E. and G.A. contributed to sample acquisition, DNA sequencing and phenotypic curation for the NHLBI TOPMed constituent cohorts analysed here. A.G.B., J.S.W., S.K. and P.N. wrote the manuscript with input from all authors.

Corresponding authors

Correspondence to Sekar Kathiresan or Pradeep Natarajan.

Ethics declarations

Competing interests

B.M.P. serves on the Steering Committee of the Yale Open Data Access Project funded by Johnson & Johnson. E.K.S. and M.H.C. received grant support from GlaxoSmithKlein and Bayer. S.T.W. received royalties from UpToDate. S.A. reports employment and equity in 23andMe. B.I.F. is a consultant for RenalytixAI and AstraZeneca Pharmaceuticals. M.E.M. reports funding from Regeneron Pharmaceuticals, unrelated to this project. M.H.C. has received grant support from GlaxoSmithKlein and Bayer and consulting or speaking fees from AstraZeneca and Illumina. J.S.F. has consulted for Shionogi. B.D.L. is a co-founder of Nocion Therapeutics; receives grant support from Pieris Pharmaceuticals, Sanofi and Samsung Research America; and has served as a consultant for Bayer, Entrinsic Health, Gossamer Bio, NControl, Novartis, Teva and Thetis Pharmaceuticals. E.S.L. serves on the board of directors for Codiak BioSciences and serves on the scientific advisory board of F-Prime Capital Partners and Third Rock Ventures. B.L.E. reports grant support from Celgene and Deerfield. P.T.E. has received grant support from Bayer AG and has served on advisory boards or consulted for Bayer AG, Quest Diagnostics, MyoKardia and Novartis. G.A. is an employee of Regeneron Pharmaceuticals and owns stock and stock options for Regeneron Pharmaceuticals. S.J. is a scientific advisor to Grail. S.L. receives sponsored research support from Bristol Myers Squibb, Pfizer, Bayer AG, Boehringer Ingelheim and Fitbit; has consulted for Bristol Myers Squibb, Pfizer and Bayer AG; and participates in a research collaboration with IBM. P.N. reports grants support from Amgen, Apple and Boston Scientific, and is a scientific advisor to Apple. S.K. is an employee of Verve Therapeutics, and holds equity in Verve Therapeutics, Maze Therapeutics, Catabasis and San Therapeutics; is a member of the scientific advisory boards for Regeneron Genetics Center and Corvidia Therapeutics; and has served as a consultant for Acceleron, Eli Lilly, Novartis, Merck, Novo Nordisk, Novo Ventures, Ionis, Alnylam, Aegerion, Haug Partners, Noble Insights, Leerink Partners, Bayer Healthcare, Illumina, Color Genomics, MedGenome, Quest and Medscape. The other authors declare no competing interests.

Additional information

Peer review information Nature thanks Stephen Chanock, Ross Levine and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Characterizing TOPMed CHIP.

a, There was marked heterogeneity of CHIP clone size as measured by variant allele fraction by CHIP driver gene. Violin plot spanning minimum and maximum values calculated on full data set (Supplementary Table 3). Sample size for each element in violin plot displayed in Fig. 1. b, 90% of individuals with CHIP had only one somatic CHIP driver mutation variant identified. c, CHIP prevalence with age was highly concordant across sequenced cohorts. CHIP prevalence was estimated from a logistic mixed model with spline-transformed age, sex, and cohort included as predictors. The cohort was included as a random intercept. Sample size for each cohort listed in Supplementary Table 1. d, CHIP prevalence with age in this study (blue triangles, n = 82,807) was highly consistent with previously observed CHIP prevalence (dots represent mean point prevalence with shaded area represents 95% confidence interval; nGenovese = 12,380; nJaiswal = 17,182; nXie = 2,728).

Extended Data Fig. 2 CHIP age association by mutational mechanism, gene and overlap with somatic chromosomal mosaicism.

a, Cumulative density plot of CHIP incidence with age stratified by single nucleotide variant (SNV) vs frameshift mutations. SNVs were observed in younger individuals than Frameshift mutations (n = 4,939; two-sided Wilcoxon rank sum test P = 0.01). b, Cumulative density plot of CHIP incidence with age stratified by driver gene. c, 855 elderly WHI individuals (mean age: 70) with both whole genome and the array genotyping data available were interrogated for large-scale somatic mosaic chromosomal rearrangements. The two somatic events did not co-occur more than would be expected by chance (hypergeometric P = 0.25).

Extended Data Fig. 3 CHIP associates with blood, lipid and inflammatory traits.

a, CHIP consistently associated with increased RDW. JAK2, SF3B1 and SRSF2 showed driver gene specific effects on blood traits (see Supplementary Table 5). b, CHIP status was not consistently associated with lipid traits, other than JAK2 CHIP which was associated with decreased total cholesterol and a trend towards decreased LDL (see Supplementary Table 6). c, CHIP status is associated with inflammatory markers, however notable heterogeneity existed across CHIP mutations (see Supplementary Table 7). Associations used a two-sided t-test from a multivariate general linear model including age, smoking, race and gender and study centre and were not adjusted for multiple comparisons. Sample sizes and exact p-values for each phenotype are listed in Supplementary Tables 57.

Extended Data Fig. 4 CHIP passenger somatic mutation spectrum.

a, Singleton mutation counts by nucleotide context in CHIP cases and controls. b, Signature contribution in CHIP cases and controls identified differential enrichment.

Extended Data Fig. 5 CHIP single variant association regional association plots.

a, TERT locus. b, TRIM59KPNA4 locus. c, TET2 locus. Two-sided association testing performed using SAIGE (n = 65,405 individuals, see Methods).

Extended Data Fig. 6 CHIP transcriptome-wide association study (TWAS) results across 48 tissues identified 7 significant loci.

UTMOST algorithm applied to CHIP genome wide association study results from n = 65,405 individuals (see Methods). Genomic coordinates listed on x-axis. P value from generalized Berk-Jones test on y-axis. Multiple hypothesis corrected threshold, P < 2.9 × 10−6 displayed as dotted red line.

Extended Data Fig. 7 Tissue-specific results from the top 9 overall UTMOST-significant genes.

UTMOST algorithm applied to CHIP genome wide association study results from n = 65,405 individuals. P value from generalized Berk-Jones test. eQTL z-scores for associations with P < 0.05 are displayed in each bar. GTEX eQTL tissue listed on y-axis.

Extended Data Fig. 8 CRISPR–Cas9 editing efficiency of TET2 enhancer deletion in primary CD34+ HSPCs.

a, Schematic showing the position of the two sgRNAs used to delete the TET2 enhancer (512 bp) containing rs79901204. b, Gel electrophoresis image of PCR products from genomic DNA of edited HSPCs indicating unedited (WT) and deletion bands at sgRNA target site. Percentages of deletion alleles determined by band intensity and is shown below each lane. The experiment contains 3 biological replicates and was performed once.

Extended Data Fig. 9 rs79901204 associated with genome wide differential methylation signal.

Methylation quantitative trait association results of rs79901204 variant with CpG methylation probes identify an altered peripheral leukocyte methylation profile genome wide in n = 1,747 individuals. The strongest signal is at the chr4 TET2 locus. P values on y-axis derived from two-sided linear mixed effects model (see Methods). To account for multiple hypothesis testing, a Bonferroni threshold of P < 5.8 × 10−8 was used to establish statistical significance.

Extended Data Fig. 10 Sensitivity of CHIP detection at various VAFs across sequencing depths.

A set of 30 samples from a previously published CHIP cohort33 were computationally down sampled to 30x, 40x, 50x, 100x and 400x sequencing depth. TOPMed WGS data were typically in the 40x depth range across CHIP genes. WGS data have excellent sensitivity to detect CHIP clones with VAF > 10%, and ~50% sensitivity to detect CHIP VAF 5–10%, with minimal ability to detect CHIP clones <5%.

Supplementary information

Supplementary Tables

This file contains Supplementary Tables 1-15.

Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bick, A.G., Weinstock, J.S., Nandakumar, S.K. et al. Inherited causes of clonal haematopoiesis in 97,691 whole genomes. Nature 586, 763–768 (2020).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links