Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Deficient H2A.Z deposition is associated with genesis of uterine leiomyoma

Abstract

One in four women suffers from uterine leiomyomas (ULs)—benign tumours of the uterine wall, also known as uterine fibroids—at some point in premenopausal life. ULs can cause excessive bleeding, pain and infertility1, and are a common cause of hysterectomy2. They emerge through at least three distinct genetic drivers: mutations in MED12 or FH, or genomic rearrangement of HMGA23. Here we created genome-wide datasets, using DNA, RNA, assay for transposase-accessible chromatin (ATAC), chromatin immunoprecipitation (ChIP) and HiC chromatin immunoprecipitation (HiChIP) sequencing of primary tissues to profoundly understand the genesis of UL. We identified somatic mutations in genes encoding six members of the SRCAP histone-loading complex4, and found that germline mutations in the SRCAP members YEATS4 and ZNHIT1 predispose women to UL. Tumours bearing these mutations showed defective deposition of the histone variant H2A.Z. In ULs, H2A.Z occupancy correlated positively with chromatin accessibility and gene expression, and negatively with DNA methylation, but these correlations were weak in tumours bearing SRCAP complex mutations. In these tumours, open chromatin emerged at transcription start sites where H2A.Z was lost, which was associated with upregulation of genes. Furthermore, YEATS4 defects were associated with abnormal upregulation of bivalent embryonic stem cell genes, as previously shown in mice5. Our work describes a potential mechanism of tumorigenesis—epigenetic instability caused by deficient H2A.Z deposition—and suggests that ULs arise through an aberrant differentiation program driven by deranged chromatin, emanating from a small number of mutually exclusive driver mutations.

This is a preview of subscription content

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Mutations in SRCAP complex genes identified as a driver of UL.
Fig. 2: H2A.Z binding on chromatin is changed in UL.
Fig. 3: Clues to the pathogenesis of UL derived from expression data.
Fig. 4: Chromatin changes emerge at the locus containing CBX2, CBX4 and CBX8 in ULs.

Data availability

The peak level data used in the study are available for research use through Zenodo (https://doi.org/10.5281/zenodo.4745433). Genetic data presented in this manuscript have been deposited at the European Genome–phenome Archive (EGA; https://www.ebi.ac.uk/ega/) under accession number EGAS00001004499. A data access committee (DAC) has been established from two University of Helsinki representatives that are independent of the authors of the current study. See Supplementary Table 26 for EGA dataset accession numbers. Requests for the data should be sent to the DAC via email (dac-finlandmyomastudy@helsinki.fi). The DAC ensures that the intended use of data as detailed in the request is compatible with the requirements of the European General Data Protection Regulation (GDPR), consistent with the consents given and otherwise ensures the protection of data subjects’ rights as required by the GDPR. The DAC will always grant access to the data if the University is legally allowed to do so without infringing the rights and freedoms of data subjects. Subject to the requirements of the GDPR, the DAC grants access to the genetic data to non-commercial academic research on neoplasia and chromatin. Roadmap Epigenomics ChIP−seq and DNase–seq data (https://egg2.wustl.edu/roadmap/data/byFileType/peaks/consolidated/narrowPeak/) provided in the LOLA extended and core databases were downloaded from http://cloud.databio.org/regiondb/. Chromatin states provided in mnemonics bed files by the Roadmap Epigenomics project were downloaded from https://egg2.wustl.edu/roadmap/data/byFileType/chromhmmSegmentations/ChmmModels/coreMarks/jointModel/final/. GWAS cohort material was downloaded from http://jenger.riken.jp/en/ (accessed on 2 September 2020) and https://finngen.gitbook.io/documentation/v/r3/ (accessed on 2 September 2020). UK Biobank material access can be applied for at https://www.ukbiobank.ac.uk/. For RNA-seq, the reference annotation for GRCh37 was downloaded from Ensembl (ftp://ftp.ensembl.org/pub/release-75/gtf/homo_sapiens/) and for GRCh38 from GenBank (accession: GCA_000001405.15). For RNA-seq variant calling, the SNP and indel resources were downloaded from Broad institute (https://console.cloud.google.com/storage/browser/gcp-public-data–broad-references/hg19/v0). DAR annotation data are available from http://homer.ucsd.edu/homer/data/genomes/hg19.v6.4.zipSource data are provided with this paper.

Code availability

Code for performing the analyses are available from Zenodo (https://doi.org/10.5281/zenodo.4745433).

References

  1. 1.

    Wallach, E. E., Buttram, V. C. Jr & Reiter, R. C. Uterine leiomyomata: etiology, symptomatology, and management. Fertil. Steril. 36, 433–445 (1981).

    Article  Google Scholar 

  2. 2.

    Gurusamy, K. S., Vaughan, J., Fraser, I. S., Best, L. M. J. & Richards, T. Medical therapies for uterine fibroids — a systematic review and network meta-analysis of randomised controlled trials. PLoS ONE 11, e0149631 (2016).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  3. 3.

    Mehine, M., Mäkinen, N., Heinonen, H.-R., Aaltonen, L. A. & Vahteristo, P. Genomics of uterine leiomyomas: insights from high-throughput sequencing. Fertil. Steril. 102, 621–629 (2014).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  4. 4.

    Giaimo, B. D., Ferrante, F., Herchenröther, A., Hake, S. B. & Borggrefe, T. The histone variant H2A.Z in gene regulation. Epigenetics Chromatin 12, 37 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  5. 5.

    Hsu, C.-C. et al. Gas41 links histone acetylation to H2A.Z deposition and maintenance of embryonic stem cell identity. Cell Discov. 4, 28 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  6. 6.

    Baird, D. D., Dunson, D. B., Hill, M. C., Cousins, D. & Schectman, J. M. High cumulative incidence of uterine leiomyoma in black and white women: ultrasound evidence. Am. J. Obstet. Gynecol. 188, 100–107 (2003).

    PubMed  Article  PubMed Central  Google Scholar 

  7. 7.

    Okolo, S. Incidence, aetiology and epidemiology of uterine fibroids. Best Pract. Res. Clin. Obstet. Gynaecol. 22, 571–588 (2008).

    PubMed  Article  PubMed Central  Google Scholar 

  8. 8.

    Bertsch, E. et al. MED12 and HMGA2 mutations: two independent genetic events in uterine leiomyoma and leiomyosarcoma. Mod. Pathol. 27, 1144–1153 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  9. 9.

    Lehtonen, R. et al. Biallelic inactivation of fumarate hydratase (FH) occurs in nonsyndromic uterine leiomyomas but is rare in other tumors. Am. J. Pathol. 164, 17–22 (2004).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  10. 10.

    Hua, S. et al. Genomic analysis of estrogen cascade reveals histone variant H2A.Z associated with breast cancer progression. Mol. Syst. Biol. 4, 188 (2008).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  11. 11.

    Dunican, D. S., McWilliam, P., Tighe, O., Parle-McDermott, A. & Croke, D. T. Gene expression differences between the microsatellite instability (MIN) and chromosomal instability (CIN) phenotypes in colorectal cancer revealed by high-density cDNA array hybridization. Oncogene 21, 3253–3257 (2002).

    CAS  PubMed  Article  Google Scholar 

  12. 12.

    Yang, B. et al. H2A.Z regulates tumorigenesis, metastasis and sensitivity to cisplatin in intrahepatic cholangiocarcinoma. Int. J. Oncol. 52, 1235–1245 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Vardabasso, C. et al. Histone variant H2A.Z.2 mediates proliferation and drug sensitivity of malignant melanoma. Mol. Cell 59, 75–88 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  14. 14.

    Hsu, C.-C. et al. Recognition of histone acetylation by the GAS41 YEATS domain promotes H2A.Z deposition in non-small cell lung cancer. Genes Dev. 32, 58–69 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  15. 15.

    Bellucci, L., Dalvai, M., Kocanova, S., Moutahir, F. & Bystricky, K. Activation of p21 by HDAC inhibitors requires acetylation of H2A.Z. PLoS ONE 8, e54102 (2013).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  16. 16.

    Ku, M. et al. H2A.Z landscapes and dual modifications in pluripotent and multipotent stem cells underlie complex genome regulatory functions. Genome Biol. 13, R85 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  17. 17.

    Tomlinson, I. P. M. et al. Germline mutations in FH predispose to dominantly inherited uterine fibroids, skin leiomyomata and papillary renal cell cancer. Nat. Genet. 30, 406–410 (2002).

    CAS  PubMed  Article  Google Scholar 

  18. 18.

    Takahashi, D. et al. Quantitative regulation of histone variant H2A.Z during cell cycle by ubiquitin proteasome system and SUMO-targeted ubiquitin ligases. Biosci. Biotechnol. Biochem. 81, 1557–1560 (2017).

    CAS  PubMed  Article  Google Scholar 

  19. 19.

    Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    Article  CAS  Google Scholar 

  20. 20.

    Ye, B. et al. Suppression of SRCAP chromatin remodelling complex and restriction of lymphoid lineage commitment by Pcid2. Nat. Commun. 8, 1518 (2017).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  21. 21.

    Bowman, T. A., Wong, M. M., Cox, L. K., Baldassare, J. J. & Chrivia, J. C. Loss of H2A.Z is not sufficient to determine transcriptional activity of Snf2-related CBP activator protein or p400 complexes. Int. J. Cell Biol. 2011, 715642 (2011).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  22. 22.

    Slupianek, A., Yerrum, S., Safadi, F. F. & Monroy, M. A. The chromatin remodeling factor SRCAP modulates expression of prostate specific antigen and cellular proliferation in prostate cancer cells. J. Cell. Physiol. 224, 369–375 (2010).

    CAS  PubMed  Article  Google Scholar 

  23. 23.

    Murphy, K. E., Meng, F. W., Makowski, C. E. & Murphy, P. J. Genome-wide chromatin accessibility is restricted by ANP32E. Nat. Commun. 11, 5063 (2020).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  24. 24.

    Brunelle, M. et al. The histone variant H2A.Z is an important regulator of enhancer activity. Nucleic Acids Res. 43, 9742–9756 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Andersson, S., Berman, D. M., Jenkins, E. P. & Russell, D. W. Deletion of steroid 5 α-reductase 2 gene in male pseudohermaphroditism. Nature 354, 159–161 (1991).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  26. 26.

    Bauman, D. R., Steckelbroeck, S., Williams, M. V., Peehl, D. M. & Penning, T. M. Identification of the major oxidative 3α-hydroxysteroid dehydrogenase in human prostate that converts 5α-androstane-3α,17β-diol to 5α-dihydrotestosterone: a potential therapeutic target for androgen-dependent disease. Mol. Endocrinol. 20, 444–458 (2006).

    CAS  PubMed  Article  Google Scholar 

  27. 27.

    Weihua, Z., Lathe, R., Warner, M. & Gustafsson, J.-A. An endocrine pathway in the prostate, ERβ, AR, 5α-androstane-3β,17β-diol, and CYP7B1, regulates prostate growth. Proc. Natl Acad. Sci. USA 99, 13589–13594 (2002).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  28. 28.

    Solomon, M. J., Strauss, F. & Varshavsky, A. A mammalian high mobility group protein recognizes any stretch of six A.T base pairs in duplex DNA. Proc. Natl Acad. Sci. USA 83, 1276–1280 (1986).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  29. 29.

    Fowler, C. B., Evers, D. L., O’Leary, T. J. & Mason, J. T. Antigen retrieval causes protein unfolding: evidence for a linear epitope model of recovered immunoreactivity. J. Histochem. Cytochem. 59, 366–381 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  30. 30.

    Kidder, B. L., Hu, G. & Zhao, K. ChIP-seq: technical considerations for obtaining high-quality data. Nat. Immunol. 12, 918–922 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  31. 31.

    Kim, J. & Kingston, R. E. The CBX family of proteins in transcriptional repression and memory. J. Biosci. 45, 16 (2020).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  32. 32.

    Klauke, K. et al. Polycomb Cbx family members mediate the balance between haematopoietic stem cell self-renewal and differentiation. Nat. Cell Biol. 15, 353–362 (2013).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  33. 33.

    George, J. W. et al. Integrated epigenome, exome, and transcriptome analyses reveal molecular subtypes and homeotic transformation in uterine fibroids. Cell Rep. 29, 4069–4085.e6 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. 34.

    Sato, S. et al. SATB2 and NGR1: potential upstream regulatory factors in uterine leiomyomas. J. Assist. Reprod. Genet. 36, 2385–2397 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  35. 35.

    Papadopoulou, T., Kaymak, A., Sayols, S. & Richly, H. Dual role of Med12 in PRC1-dependent gene repression and ncRNA-mediated transcriptional activation. Cell Cycle 15, 1479–1493 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  36. 36.

    Tyagi, M., Cheema, M. S., Dryhurst, D., Eskiw, C. H. & Ausió, J. Metformin alters H2A.Z dynamics and regulates androgen dependent prostate cancer progression. Oncotarget 9, 37054–37068 (2018).

    PubMed  PubMed Central  Article  Google Scholar 

  37. 37.

    Tseng, C.-H. Metformin use is associated with a lower risk of uterine leiomyoma in female type 2 diabetes patients. Ther. Adv. Endocrinol. Metab. https://doi.org/10.1177/2042018819895159 (2019).

  38. 38.

    Gormley, G. J. et al. The effect of finasteride in men with benign prostatic hyperplasia. N. Engl. J. Med. 327, 1185–1191 (1992).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  39. 39.

    Tate, J. G. et al. COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 47, D941–D947 (2019).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  40. 40.

    Ikeda, H., Sone, M., Yamanaka, S. & Yamamoto, T. Structural and spatial chromatin features at developmental gene loci in human pluripotent stem cells. Nat. Commun. 8, 1616 (2017).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  41. 41.

    Ku, M. et al. Genomewide analysis of PRC1 and PRC2 occupancy identifies two classes of bivalent domains. PLoS Genet. 4, e1000242 (2008).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  42. 42.

    Surface, L. E. et al. H2A.Z.1 monoubiquitylation antagonizes BRD2 to maintain poised chromatin in ESCs. Cell Rep. 14, 1142–1155 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  43. 43.

    Wang, K. et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 17, 1665–1674 (2007).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  44. 44.

    Mäkinen, N. et al. MED12, the mediator complex subunit 12 gene, is mutated at high frequency in uterine leiomyomas. Science 334, 252–255 (2011).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  45. 45.

    McGuire, M. M. et al. Whole exome sequencing in a random sample of North American women with leiomyomas identifies MED12 mutations in majority of uterine leiomyomas. PLoS ONE 7, e33251 (2012).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. 46.

    Mehine, M. et al. Integrated data analysis reveals uterine leiomyoma subtypes with distinct driver pathways and biomarkers. Proc. Natl Acad. Sci. USA 113, 1315–1320 (2016).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  47. 47.

    Mäkinen, N., Kämpjärvi, K., Frizzell, N., Bützow, R. & Vahteristo, P. Characterization of MED12, HMGA2, and FH alterations reveals molecular variability in uterine smooth muscle tumors. Mol. Cancer 16, 101 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  48. 48.

    Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.1–11.10.33 (2013).

    Article  Google Scholar 

  51. 51.

    Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    PubMed  PubMed Central  Google Scholar 

  52. 52.

    Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  53. 53.

    Witten, D. M. & Tibshirani, R. Penalized classification using Fisher’s linear discriminant. J. R. Stat. Soc. Series B Stat. Methodol. 73, 753–772 (2011).

    MathSciNet  PubMed  PubMed Central  MATH  Article  Google Scholar 

  54. 54.

    Wilkerson, M. D. & Hayes, D. N. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 1572–1573 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  55. 55.

    Monti, S., Tamayo, P., Mesirov, J. & Golub, T. Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach. Learn. 52, 91–118 (2003).

    MATH  Article  Google Scholar 

  56. 56.

    Van Hout, C. V. et al. Whole exome sequencing and characterization of coding variation in 49,960 individuals in the UK Biobank. Preprint at https://doi.org/10.1101/572347 (2019).

  57. 57.

    Zhou, W. et al. Scalable generalized linear mixed model for region-based association tests in large biobanks and cohorts. Nat. Genet. 52, 634–639 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  58. 58.

    Castel, S. E., Mohammadi, P., Chung, W. K., Shen, Y. & Lappalainen, T. Rare variant phasing and haplotypic expression from RNA sequencing with phASER. Nat. Commun. 7, 12817 (2016).

    ADS  PubMed  PubMed Central  Article  Google Scholar 

  59. 59.

    Krämer, A., Green, J., Pollard, J., Jr & Tugendreich, S. Causal analysis approaches in Ingenuity Pathway Analysis. Bioinformatics 30, 523–530 (2014).

    PubMed  Article  CAS  Google Scholar 

  60. 60.

    Mehine, M. et al. Characterization of uterine leiomyomas by whole-genome sequencing. N. Engl. J. Med. 369, 43–53 (2013).

    CAS  PubMed  Article  Google Scholar 

  61. 61.

    Drmanac, R. et al. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 327, 78–81 (2010).

    ADS  CAS  PubMed  Article  Google Scholar 

  62. 62.

    Amemiya, H. M., Kundaje, A. & Boyle, A. P. The ENCODE Blacklist: identification of problematic regions of the genome. Sci. Rep. 9, 9354 (2019).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  63. 63.

    Sheffield, N. C. & Bock, C. LOLA: enrichment analysis for genomic region sets and regulatory elements in R and Bioconductor. Bioinformatics 32, 587–589 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  64. 64.

    Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  65. 65.

    Ross-Innes, C. S. et al. Differential oestrogen receptor binding is associated with clinical outcome in breast cancer. Nature 481, 389–393 (2012).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  66. 66.

    Corces, M. R. et al. An improved ATAC–seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  67. 67.

    Orchard, P., Kyono, Y., Hensley, J., Kitzman, J. O. & Parker, S. C. J. Quantification, dynamic visualization, and validation of bias in ATAC-seq data with ataqv. Cell Syst. 10, 298–306.e4 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  68. 68.

    Corces, M. R. et al. The chromatin accessibility landscape of primary human cancers. Science 362, eaav1898 (2018).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  69. 69.

    Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  70. 70.

    Mumbach, M. R. et al. HiChIP: efficient and sensitive analysis of protein-directed genome architecture. Nat. Methods 13, 919–922 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  71. 71.

    Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  72. 72.

    Bhattacharyya, S., Chandra, V., Vijayanand, P. & Ay, F. Identification of significant chromatin contacts from HiChIP data by FitHiChIP. Nat. Commun. 10, 4221 (2019).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  73. 73.

    Akdemir, K. C. & Chin, L. HiCPlotter integrates genomic data with interaction matrices. Genome Biol. 16, 198 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  74. 74.

    Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  75. 75.

    De Coster, W., D’Hert, S., Schultz, D. T., Cruts, M. & Van Broeckhoven, C. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 34, 2666–2669 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  76. 76.

    Martin, M. et al. WhatsHap: fast and accurate read-based phasing. Preprint at https://doi.org/10.1101/085050 (2016).

  77. 77.

    Simpson, J. T. et al. Detecting DNA cytosine methylation using nanopore sequencing. Nat. Methods 14, 407–410 (2017).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  78. 78.

    Haeussler, M. et al. The UCSC Genome Browser database: 2019 update. Nucleic Acids Res. 47, D853–D858 (2019).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  79. 79.

    Karolchik, D. et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 32, D493–D496 (2004).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  80. 80.

    Park, Y. & Wu, H. Differential methylation analysis for BS-seq data under general experimental design. Bioinformatics 32, 1446–1453 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  81. 81.

    Shabalin, A. A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  82. 82.

    Ishigaki, K. et al. Large-scale genome-wide association study in a Japanese population identifies novel susceptibility loci across different diseases. Nat. Genet. 52, 669–679 (2020).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  83. 83.

    Välimäki, N. et al. Genetic predisposition to uterine leiomyoma is determined by loci for genitourinary development and genome stability. eLife 7, e37110 (2018).

    PubMed  PubMed Central  Article  Google Scholar 

Download references

Acknowledgements

The Finland Myoma Study has been made possible by professionals from operating theatre nurses to data scientists, and in particular the study participants, and we are deeply grateful for their vital support during the years. We thank S. Marttinen, S. Soisalo, M. Rajalaakso, I.-L. Åberg, I. Vuoristo, A. London, J. Kolakowska, H. Metsola and K. Pylvänäinen for technical support; and P. Ikonen and the rest of the staff of the Kätilöopisto Maternity Hospital, and the staff of the Department of Pathology, University of Helsinki for technical assistance. Capillary sequencing was performed by the sequencing unit of FIMM Technology Centre, University of Helsinki supported by Biocenter Finland. This research was conducted using the UK Biobank Resource under Application Number 32506. We acknowledge the participants and investigators of the FinnGen study; the Biobank Japan Project for the ‘uterine fibroids’ GWAS summary statistics; and the computational resources provided by the ELIXIR node, hosted at the CSC–IT Center for Science, Finland. The study was supported by grants from Academy of Finland (Finnish Center of Excellence Program 2012–2017, 2018–2025, No. 250345 and 312041, Academy Professor grants No. 319083 and 320149), European Research Council (ERC, 695727), iCAN Digital Precision Cancer Medicine Flagship (320185), Cancer Society of Finland, Sigrid Juselius Foundation and Jane and Aatos Erkko Foundation.

Author information

Affiliations

Authors

Contributions

D.G.B., H.K., N.V., A.K., K.P., E.K. and L.A.A. conceived the ideas, planned the experiments and wrote the manuscript. D.G.B., M.R. and M.J. performed the ATAC and ChIP–seq experiments. D.G.B. performed HiChIP and western blot experiments. M.R., E.K., K.P and H.K. analysed ATAC, ChIP and HiChIP data. H.K., N.V. and T.C. performed the mutation analysis. H.K. and N.V. performed the gene expression analysis. R.-M.P. and N.V. performed the allele-specific expression analysis. A.T., K.P. and E.K. performed the methylation analysis. A.K., S.N. and T.C. performed the IHC experiments. R.B., A.K. and H.K. analysed the IHC data. N.V. analysed the germline predisposition data. J.K., S.A., P.V., M.M. and N.M. performed the mutation screening and managed clinical data. A.P., J.J., O.H. and R.B. collected the tissue samples. J.R. and R.L. provided support with analysis and management of data. J.T., K.R. and B.S. provided support with the experimental design and interpretation.

Corresponding authors

Correspondence to Eevi Kaasinen or Lauri A. Aaltonen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Zehra Ordulu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Overview of study material and somatic allelic imbalance.

a, The sequential stages of UL subclass detection. Denominators are numbers of tumours screened at each stage; see Methods for a detailed description of data used at each stage. Percentages refer to proportions among all 2,263 tumours. HMGA1 subclass comprised 67 tumours without any other known driver change. Three tumours with an SRCAP complex gene mutation are shown here as one MED12 and two HMGA2 tumours as they also showed these driver changes. b, Germline LoF variants (blue; UK Biobank WES) compiled with somatic LoF (red) and missense (black) variants. Green boxes represent protein domains from Pfam. ce, Overview of genome-wide somatic allelic imbalance in 2,186 SNP-arrayed ULs. c, y-axis gives the total length of the genome affected by somatic loss and gain aberrations per tumour, stratified by subclass (logarithmic and truncated to 105 bp). Dashed lines show the overall and subclass-specific mean values. Percentage units refer to the proportion of chromosomally stable tumours within each subclass. d, Estimated numbers of somatic DSBs. e, Subclass-specific enrichment of allelic loss: x-axis gives log-transformed, one-sided tests of loss-event enrichment in each subclass compared to the rest of the tumours (truncated to 1.0 × 10−20). y-axis indicates the genomic position (autosomes and X). See Supplementary Table 6 for detailed statistics.

Source data

Extended Data Fig. 2 H2AZ staining in myometrium, MED12, HMGA, FH, OM and YEATS4 tumours.

a, Representative immunostaining of non-acetylated H2A.Z in normal myometrium and MED12, HMGA2, HMGA1, FH and YEATS4 ULs. The intensity of immunoreaction is given in parentheses: 0 = negative or weak, 1 = moderate, 2 = strong. YEATS4 mutated myoma displays negative or weak H2A.Z staining in neoplastic cells, but preserved staining in endothelial, perivascular and scattered inflammatory cells. 40× magnification. The scores for all the stained samples are given in b (left). MED12 (n = 95), HMGA2 (n = 68), HMGA1 (n = 58), FH (n = 10), OM (n = 14), YEATS4 (n = 19). b, Distribution of staining scores on non-acetylated H2A.Z antibody (left) and acetylated H2A.Z (H2A.Zac) antibody (right). c, H2A.Z (top) and H2A.Zac (bottom) staining intensity in HMGA2, HMGA1, HMGA2 and HMGA1 combined (HMGA2&1), YEATS4 and OM tumours compared to MED12 tumours (GEE-model). Nominal two-sided P values are shown. d, Western blot analysis of H2A.Z (left), H2A.Zac (right) and TBP (for both, as loading control) in chromatin fraction from MED12 and YEATS4 tumours and related normal myometrium, all from the same individual. Molecular weights of the protein ladder are indicated. Owing to the limited amount of the respective myometrium tissue as control, it was possible to perform the western blots only once. Uncropped blot available in Supplementary Fig. 30.

Source data

Extended Data Fig. 3 Reduction in acetylated H2A.Z in MED12, HMGA, FH and YEATS4 tumours.

Representative immunostaining of H2A.Zac in normal myometrium and MED12, HMGA, FH and YEATS4 ULs. The intensity of immunoreaction is shown in parentheses: 0 = negative or weak, 1 = moderate, 2 = strong. 40× magnification. The scores for all the stained samples are presented in Extended Data Fig. 2b (right). MED12 (n = 96), HMGA2 (n = 68), HMGA1 (n = 58), FH (n = 9), OM (n = 14), YEATS4 (n = 19).

Extended Data Fig. 4 Differences in H2A.Z binding between UL subclasses and normal samples.

log2 FC and average binding strength (normalized read concentration) were calculated on each H2A.Z peak region and are represented by a dot. ad, MED12 (n = 3) (a), HMGA2 (n = 4) (b), FH (n = 4) (c) and OM (n = 3) (d) tumours were separately compared to all normal samples (n = 11). H2A.Z peak regions were stratified by five-state genome annotations from myometrium. States were annotated as bivalent TSS regions (TssBiv marked by both H3K4me3 and H3K27me3), active TSS regions (TssA marked by both H3K4me3 and H3K27ac), active chromatin outside TSS regions (OtherA marked by H3K27ac), repressed chromatin (Repr marked by H3K27me3) and other chromatin (Other).

Extended Data Fig. 5 Differential H2A.Z binding between tumours and myometria.

Volcano plots displaying differences in H2A.Z binding for MED12 (n = 2), YEATS4 (n = 2), HMGA2 (n = 2), HMGA1 (n = 4), OM (n = 2) and FH (n = 2) tumours against normal samples (n = 4) from the spike-in ChIP–seq experiments. FDR (y-axis) and log2FC (x-axis) from DESeq2 analysis as implemented in the DiffBind R package. Violet dots represent differential H2A.Z binding sites (FDR < 0.05, |log2FC| > 1). The highlighted peaks (pink dots) are located close to CBX8 and named as S with distance in kilobases from the CBX8 TSS.

Extended Data Fig. 6 H2A.Z occupancy associates with chromatin opening and DNA methylation.

a, Chromatin accessibility as a function of H2A.Z occupancy. Chromatin accessibility log2FC (y-axis) was measured by DESeq2-normalized ATAC–seq Tn5 insertion counts in MED12 (n = 4), HMGA2 (n = 4), FH (n = 4) and YEATS4 (n = 4) tumours compared with normal samples (n = 15) at H2A.Z peaks stratified into increased, no change and decreased binding. b, c, Pileup of H2A.Z ChIP fragments from pooled myometrium data at DARs shown by composite plots and heatmaps. DARs in each UL subclass (b) or in YEATS4 tumours stratified by overlap with other UL subclasses (c) are represented by heatmap rows over which mean H2A.Z fragment coverage is calculated. d, Differences in H2A.Z binding at DARs in UL subclasses as compared to normal samples. Violet dots represent differential H2A.Z binding sites (FDR < 0.05, |log2FC| > 1). The highlighted peaks (pink dots) are located close to CBX8 and named as S with distance in kilobases from the CBX8 TSS. e, Mean sample-wise DNA methylation differences in MED12 (n = 11), HMGA2 (n = 26), FH (n = 6), YEATS4 (n = 14) and OM (n = 8) tumours compared with respective normal samples at H2A.Z peaks stratified into increased, no change and decreased binding. H2A.Z binding differences in a, d, e are from the spike-in ChIP–seq experiments comparing two tumours from each subclass to four normal samples. Increased, no change and decreased binding were defined using FDR < 0.05 and |log2FC| > 1 cutoffs. Boxplots show the median and the first and third quartiles. Error bars extend up to 1.5 IQR beyond the quartiles.

Extended Data Fig. 7 DNA methylation displays distinct patterns in UL subclasses.

a, Overall genome-wide DNA methylation in ULs and normal myometrium. Each dot represents a sample. b, Enrichment of hyper- and hypomethylated loci in different tumour subclasses on five chromatin states from normal myometrium. ORs and P values (log_pvalue stands for –log10(P value)) from one-sided Fisher’s exact test implemented in the LOLA R package. cg, Overall DNA methylation on active (c) and bivalent (d) TSSs, other active chromatin (e), repressed/poised chromatin (f) and other, quiescent, chromatin regions (g). Significance of methylation difference against normals evaluated by ordinary least squares regression: ***P < 0.0001, **P < 0.001, *P < 0.01. Test controlled by global methylation levels. Sample sizes: normal 96, MED12 13, HMGA1 21, HMGA2 28, YEATS4 11, OM 9, FH 6, Unknown 14. Box covers the central two quartiles of the distribution. Median is highlighted. Whiskers extend to the minimum and maximum of the distribution or at most 1.5× IQR from the edge of the box, whichever is closer. Multiple testing correction was not performed. h, Mean sample-wise DNA methylation in ULs and normal myometrium at H2A.Z binding sites derived from pooled normal myometrium tissue samples. Asterisks, box-and-whiskers plots and sample sizes are as in cg.

Extended Data Fig. 8 Clustering of RNA-seq samples.

a, log2FC of genes for which decreased (FDR < 0.05, FC < −1), increased (FDR < 0.05, FC > 1) or no-change H2A.Z peaks are located at the TSS. log2FC measured by differential expression analysis of tumours (MED12 (n = 38), HMGA2 (n = 44), HMGA1 (n = 62), FH (n = 15), OM (n = 15) and YEATS4 (n = 16)) against normal myometrium (n = 162). H2AZ binding differences are from the spike-in ChIP experiments comparing MED12 (n = 2), YEATS4 (n = 2), HMGA2 (n = 2), HMGA1 (n = 4), OM (n = 2) and FH (n = 2) tumours against normal samples (n = 4). Boxplots show the median and the first and third quartiles. Error bars extend up to 1.5× IQR beyond the quartiles. b, Heatmap presentation of 426 genes that separate myoma subclasses, selected on the basis of linear discriminant analysis. The ordering of samples and genes is based on an unsupervised hierarchical clustering of the 5% (n = 1,355) most variable genes. Two genes per gene cluster are highlighted on the basis of the highest absolute value in discriminant vectors. Patients from whom more than one tumour entered the analysis are highlighted in separate colours. All 426 genes are presented in Supplementary Table 20. c, Consensus clustering of RNA-seq samples. x-axis is sorted by subclass (from left to right: FH, HMGA1, HMGA2, MED12, normal myometrium, OM, YEATS4, and unknown; subclass labels are shown for reference). Item consensus is the mean consensus of an item with all the other items in the same cluster. For each sample, the item consensus value corresponding to each cluster (k = 26) is represented by a colour. For example, all FH samples have the largest item consensus on cluster 2, represented by dark green. Both YEATS4 and OM samples cluster predominantly to cluster 12, represented by blue. Unknown samples form several small clusters. The item consensus value (ei) of each cluster (k) is presented on the y-axis. It is defined as: \({m}_{i}(k)\,=\,\frac{1}{{N}_{k}-1\{{e}_{i}\in {I}_{k}\}}\sum _{j\in {I}_{k},j\ne i}M(i.j)\), where M is distance matrix and Nk is the number of items in the cluster. See Monti et al.55 for further details.

Source data

Extended Data Fig. 9 Canonical IPA pathway comparison analyses.

Pathways are relevant to multiple UL subclasses. Pathways with the highest total score (P values; right-tailed Fisher’s exact test) across the set of subclasses are sorted to the top. Heat map cells with insignificant P values (−log101.3) are marked with a dot.

Extended Data Fig. 10 H2A.Z ChIP–seq fragment pileup for individual samples at the locus harbouring CBX2, CBX4 and CBX8.

Pink squares depict differential H2A.Z binding sites close to CBX8 and the name of each of these sites refers to distance from the CBX8 TSS in kilobases. UL subclasses are colour-coded as in the main Figures. Coordinates in GRCh38.

Supplementary information

Supplementary Information

This file contains a Supplementary Discussion, Supplementary Figures 1-30, Supplementary Tables (with descriptions) 3-7, 9-12, 15-18, 23 and Supplementary References.

Reporting Summary

Supplementary Table 1

Anonymized patient identifiers and background information. First eight columns give the patient identifier, sample series and denote (YES/NO) which of the myometrium tissue data types were generated per patient. Next two columns give patients’ ancestry based on SNP chip genotypes (YES/NO; outliers identified in a principal components analysis, PLINK v1.90) and self-reported ancestry. The rest of the columns denote age at hysterectomy (years), height (cm), weight (kg), BMI (kg/m2), menopause status (pre/post/current hormone replacement therapy), parity, family history of ULs (YES/NO) and previous use of oral contraceptives (YES/NO). Note that all values were not available for all patients (empty fields denote missing data).

Supplementary Table 2

Tumour specimens utilized in the study. First three columns give the tumour identifier and subclass and denote which of the tumours were selected (YES/NO) for statistical analyses that rely on independent observations (one tumour per subclass per patient; see details in Methods). Next fourteen columns denote (YES/NO) which of the data types were generated per tumour and per paired-normal myometrium tissue. Patient identifiers (column J) refer to Supplementary Table 1. Next seven columns give the mutation statuses, and the following thirteen columns specify the mutation statuses used in the mutual exclusivity analysis. Next two columns specify, if available, tumour size (mm) and type (submucous/intramural/subserous). The rest of the columns summarize the WGS material (batch, sequencing platform, library size, sequencing coverage at SRCAP complex genes, software versions).

Supplementary Table 8

Summary statistics of the gene-based UL association results. The columns give the p-values (SKAT-O test; combined burden and SKAT test, sidedness is not applicable), MAC distribution, and a list of LoF variants (at MAF≤ 1%) identified in each gene. The results are based on the discovery set (50k UKBiobank). All genes with sum(MAC)≥3 among discovery set females of European ancestry were included. No adjustment for multiple comparisons.

Supplementary Table 13

Promoter-proximal open chromatin regions in YEATS4 tumours. Differentially more accessible regions annotated to genes (RefSeq TSS + 1kb flank). Coordinates are in GRCh37.

Supplementary Table 14

Differentially expressed genes (FDR < 0.05) where significantly decreased (FDR < 0.05, Fold < -1) H2A.Z binding peak is located at TSS in YEATS4 tumours.

Supplementary Table 19

Per subclass differential expression of genes with differential HiCHIP links bridging (see Supplementary Figure 22b) a H2A.z peak and a TSS of the gene. Links (DiffLink) were derived from FitHiChIP differential analysis using FDR < 0.05 and fold change (FC) >= log2(2) thresholds for EdgeR v3.26.5 statistics. The differential expression (DE) values were calculated with DESeq2 v1.22.2 utilizing Wald statistics.

Supplementary Table 20

The 426 genes that separate myoma subclasses (MED12, HMGA1, HMGA2, FH, YEATS4 + OM, and unknown) from each other based on linear discriminant analysis.

Supplementary Table 21

List of the differentially expressed (DE) genes in MED12, HMGA1, HMGA2, YEATS4, OM and FH UL subclasses as well as in all UL subclasses combined against myometrium. The values were calculated with DESeq2 v.1.22.2 utilizing Wald statistics.

Supplementary Table 22

Supplementary Tables 22.1 – 22.6 list the enriched pathways in MED12, HMGA1, HMGA2, YEATS4, OM and FH uterine leiomyoma subclasses. P-values were calculated with right-tailed Fisher’s exact test and presented for each pathway on logarithmic scale.

Supplementary Table 24

Summary of allele-specific gene expression, including subclass-specific numbers. Table is sorted by genes with enrichment of ASE among tumours (log2 fold-change with a pseudocount 1). The RNA-seq observations of tumours and normals were divided into five categories: i) allele-specific gene expression (ASE), ii) balanced gene expression (balancedE), iii) low gene expression (<16 coverage, lowE), iv) somatic allelic imbalance (AI), and v) no heterozygous markers to measure (homozygous), where the category iv) is applicable only among the tumour samples. Additionally, the number of tumors with ASE in each subclass and their proportions (prop; proportion of ASE per subclass after excluding chromosomal instability, low expression and homozygous samples) were listed. Genes with ASE in at least ten tumours were included in the table.

Supplementary Table 25

Meta-analysis results (P<1e-6) from three independent cohorts: FINNGEN (R3), UK Biobank and Biobank Japan. Each SNP was annotated for the nearest DAR (among the common 156 DARs) and nearest entry in the GWAS catalog (v1.0 “uterine fibroid”; accessed on October 1, 2020) if any were found within a 3Mbp distance. Allele frequency (AF) and regression coefficient (BETA) base on the alternative allele. Coordinates are in GRCh37.

Supplementary Table 26

Summary of EGA study and dataset accession numbers. First spreadsheet gives the total numbers of samples. For the rest of the spreadsheets, rows follow the patient and tumour listing in Supplementary Table 1 and 2, respectively, and columns match the data type columns in Supplementary Table 1 and 2.

Source Data

This file contains Source Data for Supplementary Fig. 1.

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Berta, D.G., Kuisma, H., Välimäki, N. et al. Deficient H2A.Z deposition is associated with genesis of uterine leiomyoma. Nature 596, 398–403 (2021). https://doi.org/10.1038/s41586-021-03747-1

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links