Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Genomic insights into population history and biological adaptation in Oceania

Abstract

The Pacific region is of major importance for addressing questions regarding human dispersals, interactions with archaic hominins and natural selection processes1. However, the demographic and adaptive history of Oceanian populations remains largely uncharacterized. Here we report high-coverage genomes of 317 individuals from 20 populations from the Pacific region. We find that the ancestors of Papuan-related (‘Near Oceanian’) groups underwent a strong bottleneck before the settlement of the region, and separated around 20,000–40,000 years ago. We infer that the East Asian ancestors of Pacific populations may have diverged from Taiwanese Indigenous peoples before the Neolithic expansion, which is thought to have started from Taiwan around 5,000 years ago2,3,4. Additionally, this dispersal was not followed by an immediate, single admixture event with Near Oceanian populations, but involved recurrent episodes of genetic interactions. Our analyses reveal marked differences in the proportion and nature of Denisovan heritage among Pacific groups, suggesting that independent interbreeding with highly structured archaic populations occurred. Furthermore, whereas introgression of Neanderthal genetic information facilitated the adaptation of modern humans related to multiple phenotypes (for example, metabolism, pigmentation and neuronal development), Denisovan introgression was primarily beneficial for immune-related functions. Finally, we report evidence of selective sweeps and polygenic adaptation associated with pathogen exposure and lipid metabolism in the Pacific region, increasing our understanding of the mechanisms of biological adaptation to island environments.

This is a preview of subscription content

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Whole-genome variation in Pacific Islanders.
Fig. 2: Demographic models of the human settlement of the Pacific.
Fig. 3: Neanderthal and Denisovan introgression across the Pacific.
Fig. 4: Mechanisms of genetic adaptation to Pacific environments.

Data availability

The whole-genome sequencing dataset generated and analysed in this study is available from the European Genome-Phenome Archive (EGA; https://www.ebi.ac.uk/ega/), under accession code EGAS00001004540. Data access and use is restricted to academic research in population genetics, including research on population origins, ancestry and history. The SGDP genome data were retrieved from the EBI European Nucleotide Archive (accession codes PRJEB9586 and ERP010710). The genome data from Malaspinas et al.18 were retrieved from the EGA (accession code EGAS00001001247). The genome data from Vernot et al.16 were retrieved from dbGAP (accession code phs001085.v1.p1).

Code availability

Neutrality statistics were computed with the optimized, window-based algorithms implemented in selink (https://github.com/h-e-g/selink). All other custom-generated computer codes or algorithms used in this study are available on GitHub (https://github.com/h-e-g/evoceania).

References

  1. 1.

    Gosling, A. L. & Matisoo-Smith, E. A. The evolutionary history and human settlement of Australia and the Pacific. Curr. Opin. Genet. Dev. 53, 53–59 (2018).

    CAS  PubMed  Article  Google Scholar 

  2. 2.

    Hung, H.-C. & Carson, M. T. Foragers, fishers and farmers: origins of the Taiwanese Neolithic. Antiquity 88, 1115–1131 (2014).

    Article  Google Scholar 

  3. 3.

    Gray, R. D., Drummond, A. J. & Greenhill, S. J. Language phylogenies reveal expansion pulses and pauses in Pacific settlement. Science 323, 479–483 (2009).

    ADS  CAS  PubMed  Article  Google Scholar 

  4. 4.

    Bellwood, P. First Farmers: the Origins of Agricultural Societies (Blackwell, 2005).

  5. 5.

    O’Connell, J. F. et al. When did Homo sapiens first reach Southeast Asia and Sahul? Proc. Natl Acad. Sci. USA 115, 8482–8490 (2018).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  6. 6.

    Kirch, P. V. On the Road of the Winds: An Archeological History of the Pacific Islands before European Contact (Univ. California Press, 2017).

  7. 7.

    Wollstein, A. et al. Demographic history of Oceania inferred from genome-wide data. Curr. Biol. 20, 1983–1992 (2010).

    CAS  PubMed  Article  Google Scholar 

  8. 8.

    Lipson, M. et al. Population turnover in Remote Oceania shortly after initial settlement. Curr. Biol. 28, 1157–1165 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  9. 9.

    Skoglund, P. et al. Genomic insights into the peopling of the Southwest Pacific. Nature 538, 510–513 (2016).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  10. 10.

    Posth, C. et al. Language continuity despite population replacement in Remote Oceania. Nat. Ecol. Evol. 2, 731–740 (2018).

    PubMed  PubMed Central  Article  Google Scholar 

  11. 11.

    Pugach, I. et al. The gateway from Near into Remote Oceania: new insights from genome-wide data. Mol. Biol. Evol. 35, 871–886 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  12. 12.

    Bergström, A. et al. A Neolithic expansion, but strong genetic structure, in the independent history of New Guinea. Science 357, 1160–1163 (2017).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  13. 13.

    Ioannidis, A. G. et al. Native American gene flow into Polynesia predating Easter Island settlement. Nature 583, 572–577 (2020).

    ADS  CAS  PubMed  Article  Google Scholar 

  14. 14.

    Qin, P. & Stoneking, M. Denisovan ancestry in East Eurasian and Native American populations. Mol. Biol. Evol. 32, 2665–2674 (2015).

    CAS  PubMed  Article  Google Scholar 

  15. 15.

    Reich, D. et al. Denisova admixture and the first modern human dispersals into Southeast Asia and Oceania. Am. J. Hum. Genet. 89, 516–528 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  16. 16.

    Vernot, B. et al. Excavating Neandertal and Denisovan DNA from the genomes of Melanesian individuals. Science 352, 235–239 (2016).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  17. 17.

    Sankararaman, S., Mallick, S., Patterson, N. & Reich, D. The combined landscape of Denisovan and Neanderthal ancestry in present-day humans. Curr. Biol. 26, 1241–1247 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  18. 18.

    Malaspinas, A. S. et al. A genomic history of Aboriginal Australia. Nature 538, 207–214 (2016).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  19. 19.

    Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  20. 20.

    Prüfer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43–49 (2014).

    ADS  Article  CAS  Google Scholar 

  21. 21.

    Prüfer, K. et al. A high-coverage Neandertal genome from Vindija Cave in Croatia. Science 358, 655–658 (2017).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  22. 22.

    Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  23. 23.

    Lipson, M. et al. Three phases of ancient migration shaped the ancestry of human populations in Vanuatu. Curr. Biol. 30, 4846–4856 (2020).

    CAS  PubMed  Article  Google Scholar 

  24. 24.

    Excoffier, L., Dupanloup, I., Huerta-Sánchez, E., Sousa, V. C. & Foll, M. Robust demographic inference from genomic and SNP data. PLoS Genet. 9, e1003905 (2013).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  25. 25.

    Larena, M. et al. Multiple migrations to the Philippines during the last 50,000 years. Proc. Natl Acad. Sci. USA, https://doi.org/10.1073/pnas.2026132118 (2021).

  26. 26.

    Yang, M. A. et al. Ancient DNA indicates human population shifts and admixture in northern and southern China. Science 369, 282–288 (2020).

    ADS  CAS  PubMed  Article  Google Scholar 

  27. 27.

    Rieth, T. M. & Athens, J. S. Late Holocene human expansion into Near and Remote Oceania: a Bayesian model of the chronologies of the Mariana Islands and Bismarck Archipelago. J. Island Coast. Archaeol. 14, 5–16 (2019).

    Article  Google Scholar 

  28. 28.

    Browning, S. R., Browning, B. L., Zhou, Y., Tucci, S. & Akey, J. M. Analysis of human sequence data reveals two pulses of archaic Denisovan admixture. Cell 173, 53–61 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  29. 29.

    Jacobs, G. S. et al. Multiple deeply divergent Denisovan ancestries in Papuans. Cell 177, 1010–1021 (2019).

    CAS  PubMed  Article  Google Scholar 

  30. 30.

    Détroit, F. et al. A new species of Homo from the Late Pleistocene of the Philippines. Nature 568, 181–186 (2019).

    ADS  PubMed  Article  CAS  Google Scholar 

  31. 31.

    Gittelman, R. M. et al. Archaic hominin admixture facilitated adaptation to out-of-Africa environments. Curr. Biol. 26, 3375–3382 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  32. 32.

    Racimo, F., Marnetto, D. & Huerta-Sánchez, E. Signatures of archaic adaptive introgression in present-day human populations. Mol. Biol. Evol. 34, 296–317 (2017).

    CAS  PubMed  Google Scholar 

  33. 33.

    Simonti, C. N. et al. The phenotypic legacy of admixture between modern humans and Neandertals. Science 351, 737–741 (2016).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. 34.

    Vitale, C. et al. Surface expression and function of p75/AIRM-1 or CD33 in acute myeloid leukemias: engagement of CD33 induces apoptosis of leukemic cells. Proc. Natl Acad. Sci. USA 98, 5764–5769 (2001).

    ADS  CAS  PubMed  Article  Google Scholar 

  35. 35.

    Negishi, H. et al. Negative regulation of Toll-like-receptor signaling by IRF-4. Proc. Natl Acad. Sci. USA 102, 15989–15994 (2005).

    ADS  CAS  PubMed  Article  Google Scholar 

  36. 36.

    Hedblom, E. & Kirkness, E. F. A novel class of GABAA receptor subunit in tissues of the reproductive system. J. Biol. Chem. 272, 15346–15350 (1997).

    CAS  PubMed  Article  Google Scholar 

  37. 37.

    Hoffmann, T. J. et al. A large multiethnic genome-wide association study of adult body mass index identifies novel loci. Genetics 210, 499–515 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  38. 38.

    Lee, I. H. et al. Atg7 modulates p53 activity to regulate cell cycle and survival during metabolic stress. Science 336, 225–228 (2012).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  39. 39.

    Giri, A. et al. Trans-ethnic association study of blood pressure determinants in over 750,000 individuals. Nat. Genet. 51, 51–62 (2019).

    CAS  PubMed  Article  Google Scholar 

  40. 40.

    Sakaue, S. et al. Functional variants in ADH1B and ALDH2 are non-additively associated with all-cause mortality in Japanese population. Eur. J. Hum. Genet. 28, 378–382 (2020).

    CAS  PubMed  Article  Google Scholar 

  41. 41.

    Perttilä, J. et al. OSBPL10, a novel candidate gene for high triglyceride trait in dyslipidemic Finnish subjects, regulates cellular lipid metabolism. J. Mol. Med. 87, 825–835 (2009).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  42. 42.

    Sierra, B. et al. OSBPL10, RXRA and lipid metabolism confer African-ancestry protection against dengue haemorrhagic fever in admixed Cubans. PLoS Pathog. 13, e1006220 (2017).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  43. 43.

    Gao, X. R., Huang, H. & Kim, H. Genome-wide association analyses identify 139 loci associated with macular thickness in the UK Biobank cohort. Hum. Mol. Genet. 28, 1162–1172 (2019).

    CAS  PubMed  Article  Google Scholar 

  44. 44.

    Sella, G. & Barton, N. H. Thinking about the evolution of complex traits in the era of genome-wide association studies. Annu. Rev. Genomics Hum. Genet. 20, 461–493 (2019).

    CAS  PubMed  Article  Google Scholar 

  45. 45.

    Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  46. 46.

    Field, Y. et al. Detection of human adaptation during the past 2000 years. Science 354, 760–764 (2016).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  47. 47.

    Berg, J. J. et al. Reduced signal for polygenic adaptation of height in UK Biobank. eLife 8, e39725 (2019).

    PubMed  PubMed Central  Article  Google Scholar 

  48. 48.

    Brown, P. et al. A new small-bodied hominin from the Late Pleistocene of Flores, Indonesia. Nature 431, 1055–1061 (2004).

    ADS  CAS  PubMed  Article  PubMed Central  Google Scholar 

  49. 49.

    Gouy, A. & Excoffier, L. Polygenic patterns of adaptive introgression in modern humans are mainly shaped by response to pathogens. Mol. Biol. Evol. 37, 1420–1433 (2020).

    CAS  PubMed  Article  Google Scholar 

  50. 50.

    Gosling, A. L., Buckley, H. R., Matisoo-Smith, E. & Merriman, T. R. Pacific populations, metabolic disease and ‘just-so stories’: a critique of the ‘thrifty genotype’ hypothesis in Oceania. Ann. Hum. Genet. 79, 470–480 (2015).

    CAS  PubMed  Article  Google Scholar 

  51. 51.

    R Core Team. R: A language and environment for statistical computing. http://www.R-project.org/ (R Foundation for Statistical Computing, 2013).

  52. 52.

    Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  53. 53.

    DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  54. 54.

    McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  55. 55.

    Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  56. 56.

    Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  57. 57.

    Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  58. 58.

    Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  59. 59.

    Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  60. 60.

    Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  61. 61.

    Barrett, J. C., Fry, B., Maller, J. & Daly, M. J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005).

    CAS  PubMed  Article  Google Scholar 

  62. 62.

    1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Article  CAS  Google Scholar 

  63. 63.

    Excoffier, L., Smouse, P. E. & Quattro, J. M. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131, 479–491 (1992).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  64. 64.

    Meyer, L. R. et al. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res. 41, D64–D69 (2013).

    CAS  PubMed  Article  Google Scholar 

  65. 65.

    de Manuel, M. et al. Chimpanzee genomic diversity reveals ancient admixture with bonobos. Science 354, 477–481 (2016).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  66. 66.

    Sikora, M. et al. The population history of northeastern Siberia since the Pleistocene. Nature 570, 182–188 (2019).

    ADS  CAS  PubMed  Article  Google Scholar 

  67. 67.

    Schiffels, S. & Durbin, R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  68. 68.

    Fenner, J. N. Cross-cultural estimation of the human generation interval for use in genetics-based population divergence studies. Am. J. Phys. Anthropol. 128, 415–423 (2005).

    PubMed  Article  Google Scholar 

  69. 69.

    Fu, Q. et al. Genome sequence of a 45,000-year-old modern human from western Siberia. Nature 514, 445–449 (2014).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  70. 70.

    Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  71. 71.

    Excoffier, L. & Lischer, H. E. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol. Ecol. Resour. 10, 564–567 (2010).

    PubMed  Article  Google Scholar 

  72. 72.

    Beaumont, M. A., Zhang, W. & Balding, D. J. Approximate Bayesian computation in population genetics. Genetics 162, 2025–2035 (2002).

    PubMed  PubMed Central  Google Scholar 

  73. 73.

    Tavaré, S., Balding, D. J., Griffiths, R. C. & Donnelly, P. Inferring coalescence times from DNA sequence data. Genetics 145, 505–518 (1997).

    PubMed  PubMed Central  Article  Google Scholar 

  74. 74.

    Fortes-Lima, C. A., Laurent, L., Thouzeau, V., Toupance, B. & Verdu, P. Complex genetic admixture histories reconstructed with approximate Bayesian computations. Mol. Ecol. Resour. https://doi.org/10.1111/1755-0998.13325 (2021).

  75. 75.

    Verdu, P. & Rosenberg, N. A. A general mechanistic model for admixture histories of hybrid populations. Genetics 189, 1413–1426 (2011).

    PubMed  PubMed Central  Article  Google Scholar 

  76. 76.

    Gravel, S. Population genetics models of local ancestry. Genetics 191, 607–619 (2012).

    PubMed  PubMed Central  Article  Google Scholar 

  77. 77.

    Liang, M. & Nielsen, R. The lengths of admixture tracts. Genetics 197, 953–967 (2014).

    PubMed  PubMed Central  Article  Google Scholar 

  78. 78.

    Csilléry, K., François, O. & Blum, M. G. B. abc: an R package for approximate Bayesian computation (ABC). Methods Ecol. Evol. 3, 475–479 (2012).

    Article  Google Scholar 

  79. 79.

    Pudlo, P. et al. Reliable ABC model choice via random forests. Bioinformatics 32, 859–866 (2016).

    CAS  PubMed  Article  Google Scholar 

  80. 80.

    Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 93, 278–288 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  81. 81.

    Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).

    PubMed  PubMed Central  Article  Google Scholar 

  82. 82.

    Sankararaman, S. et al. The genomic landscape of Neanderthal ancestry in present-day humans. Nature 507, 354–357 (2014).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  83. 83.

    Delaneau, O., Marchini, J. & Zagury, J. F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2012).

    CAS  Article  Google Scholar 

  84. 84.

    Delaneau, O., Zagury, J. F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).

    CAS  PubMed  Article  Google Scholar 

  85. 85.

    Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  86. 86.

    Kutmon, M. et al. WikiPathways: capturing the full diversity of pathway knowledge. Nucleic Acids Res. 44, D488–D494 (2016).

    CAS  Article  Google Scholar 

  87. 87.

    Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).

    CAS  PubMed  Article  Google Scholar 

  88. 88.

    The Gene Ontology Consortium. Gene ontology: tool for the unification of biology. Nat. Genet. 25, 25–29 (2000).

    PubMed Central  Article  CAS  PubMed  Google Scholar 

  89. 89.

    Deschamps, M. et al. genomic signatures of selective pressures and introgression from archaic hominins at human innate immunity genes. Am. J. Hum. Genet. 98, 5–21 (2016).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  90. 90.

    Enard, D. & Petrov, D. A. Evidence that RNA viruses drove adaptive introgression between Neanderthals and modern humans. Cell 175, 360–371 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  91. 91.

    Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  92. 92.

    Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2019).

    CAS  PubMed  Article  Google Scholar 

  93. 93.

    ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    ADS  Article  CAS  Google Scholar 

  94. 94.

    Shriver, M. D. et al. The genomic distribution of population substructure in four populations using 8,525 autosomal SNPs. Hum. Genomics 1, 274–286 (2004).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  95. 95.

    Sabeti, P. C. et al. Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913–918 (2007).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  96. 96.

    Speidel, L., Forest, M., Shi, S. & Myers, S. R. A method for genome-wide genealogy estimation for thousands of samples. Nat. Genet. 51, 1321–1329 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  97. 97.

    GenomeAsia100K Consortium. The GenomeAsia 100K Project enables genetic discoveries across Asia. Nature 576, 106–111 (2019).

    ADS  Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank all volunteers and Indigenous communities participating in this research; S. Créno and the HPC Core Facility of Institut Pasteur (Paris) for the management of computational resources; F. Mendoza de Leon Jr, NCCA chairperson 2010–2016, for his support; C. Ebeo, O. Casel, K. Pullupul Hagada, D. Guilay, A. Manera and R. Quilang of Cagayan State University, Lahaina Sue Azarcon and Samuel Benigno of Quirino State University, and the regional and provincial offices of the National Commission for Indigenous Peoples (NCIP)–Cagayan Valley for their support and assistance. J.C. is supported by the INCEPTION programme ANR-16-CONV-0005 and the Ecole Doctorale FIRE-CRI-Programme Bettencourt and L.R.A. by a Pasteur-Roux-Cantarini fellowship. The CNRGH sequencing platform was supported by the France Génomique National infrastructure, funded as part of the « Investissements d’Avenir » programme managed by the Agence Nationale pour la Recherche (ANR-10-INBS-09). M.J. is supported by the Knut and Alice Wallenberg foundation. M.S. is supported by the Max Planck Society. The laboratory of L.Q.-M. is supported by the Institut Pasteur, the Collège de France, the CNRS, the Fondation Allianz-Institut de France and the French Government’s Investissement d’Avenir programme, Laboratoires d’Excellence ‘Integrative Biology of Emerging Infectious Diseases’ (ANR-10-LABX-62-IBEID) and ‘Milieu Intérieur’ (ANR-10-LABX-69-01).

Author information

Affiliations

Authors

Contributions

E.P. and L.Q.-M. conceived and supervised the project; J.C. led and performed the processing of the genetic data as well as the analyses of population structure and demographic inference; J.M.-R. led and performed the analyses of archaic and adaptive introgression; L.R.A. led and performed the analyses of genetic adaptation; S.C.-E., R.L. and P.V. performed the analyses of admixture models; E.P. coordinated all genetic analyses; O.C., M.L., A.M.-S.K., Y.-C.K., M.J., A.G. and M.S collaborated with local groups to collect population samples; C.H., A.B., R.O. and J.-F.D. coordinated and performed sample preparation and sequencing; F.V. provided the archaeological and anthropological context; G.L. and L.E. provided the theoretical and methodological context; J.C., J.M.-R., L.R.A., E.P. and L.Q.-M. wrote the manuscript, with critical input from all authors.

Corresponding authors

Correspondence to Etienne Patin or Lluis Quintana-Murci.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Patrick Kirch, Cosimo Posth and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Extended data figures and tables

Extended Data Fig. 1 Genetic structure of Pacific populations.

ADMIXTURE ancestry components are shown from K = 2 (top) to K = 10 (bottom) for the 462 unrelated individuals. The lowest cross-validation error was obtained at K = 6 (Supplementary Fig. 5). Populations are delimited by black borders. Population width is not proportional to population sample size, which is indicated in parentheses.

Extended Data Fig. 2 Demographic models for Pacific populations.

a, Maximum-likelihood demographic model for baseline populations. Point estimates of parameters and 95% confidence intervals are shown in Supplementary Table 2. b, Maximum-likelihood demographic models for western Remote Oceanian individuals (VAN). The likelihoods of the two models are not considered to be different. Point estimates of parameters and 95% confidence intervals are shown in Supplementary Table 5. The (VAN, PNG) model (left) assumes that the ni-Vanuatu diverged from Papua New Guinean Highlanders and then received gene flow from Solomon Islanders, Bismarck Islanders and Austronesian-speaking Taiwanese Indigenous peoples. The (VAN, SLI) (right) model assumes that the ni-Vanuatu diverged from the Solomon Islanders and then received gene flow from the other three groups. For the sake of clarity, only Taiwanese Indigenous, Near Oceanian and western Remote Oceanian populations are shown. c, Maximum-likelihood model for Austronesian-speaking populations, represented by Taiwanese Indigenous, Philippine Kankanaey and Tikopia Polynesian individuals. BKA, Bismarck Islanders; HAN, Han Chinese individuals (China); NOC GST, a meta-population of Near Oceanian individuals; OoA GST, an unsampled population to represent the Out-of-Africa exodus; PHP, Philippine individuals; PNG, Papua New Guinean Highlanders; POL, Polynesian individuals from the Solomon Islands; SAR, Sardinian individuals (Italy); SLI, Solomon Islanders; TWN, Taiwanese Indigenous peoples; VAN, ni-Vanuatu; YRB, Yoruba individuals (Nigeria). We assumed a mutation rate of 1.25 × 10−8 mutations per generation per site and a generation time of 29 years. Single-pulse introgression rates are reported as a percentage. The 95% confidence intervals are shown in square brackets. The larger the rectangle width, the larger the estimated effective population size (Ne), except for b. Bottlenecks are indicated by black rectangles. Grey and black arrows represent continuous and single pulse gene flow, respectively. One- and two-directional arrows indicate asymmetric and symmetric gene flow, respectively. We limited the number of parameter estimations by making simplifying assumptions regarding the recent demography of East-Asian-related and Near Oceanian populations in a and c, respectively (Supplementary Note 4). Sample sizes are described in Supplementary Note 4.

Extended Data Fig. 3 Match rate of introgressed S′ haplotypes in Pacific populations to the Vindija Neanderthal and Altai Denisovan genomes.

The match rate is the proportion of putative archaic alleles matching a given archaic genome, excluding sites at masked positions. Only S′ haplotypes with more than 40 sites outside archaic genome masks were included in the analysis. The numbers indicate the height of the density corresponding to each contour line. Contour lines are shown for multiples of 1 (solid lines) and multiples of 0.1 between 0.3 and 0.9 (dashed lines).

Extended Data Fig. 4 Detection of introgressed haplotypes from an unknown archaic hominin.

a, Cumulative length of S′ haplotypes retrieved among modern human populations (S′), after removing Neanderthal CRF haplotypes (S′NoNeanderthal) or Denisovan CRF haplotypes (S′NoDenisova) or both (S′NoArchaic), and removing from the S′NoArchaic haplotypes those with a match rate higher than 1% to either the Vindija Neanderthal or Altai Denisovan genomes (S′NoArchaicLowMatch). These S′ haplotypes are, therefore, putatively introgressed haplotypes from hominins outside of the Neanderthal and Denisovan branch (Supplementary Note 13). b, Proportion of S′NoArchaicLowMatch haplotypes common or private (that is, unique) to populations. Total numbers of S′NoArchaicLowMatch haplotypes are shown above the population labels.

Extended Data Fig. 5 Examples of candidate loci for adaptive introgression in Pacific populations.

a, Adaptive introgression of Denisovan origin at the CD33 locus. b, Adaptive introgression of Denisovan origin at the IRF4 locus. c, Adaptive introgression of Neanderthal origin at the KRT80 locus. d, Adaptive introgression of Neanderthal origin at the TBC1D1 locus. e, Adaptive introgression of Denisovan origin at the JAK1 locus. f, Adaptive introgression of Denisovan origin at the BANK1 locus. af, Left, local Manhattan plot showing the derived allele frequency of archaic SNPs (aSNPs), the proportion of high-confidence introgressed haplotypes (HC CRF) and the gene isoforms at the locus (in Mb, based on hg19 coordinates). Middle, derived allele frequencies of the top archaic SNP in 1000 Genomes Project phase 3 populations (excluding recently admixed populations). Right, derived allele frequencies of the top archaic SNP in populations from this study. Colours in the left panels indicate populations as in Fig. 1. Pie charts indicate the derived allele frequency in purple, and are centred on the approximate geographical location of each population. Maps were generated using the maps R package51.

Extended Data Fig. 6 Classic sweep signals detected in Papuan-related populations.

ad, Manhattan plots of classic sweep signals in Papua New Guinean Highlanders (a), Solomon Islanders (b), ni-Vanuatu (c) and Philippine Agta (d). ad, The y axis shows the −log10(P value) for the number of outlier SNPs per window. Each point is a 100-kb window. The names of genes associated with windows with significant sweep signals are shown.

Extended Data Fig. 7 Classic sweeps and adaptive archaic introgression.

a, b, Coloured squares indicate genomic regions displaying signals of both a selective sweep and adaptive introgression from Neanderthals (a) or Denisovans (b). Yellow and blue frames indicate genomic regions identified in East-Asian- and Papuan-related populations, respectively. AGT, Philippine Agta; PHP, Philippine individuals.

Extended Data Fig. 8 Examples of candidate loci for classic sweeps in Pacific populations.

a, c, Sweep signals detected in Papuan-related populations at the GABRP locus (a) and in Polynesian populations at the LHFPL2 locus (c). Manhattan plots shows the −log10(P value) of the Fisher’s scores for each SNP (Supplementary Note 16). b, d, Maps showing the population allele frequencies for candidate SNPs rs79997355 (GABRP) (b) and rs117421341 (LHFPL2) (d). Pie charts indicate the derived allele frequency in purple, in which the radius is proportional to the sample size (Supplementary Table 1). The pie charts for the populations of Santa Cruz and Vanuatu were moved from their sampling locations for convenience. Maps were generated using the maps R package51.

Extended Data Fig. 9 Classic sweep signals detected in East-Asian-related populations.

Manhattan plots of classic sweep signals in East Asian individuals (a), Taiwanese Indigenous peoples (b), Philippine Cebuano (c) and Polynesian individuals (d). ad, The y axis shows the −log10(P value) for the number of outlier SNPs per window. Each point is a 100-kb window. The names of genes associated with windows with significant sweep signals are shown.

Extended Data Fig. 10 Schematic model of the history of archaic introgression in modern humans.

The phylogenetic tree depicts relationships among archaic and modern humans. Estimates for the splits between archaic, introgressing populations and for introgression episodes are shown. Five introgression events are consistent with our data: a Neanderthal introgression event into the common ancestors of non-African individuals around 61 ka; a Denisovan introgression event into the ancestors of Papuan individuals approximately 46 ka, which is shared with the ancestral Indigenous Australian individuals and Philippine Agta populations14,15,17,97; a Denisovan introgression event that occurred only in the ancestors of Papuan individuals around 25 ka; a Denisovan introgression event in the ancestors of East Asian individuals around 21 ka, the legacy of which is also observed in Philippine Agta and western Eurasian individuals due to subsequent gene flow (solid purple arrows); and a Denisovan introgression event into the ancestors of the Philippine Agta at an unknown date.

Supplementary information

Supplementary Information

This file contains Supplementary Notes 1-18, which provide additional information on the methodology, results and discussion of the main text; Supplementary Figs. 1-82, which present additional results and quality checks; and Supplementary References.

Reporting Summary

Supplementary Tables

This file contains Supplementary Tables 1-25 (referred to in the main Supplementary Information file).

Peer Review File

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Choin, J., Mendoza-Revilla, J., Arauna, L.R. et al. Genomic insights into population history and biological adaptation in Oceania. Nature 592, 583–589 (2021). https://doi.org/10.1038/s41586-021-03236-5

Download citation

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing