Population Genetics: Global clues to the nature of genomic mutations in humans

An analysis of worldwide human genetic variation reveals the footprints of ancient changes in genomic mutation processes.
  1. Aylwyn Scally  Is a corresponding author
  1. University of Cambridge, United Kingdom

All children inherit a mixture of chromosomal sequences from their parents, and although the copying process involved is extremely accurate, some errors occur. We refer to such errors as de novo (new) germline mutations, and when they are passed on to subsequent generations, these mutations are the raw material on which natural selection works and the source of all genetic differences between populations and species. Most have negligible or minor effects, but some on very rare occasions are responsible for serious genetic disease (Veltman and Brunner, 2012). The accumulation of genetic differences through mutation is also a primary source of information about human evolution. Thus it is important to understand the nature of genomic mutation, the rate at which it occurs and the factors causing it.

A key question is to what extent mutation processes differ between individuals, either in the total number of de novo mutations bequeathed to offspring or the places in the genome where they occur. Such differences could be genetic in origin, given that the proteins involved in DNA replication are themselves encoded in the genome, and there could also be environmental effects associated with where and how an individual lives. In both cases these factors may reflect recent evolutionary events, particularly the divergence of human populations and their global dispersal within the last 100,000 years.

One way to address this question would be to collect genomic and other data for thousands of families, identify de novo mutations in each of the offspring, and then analyse the factors contributing to them. However, this approach is extremely demanding in terms of the resources needed. Now, in eLife, Kelley Harris and Jonathan Pritchard of Stanford University report how they have taken an alternative approach (Harris and Pritchard, 2017) that involved using a dataset of whole-genome sequences for 2504 individuals from 26 different populations around the world (Auton et al., 2015). Consider that any genetic variant in this dataset, even if it is found today in many individuals, was once a de novo mutation in a single ancestor. Thus if different genetic or environmental factors have affected mutation processes in different human populations, we might expect to find evidence in the distribution of variants in these populations today.

Harris and Pritchard categorised single-nucleotide variants in this dataset by their ancestral and derived alleles (i.e. the version before and after mutation) and their sequence context as represented by the two flanking nucleotides. For example, the category AGC → ATC represents a mutation from G to T with flanking nucleotides A and C. After counting the number of variants in each category within every individual, the researchers found that the distribution of counts, termed the mutation spectrum, differs between populations to the extent that it is possible to identify an individual’s continent of origin based solely on the spectrum of mutations they carry. In general the differences between spectra comprise a multitude of small discrepancies, rather than large discrepancies in a few categories. However some categories do stand out, most notably an increased abundance of TCC → TTC mutations in European and South Asian populations – a signal also seen in other recent studies based on similar data (Mathieson and Reich, 2017; Narasimhan et al., 2016).

Harris and Pritchard then looked at how this signal changes with time. Mutations themselves have no timestamp, but on average a mutation that is rare in the population is likely to have arisen more recently than a mutation that is common. Using frequency as a proxy for age, Harris and Pritchard found that the enrichment for TCC → TTC was evident mainly in variants of intermediate age, and not in recent or very old variants. The data fit a model in which there was a pulse of mutation between about 2,000 and 15,000 years ago in the ancestors of present-day Europeans and South Asians, during which these mutations were 30–40% more likely.

One possible explanation is that this is the legacy of a mutator allele (that is, a mutation that increases the rate of de novo mutation in some or all categories) that appeared and survived in the population for several thousand years before going extinct (Figure 1). It is not known how frequently such alleles arise, or to what extent they drive differences in the mutation spectrum: however it does seem that under certain conditions they can survive for a long time and have lasting effects (Seoighe and Scally, 2017). In addition to a mutator allele that would have increased the relative rate of the TCC → TTC mutation, it is possible that other mutator alleles that had smaller effects and/or survived for shorter times in different populations may be responsible for less prominent differences in the mutation spectrum.

Changes in the genetic mutation spectrum in humans.

Modern humans originated in Africa (top left) and spread into Eurasia in a series of migrations 50,000–80,000 years ago. Subsequent migrations within Eurasia (such as the migration between West Eurasia and South Asia around 3000 years ago that is shown here; Reich et al., 2009) led to secondary genetic contact and admixture. The mutation spectrum (represented abstractly here) evolved separately in each population after divergence, perhaps due to the effect of mutator alleles: Harris and Pritchard propose that such a mutator allele could have been responsible for an increase in TCC → TTC mutations between 2,000 and 15,000 years ago (red shading) which influenced the mutation spectra of present-day Europeans and South Asians.

These findings are also relevant to the question of whether or not the overall rate of genomic mutation varies between populations and over time. There is evidence that the mean rate has probably not changed in at least the last 50,000 years (Fu et al., 2014), but we also know that the genome-wide mutation rate has slowed down when measured over timescales of millions of years (Moorjani et al., 2016). The results of Harris and Pritchard involve the relative rates of mutation in different sequence contexts, and so strictly speaking they do not tell us about variation in the overall rate, but they are suggestive of differences perhaps on the order of a few percent.

The next few years will see a substantial increase in the amount of available de novo mutation data, particularly from large-scale sequencing projects aimed at understanding the causes of rare genetic disease. These data will enable direct exploration of the factors determining mutation and its evolution within human populations, and shed further light on the questions addressed and raised by Harris and Pritchard.

References

Article and author information

Author details

  1. Aylwyn Scally

    Department of Genetics, University of Cambridge, Cambridge, United Kingdom
    For correspondence
    a.scally@gen.cam.ac.uk
    Competing interests
    The author declares that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-0807-1167

Publication history

  1. Version of Record published: May 17, 2017 (version 1)

Copyright

© 2017, Scally

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,150
    Page views
  • 257
    Downloads
  • 2
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Aylwyn Scally
(2017)
Population Genetics: Global clues to the nature of genomic mutations in humans
eLife 6:e27605.
https://doi.org/10.7554/eLife.27605
  1. Further reading

Further reading

    1. Evolutionary Biology
    Alexei V Tkachenko, Sergei Maslov
    Research Article

    Life as we know it relies on the interplay between catalytic activity and information processing carried out by biological polymers. Here we present a plausible pathway by which a pool of prebiotic information-coding oligomers could acquire an early catalytic function, namely sequence-specific cleavage activity. Starting with a system capable of non-enzymatic templated replication, we demonstrate that even non-catalyzed spontaneous cleavage would promote proliferation by generating short fragments that act as primers. Furthermore, we show that catalytic cleavage function can naturally emerge and proliferate in this system. Specifically, a cooperative catalytic network with four subpopulations of oligomers is selected by the evolution in competition with chains lacking catalytic activity. The cooperative system emerges through the functional differentiation of oligomers into catalysts and their substrates. The model is inspired by the structure of the hammerhead RNA enzyme as well as other DNA- and RNA-based enzymes with cleavage activity that readily emerge through natural or artificial selection. We identify the conditions necessary for the emergence of the cooperative catalytic network. In particular, we show that it requires the catalytic rate enhancement over the spontaneous cleavage rate to be at least 102–103, a factor consistent with the existing experiments. The evolutionary pressure leads to a further increase in catalytic efficiency. The presented mechanism provides an escape route from a relatively simple pairwise replication of oligomers toward a more complex behavior involving catalytic function. This provides a bridge between the information-first origin of life scenarios and the paradigm of autocatalytic sets and hypercycles, albeit based on cleavage rather than synthesis of reactants.

    1. Cell Biology
    2. Evolutionary Biology
    Jonathan E Phillips, Duojia Pan
    Research Advance

    The genomes of close unicellular relatives of animals encode orthologs of many genes that regulate animal development. However, little is known about the function of such genes in unicellular organisms or the evolutionary process by which these genes came to function in multicellular development. The Hippo pathway, which regulates cell proliferation and tissue size in animals, is present in some of the closest unicellular relatives of animals, including the amoeboid organism Capsaspora owczarzaki. We previously showed that the Capsaspora ortholog of the Hippo pathway nuclear effector Yorkie/YAP/TAZ (coYki) regulates actin dynamics and the three-dimensional morphology of Capsaspora cell aggregates, but is dispensable for cell proliferation control (Phillips et al., 2022). However, the function of upstream Hippo pathway components, and whether and how they regulate coYki in Capsaspora, remained unknown. Here, we analyze the function of the upstream Hippo pathway kinases coHpo and coWts in Capsaspora by generating mutant lines for each gene. Loss of either kinase results in increased nuclear localization of coYki, indicating an ancient, premetazoan origin of this Hippo pathway regulatory mechanism. Strikingly, we find that loss of either kinase causes a contractile cell behavior and increased density of cell packing within Capsaspora aggregates. We further show that this increased cell density is not due to differences in proliferation, but rather actomyosin-dependent changes in the multicellular architecture of aggregates. Given its well-established role in cell density-regulated proliferation in animals, the increased density of cell packing in coHpo and coWts mutants suggests a shared and possibly ancient and conserved function of the Hippo pathway in cell density control. Together, these results implicate cytoskeletal regulation but not proliferation as an ancestral function of the Hippo pathway kinase cascade and uncover a novel role for Hippo signaling in regulating cell density in a proliferation-independent manner.