
Differential dna mismatch repair underlies mutation rate variation across the human genome
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN
Play all audios:

ABSTRACT Cancer genome sequencing has revealed considerable variation in somatic mutation rates across the human genome, with mutation rates elevated in heterochromatic late replicating
regions and reduced in early replicating euchromatin1,2,3,4,5. Multiple mechanisms have been suggested to underlie this2,6,7,8,9,10, but the actual cause is unknown. Here we identify
variable DNA mismatch repair (MMR) as the basis of this variation. Analysing ∼17 million single-nucleotide variants from the genomes of 652 tumours, we show that regional autosomal mutation
rates at megabase resolution are largely stable across cancer types, with differences related to changes in replication timing and gene expression. However, mutations arising after the
inactivation of MMR are no longer enriched in late replicating heterochromatin relative to early replicating euchromatin. Thus, differential DNA repair and not differential mutation supply
is the primary cause of the large-scale regional mutation rate variation across the human genome. Access through your institution Buy or subscribe This is a preview of subscription content,
access via your institution ACCESS OPTIONS Access through your institution Subscribe to this journal Receive 51 print issues and online access $199.00 per year only $3.90 per issue Learn
more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices may be subject to local taxes which are calculated during checkout ADDITIONAL ACCESS
OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support SIMILAR CONTENT BEING VIEWED BY OTHERS MUTATIONAL SIGNATURE SBS8 PREDOMINANTLY ARISES
DUE TO LATE REPLICATION ERRORS IN CANCER Article Open access 03 August 2020 BOTH CELL AUTONOMOUS AND NON-AUTONOMOUS PROCESSES MODULATE THE ASSOCIATION BETWEEN REPLICATION TIMING AND MUTATION
RATE Article Open access 12 August 2023 MUTATIONAL SIGNATURES ASSOCIATION WITH REPLICATION TIMING IN NORMAL CELLS REVEALS SIMILARITIES AND DIFFERENCES WITH MATCHED CANCER TISSUES Article
Open access 15 May 2023 CHANGE HISTORY * _ 26 FEBRUARY 2015 A sentence was edited in the abstract to match the authors’ findings _ REFERENCES * Hodgkinson, A., Chen, Y. & Eyre-Walker, A.
The large-scale distribution of somatic mutations in cancer genomes. _Hum. Mutat._ 33, 136–143 (2012) Article CAS Google Scholar * Schuster-Böckler, B. & Lehner, B. Chromatin
organization is a major influence on regional mutation rates in human cancer cells. _Nature_ 488, 504–507 (2012) Article ADS Google Scholar * Woo, Y. H. & Li, W.-H. DNA replication
timing and selection shape the landscape of nucleotide variation in cancer genomes. _Nature Commun._ 3, 1004 (2012) Article ADS Google Scholar * Pleasance, E. D. et al. A comprehensive
catalogue of somatic mutations from a human cancer genome. _Nature_ 463, 191–196 (2010) Article CAS ADS Google Scholar * Liu, L., De, S. & Michor, F. DNA replication timing and
higher-order nuclear organization determine single-nucleotide substitution patterns in cancer genomes. _Nature Commun._ 4, 1502 (2013) Article ADS Google Scholar * Stamatoyannopoulos, J.
A. et al. Human mutation rate associated with DNA replication timing. _Nature Genet._ 41, 393–395 (2009) Article CAS Google Scholar * Waters, L. S. & Walker, G. C. The critical
mutagenic translesion DNA polymerase Rev1 is highly expressed during G2/M phase rather than S phase. _Proc. Natl Acad. Sci. USA_ 103, 8971–8976 (2006) Article CAS ADS Google Scholar *
Hsu, T. C. A possible function of constitutive heterochromatin: the bodyguard hypothesis. _Genetics_ 79 (suppl.). 137–150 (1975) PubMed Google Scholar * Sima, J. & Gilbert, D. M.
Complex correlations: replication timing and mutational landscapes during cancer and genome evolution. _Curr. Opin. Genet. Dev._ 25, 93–100 (2014) Article CAS Google Scholar * Chen, C.-L.
et al. Impact of replication timing on non-CpG and CpG substitution rates in mammalian genomes. _Genome Res._ 20, 447–457 (2010) Article CAS Google Scholar * Lawrence, M. S. et al.
Mutational heterogeneity in cancer and the search for new cancer-associated genes. _Nature_ 499, 214–218 (2013) Article CAS ADS Google Scholar * Jäger, N. et al. Hypermutation of the
inactive X chromosome is a frequent event in cancer. _Cell_ 155, 567–581 (2013) Article Google Scholar * The Cancer Genome Atlas Research Network. Comprehensive molecular characterization
of human colon and rectal cancer. _Nature_ 487, 330–337 (2012) * The Cancer Genome Atlas Research Network. Integrated genomic characterization of endometrial carcinoma. _Nature_ 497, 67–73
(2013) Article ADS Google Scholar * The Cancer Genome Atlas Research Network. Comprehensive molecular characterization of gastric adenocarcinoma. _Nature_ 513, 202–209 (2014) *
Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. _Nature_ 500, 415–421 (2013) Article CAS Google Scholar * Helleday, T., Eshtad, S. & Nik-Zainal, S.
Mechanisms underlying mutational signatures in human cancers. _Nature Rev. Genet._ 15, 585–598 (2014) Article CAS Google Scholar * Hombauer, H., Srivatsan, A., Putnam, C. D. &
Kolodner, R. D. Mismatch repair, but not heteroduplex rejection, is temporally coupled to DNA replication. _Science_ 334, 1713–1716 (2011) Article CAS ADS Google Scholar * Edelbrock, M.
A., Kaliyaperumal, S. & Williams, K. J. DNA mismatch repair efficiency and fidelity are elevated during DNA synthesis in human cells. _Mutat. Res._ 662, 59–66 (2009) Article CAS Google
Scholar * Amouroux, R., Campalans, A., Epe, B. & Radicella, J. P. Oxidative stress triggers the preferential assembly of base excision repair complexes on open chromatin regions.
_Nucleic Acids Res._ 38, 2878–2890 (2010) Article CAS Google Scholar * Chaudhuri, S., Wyrick, J. J. & Smerdon, M. J. Histone H3 Lys79 methylation is required for efficient nucleotide
excision repair in a silenced locus of _Saccharomyces cerevisiae_. _Nucleic Acids Res._ 37, 1690–1700 (2009) Article CAS Google Scholar * Murga, M. et al. Global chromatin compaction
limits the strength of the DNA damage response. _J. Cell Biol._ 178, 1101–1108 (2007) Article CAS Google Scholar * Hiratani, I. et al. Genome-wide dynamics of replication timing revealed
by _in vitro_ models of mouse embryogenesis. _Genome Res._ 20, 155–169 (2010) Article CAS Google Scholar * Lubelsky, Y. et al. DNA replication and transcription programs respond to the
same chromatin cues. _Genome Res._ 24, 1102–1114 (2014) Article CAS Google Scholar * Hiratani, I. et al. Global reorganization of replication domains during embryonic stem cell
differentiation. _PLoS Biol._ 6, e245 (2008) Article Google Scholar * Saunders, C. T. et al. Strelka: accurate somatic small-variant calling from sequenced tumor–normal sample pairs.
_Bioinformatics_ 28, 1811–1817 (2012) Article CAS Google Scholar * Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. _Nature
Biotechnol._ 31, 213–219 (2013) Article CAS Google Scholar * Roberts, N. D. et al. A comparative analysis of algorithms for somatic SNV detection in cancer. _Bioinformatics_ 29, 2223–2230
(2013) Article CAS Google Scholar * The Cancer Genome Atlas Research Network. Genomic and epigenomic landscapes of adult _de novo_ acute myeloid leukemia. _N. Engl. J. Med._ 368,
2059–2074 (2013) * Derrien, T. et al. Fast computation and applications of genome mappability. _PLoS ONE_ 7, e30377 (2012) Article CAS ADS Google Scholar * Costello, M. et al. Discovery
and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation. _Nucleic Acids Res._ 41, e67 (2013)
Article CAS Google Scholar * Kim, T.-M., Laird, P. W. & Park, P. J. The landscape of microsatellite instability in colorectal and endometrial cancer genomes. _Cell_ 155, 858–868
(2013) Article CAS Google Scholar * Pawlik, T. M., Raut, C. P. & Rodriguez-Bigas, M. A. Colorectal carcinogenesis: MSI-H versus MSI-L. _Dis. Markers_ 20, 199–206 (2004) Article
Google Scholar * Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. _BMC Bioinformatics_ 12, 323 (2011) Article CAS
Google Scholar * Wagner, G. P., Kin, K. & Lynch, V. J. Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples. _Theory Biosci._ 131, 281–285 (2012)
Article CAS Google Scholar * Hansen, R. S. et al. Sequencing newly replicated DNA reveals widespread plasticity in human replication timing. _Proc. Natl Acad. Sci. USA_ 107, 139–144
(2010) Article CAS ADS Google Scholar * Thurman, R. E., Day, N., Noble, W. S. & Stamatoyannopoulos, J. A. Identification of higher-order functional domains in the human ENCODE
regions. _Genome Res._ 17, 917–927 (2007) Article CAS Google Scholar * Barski, A. et al. High-resolution profiling of histone methylations in the human genome. _Cell_ 129, 823–837 (2007)
Article CAS Google Scholar * Jackson, D. A. Stopping rules in principal components analysis: a comparison of heuristical and statistical approaches. _Ecology_ 74, 2204–2214 (1993) Article
Google Scholar * Mebane, W. R. & Sekhon, J. S. Genetic optimization using derivatives: the rgenoud package for R. _J. Stat. Softw._ 42, 473–487 (2010) Google Scholar Download
references ACKNOWLEDGEMENTS This work was supported by grants from the Spanish Ministry of Economy and Competitiveness (BFU2011-26206 and ‘Centro de Excelencia Severo Ochoa 2013-2017’
SEV-2012-0208), a European Research Council Consolidator grant IR-DC (616434), Agència de Gestió d’Ajuts Universitaris i de Recerca (AGAUR), the EMBO Young Investigator Program, the EMBL-CRG
Systems Biology Program, FP7 project 4DCellFate (277899), FP7 project MAESTRA (ICT-2013-612944) and by Marie Curie Actions. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * EMBL-CRG Systems
Biology Unit, Centre for Genomic Regulation (CRG), 08003 Barcelona, Spain, Fran Supek & Ben Lehner * Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain, Fran Supek & Ben Lehner *
Division of Electronics, Rudjer Boskovic Institute, 10000 Zagreb, Croatia, Fran Supek * Institució Catalana de Recerca i Estudis Avançats (ICREA), 08010 Barcelona, Spain, Ben Lehner Authors
* Fran Supek View author publications You can also search for this author inPubMed Google Scholar * Ben Lehner View author publications You can also search for this author inPubMed Google
Scholar CONTRIBUTIONS F.S. performed all analyses. F.S. and B.L. designed analyses, interpreted the data and wrote the manuscript. CORRESPONDING AUTHOR Correspondence to Ben Lehner. ETHICS
DECLARATIONS COMPETING INTERESTS The authors declare no competing financial interests. EXTENDED DATA FIGURES AND TABLES EXTENDED DATA FIGURE 1 OVERALL MUTATIONAL BURDEN AND MEGABASE-SCALE
REGIONAL RATE VARIABILITY IN TUMOUR SAMPLES OF MSI-PRONE CANCER TYPES. A, B, Correlations of tissue specificity (TS; see Methods) in regional mutation rates of diffuse large B-cell lymphoma
(DLBC) with TS of gene expression in DLBC (A), or with TS of replication timing in the Gm12878 lymphoblastoid cell line (B). C, Overall mutational load, as SNVs per Mb of alignable genomic
DNA (Methods) for MSI-H, MSS (includes MSI-L), PolE mutant tumours, or otherwise hypermutated tumour samples. D, PC plot with PCs 3 and 4, as in Fig. 1e, but showing only tumour samples for
colorectal (CRAD), uterine (UCEC) and stomach (STAD) cancers for visual emphasis. E, F, Relative SNV frequencies across 1 Mb windows of chromosome 1p in UCEC and STAD. Unbroken and dotted
lines are the median across tumour samples and its 95% confidence interval, respectively. For each tumour sample, relative mutation frequencies are always obtained by dividing by the mean of
all 1 Mb windows. MSI/PolE samples are in the MSI-H group; hyper/ultramutators are not in the MSS group. EXTENDED DATA FIGURE 2 REDUCED CORRELATION OF REGIONAL MUTATION RATES TO GENE
EXPRESSION, HETEROCHROMATIN AND REPLICATION TIMING IN GENOMES AND EXOMES OF MSI TUMOURS. A–C, The 1 Mb windows in the genome were pooled into five equal-frequency bins by the average gene
expression levels (log2 transcripts per million (TPM)) in each window. The median and interquartile range of relative mutation rates across 1 Mb windows is shown for each bin. _R_2 values
were always determined on original (not binned) data. _P_ < 0.01 for difference of _R_ after Fisher _Z_-transform. Gene expression levels are medians over TPM across 15 cancer types.
Relative SNV frequencies of each tumour sample were obtained by normalizing by the average SNV density of all genomic 1 Mb windows of that sample. Prior to binning the windows, cancer
samples in a group were combined by taking the median of the relative mutation frequencies for each 1 Mb window, as illustrated for CRAD in Fig. 2d. PolE/MSI samples are in the MSI group;
ultramutators are not in the MSS group. MSI-L samples are pooled with MSS. D–F, Same as in A–C but for five heterochromatin bins (median H3K9me3 signal over eight tissues and cell lines).
G–I, Regional mutation rates in exome sequences of a broader set of 195 MSI-H tumour samples. The 1,709 genomic 1 Mb windows with at least 5 kb alignable protein-coding DNA each were grouped
into five equal-frequency bins by the median Repli-Seq signal over 11 cell lines (Methods). Mutations were pooled across all samples in one cancer type with a known MSI-H or MSS status
(Methods). _a_ is the slope of the regression line fit to binned data. J, Slopes _a_ determined for individual cancer exomes with a sufficient number of mutations (≥50 SNVs). Number of
samples _n_ shown below each group. For all cancer types, MSI-H samples have significantly less negative slopes than MSS (_P_ < 0.01, Mann–Whitney test, one tailed). MSI-H also includes
the MSI-H/PolE mutant samples, and MSS includes the MSI-L samples. In the exome analyses, ultramutators were not considered separately. EXTENDED DATA FIGURE 3 ASSOCIATION OF MUTATIONAL
SIGNATURES TO MICROSATELLITE INSTABILITY AND TO REPLICATION TIMING. A, Relative frequencies of the 96 mutation contexts (strand symmetric) in MSI versus MSS cancers; the MSS group includes
MSI-L samples but not MSS/PolE ultramutators. Mutations were pooled across samples of MSI-prone tissues (CRAD, UCEC and STAD). B, C, Similar to Fig. 3a, b, showing two additional examples of
mutational contexts with different MSI propensities and their relative mutation rates across five genomic replication timing bins. D, Lack of correlation between the MSI propensity of a
mutational context with its replication timing slope in MSS tumour samples (compare to Fig. 3c, which shows slopes in MSI samples). Ts, transition; Tv, transversion. E, F, Association of per
cent MSI-specific signatures (CCN > CAN + GCN > GTN + [C/T]AN > [C/T]GN) across cancer samples and the binned replication timing slopes for two non-MSI transition signatures in the
same samples. Slopes averaged over contexts are displayed in each plot. In all panels except A, mutation rates were normalized to number of nucleotides at risk in a 1 Mb window before
determining the replication timing slopes. EXTENDED DATA FIGURE 4 THE DECONVOLUTION OF MSI MUTATIONAL SPECTRA ROBUSTLY CONVERGES ONTO TWO EQUIVALENT SOLUTIONS. A, Agreement of the observed
relative frequencies of mutational contexts in each tumour sample with the predictions of model 1 (having median _a_, _b_ and _z_ coefficients across all solutions in cluster 1). B, Sets of
best-fit solutions determined in a hundred optimization runs initialized with different starting conditions. The solutions cluster into two homogeneous clusters (Pearson _R_ > 0.9 between
>90% of the solutions within a cluster, in UPGMA hierarchical clustering). C, D, Solutions within both clusters have similar fit to observed data (C) and make extremely similar
predictions for mutation spectra in tumour samples (D). E–H, Similar to Fig. 4a, b. Example mutation accumulation diagrams for two mutation contexts typical of MSI tumours, shown for an
example MSI tumour TCGA-BR-4280 (E, G) and for an MSS tumour TCGA-CD-8529 (F, H). I, J, Values of the parameters in two solution clusters, with medians and interquartile ranges (shown as
whiskers). Each solution encompasses 104 parameters: relative mutation rates _a_ and _b_ for each of 28 mutational contexts (I), and the relative pre-MMR failure time _z_ for each tumour
sample of the 24 MSI and 24 MSS samples (J). SUPPLEMENTARY INFORMATION SUPPLEMENTARY TABLE 1 This table contains genome data sources. (XLSX 53 kb) POWERPOINT SLIDES POWERPOINT SLIDE FOR FIG.
1 POWERPOINT SLIDE FOR FIG. 2 POWERPOINT SLIDE FOR FIG. 3 POWERPOINT SLIDE FOR FIG. 4 RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Supek, F., Lehner,
B. Differential DNA mismatch repair underlies mutation rate variation across the human genome. _Nature_ 521, 81–84 (2015). https://doi.org/10.1038/nature14173 Download citation * Received:
30 September 2014 * Accepted: 19 December 2014 * Published: 23 February 2015 * Issue Date: 07 May 2015 * DOI: https://doi.org/10.1038/nature14173 SHARE THIS ARTICLE Anyone you share the
following link with will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer
Nature SharedIt content-sharing initiative