Population variability in x-chromosome inactivation across 10 mammalian species

Nature

Select a language for the TTS:
UK English Female
UK English Male
US English Female
US English Male
Australian Female
Australian Male
Language selected: (auto detect) - EN

Play all audios:

ABSTRACT One of the two X-chromosomes in female mammals is epigenetically silenced in embryonic stem cells by X-chromosome inactivation. This creates a mosaic of cells expressing either the

maternal or the paternal X allele. The X-chromosome inactivation ratio, the proportion of inactivated parental alleles, varies widely among individuals, representing the largest instance of

epigenetic variability within mammalian populations. While various contributing factors to X-chromosome inactivation variability are recognized, namely stochastic and/or genetic effects,

their relative contributions are poorly understood. This is due in part to limited cross-species analysis, making it difficult to distinguish between generalizable or species-specific

mechanisms for X-chromosome inactivation ratio variability. To address this gap, we measure X-chromosome inactivation ratios in ten mammalian species (9531 individual samples), ranging from

rodents to primates, and compare the strength of stochastic models or genetic factors for explaining X-chromosome inactivation variability. Our results demonstrate the embryonic

stochasticity of X-chromosome inactivation is a general explanatory model for population X-chromosome inactivation variability in mammals, while genetic factors play a minor role. SIMILAR

CONTENT BEING VIEWED BY OTHERS INTEGRATED ANALYSIS OF XIST UPREGULATION AND X-CHROMOSOME INACTIVATION WITH SINGLE-CELL AND SINGLE-ALLELE RESOLUTION Article Open access 15 June 2021 GENE

REGULATION IN TIME AND SPACE DURING X-CHROMOSOME INACTIVATION Article 10 January 2022 X-LINKED COMPETITION — IMPLICATIONS FOR HUMAN DEVELOPMENT AND DISEASE Article 12 May 2025 INTRODUCTION

Every female mammalian embryo undergoes X-chromosome inactivation (XCI) as an essential step for successful development1,2,3. XCI evolved to balance the gene dosage between females with two

X-chromosomes and males with one X-chromosome4. While the exact timing can vary across species5, XCI usually occurs during preimplantation embryonic development6. During this process, one of

the two X-alleles in each female cell is independently, randomly, and permanently chosen for transcriptional silencing to match the single X-allele in male embryos1,7,8,9. The choice of

silenced X-allele is inherited through cell divisions, propagating the random choice of allelic inactivation down each cell’s subsequent lineage. This produces whole-body mosaicism for

allelic X-chromosome expression in each adult mammalian female, originating from very early embryonic development10. In humans, both X-alleles are equally likely to be inactivated, but XCI

ratios vary widely among adult females, from balanced to highly skewed. XCI ratios affect the phenotypes of X-linked diseases, as they can either protect or expose individuals to disease

variants10,11,12,13. The factors that influence XCI variability are mostly studied in mice and humans, and include stochasticity13 and genetics14,15,16, but their relative roles are

debated17. Cross-species comparisons of XCI variability stand to reveal general or species-specific mechanisms of XCI. For instance, genetic determinants of XCI are well-established in lab

mice18,19,20, but not in humans17,21,22, where they are more difficult to identify and measure. Exploring XCI variability in other mammals presents the opportunity to test models of

stochasticity or genetics in the context of evolution. Considering first a stochastic model for XCI variability, each cell within an embryo at the time of XCI independently selects an

X-allele to inactivate, resulting in ratios of allelic-inactivation varying across embryos purely by chance (Fig. 1A). Closely following Mary Lyon’s discovery of XCI in 19611, it was

recognized that the inherent embryonic stochasticity and permanence of XCI is the simplest explanation for the observed variability in XCI among adults and positions this adult variability

as a window into embryonic events23,24,25,26,27. For example, flipping 10 coins is more likely to result in 8 heads than flipping 100 coins is likely to result in 80 heads, meaning that the

variability in heads-to-tails ratios depends on the number of coins flipped. Similarly, the variability of XCI ratios in a population of female mammalian embryos is determined by the number

of cells at the time of XCI (Fig. 1A). Since each cell inherits its allelic-inactivation from its ancestor, measuring XCI variability in adults can approximate embryonic XCI variability and

help infer cell counts at the time of XCI or other early lineage decisions25,28 (Fig. 1D). Stochastic models have been used to estimate cell counts during embryonic events in human and mice

populations for decades20,23,25,27,28,29—but their applicability has not been tested in other mammalian species. In addition to stochasticity, genetic effects can influence the choice of

allelic inactivation and contribute to population variability in XCI ratios. Allelic inactivation during XCI is mediated by the cis-acting long non-coding RNA _XIST_30, which silences its

corresponding X-allele through epigenetic modifications31,32. Heterozygous variants affecting _XIST_ expression can bias allelic inactivation15. For example, inbred mice show preferential

inactivation of specific X-alleles depending on the parental strains and their corresponding X-chromosome controlling element (XCE) haplotypes18,20,33. In humans, genetic influence on XCI is

mostly observed in small family studies or disease cases, with no strong evidence for the broad allelic effects seen in mice21,22. Another genetic influence on XCI is allelic selection,

where disease-associated variants impart a selective effect across the two X-alleles that often results in extreme XCI skew14,16,34,35,36,37,38. However, the role of allelic selection

outside of a disease context and its relationship to population XCI variability remains to be thoroughly investigated across species. Thus, the relative contributions of stochasticity and

genetics to population XCI variability in mammals remain unclear with currently limited data from mouse and human studies. In this study, we assess population scale XCI variability and its

determinants across ten mammalian species. We source female annotated bulk RNA-sequencing samples from the Sequencing Read Archive (SRA), resulting in a total of 19,784 initial samples

derived from 562 individual studies (Fig. 1C), including human samples from the GTEx39 dataset. Our approach leverages natural genetic variation to sample X-linked heterozygosity and

eliminates the requirement for costly phased or strain-specific genetic information to assess XCI ratios across diverse mammals at population scale. We start by establishing the

population-level XCI ratio distributions for all ten mammalian species and use models of embryonic stochasticity to predict the number of cells fated for embryonic lineages (Figs. 1D and

2). We then investigate how broad genetic diversity, as indicated by measures of inbreeding (Fig. 3), as well as specific individual variants (Fig. 4), may impact population XCI variability.

Overall, our analyses explore how both models of stochasticity and genetic factors can explain population XCI variability across diverse mammalian species. RESULTS REFERENCE ALIGNED

RNA-SEQUENCING DATA ENABLES SCALABLE MODELING OF XCI RATIOS We use bulk RNA-sequencing (RNA-seq) data to measure the X-linked allelic expression of a sampled tissue by computing

allele-specific expression ratios of heterozygous single nucleotide polymorphisms (SNPs). The parental proportion of X-linked allelic reads are expected to follow a binomial distribution

dependent on the number of sampled reads and the XCI ratio of the tissue (see methods). The binomial distribution is an appropriate model when the parental identity of sequencing reads is

known, which is not the case when aligning to a reference genome. A reference genome will contain SNPs from both parents, making the parental identity of aligned reads ambiguous and

producing reference allelic expression ratios that represent expression of both parental X-alleles (Fig. 1B). Analogous to folding a book on its side closed, we fold the distribution of

reference allelic-expression ratios around 0.50 so that values an equal amount above and below 0.5 are in the same bin. This allows us to aggregate data across both alleles and enable a

robust estimate of the XCI ratio magnitude for the bulk RNA-seq sample (Fig. 1B). We fit folded-normal distributions to the reference allelic expression ratios of multiple SNPs per sample,

which serves as a continuous approximation of the underlying sequencing depth-dependent mixture of folded-binomial distributions per SNP. The mean of the fitted distribution is the estimate

of the XCI ratio for the sample (Fig. 1B). We also incorporate specific steps to address confounding factors that can impact measured X-linked allelic expression, namely excluding SNPs with

persistent reference bias across samples and chromosomal bins that exhibit probable escape from XCI40,41 (Supplementary Figs. 1, 2, see methods). Of note, the rat population exhibits a large

collection of reference biased SNPs when compared to the other species, likely due to the highly inbred nature of laboratory rat strains. We circumvent this expected issue in the mouse

population by leveraging two studies42,43 that sampled Diversity Outbred (DO)44 mice, evidenced by the lack of reference-biased SNPs in the mouse population compared to the other species.

Additionally, it is important to note our approach detects SNPs present only within RNA molecules, so we will miss variants in non-transcribed proximal regulatory elements, such as the

well-described XCE-interval in mice33. With regards to escape from XCI, we find the strongest signals of escape near chromosomal ends across all species (Supplementary Fig. 2), suggesting

escape within pseudo-autosomal regions is conserved across mammals40,45. Previously28, we validated our SNP filtering and XCI modeling approach using phased RNA-seq data (where haplotype

information is known for each variant) from the EN-TEx consortium46, achieving nearly perfect agreement in XCI ratio estimates for samples with folded XCI ratios of 0.60 or higher,

demonstrating the accuracy of our approach. By calling SNPs from RNA-seq reads and employing folded distributions to model reference-aligned allelic expression, we can estimate the magnitude

of XCI in any female mammalian bulk RNA-seq sample. We source female annotated bulk RNA-seq samples of 9 non-human mammalian species from the SRA database (Fig. 1C), additionally including

cross-tissue human samples from the GTEx dataset. As sex annotations were not available on SRA for the two DO mouse studies, we annotate the sex of the mouse samples by thresholding on the

total number of reads aligned to the Y-chromosome (Supplementary Fig. 3). After processing, the number of samples with a minimum of 10 well-powered SNPs for estimating XCI ratios are 130

macaca (mean of 28 SNPs ±17 SD), 275 horse (mean of 54 SNPs ±36 SD), 291 dog (mean of 29 SNPs ±13 SD), 369 rat (mean of 28 SNPs ±16 SD), 388 mouse (mean of 87 SNPs ±46 SD), 399 goat (mean of

34 SNPs ±14 SD), 654 pig (mean of 50 SNPs ±28 SD), 784 sheep (mean of 81 SNPs ±43 SD), 1364 cow (mean of 33 SNPs ±19 SD), and 4877 human (mean of 56 SNPs ±23 SD, 314 total individuals)

samples (Fig. 1C, Supplementary Fig. 1). Aggregating reference SNP allelic expression ratios for samples with similar estimated XCI ratios (0.05 bins) clearly reveals the expected haplotype

expression distributions, demonstrating the applicability of folded models (Supplementary Fig. 4). Following sample-level XCI ratio modeling, we then generate population-level distributions

by unfolding the distribution of folded XCI ratio estimates around 0.50, analogous to opening a closed book (Fig. 1D). As an additional control to ensure the allelic variability we report

from X-linked SNPs is specific to XCI, we estimate autosomal allelic imbalances for all samples using the same pipeline and approach as for the X-chromosome analysis (Supplementary Fig. 5,

see methods). Comparing allelic imbalances across the two autosomes closest in size to the X-chromosome reveals the vast majority of samples across all species are biallelically balanced for

autosomal expression, as expected (Supplementary Fig. 5). Several species (pig, cow, goat, rat, sheep, and dog) exhibit small subsets of samples that are consistently imbalanced across the

two autosomes and the X-chromosome, indicative of a global influence on allelic-expression independent of XCI (Supplementary Fig. 5). These samples with global allelic imbalances are

excluded from all downstream analysis, ensuring the population distributions of XCI ratios reflect variability specific to XCI. MODELS OF EMBRYONIC STOCHASTICITY EXPLAIN ADULT POPULATION XCI

VARIABILITY After generating population distributions of XCI ratios for the 10 mammalian species, we next explore how well models of embryonic stochasticity explain the observed adult XCI

ratio variability. The initial variability in XCI ratios among mammalian embryos is dependent on the number of cells present during XCI (Fig. 1A), where adult variability can be modeled to

infer embryonic cell counts. When estimating embryonic cell counts from XCI variability in adult tissues, it is important to note that adult tissues represent only the embryonic lineage of

the blastocyst, not the extra-embryonic lineages. This positions XCI variability of adult tissue samples as informative for the number of cells present within the last common lineage

decision for all adult cells, i.e. the number of cells present within the epiblast of the mammalian blastocyst. If XCI occurs after epiblast specification, XCI ratio variability is

determined by the number of epiblast cells at the time of XCI. If XCI occurs before epiblast specification, the variability is influenced by both the initial stochasticity of XCI and the

stochasticity of cell sampling during epiblast lineage specification. Without cross-tissue sampling of both extra-embryonic and embryonic tissues, the temporal ordering of XCI among these

lineage events cannot be resolved. Therefore, estimating cell counts based solely on XCI variability in adult tissues provides an estimate of the number of cells present in the epiblast at

the time of XCI. Figure 2A presents the unfolded population distributions of XCI ratios in the 10 mammalian species we sampled, ranging from the least variable (macaca) to most variable

(dog). We fit normal distributions as continuous approximations to the underlying binomial distribution that defines the relationship between cell counts and XCI ratio variability (Fig. 1A,

D, see methods). We focus on the tails of the distributions for our model fitting (colored in portions of the distributions, unfolded estimates ≤0.40 and ≥0.60, Fig. 2A), for two reasons.

Our analysis of autosomal allelic ratios (Supplementary Fig. 5) highlights that samples with no expected allelic imbalance produce folded skew estimates that vary between 0.5 and 0.6 and our

previous work28 using phased data indicated model misspecification around the point of folding (0.50). Fitting to the tails of the empirical distribution is therefore a more accurate

representation of variability specific to XCI. At a broad level, population XCI ratio variability varies substantially across the sampled mammalian species. Our estimates for the number of

epiblast cells present at the time of XCI include 65 (macaca), 31 (rat), 23 (pig), 16 (goat), 15 (horse), 14 (sheep), 14 (cow), 13 (human), 12 (mouse), and 8 (dog) cells, with associated 95%

confidence intervals presented in Fig. 2B. Importantly, species with similar numbers of detected SNPs per sample and total sample size exhibit variable cell number estimates, indicating it

is unlikely our estimates are driven by technical effects across the species (Supplementary Fig. 6). We additionally down sample all species to the smallest sample size present (130 macaca

samples) and achieve virtually identical cell number estimates, demonstrating variation in cell number estimates across species is not driven by sample size differences (Supplementary Fig.

6). The error between the empirical XCI ratio distributions and the normal fitted distributions is strikingly small, with a mean of 0.00588 sum-squared error (±0.00965 SD) across the species

(Supplementary Fig. 6). This shows models of embryonic stochasticity can explain observed XCI ratio variability in adult populations exceptionally well. For the least and most variable

species (macaca and dog), the estimated autosomal imbalances offer additional context for the reported XCI population variability. The reported X-linked variability in macaca is in excess to

the reported autosomal allelic variability, which itself is highly consistent across species (Supplementary Fig. 5). This demonstrates the X-linked population variability for macaca, while

strikingly small, still varies beyond the extremely consistent autosomal variability present across species and is specific to the X-chromosome, representing informative variability for

estimating cell counts. On the other hand, the dog population is the only one that contains samples with strong allelic imbalances on only one autosome, where autosomal imbalances in all

other species are global (Supplementary Fig. 5). This is suggestive of broader genomic incompatibilities within the dog population. The reported X-linked population variability in dog is

likely a combination of XCI and broader allelic incompatibilities, positioning our estimate of 8 cells as a likely underestimate due to excess variability outside of XCI. Modeling XCI ratio

variability across numerous species allows comparisons in light of evolution for determining generalizable or species-specific characteristics in XCI. Broadly, we demonstrate XCI ratios are

variable in each species we assess, revealing variability in XCI ratios itself as a conserved characteristic of XCI. The exact variance in XCI ratios varies across the species, with

differences in the timing of XCI and/or differences in cell counts for embryonic/extra-embryonic lineage specification as two putative explanations. We compare our estimated cell counts to

the evolutionary relationships among the species we assess (Fig. 2B), suggesting that variability in these early embryonic events can be recent evolutionary adaptations. This is highlighted

by the large differences in cell counts between macaca and humans, as well as between rats and mice. When viewed through the lens of cell divisions (log2 of the estimated cell counts, Fig.

2B), the differences in XCI ratio variability among the species can be explained by differences in a range of only 3 cell divisions, a narrow developmental window. This demonstrates even

slight changes in the timing of XCI or cell counts for embryonic/extra-embryonic lineage specification across mammalian species can produce large differences in population XCI ratio

variability, as explained through the inherent stochasticity of XCI. XCI RATIOS ARE NOT ASSOCIATED WITH X-LINKED HETEROZYGOSITY After determining stochastic models can explain population XCI

ratio variability across mammalian species, we turn to testing whether we can identify any genetic correlates with XCI ratios. Our approach leveraging natural genetic variation to quantify

XCI ratios enables us to assess a large catalog of genetic variants for associations with XCI ratios across mammalian species (10,735 macaca SNPs, 12,024 rat SNPs, 28,339 mouse SNPS, 23,603

pig SNPs, 16,123 goat SNPs, 10,281 horse SNPs, 53,505 sheep SNPs, 18,509 cow SNPs, 16,168 human SNPs, and 10,050 dog SNPs). One putative genetic contribution to XCI ratio variability is

allelic selection during development, where increased X-linked heterozygosity (i.e., genetic distance), is more likely to produce selective pressures between the two X-alleles. It follows

that samples with higher X-linked heterozygosity would be expected to exhibit more extreme XCI ratios. We score X-linked heterozygosity per sample as the ratio of the detected SNPs within a

sample to the number of unique SNPs identified across all samples, relative for each species (Fig. 3A). This quantification also serves as a measure of inbreeding, with decreased

heterozygosity associated with a higher degree of inbreeding47. The trend in heterozygosity across species is as expected, with rats (likely laboratory strains) as the most inbred (Fig. 3A).

Next, we examine the correlations between sample heterozygosity and the estimated XCI ratio, as well as the estimated allelic variability across SNPs in each sample (mean and standard

deviation of the fitted folded-normal distribution per sample, Fig. 3B). Across all species, X-linked heterozygosity shows a near-zero correlation with the estimated XCI ratio, indicating a

lack of association between X-linked genetic heterozygosity and XCI ratio variability (Fig. 3B). However, we observe moderate correlations between sample heterozygosity and the estimated

variability in SNP allelic ratios in three species: rat (corr: 0.576), macaca (corr: 0.459), and cow (corr: 0.364), notably the most inbred species (Fig. 3A, Supplementary Fig. 7). The

increased variability in allelic expression present only within the most inbred species could potentially reflect gene-specific regulatory events between parental haplotypes48 rather than a

direct genetic effect on XCI. LOW FREQUENCY VARIANTS EXHIBIT MODERATE ASSOCIATIONS WITH XCI RATIOS After investigating relationships between genetic variation and XCI ratios at a broad level

across the whole X-chromosome, we next asked if individual variants might be associated with extreme XCI ratios. Variants that affect the expression and/or function of the genetic elements

that control XCI can result in highly skewed XCI ratios, as documented in human studies15. This can also occur in other X-linked genes, if the resulting differential in gene activity exerts

a selective pressure across the X-alleles, as documented in disease cases14,16. We test the association between XCI ratios and individual variants for all variants detected in each species

with a minimum of 10 samples, quantified through the area-under-the-receiver-operating-curve statistic (AUROC). For each species, we rank the samples based on their estimated XCI ratio and

score the placement of samples carrying a given variant within the ranked list (Fig. 4A). If all the samples with that variant are at the top of the ranked list, the XCI ratio can be said to

have perfectly predicted the presence of that variant, quantified with an AUROC of exactly 1. An AUROC of 0.50 indicates the XCI ratio performs no better than random chance for predicting

the presence of the variant. The distribution of AUROCs for each species show striking similarities to a null comparison (Fig. 4B, see methods), indicating a pervasive lack of association

between XCI ratios and individual variants. However, a small subset of variants in each species exhibits moderate associations (AUROCs ≥0.75 and FDR-corrected _p_-value ≤0.05). By comparing

each variant’s AUROC with its frequency in the species, we find that the variants with moderate associations occur at low frequencies within the sampled populations (Fig. 4C, Supplementary

Fig. 8). We investigate whether this relationship is simply due to a lack in power with bootstrap simulations, demonstrating moderate AUROCs (≥0.75) are robust to their small sample sizes

(Supplementary Fig. 8). Figure 4D displays these variants along with their gene annotations for each species. Notably, we observe no statistically significant variant-XCI ratio associations

in the GTEx human population when performing either a tissue-specific or donor-specific analysis, as well as only considering the sample per donor with the highest sequencing depth

(Supplementary Fig. 9). While the GTEx dataset is comprised of thousands of tissue samples, only 314 female individuals are present in our final dataset. We test the effect of a small

population size by down sampling the cow data to 300 samples and scoring variant-XCI ratio associations (Supplementary Fig. 9). All of the cow variants that we originally identified as

significantly associated with XCI ratios are no longer detected in the down sampled data, in line with the observation that variants with associations to XCI ratios occur at low frequencies

within mammalian populations. Increased population sampling is likely required to identify further genetic associations with XCI ratios. Several genes with moderate AUROCs have prior

evidence for escaping XCI in humans49, bringing into question their associations with extreme XCI ratios in our analysis. To explore further, we compare the estimated XCI ratios of samples

to the allelic ratios of all detected variants for genes with at least one variant significantly associated with XCI ratios. We report several examples across species where the allelic

expression of individual variants from these putative escape genes does in fact exhibit the expected balanced biallelic expression of escape from XCI while also being enriched in samples

with increased XCI ratios (Supplementary Fig. 9). A gene that escapes XCI will be biallelically expressed; this suggests the variant-specific association we detect within these XCI escapers

likely reflects a haplotype-effect, where the variant is linked to a haplotype influencing XCI ratios, rather than an effect from the gene/variant itself. Further analysis with phased data

to assess potential haplotype effects may help identify genetic associations with XCI ratios. Overall, our assessments of chromosome-wide genetic variability and individual variants do not

reveal genetic associations robust enough to explain population XCI ratio variability across all 10 mammalian species. PUTATIVE MOUSE XCE-_XIST_-HAPLOTYPES EXHIBIT HIGHLY VARIABLE XCI RATIOS

One of the most well-documented instances of a genetic association with XCI ratios is the XCE-haplotypes in laboratory mouse strains, where a preferential ordering of allelic inactivation

exists across haplotypes33. The DO mice we utilize are expected to be genetically diverse combinations of various lab strains and it is highly likely a mix of XCE-haplotypes are present

within this population and may have an impact on XCI ratios. Such a haplotype-specific effect would be missed in our previous AUROC variant-specific analysis of XCI ratios. Since we only

sample variants present within RNA molecules and the XCE-interval is a proximal non-transcribed regulatory element of _Xist_33, we reason variants present within _Xist_ are likely linked to

XCE-haplotypes and may be informative for identifying putative XCE-_Xist_-haplotypes. We identify 4 putative XCE-_Xist_-haplotypes as determined by groups of samples with shared _Xist_

variants (Supplementary Fig. 10). As a general observation across haplotypes and the two studies we sample from, XCI ratios of samples with the same haplotype are highly variable

(Supplementary Fig. 10), suggesting XCI ratios are not definitively determined by _Xist_ genotypes within the DO mice population. The haplotype with the seemingly largest effect, evidenced

by 2 samples with highly skewed XCI ratios (0.799 and 0.782) in the one study that collected striatal tissue, conversely exhibits XCI ratios ranging from balanced to moderately skewed in the

second study, which collected pancreatic tissue (Supplementary Fig. 10). While this may be indicative of a tissue-specific effect of a particular XCE-_Xist_-haplotype, far greater sample

sizes with higher genetic resolution to confirm haplotypes are needed for validation. In general, the variability in XCI ratios within putative XCE-_Xist_-haplotypes suggests non-genetic

contributions to XCI variability of DO mice. DISCUSSION We modeled tissue XCI ratios from bulk RNA-seq samples across 10 mammalian species and revealed population-level variation in XCI

ratios that likely reflects differences in developmental events such as XCI timing or lineage specification. We showed that models of embryonic stochasticity fit the XCI data exceptionally

well and estimated epiblast cell counts at the time of XCI across species. We also searched for genetic factors influencing XCI ratios and found a pervasive lack of strong genetic

associations with XCI ratios, indicating that population XCI variability is better explained by the inherent stochastic nature of XCI rather than through genetic mechanisms. The lack of

cross-mammalian comparisons of population XCI variability has previously limited our understanding on the sources of XCI variability in mammals. The existence of XCE-haplotypes in laboratory

mice18,19,20,33 has supported the hypothesis that a similar genetic mechanism can exist in humans and drive population XCI variability21, though evidence for XCE-haplotypes in human

populations remains inconclusive22 and data from other mammalian species is historically absent. Although genetic influences on XCI, particularly variants affecting _XIST_15 or

disease-associated variants34,35,36,37, have been identified, they do not constitute a general mechanism that can fully account for observed population-level XCI variability across species.

In particular, allelic selection via genetic variability across the X-alleles has been put forth as an explanatory mechanism for XCI ratio variability14,16, but is almost exclusively studied

in a disease context and typically associated with extreme XCI ratios14,16,17,34,35,36,37,38, which is conflicting with the continuous population variability we report across species. Our

measures of X-linked heterozygosity have near-zero association with XCI ratios in all 10 mammals we assessed, a strong indication that genetic variability on the X-chromosome has little

influence on XCI ratios outside of disease. This is supported by the observation of depleted X-chromosome genetic variability via strong rates of purifying selection50,51,52,53, rendering

both parental alleles as largely equivalent at population scales. Our approach for extracting heterozygous variants from RNA-seq data28, while providing a sample of genetic variability, is

still able to assess hundreds of X-linked genes and chromosome-wide heterozygosity per species for associations with XCI and culminated in only weak evidence of limited genetic influence on

XCI ratios. In contrast, we demonstrated models of embryonic stochasticity can explain population XCI variability with exceedingly small amounts of error consistently across mammalian

species, providing a much more general explanation for population XCI variability. Besides X-linked disorders and _XIST_-variants, other factors that may affect XCI ratio variability are

genomic incompatibilities48 and stochastic allelic drift during development20 and/or aging as in the well-reported case of increased skewing of blood samples with age29,54. We found an

association between the variance in X-linked allelic expression and the degree of inbreeding for some species (Fig. 2B), as well as autosome-specific allelic imbalances in dog samples

(Supplementary Fig. 5). This implies that X-linked allelic expression variability may result from both the bulk tissue XCI ratio and the genomic incompatibilities between the parental

genomes48, depending on the species. We controlled for global allelic imbalances by excluding samples that showed consistent autosomal imbalances (Supplementary Fig. 5), which confirms the

allelic-expression variability on the X-chromosome is specific to XCI. Turning to allelic drift, developmental allelic drift may introduce XCI ratio variability beyond the initial random

choice of allelic inactivation20. While our previous cross-tissue analysis of XCI ratios in humans28 showed consistent XCI ratios across tissues, suggesting allelic drift is not a major

factor in XCI ratio variability, similar data for non-human mammals is missing. In general, we cannot account for tissue-specific or age-related effects in this dataset as these sample

annotations are almost universally absent for the SRA sourced data. These factors indicate that our epiblast cell count estimates are lower bound estimates for the number of cells needed to

produce the observed XCI ratio variability as purely derived from embryonic stochasticity. Our statistical modeling approach here would be greatly complemented by future experimental

validation of the timing and cell counts present during XCI across species. Regarding the timing of XCI and our epiblast cell count estimates, exploring known temporal variability in XCI

across species provides additional context. In mice, random XCI occurs within the epiblast soon after its specification and is readily identified by monoallelic expression of _XIST_55. In

macacas and humans, inactivation appears more continuous; data show progressive chromosome-wide silencing over several days, with a shift from biallelic to monoallelic _XIST_

expression56,57,58. This lengthy continuous inactivation obscures the exact timing of XCI in these species and highlights that XCI hallmarks in mice, namely rapid mono-allelic _XIST_

expression, are not readily applicable to other species; indeed, many species initially exhibit biallelic _XIST_ expression59. In context, our epiblast cell counts estimate the number of

cells involved in XCI, not the exact timing. For instance, our macaca and human cell counts indicate approximately four times as many cells are present within the epiblast at the time of XCI

in macacas compared to humans. This difference could result from delayed XCI or a greater number of cells fated for the epiblast in macacas compared to humans. In general, our cell count

estimates reflect population sizes at the time of XCI and we attribute variability in cell counts as most likely due to differences in XCI timing or lineage specification dynamics across

species. An important caveat to our analyses of genetic influences on XCI ratios is that we are limited to assessing variants present within RNA molecules, which are necessary for

quantifying allele-specific expression. Consequently, we likely miss many non-transcribed regulatory variants that may significantly influence XCI ratios. The XCE-interval in mice is one

example, a proximal regulatory element to _Xist_ known to influence XCI ratios in heterozygous lab crosses33. We identified putative XCE-haplotypes in the DO mice population using _Xist_

variants as proxies and showed XCI ratios are highly variable among samples sharing a haplotype, indicating other factors outside of _Xist_ genotypes/putative XCE-haplotypes influence XCI

ratios in DO mice. While this suggests the effects of XCE-haplotypes in DO mice are minor and more pronounced in inbred lab crosses, we cannot exclude the possibility of similar genetic

influences on XCI ratios in non-transcribed regions among our sampled species. Our approach trades comprehensive genetic screening for scalability, enabling the assessment of XCI ratios

across thousands of samples from different species. While we demonstrate models of embryonic stochasticity explain observed population XCI ratio variability far better than genetic

associations, we are limited in the type of genetic variability we can assess. Importantly, we do not discount the well-documented role of genetics in XCI ratios for rare and exceptional

cases (e.g., disease); instead, we advance embryonic stochasticity as the parsimonious explanation of XCI variability in normal populations. METHODS SNAKEMAKE PIPELINE FOR RNA-SEQ ALIGNMENT

AND VARIANT IDENTIFICATION All non-human mammalian fastq data was downloaded from the Sequencing Read Archive (SRA, https://www.ncbi.nlm.nih.gov/sra), where only samples annotated as female

were selected, using the metadata provided through SRA. We sourced Diversity Outbred mice data from two studies42,43 where sex annotations were not available on SRA and identified female

samples as those with less than 200 CPM counts aligned to the Y-chromosome (Supplementary Figure 3). Details for download and processing of the GTEx39 data can be found here28. The entire

sample processing pipeline uses a standard collection of bioinformatics software tools, all available for installation via Conda (STAR60 v2.7.9a, GATK61 v4.2.2.0, samtools62 v1.13,

igvtools63 v2.5.3, and sra-tools 2.11.0). All Snakemake workflow rules, environment setup procedure, analysis commands and options, and underlying libraries are available on Github at

https://github.com/gillislab/cross_mammal_xci, and https://github.com/gillislab/xskew. Briefly, a.fastq file acts as input, for either single- or pair-end sequencing experiments, and a.vcf

and.wig file are produced as outputs for subsequent compiling of allele-specific read counts in R v4.3.0. The R script used for combining the.vcf and.wig information is also made available

at https://github.com/gillislab/cross_mammal_xci/tree/main/R. Genome generation and alignment was performed with STAR, with the addition of the WASP64 algorithm for identifying and excluding

reference biased reads. We extract chromosome-specific alignments from the.bam file (X chromosome or specific autosomes) and use GATK tools to identify heterozygous SNPs from that

chromosome. The suite of GATK tools for identifying heterozygous variants from RNA-sequencing data was used following the GATK Best Practices recommendations. Specifically, the tools

utilized include AddOrReplaceReadGroups –> MarkDuplicates –> SplitNCigarReads –> HaplotypeCaller –> SelectVariants –> VariantFiltration. Reference genomes and gene annotations

(.gtf files) for each species were sourced from the NCBI Refseq database (https://www.ncbi.nlm.nih.gov/refseq/). In each case the latest assembly version path was used, and the genomic.fna

and genomic.gtf was downloaded. Annotated and indexed genomes were generated with STAR using –runMode genomeGenerate with default parameters. SNP FILTERING Only SNPs with exactly two

identified genotypes were included for analysis and indels were excluded. We required each SNP to have a minimum of 10 reads mapped to both alleles for a minimum read depth of 20 reads per

SNP. Gene annotations for all SNPs were extracted from the species-specific.gtf files. For XCI ratio modeling, we only used SNPs found within annotated genes. For any sample with multiple

SNPs identified in a gene, we took the SNP with the highest read count to be the max-powered representative of that gene, so each individual SNP is representative of a single gene. In

addition to implementing the WASP algorithm for excluding reference biased reads, we filter out SNPs within each species whose mean expression ratios across samples deviate strongly from

0.50 (mean allelic ratio <0.40 and >0.60, Supplementary Fig. 1). This SNP filtering also excludes potential eQTL effects that may impact allelic-expression outside of the underlying

XCI ratio. IDENTIFYING AND EXCLUDING CHROMOSOMAL REGIONS THAT ESCAPE XCI We reasoned robust escape from XCI would produce more balanced biallelic expression in samples with skewed XCI. We

performed an initial pass at XCI ratio modeling including all well-powered SNPs in a sample to identify samples with skewed XCI ratios (XCI ratios ≥0.70 for all species except rat and

macaca, where a threshold of 0.60 was used due to a reduced incidence of skewed XCI in these species). Using the subset of skewed samples for each species, we averaged the folded

allelic-expression ratios for all SNPs present in 1 mega-base (MB) bins across the X-chromosome (Supplementary Fig. 2). Chromosomal-bins that displayed balanced allelic expression in

opposition to the clearly skewed allelic expression of the rest of the chromosome were excluded from analysis. Specifically, chromosomal bins with an average allelic-expression <0.65 for

pig, goat, horse, sheep, mouse, and cow, <0.60 in rat and macaca, and <0.675 in dog were excluded (Supplementary Fig. 2) The ends of the X-chromosome in all species, except rat and

mouse, demonstrated strong balanced biallelic expression, indicative of escape within putative pseudo-autosomal regions. We excluded any bin within these putative pseudo-autosomal regions

regardless of average allelic expression. The escape threshold for dog was increased to exclude all bins within the dog putative pseudo-autosomal region. MODELING XCI RATIOS WITH THE

FOLDED-NORMAL DISTRIBUTION Starting with a single parental allele, the sampled maternal allelic-expression of a heterozygous X-linked SNP can be modeled with a binomial distribution,

dependent on the ratio of active maternal X-alleles in the sample and the read depth of the SNP. $$\frac{{X}_{{mat}}}{{n}_{{reads}}} \sim

\frac{{Bin}\left({n}_{{reads}},\,{p}_{{mat}}\right)}{{n}_{{reads}}}{;}\, E\left[\frac{{X}_{{mat}}}{{n}_{{reads}}}\right]={p}_{{mat}}{;} \,

{Var}\left(\frac{{X}_{{mat}}}{{n}_{{reads}}}\right)=\frac{{p}_{{mat}}(1-{p}_{{mat}})}{{n}_{{reads}}},$$ (1) where ${X}_{{mat}}$ is the number of maternal allelic reads, ${n}_{{reads}}$

is the read depth of the SNP, and ${p}_{{mat}}$ is the ratio of active maternal X-alleles. When aligned to a reference genome, the parental phasing information is lost and the

allelic-expression of X-linked SNPs can instead be modeled with the folded-binomial model65,66. Since SNPs vary in read-depth, we use a folded-normal model as an approximation of the

underlying mixture of sequencing depth-dependent folded-binomial distributions. The probability of allelic-expression under the folded-normal model is defined as: $$\Pr

\left({x}_{{ratio}}{;}\,{{{\rm{\mu }}}},{\sigma }^{2}\right)=\frac{1}{\sqrt{2\pi }\sigma }{e}^{-\frac{{({x}_{{ratio}}-\mu )}^{2}}{2{\sigma }^{2}}}+\frac{1}{\sqrt{2\pi }\sigma

}{e}^{-\frac{{({x}_{{ratio}}+\mu -1)}^{2}}{2{\sigma }^{2}}},{{{\rm{for}}}}\,\mu \in \left[0.50,1\right],$$ (2) where ${x}_{{ratio}}$ is the folded allelic-expression ratio of a SNP,

$\mu$ is the folded XCI ratio of the sample, and $\sigma$ is the standard deviation of the folded-normal distribution. We utilize a maximum-likelihood approach (negative log-likelihood

minimization of Eq. (2)) to fit folded-normal distributions to the observed folded allelic-expression ratios of at least 10 filtered SNPs per sample, taking the $\mu$ parameter of the

maximum-likelihood folded-normal distribution as the folded XCI ratio estimate of the sample. MODELING AUTOSOMAL IMBALANCES The folded-normal model can also be applied to autosomal data to

estimate allelic-imbalances. For each species, we extract chromosome-specific alignments from the.bam file for the two autosomes closest in size to the X-chromosome (Supplementary Fig. 5).

We employ the exact same processing pipeline and thresholds as used for the X-chromosome. Any sample that displayed an autosomal imbalance greater than or equal to a folded estimate of 0.60

(dotted lines in Supplementary Fig. 5A) on either autosome was excluded from downstream analysis. MODELING POPULATION XCI VARIABILITY WITH MODELS OF EMBRYONIC STOCHASTICITY XCI is a binomial

sampling event, where the number of cells choosing to inactivate the same X-allele follows a binomial distribution defined as: $$X \sim {Bin}\left({n}_{{cells}},{p}_{{inact}}\right),$$ (3)

where $X$ is the number of cells inactivating the same X-allele, ${n}_{{cells}}$ is the number of cells present at the time of XCI, and ${p}_{{inact}}$ is the probability of

inactivation (0.50). Embryonic XCI ratios can be modeled as: $$\frac{X}{{n}_{{cells}}} \sim \frac{{Bin}({n}_{{cells}},{p}_{{inact}})}{{n}_{{cells}}}$$ (4) We estimate ${n}_{{cells}}$ by

fitting normal distributions to the unfolded population XCI ratio distributions of each species, as a continuous approximation for the underlying binomial distribution. The variance of the

normal distribution is defined as:

$${{{\mathrm{var}}}}_{{normal}}={Var}\left(\frac{{Bin}\left({n}_{{cells}},{p}_{{inact}}\right)}{{n}_{{cells}}}\right)=\frac{{p}_{{inact}}(1-{p}_{{inact}})}{{n}_{{cells}}}=\frac{.5(1-.5)}{{n}_{{cells}}}$$

(5) We model population XCI ratios as: $$\frac{X}{{n}_{{cells}}} \sim {Norm}\left(\mu,\sqrt{{{{\mathrm{var}}}}_{{normal}}}\,\right),$$ (6) where $\mu$ = ${p}_{{inact}}$ = 0.50 and

${{{\mathrm{var}}}}_{{normal}}$ is computed for ${n}_{{cells}}\in [2,\,200]$. We identify the normal distribution with minimum sum-squared error between its CDF and the empirical

population XCI ratio CDF, minimizing error over the tails of the distributions with percentiles ≤0.40 or ≥0.60 (Supplementary Fig. 6). We compute 95% confidence intervals about the cell

number estimate ${n}_{{cells}}$ through bootstrap simulations. We sample with replacement from the empirical population XCI ratio distribution, matching the sample size of the original

empirical population distribution, and fit a normal model to derive a bootstrap estimate of ${n}_{{cells}}$. We repeat this for 2000 simulations to generate a bootstrapped distribution of

${n}_{{cells}}$, from which we derive the 95% confidence intervals, defined as the interval where 2.5% of the bootstrapped distribution lies outside either end. We down sample the

population XCI ratio distribution to 130 for each species to match the sample size of macaca, the species with the smallest sample size. We sample with replacement and then estimate cell

numbers as previously described, repeating for 2000 simulations. The mean cell number estimate and 95% confidence intervals for each down sampled species is reported in Supplementary Fig.

6D. MEASURING SAMPLE X-LINKED HETEROZYGOSITY We compute sample heterozygosity as the ratio of SNPs detected in a sample (20 read minimum) to the total number of unique SNPs identified across

all samples for a given species. We quantify associations between X-linked heterozygosity and XCI ratios as the spearman correlation coefficient between the sample X-linked heterozygosity

ratio and the fitted mean and variance of the maximum-likelihood folded-normal distribution of the sample (Fig. 3B, C, Supplementary Fig. 7). We only consider samples with at least 10

detected SNPs. QUANTIFYING VARIANT ASSOCIATIONS WITH EXTREME XCI RATIOS We quantify the strength of XCI ratios as a predictor for the presence of a given variant through the AUROC metric.

Given a ranked list of data (XCI ratios) and an indicator of true positives (samples with a given variant), the AUROC quantifies the probability a true positive is ranked above a true

negative. An AUROC of 1 indicates all true positive samples were ranked above all true negative samples, demonstrating XCI ratios were a perfect predictor for the presence of that variant.

An AUROC of 0.50 indicates random placement of true positives and negatives in the ranked list, demonstrating XCI ratios performed no better than random chance for predicting the presence of

that variant. We compute the AUROC through the Mann–Whitney U-test, defined as: $${AUROC}=\frac{U}{{n}_{{pos}}+\,{n}_{{neg}}},$$ (7) where $U$ is the Mann–Whitney U-test test statistic,

computed in R with wilcox.test(alternative = ‘two.sided’), ${n}_{{pos}}$ is the number of true positive samples and ${n}_{{neg}}$ is the number of true negative samples. We generate a

null AUROC per variant by randomly shuffling the true positive and negative labels. The variant frequency is defined as the number of samples that carry a given variant over the total number

of samples for a given species. The _p_-value for a given AUROC is the _p_-value associated with the Mann–Whitney U-test test statistic ($U$), where we determine significance as an

FDR-corrected _p_-value ≤0.05. We perform FDR correction for all _p_-values computed for all variants across the 10 species through the Benjamini–Hochberg method, implemented in R via

p.adjust(method =‘BH’). We estimate the power of each variant through bootstrap simulations. We randomly sample with replacement the XCI ratios of the true positive and true negative

samples, those that either carry or do not carry a given variant. We match the sample size of the original true positive and negative labels. We compute a bootstrapped AUROC and _p_-value

from the simulated data, repeating for 2000 simulations to compute a bootstrapped distribution of AUROCs. The AUROC power (Supplementary Fig. 8B) is defined as the fraction of bootstrapped

AUROCs that are significant, using a significance threshold of FDR-corrected _p_-value ≤0.05. The AUROC effect size power (Supplementary Fig. 8C) is defined as the fraction of bootstrapped

AUROCs that are ≥0.75. We also report the variance of the bootstrapped AUROC distribution per variant in Supplementary Fig. 8D. We exclude all variants classified as reference biased from

Supplementary Fig. 1, with the distributions of AUROCs for the reference biased and non-reference biased SNPs presented in Supplementary Fig. 8E. We assess variants for associations with XCI

ratios within the human data in several slightly different ways to accommodate the cross-tissue sampling structure of the GTEx data (Supplementary Fig. 9A). For the tissue-specific

analysis, we rank samples of a given tissue by their XCI ratios and score XCI ratio associations for all variants present within that tissue’s samples as previously described using our AUROC

metric. We only consider tissues with at least 50 donors. For the donor-specific analysis, we average the tissue XCI ratios of all samples for a given donor and then rank all donors by

their average XCI ratio. We also average the allelic-expression ratio of variants present in multiple tissue samples for a given donor and then score XCI ratio associations for all variants

as previously described. We additionally perform this experiment using the single tissue sample with the highest sequencing-depth per donor. We down sample the cow sample population to 300

and then score XCI ratio associations for variants as a comparison to the human sample population (314). Sampling is done without replacement 10 times and we compute the average AUROC per

variant across the 10 samples (Supplementary Fig. 9). PUTATIVE MOUSE XCE-_XIST_-HAPLOTYPES We hierarchically clustered mouse samples by their _Xist_ variants considering samples with at

least 20 detected _Xist_ variants using the ComplexHeatmap67 R package with the following function options: Heatmap(clustering_distance_columns = function(m dist(m, method = ‘binary’)),

clustering_method_columns = ‘ward.D2’, column_split = 4). This performs ward.D2 clustering using the Jaccard distance between samples and cuts the column dendrogram using cuttree() into four

clusters, which we chose to capture the clear sample groupings present within the data (Supplementary Fig. 10). SOFTWARE All analysis was performed in R68 v4.3.3. All plots were generated

using ggplot269 v3.4.2 functions. The phylogenetic tree in Fig. 2B was generated from TimeTree http://www.timetree.org/. REPORTING SUMMARY Further information on research design is available

in the Nature Portfolio Reporting Summary linked to this article. DATA AVAILABILITY The source data for all figure panels can be found at70

https://github.com/gillislab/cross_mammal_xci/tree/main/R/data_for_plots. Where applicable, exact _p_-values are provided in the source data files. The SRA accession numbers for all

non-human mammalian samples processed can be found at https://github.com/gillislab/cross_mammal_xci/blob/main/R/data_for_plots/all_keep_species_meta.Rdata. Details for accessing the GTEx

samples can be found here https://gtexportal.org/home/protectedDataAccess. CODE AVAILABILITY All associated code can be found at70 https://github.com/gillislab/cross_mammal_xci/tree/main/R.

Code for generating all figure panels using associated source data can be found at https://github.com/gillislab/cross_mammal_xci/blob/main/R/figure_plots_with_data_code.md. The snakemake

pipeline used for processing the non-human mammalian data can be found at https://github.com/gillislab/cross_mammal_xci/tree/main. https://doi.org/10.5281/zenodo.13774726. Details and code

for processing the human GTEx samples can be found here28. REFERENCES * Lyon, M. F. Gene Action in the X -chromosome of the Mouse (Mus musculus L.). _Nature_ 190, 372–373 (1961). Article

ADS PubMed CAS Google Scholar * Migeon, B. R. An overview of X inactivation based on species differences. _Semin. Cell Dev. Biol._ 56, 111–116 (2016). Article PubMed CAS Google

Scholar * Okamoto, I. et al. Eutherian mammals use diverse strategies to initiate X-chromosome inactivation during development. _Nature_ 472, 370–374 (2011). Article ADS PubMed CAS

Google Scholar * Ohno, S. _Sex Chromosomes and Sex Linked Genes_. (Springer Berlin, 1966). * Lyon, M. F. X-chromosome inactivation and developmental patterns in mammals. _Biol. Rev. Camb.

Philos. Soc._ 47, 1–35 (1972). Article PubMed CAS Google Scholar * van den Berg, I. M. et al. X Chromosome Inactivation Is Initiated in Human Preimplantation Embryos. _Am. J. Hum.

Genet._ 84, 771–779 (2009). Article PubMed PubMed Central Google Scholar * Evans, H. J., Ford, C. E., Lyon, M. F. & Gray, J. DNA Replication and Genetic Expression in Female Mice

with Morphologically Distinguishable X Chromosomes. _Nature_ 206, 900–903 (1965). Article ADS PubMed CAS Google Scholar * Wu, H. et al. Cellular resolution maps of X-chromosome

inactivation: implications for neural development, function, and disease. _Neuron_ 81, 103–119 (2014). Article PubMed PubMed Central CAS Google Scholar * Mutzel, V. et al. A symmetric

toggle switch explains the onset of random X inactivation in different mammals. _Nat. Struct. Mol. Biol._ 26, 350–360 (2019). Article PubMed PubMed Central CAS Google Scholar * Migeon,

B. _Females Are Mosaics: X Inactivation and Sex Differences in Disease_. _Females Are Mosaics_ (Oxford University Press, 2013). * Fang, H., Deng, X. & Disteche, C. M. X-factors in human

disease: impact of gene content and dosage regulation. _Hum. Mol. Genet._ 30, R285–R295 (2021). Article PubMed PubMed Central CAS Google Scholar * Amos-Landgraf, J. M. et al. X

Chromosome–Inactivation Patterns of 1005 Phenotypically Unaffected Females. _Am. J. Hum. Genet._ 79, 493–499 (2006). Article PubMed PubMed Central CAS Google Scholar * Shvetsova, E. et

al. Skewed X-inactivation is common in the general female population. _Eur. J. Hum. Genet._ 27, 455–465 (2019). Article PubMed CAS Google Scholar * Migeon, B. R. Non-random X chromosome

inactivation in mammalian cells. _Cytogenet. Cell Genet._ 80, 142–148 (1998). Article PubMed CAS Google Scholar * Plenge, R. M. et al. A promoter mutation in the XIST gene in two

unrelated families with skewed X-chromosome inactivation. _Nat. Genet._ 17, 353–356 (1997). Article PubMed CAS Google Scholar * Belmont, J. W. Genetic control of X inactivation and

processes leading to X-inactivation skewing. _Am. J. Hum. Genet._ 58, 1101–1108 (1996). PubMed PubMed Central CAS Google Scholar * Brown, C. & Robinson, W. The causes and

consequences of random and non-random X chromosome inactivation in humans: X chromosome inactivation in humans. _Clin. Genet._ 58, 353–363 (2000). Article PubMed CAS Google Scholar *

Cattanach, B. M. & Isaacson, J. H. Genetic control over the inactivation of autosomal genes attached to the X-chromosome. _Z. Vererbungsl_ 96, 313–323 (1965). PubMed CAS Google Scholar

* Simmler, M. C., Cattanach, B. M., Rasberry, C., Rougeulle, C. & Avner, P. Mapping the murine Xce locus with (CA)n repeats. _Mamm. Genome J. Int. Mamm. Genome Soc._ 4, 523–530 (1993).

Article CAS Google Scholar * Sun, K. Y. et al. Bayesian modeling of skewed X inactivation in genetically diverse mice identifies a novel Xce allele associated with copy number changes.

_Genetics_ 218, iyab034 (2021). Article PubMed PubMed Central Google Scholar * Peeters, S. B., Yang, C. & Brown, C. J. Have humans lost control: The elusive X-controlling element.

_Semin. Cell Dev. Biol._ 56, 71–77 (2016). Article PubMed CAS Google Scholar * Bolduc, V. et al. No evidence that skewing of X chromosome inactivation patterns is transmitted to

offspring in humans. https://www.jci.org/articles/view/33166/pdf (2008). * Gandini, E., Gartler, S. M., Angioni, G., Argiolas, N. & Dell’Acqua, G. Developmental implications of multiple

tissue studies in glucose-6-phosphate dehydrogenase-deficient heterozygotes. _Proc. Natl Acad. Sci._ 61, 945–948 (1968). Article ADS PubMed PubMed Central CAS Google Scholar * Gandini,

E. & Gartler, S. M. Glucose-6-phosphate Dehydrogenase Mosaicism for studying the Development of Blood Cell Precursors. _Nature_ 224, 599–600 (1969). Article ADS PubMed CAS Google

Scholar * Nesbitt, M. N. X chromosome inactivation mosaicism in the mouse. _Dev. Biol._ 26, 252–263 (1971). Article Google Scholar * Fialkow, P. J. Primordial cell pool size and lineage

relationships of five human cell types. _Ann. Hum. Genet._ 37, 39–48 (1973). Article PubMed CAS Google Scholar * McMahon, A., Fosten, M. & Monk, M. X-chromosome inactivation

mosaicism in the three germ layers and the germ line of the mouse embryo. _J. Embryol. Exp. Morphol._ 74, 207–220 (1983). PubMed CAS Google Scholar * Werner, J. M., Ballouz, S., Hover, J.

& Gillis, J. Variability of cross-tissue X-chromosome inactivation characterizes timing of human embryonic lineage specification events. _Dev. Cell_ 57, 1995–2008.e5 (2022). Article

PubMed PubMed Central CAS Google Scholar * Bittel, D. C. et al. Comparison of X-chromosome inactivation patterns in multiple tissues from human females. _J. Med. Genet._ 45, 309–313

(2008). Article PubMed CAS Google Scholar * Brown, C. J. et al. The human XIST gene: Analysis of a 17 kb inactive X-specific RNA that contains conserved repeats and is highly localized

within the nucleus. _Cell_ 71, 527–542 (1992). Article PubMed CAS Google Scholar * Dossin, F. & Heard, E. The Molecular and Nuclear Dynamics of X-Chromosome Inactivation. _Cold

Spring Harb. Perspect. Biol_. a040196, https://doi.org/10.1101/cshperspect.a040196 (2021). * Dixon-McDougall, T. & Brown, C. J. Multiple distinct domains of human XIST are required to

coordinate gene silencing and subsequent heterochromatin formation. _Epigenet. Chromatin_ 15, 6 (2022). Article CAS Google Scholar * Calaway, J. D. et al. Genetic Architecture of Skewed X

Inactivation in the Laboratory Mouse. _PLOS Genet._ 9, e1003853 (2013). Article PubMed PubMed Central Google Scholar * Migeon, B. R. Studies of skin fibroblasts from 10 families with

HGPRT deficiency, with reference in X-chromosomal inactivation. _Am. J. Hum. Genet._ 23, 199–210 (1971). PubMed PubMed Central CAS Google Scholar * Migeon, B. R. et al.

Adrenoleukodystrophy: evidence for X linkage, inactivation, and selection favoring the mutant allele in heterozygous cells. _Proc. Natl Acad. Sci. Usa._ 78, 5066–5070 (1981). Article ADS

PubMed PubMed Central CAS Google Scholar * Devriendt, K. et al. Skewed X-chromosome inactivation in female carriers of dyskeratosis congenita. _Am. J. Hum. Genet._ 60, 581–587 (1997).

PubMed PubMed Central CAS Google Scholar * Plenge, R. M., Stevenson, R. A., Lubs, H. A., Schwartz, C. E. & Willard, H. F. Skewed X-chromosome inactivation is a common feature of

X-linked mental retardation disorders. _Am. J. Hum. Genet._ 71, 168–173 (2002). Article PubMed PubMed Central CAS Google Scholar * Schmidt, M. & Du Sart, D. Functional disomies of

the X chromosome influence the cell selection and hence the X inactivation pattern in females with balanced X-autosome translocations: a review of 122 cases. _Am. J. Med. Genet._ 42, 161–169

(1992). Article PubMed CAS Google Scholar * Lonsdale, J. et al. The Genotype-Tissue Expression (GTEx) project. _Nat. Genet._ 45, 580–585 (2013). Article CAS Google Scholar * Bonora,

G. & Disteche, C. M. Structural aspects of the inactive X chromosome. _Philos. Trans. R. Soc. B Biol. Sci._ 372, 20160357 (2017). Article Google Scholar * Fang, H., Disteche, C. M.

& Berletch, J. B. X Inactivation and Escape: Epigenetic and Structural Features. _Front. Cell Dev. Biol._ 7, 219 (2019). Article PubMed PubMed Central Google Scholar * Philip, V. M.

et al. Gene expression genetics of the striatum of Diversity Outbred mice. _Sci. Data_ 10, 522 (2023). Article PubMed PubMed Central CAS Google Scholar * Keller, M. P. et al. Genetic

Drivers of Pancreatic Islet Function. _Genetics_ 209, 335–356 (2018). Article PubMed PubMed Central CAS Google Scholar * Churchill, G. A., Gatti, D. M., Munger, S. C. & Svenson, K.

L. The Diversity Outbred Mouse Population. _Mamm. Genome. J. Int. Mamm. Genome Soc._ 23, 713–718 (2012). Article Google Scholar * Posynick, B. J. & Brown, C. J. Escape From

X-Chromosome Inactivation: An Evolutionary Perspective. _Front. Cell Dev. Biol._ 7, 241 (2019). Article PubMed PubMed Central Google Scholar * Rozowsky, J. et al. The EN-TEx resource of

multi-tissue personal epigenomes & variant-impact models. _Cell_ 186, 1493–1511.e40 (2023). Article PubMed PubMed Central CAS Google Scholar * Miller, J. M. et al. Estimating

genome-wide heterozygosity: effects of demographic history and marker type. _Heredity_ 112, 240–247 (2014). Article PubMed CAS Google Scholar * Shorter, J. R. et al. Male Infertility Is

Responsible for Nearly Half of the Extinction Observed in the Mouse Collaborative Cross. _Genetics_ 206, 557–572 (2017). Article PubMed PubMed Central CAS Google Scholar * GTEx

Consortium et al. Landscape of X chromosome inactivation across human tissues. _Nature_ 550, 244–248 (2017). Article PubMed Central Google Scholar * Payseur, B. A., Cutter, A. D. &

Nachman, M. W. Searching for Evidence of Positive Selection in the Human Genome Using Patterns of Microsatellite Variability. _Mol. Biol. Evol._ 19, 1143–1153 (2002). Article PubMed CAS

Google Scholar * Avery, P. J. The population genetics of haplo-diploids and X-linked genes. _Genet. Res._ 44, 321–341 (1984). Article Google Scholar * Casto, A. M. et al. Characterization

of X-Linked SNP genotypic variation in globally distributed human populations. _Genome Biol._ 11, R10 (2010). Article PubMed PubMed Central Google Scholar * Veeramah, K. R., Gutenkunst,

R. N., Woerner, A. E., Watkins, J. C. & Hammer, M. F. Evidence for Increased Levels of Positive and Negative Selection on the X Chromosome versus Autosomes in Humans. _Mol. Biol. Evol._

31, 2267–2282 (2014). Article PubMed PubMed Central CAS Google Scholar * Hatakeyama, C. et al. The dynamics of X-inactivation skewing as women age. _Clin. Genet._ 66, 327–332 (2004).

Article PubMed CAS Google Scholar * Mak, W. et al. Reactivation of the Paternal X Chromosome in Early Mouse Embryos. _Science_ 303, 666–669 (2004). Article ADS PubMed CAS Google

Scholar * Petropoulos, S. et al. Single-Cell RNA-Seq Reveals Lineage and X Chromosome Dynamics in Human Preimplantation Embryos. _Cell_ 165, 1012–1026 (2016). Article PubMed PubMed

Central CAS Google Scholar * Moreira de Mello, J. C., Fernandes, G. R., Vibranovski, M. D. & Pereira, L. V. Early X chromosome inactivation during human preimplantation development

revealed by single-cell RNA-sequencing. _Sci. Rep._ 7, 10794 (2017). Article ADS PubMed PubMed Central Google Scholar * Patrat, C., Ouimette, J.-F. & Rougeulle, C. X chromosome

inactivation in human development. _Development_ 147, dev183095 (2020). Article PubMed CAS Google Scholar * Dupont, C. & Gribnau, J. Different flavors of X-chromosome inactivation in

mammals. _Curr. Opin. Cell Biol._ 25, 314–321 (2013). Article PubMed CAS Google Scholar * Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. _Bioinformatics_ 29, 15–21 (2013).

Article CAS Google Scholar * McKenna, A. et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. _Genome Res._ 20, 1297–1303 (2010).

Article PubMed Central CAS Google Scholar * Li, H. et al. The Sequence Alignment/Map format and SAMtools. _Bioinforma. Oxf. Engl._ 25, 2078–2079 (2009). Article Google Scholar *

Robinson, J. T. et al. Integrative genomics viewer. _Nat. Biotechnol._ 29, 24–26 (2011). Article PubMed Central CAS Google Scholar * van de Geijn, B., McVicker, G., Gilad, Y. &

Pritchard, J. K. WASP: allele-specific software for robust molecular quantitative trait locus discovery. _Nat. Methods_ 12, 1061–1063 (2015). Article PubMed PubMed Central Google Scholar

* Urbakh, V. Y. Statistical Testing of Differences in Causal Behaviour of Two Morphologically Indistinguishable Objects. _Biometrics_ 23, 137–143 (1967). Article PubMed CAS Google

Scholar * Gart, J. J. A Locally Most Powerful Test for the Symmetric Folded Binomial Distribution. _Biometrics_ 26, 129–138 (1970). Article MathSciNet PubMed CAS Google Scholar * Gu,

Z., Eils, R. & Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. _Bioinformatics_ 32, 2847–2849 (2016). Article CAS Google Scholar * R

Core Team. _R: A Language and Environment for Statistical Computing_ (R Foundation for Statistical Computing, 2023). * Wickham, H. _ggplot2: Elegant Graphics for Data Analysis_

(Springer-Verl, 2016). * Werner, J. M., Hover, J. & Gillis, J. Population variability in X-chromosome inactivation across 10 mammalian species.

Werner_Hover_Gillis_cross_species_XCI_2024. https://doi.org/10.5281/zenodo.13774726 (2024). Download references ACKNOWLEDGEMENTS J.G., J.M.W., and J.H. were supported by NIH grants

R01MH113005. We thank all members of the Gillis lab and particularly John Lee for assisting in some of the initial data downloading. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Stanley

Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA Jonathan M. Werner, John Hover & Jesse Gillis * Physiology Department and Donnelly

Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada Jonathan M. Werner & Jesse Gillis Authors * Jonathan M. Werner View author publications You can

also search for this author inPubMed Google Scholar * John Hover View author publications You can also search for this author inPubMed Google Scholar * Jesse Gillis View author publications

You can also search for this author inPubMed Google Scholar CONTRIBUTIONS J.G. conceived the project. J.M.W. and J.G. designed the experiments and wrote the manuscript. J.M.W. performed the

experiments. J.H. and J.M.W performed data management and data processing. CORRESPONDING AUTHOR Correspondence to Jesse Gillis. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no

competing interests. PEER REVIEW PEER REVIEW INFORMATION _Nature Communications_ thanks Samual Collombet and the other, anonymous, reviewer(s) for their contribution to the peer review of

this work. A peer review file is available. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional

affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION TRANSPARENT PEER REVIEW FILE REPORTING SUMMARY RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative

Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the

original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in

the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended

use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Werner, J.M., Hover, J. & Gillis, J. Population variability in X-chromosome

inactivation across 10 mammalian species. _Nat Commun_ 15, 8991 (2024). https://doi.org/10.1038/s41467-024-53449-1 Download citation * Received: 09 November 2023 * Accepted: 08 October 2024

* Published: 18 October 2024 * DOI: https://doi.org/10.1038/s41467-024-53449-1 SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable

link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative

Abc news – breaking news, latest news and videos

1 Eyewitness describes being on the scene of Boulder attack: ‘It is incredibly scary’ 4:32Playing 2 Sunny, one of the cl...

BP's Toll on Texas - The Texas Observer

FORGET THE GULF OIL SPILL FOR A MOMENT (if that’s possible). If you want to understand the depth of BP’s venality—and th...

With scotus decision on sb 1070 lawsuits, harrasment will continue

The U.S. Supreme Court struck down three provisions of Arizona’s controversial SB 1070 as unconstitutional Monday, but u...

Uk court orders gsk to pay astrazeneca royalties on total sales of zejula

UK court orders GSK to pay AstraZeneca royalties on total sales of Zejula | WTVB | 1590 AM · 95.5 FM | The Voice of Bran...

Assignment: maine | goranson farm, maine maple sunday

Special | 9m 14s Goranson Farm opens their doors on Maine Maple Sunday. Goranson Farm opens their doors on Maine Maple S...

When points of order fail, democrats turn to amendments to fight voter id bill

After six failed points of order, the House Democrats decided to take a different approach to the debate over the voter ...