
Mapping the functional landscape of t cell receptor repertoires by single-t cell transcriptomics
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN
Play all audios:

ABSTRACT Many experimental and bioinformatics approaches have been developed to characterize the human T cell receptor (TCR) repertoire. However, the unknown functional relevance of TCR
profiling hinders unbiased interpretation of the biology of T cells. To address this inadequacy, we developed tessa, a tool to integrate TCRs with gene expression of T cells to estimate the
effect that TCRs confer on the phenotypes of T cells. Tessa leveraged techniques combining single-cell RNA-sequencing with TCR sequencing. We validated tessa and showed its superiority over
existing approaches that investigate only the TCR sequences. With tessa, we demonstrated that TCR similarity constrains the phenotypes of T cells to be similar and dictates a gradient in
antigen targeting efficiency of T cell clonotypes with convergent TCRs. We showed this constraint could predict a functional dichotomization of T cells postimmunotherapy treatment and is
weakened in tumor contexts. Access through your institution Buy or subscribe This is a preview of subscription content, access via your institution ACCESS OPTIONS Access through your
institution Access Nature and 54 other Nature Portfolio journals Get Nature+, our best-value online-access subscription $29.99 / 30 days cancel any time Learn more Subscribe to this journal
Receive 12 print issues and online access $259.00 per year only $21.58 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices
may be subject to local taxes which are calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support
SIMILAR CONTENT BEING VIEWED BY OTHERS HIGH-THROUGHPUT AND SINGLE-CELL T CELL RECEPTOR SEQUENCING TECHNOLOGIES Article 19 July 2021 BENCHMARKING OF T CELL RECEPTOR REPERTOIRE PROFILING
METHODS REVEALS LARGE SYSTEMATIC BIASES Article 07 September 2020 SINGLE-CELL IMMUNE REPERTOIRE ANALYSIS Article 18 April 2024 DATA AVAILABILITY The bulk RNA-seq datasets used for deriving
TCRs and then for the auto-encoder training are publicly available at https://gdc.cancer.gov/about-data/publications/panimmune (TCGA23), https://www.iedb.org/database_export_v3.php (IEDB)
and http://friedmanlab.weizmann.ac.il/McPAS-TCR/ (McPAS25). We made the Kidney-bulkRNA24 dataset available in csv format at
https://github.com/jcao89757/TESSA/tree/master/Tessa_released_data. All scRNA-seq/TCR-seq datasets are publicly available. The NSCLC-1 and healthy PBMC-1 datasets are available on the 10x
website https://support.10xgenomics.com/single-cell-vdj/datasets/2.2.0. The healthy-CD8 1–4 datasets are available on
https://www.10xgenomics.com/resources/application-notes/a-new-way-of-exploring-immunity-linking-highly-multiplexed-antigen-recognition-to-immune-repertoire-and-phenotype/. The healthy PBMC-2
dataset is also available on the 10x Genomics website https://support.10xgenomics.com/single-cell-vdj/datasets/3.0.0. The NSCLC-2 (ref. 26), CRC27 and HCC28 datasets are downloaded from the
European Genome-Phenome Archive (EGA) under accession numbers EGAS00001002430, EGAS00001002791 and EGAS00001002072, respectively. The Breast-1–5 (ref. 29) datasets are available on the Gene
Expression Omnibus (GEO) under accession numbers GSE114727 and GSE114724. The Melanoma30, BCC31 and ECCITE-Seq16 datasets are also on the GEO database under study numbers GSE123139,
GSE113590 and GSE126310. The Glanville10 dataset is downloaded from https://doi.org/10.1038/nature22976. The Dash11 dataset is available in the National Center for Biotechnology Information
Sequence Read Archive under accession number SRP101659. The details of the data used, including sample size, role in the analysis and references, are shown in Supplementary Table 1. All
scRNA-seq data were involved in Fig. 2 (directly or indirectly mentioned), the BCC scRNA-seq data were used in Fig. 3 and all scRNA-seq data were used in Fig. 4. Source data are provided
with this paper. CODE AVAILABILITY The tessa model is available at https://github.com/jcao89757/tessa (https://doi.org/10.5281/zenodo.4161819)46. The SCINA model is available at
https://github.com/jcao89757/SCINA (https://doi.org/10.3390/genes10070531)45. REFERENCES * Oettinger, M. A. V(D)J recombination: on the cutting edge. _Curr. Opin. Cell Biol._ 11, 325–329
(1999). Article CAS Google Scholar * Jung, D. & Alt, F. W. Unraveling V(D)J recombination: insights into gene regulation. _Cell_ 116, 299–311 (2004). Article CAS Google Scholar *
Kappler, J. et al. The major histocompatibility complex-restricted antigen receptor on T cells in mouse and man: identification of constant and variable peptides. _Cell_ 35, 295–302 (1983).
Article CAS Google Scholar * Haskins, K. et al. The major histocompatibility complex-restricted antigen receptor on T cells. I. Isolation with a monoclonal antibody. _J. Exp. Med._ 157,
1149–1169 (1983). Article CAS Google Scholar * Staveley-O’Carroll, K. et al. Induction of antigen-specific T cell anergy: an early event in the course of tumor progression. _Proc. Natl
Acad. Sci. USA_ 95, 1178–1183 (1998). Article Google Scholar * Skapenko, A., Leipe, J., Lipsky, P. E. & Schulze-Koops, H. The role of the T cell in autoimmune inflammation. _Arthritis
Res. Ther._ 7, S4–S14 (2005). Article Google Scholar * Stubbington, M. J. T. et al. T cell fate and clonality inference from single-cell transcriptomes. _Nat. Methods_ 13, 329–332 (2016).
Article Google Scholar * Bolotin, D. A. et al. Antigen receptor repertoire profiling from RNA-seq data. _Nat. Biotechnol._ 35, 908–911 (2017). Article CAS Google Scholar * Eltahla, A.
A. et al. Linking the T cell receptor to the single cell transcriptome in antigen-specific human T cells. _Immunol. Cell Biol._ 94, 604–611 (2016). Article CAS Google Scholar * Glanville,
J. et al. Identifying specificity groups in the T cell receptor repertoire. _Nature_ 547, 94–98 (2017). Article CAS Google Scholar * Dash, P. et al. Quantifiable predictive features
define epitope-specific T cell receptor repertoires. _Nature_ 547, 89–93 (2017). Article CAS Google Scholar * Tubo, N. J. et al. Single naive CD4+ T cells from a diverse repertoire
produce different effector cell types during infection. _Cell_ 153, 785–796 (2013). Article CAS Google Scholar * Buchholz, V. R. et al. Disparate individual fates compose robust CD8+ T
cell immunity. _Science_ 340, 630–635 (2013). Article CAS Google Scholar * Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. _Nat. Protoc._ 9, 171–181 (2014).
Article CAS Google Scholar * Sheng, K., Cao, W., Niu, Y., Deng, Q. & Zong, C. Effective detection of variation in single-cell transcriptomes using MATQ-seq. _Nat. Methods_ 14, 267–270
(2017). Article CAS Google Scholar * Mimitou, E. P. et al. Multiplexed detection of proteins, transcriptomes, clonotypes and CRISPR perturbations in single cells. _Nat. Methods_ 16,
409–412 (2019). Article CAS Google Scholar * Atchley, W. R., Zhao, J., Fernandes, A. D. & Drüke, T. Solving the protein sequence metric problem. _Proc. Natl Acad. Sci. USA_ 102,
6395–6400 (2005). Article CAS Google Scholar * Ballard, D. Modular learning in neural networks. In _Proc. Sixth National Conference on Artificial Intelligence_ Vol. 1, 279–284 (ACM,
1987). * Ostmeyer, J. et al. Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis. _BMC Bioinf._ 18, 401 (2017). Article Google
Scholar * Ostmeyer, J., Christley, S., Toby, I. T. & Cowell, L. G. Biophysicochemical motifs in T-cell receptor sequences distinguish repertoires from tumor-infiltrating lymphocyte and
adjacent healthy tissue. _Cancer Res._ 79, 1671–1680 (2019). Article CAS Google Scholar * Thomas, N. et al. Tracking global changes induced in the CD4 T-cell receptor repertoire by
immunization with a complex antigen using short stretches of CDR3 protein sequence. _Bioinformatics_ 30, 3181–3188 (2014). Article CAS Google Scholar * Zhang, A. W. et al. Interfaces of
malignant and immunologic clonal dynamics in ovarian cancer. _Cell_ 173, 1755–1769.e22 (2018). Article CAS Google Scholar * Thorsson, V. et al. The immune landscape of cancer. _Immunity_
48, 812–830.e14 (2018). Article CAS Google Scholar * Wang, T. et al. An empirical approach leveraging tumorgrafts to dissect the tumor microenvironment in renal cell carcinoma identifies
missing link to prognostic inflammatory factors. _Cancer Disco._ 8, 1142–1155 (2018). Article CAS Google Scholar * Tickotsky, N., Sagiv, T., Prilusky, J., Shifrut, E. & Friedman, N.
McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences. _Bioinformatics_ 33, 2924–2929 (2017). Article CAS Google Scholar * Guo, X. et al. Global
characterization of T cells in non-small-cell lung cancer by single-cell sequencing. _Nat. Med._ 24, 978–985 (2018). Article CAS Google Scholar * Zhang, L. et al. Lineage tracking reveals
dynamic relationships of T cells in colorectal cancer. _Nature_ 564, 268–272 (2018). Article CAS Google Scholar * Zheng, C. et al. Landscape of Infiltrating T cells in liver cancer
revealed by single-cell sequencing. _Cell_ 169, 1342–1356.e16 (2017). Article CAS Google Scholar * Azizi, E. et al. Single-cell map of diverse immune phenotypes in the breast tumor
microenvironment. _Cell_ 174, 1293–1308.e36 (2018). Article CAS Google Scholar * Li, H. et al. Dysfunctional CD8 T cells form a proliferative, dynamically regulated compartment within
human melanoma. _Cell_ 176, 775–789.e18 (2019). Article CAS Google Scholar * Yost, K. E. et al. Clonal replacement of tumor-specific T cells following PD-1 blockade. _Nat. Med._ 25,
1251–1259 (2019). Article CAS Google Scholar * Eduati, F. et al. Prediction of human population responses to toxic compounds by a collaborative competition. _Nat. Biotechnol._ 33, 933–940
(2015). Article CAS Google Scholar * Bansal, M. et al. A community computational challenge to predict the activity of pairs of compounds. _Nat. Biotechnol._ 32, 1213–1222 (2014). Article
CAS Google Scholar * Costello, J. C. & Stolovitzky, G. Seeking the wisdom of crowds through challenge-based competitions in biomedical research. _Clin. Pharmacol. Ther._ 93, 396–398
(2013). Article CAS Google Scholar * Waugh, K. A. et al. Molecular profile of tumor-specific CD8+ T cell hypofunction in a transplantable murine cancer model. _J. Immunol._ 197, 1477–1488
(2016). Article CAS Google Scholar * Wu, A. A., Drake, V., Huang, H.-S., Chiu, S. & Zheng, L. Reprogramming the tumor microenvironment: tumor-induced immunosuppressive factors
paralyze T cells. _Oncoimmunology_ 4, e1016700 (2015). Article Google Scholar * Burkholder, B. et al. Tumor-induced perturbations of cytokines and immune cell networks. _Biochim. Biophys.
Acta_ 1845, 182–201 (2014). CAS PubMed Google Scholar * Conley, J. M., Gallagher, M. P. & Berg, L. J. T cells and gene regulation: The switching on and turning up of genes after T
cell receptor stimulation in CD8 T cells. _Front. Immunol_. https://doi.org/10.3389/fimmu.2016.00076 (2016). * Cho, J.-H. et al. Unique features of naive CD8+ T cell activation by IL-2. _J.
Immunol._ 191, 5559–5573 (2013). Article CAS Google Scholar * Iezzi, G., Karjalainen, K. & Lanzavecchia, A. The duration of antigenic stimulation determines the fate of naive and
effector T cells. _Immunity_ 8, 89–95 (1998). Article CAS Google Scholar * Moskophidis, D., Lechner, F., Pircher, H. & Zinkernagel, R. M. Virus persistence in acutely infected
immunocompetent mice by exhaustion of antiviral cytotoxic effector T cells. _Nature_ 362, 758–761 (1993). Article CAS Google Scholar * Kalergis, A. M. et al. Efficient T cell activation
requires an optimal dwell-time of interaction between the TCR and the pMHC complex. _Nat. Immunol._ 2, 229–234 (2001). Article CAS Google Scholar * Corse, E., Gottschalk, R. A.,
Krogsgaard, M. & Allison, J. P. Attenuated T cell responses to a high-potency ligand in vivo. _PLoS Biol_. https://doi.org/10.1371/journal.pbio.1000481 (2010). * Mikolov, T., Chen, K.,
Corrado, G.S., & Dean, J. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013). * Zhang, Z. et al. SCINA: a semi-supervised subtyping algorithm of
single cells and bulk samples. _Genes_ https://doi.org/10.3390/genes10070531 (2019). * Zhang, Z. jcao89757/TESSA: mapping the functional landscape of T cell receptor repertoire by single T
cell transcriptomics. _Zenodo_ https://doi.org/10.5281/zenodo.4161819 (2020). Download references ACKNOWLEDGEMENTS We thank L.H.R. Xu for his valuable input on the manuscript writing. This
study was supported by the National Institutes of Health (NIH) (grant nos. CCSG 5P30CA142543 to T.W. and R15GM131390 to X.W.) and Cancer Prevention Research Institute of Texas (grant no.
CPRIT RP190208 to T.W.). AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern
Medical Center, Dallas, TX, USA Ze Zhang, Hongyu Liu & Tao Wang * Department of Statistical Science, Southern Methodist University, Dallas, TX, USA Danyi Xiong & Xinlei Wang * Center
for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, USA Tao Wang Authors * Ze Zhang View author publications You can also search for this author
inPubMed Google Scholar * Danyi Xiong View author publications You can also search for this author inPubMed Google Scholar * Xinlei Wang View author publications You can also search for this
author inPubMed Google Scholar * Hongyu Liu View author publications You can also search for this author inPubMed Google Scholar * Tao Wang View author publications You can also search for
this author inPubMed Google Scholar CONTRIBUTIONS Z.Z. contributed to the computational analyses and manuscript writing. D.X. and X.W. contributed to the design and write-up of the
statistical methodologies. H.L. provided valuable suggestions on the direction of the project, and contributed to manuscript writing. T.W. contributed to the overall supervision of the
project, study design and manuscript writing. CORRESPONDING AUTHOR Correspondence to Tao Wang. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL
INFORMATION PEER REVIEW INFORMATION Madhura Mukhopadhyay was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the
editorial team. PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. EXTENDED DATA EXTENDED DATA FIG. 1
DETAILS OF THE STACKED AUTO-ENCODER FOR TCR EMBEDDING. A, The structure of the auto-encoder, with the configurations of each layer shown. B, Typical examples of TCR CDR3b sequences, heatmaps
of the initially embedded ‘Atchley’ matrices of TCRs, and heatmaps of the auto-encoder-reconstructed ‘Athley’ matrices. The TCR sequence examples were not used in the training step of the
auto-encoder. C, Scatterplots showing the consistency between the ‘Atchley factor’ values of the original and re-constructed TCRs. Green points represent tiles in the heatmaps in (B). Source
data EXTENDED DATA FIG. 2 SCATTERPLOTS SHOWING THE RELATIONSHIPS BETWEEN THE DISTANCES OF TCRS AND THE DISTANCES OF RNA EXPRESSION LEVELS FOR SEVERAL MORE DATASETS. Both distances are
calculated in a pair-wise manner between all the T cell clonotypes of each dataset. Four example datasets are shown: Healthy-CD8-3 (A), Healthy-CD8-4 (B), Breast-1 (C), and Breast-2 (D)
(Supplementary Table 1). The P values indicate the significance of the Pearson correlation coefficients. The shaded areas denote the 95% confidence intervals for linear regressions. Source
data EXTENDED DATA FIG. 3 THE WEIGHTS OF THE TCR EMBEDDINGS LEARNED FROM TESSA. The X axis shows the digits of the 30-dimensional embeddings, and the Y axis shows the weights learned for all
datasets. Each bar represents one digit of the weights and shows the values of that digit obtained from all the 19 scRNA datasets in the Supplementary Table 1. Source data EXTENDED DATA
FIG. 4 BENCHMARKING RESULTS USING GLIPH. A, Clustering rates of the four Healthy-CD8 datasets from 10x Genomics, the Glanville dataset, and the Dash dataset under different global
convergence distance cutoff (‘_gccutoff_’) values (Supplementary Table 1). The dashed lines represented the tessa clustering rates of the corresponding datasets. B, Clustering purities of
GLIPH when the ‘_gccutoff_’ equals to 3. The cutoff value was selected so that the GLIPH clusters achieved clustering rates that are most similar to the tessa networks. The clustering
purities were calculated with the same method as in Fig. 2. C, D, The GLIPH network purities (C) and number of networks (D) with different _‘gccutoff’_ values, compared with the tessa
network purities and the number of networks. Source data EXTENDED DATA FIG. 5 CLUSTERING OF TCR CLONOTYPES INFORMED BY TESSA IS REFLECTIVE OF ANTIGEN BINDING SPECIFICITY. The antigen binding
specificity of 207 Human TCRβ chains from 704 T cells were profiled against two epitopes in the Dash dataset, and 276 TCRs from 415 T cells against three epitopes in the Glanville dataset.
A, B, T-SNE plots showing the TCR clonotypes in the space of the TCR embeddings, with the embeddings adjusted by the tessa-inferred weights. The hierarchical clustering tree cutoff used in
the two plots was represented with green dashed lines in c-f. Each point in the plots represents one TCR clonotype, and the size of the point refers to the clone size. Points are colored by
the true antigens that the corresponding TCRs target according to the original report. Points are connected if they are clustered into the same network based on hierarchical clustering of
the TCR embeddings. T cell clones with only one cell were deemed as having low confidence and unclustered clones, which does not affect the calculation of the purities, were excluded from
visualization. C, D, The numbers of TCR networks and the clustering rates with different hierarchical tree cutoffs in the Dash dataset (C) and in the Glanville dataset (D). Cluster rates
were calculated as the number of TCR clonotypes that are clustered with at least another TCR clonotype, divided by the total number of TCR clonotypes. E, F, The network purities and p-values
testing the significance of the purities with different hierarchical tree cutoffs in the Dash dataset (C) and the Glanville dataset (D). The network purity and P value calculations were
described in the Methods section. Source data EXTENDED DATA FIG. 6 T CELL PATHWAY ACTIVITY SCORES OF THE DIFFERENT T CELL SUBSETS IN THE BCC DATASET. The naive and activated pathways are
shown, to be compared against the inhibition, memory and exhausted pathways shown in Fig. 3. The T cell subsets were the same as those in Fig. 3e-g. Source data EXTENDED DATA FIG. 7
PSEUDOTIME ANALYSIS OF THE DIFFERENT T CELL SUBSETS IN THE BCC DATASET. The T cell subsets were the same as those in Fig. 3e–g. Source data EXTENDED DATA FIG. 8 A CARTOON SKETCH SHOWS HOW
THE UNEXPLAINED VARIANCE IN GENE EXPRESSION OF THE TCR NETWORKS WERE DETERMINED. Details were described in the MATERIALS AND METHODS section. Source data SUPPLEMENTARY INFORMATION
SUPPLEMENTARY INFORMATION Supplementary Tables 1 and 2 and Notes 1 and 2. REPORTING SUMMARY SOURCE DATA SOURCE DATA FIG. 1 Statistical source data. SOURCE DATA FIG. 2 Statistical source
data. SOURCE DATA FIG. 3 Statistical source data. SOURCE DATA FIG. 4 Statistical source data. SOURCE DATA EXTENDED DATA FIG. 1 Statistical source data. SOURCE DATA EXTENDED DATA FIG. 2
Statistical source data. SOURCE DATA EXTENDED DATA FIG. 3 Statistical source data. SOURCE DATA EXTENDED DATA FIG. 4 Statistical source data. SOURCE DATA EXTENDED DATA FIG. 5 Statistical
source data. SOURCE DATA EXTENDED DATA FIG. 6 Statistical source data. SOURCE DATA EXTENDED DATA FIG. 7 Statistical source data. SOURCE DATA EXTENDED DATA FIG. 8 Statistical source data.
RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Zhang, Z., Xiong, D., Wang, X. _et al._ Mapping the functional landscape of T cell receptor repertoires
by single-T cell transcriptomics. _Nat Methods_ 18, 92–99 (2021). https://doi.org/10.1038/s41592-020-01020-3 Download citation * Received: 08 April 2020 * Accepted: 12 November 2020 *
Published: 06 January 2021 * Issue Date: January 2021 * DOI: https://doi.org/10.1038/s41592-020-01020-3 SHARE THIS ARTICLE Anyone you share the following link with will be able to read this
content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative