Mapping the functional landscape of t cell receptor repertoires by single-t cell transcriptomics

Mapping the functional landscape of t cell receptor repertoires by single-t cell transcriptomics


Play all audios:


ABSTRACT Many experimental and bioinformatics approaches have been developed to characterize the human T cell receptor (TCR) repertoire. However, the unknown functional relevance of TCR


profiling hinders unbiased interpretation of the biology of T cells. To address this inadequacy, we developed tessa, a tool to integrate TCRs with gene expression of T cells to estimate the


effect that TCRs confer on the phenotypes of T cells. Tessa leveraged techniques combining single-cell RNA-sequencing with TCR sequencing. We validated tessa and showed its superiority over


existing approaches that investigate only the TCR sequences. With tessa, we demonstrated that TCR similarity constrains the phenotypes of T cells to be similar and dictates a gradient in


antigen targeting efficiency of T cell clonotypes with convergent TCRs. We showed this constraint could predict a functional dichotomization of T cells postimmunotherapy treatment and is


weakened in tumor contexts. Access through your institution Buy or subscribe This is a preview of subscription content, access via your institution ACCESS OPTIONS Access through your


institution Access Nature and 54 other Nature Portfolio journals Get Nature+, our best-value online-access subscription $29.99 / 30 days cancel any time Learn more Subscribe to this journal


Receive 12 print issues and online access $259.00 per year only $21.58 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices


may be subject to local taxes which are calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support


SIMILAR CONTENT BEING VIEWED BY OTHERS HIGH-THROUGHPUT AND SINGLE-CELL T CELL RECEPTOR SEQUENCING TECHNOLOGIES Article 19 July 2021 BENCHMARKING OF T CELL RECEPTOR REPERTOIRE PROFILING


METHODS REVEALS LARGE SYSTEMATIC BIASES Article 07 September 2020 SINGLE-CELL IMMUNE REPERTOIRE ANALYSIS Article 18 April 2024 DATA AVAILABILITY The bulk RNA-seq datasets used for deriving


TCRs and then for the auto-encoder training are publicly available at https://gdc.cancer.gov/about-data/publications/panimmune (TCGA23), https://www.iedb.org/database_export_v3.php (IEDB)


and http://friedmanlab.weizmann.ac.il/McPAS-TCR/ (McPAS25). We made the Kidney-bulkRNA24 dataset available in csv format at


https://github.com/jcao89757/TESSA/tree/master/Tessa_released_data. All scRNA-seq/TCR-seq datasets are publicly available. The NSCLC-1 and healthy PBMC-1 datasets are available on the 10x


website https://support.10xgenomics.com/single-cell-vdj/datasets/2.2.0. The healthy-CD8 1–4 datasets are available on


https://www.10xgenomics.com/resources/application-notes/a-new-way-of-exploring-immunity-linking-highly-multiplexed-antigen-recognition-to-immune-repertoire-and-phenotype/. The healthy PBMC-2


dataset is also available on the 10x Genomics website https://support.10xgenomics.com/single-cell-vdj/datasets/3.0.0. The NSCLC-2 (ref. 26), CRC27 and HCC28 datasets are downloaded from the


European Genome-Phenome Archive (EGA) under accession numbers EGAS00001002430, EGAS00001002791 and EGAS00001002072, respectively. The Breast-1–5 (ref. 29) datasets are available on the Gene


Expression Omnibus (GEO) under accession numbers GSE114727 and GSE114724. The Melanoma30, BCC31 and ECCITE-Seq16 datasets are also on the GEO database under study numbers GSE123139,


GSE113590 and GSE126310. The Glanville10 dataset is downloaded from https://doi.org/10.1038/nature22976. The Dash11 dataset is available in the National Center for Biotechnology Information


Sequence Read Archive under accession number SRP101659. The details of the data used, including sample size, role in the analysis and references, are shown in Supplementary Table 1. All


scRNA-seq data were involved in Fig. 2 (directly or indirectly mentioned), the BCC scRNA-seq data were used in Fig. 3 and all scRNA-seq data were used in Fig. 4. Source data are provided


with this paper. CODE AVAILABILITY The tessa model is available at https://github.com/jcao89757/tessa (https://doi.org/10.5281/zenodo.4161819)46. The SCINA model is available at


https://github.com/jcao89757/SCINA (https://doi.org/10.3390/genes10070531)45. REFERENCES * Oettinger, M. A. V(D)J recombination: on the cutting edge. _Curr. Opin. Cell Biol._ 11, 325–329


(1999). Article  CAS  Google Scholar  * Jung, D. & Alt, F. W. Unraveling V(D)J recombination: insights into gene regulation. _Cell_ 116, 299–311 (2004). Article  CAS  Google Scholar  *


Kappler, J. et al. The major histocompatibility complex-restricted antigen receptor on T cells in mouse and man: identification of constant and variable peptides. _Cell_ 35, 295–302 (1983).


Article  CAS  Google Scholar  * Haskins, K. et al. The major histocompatibility complex-restricted antigen receptor on T cells. I. Isolation with a monoclonal antibody. _J. Exp. Med._ 157,


1149–1169 (1983). Article  CAS  Google Scholar  * Staveley-O’Carroll, K. et al. Induction of antigen-specific T cell anergy: an early event in the course of tumor progression. _Proc. Natl


Acad. Sci. USA_ 95, 1178–1183 (1998). Article  Google Scholar  * Skapenko, A., Leipe, J., Lipsky, P. E. & Schulze-Koops, H. The role of the T cell in autoimmune inflammation. _Arthritis


Res. Ther._ 7, S4–S14 (2005). Article  Google Scholar  * Stubbington, M. J. T. et al. T cell fate and clonality inference from single-cell transcriptomes. _Nat. Methods_ 13, 329–332 (2016).


Article  Google Scholar  * Bolotin, D. A. et al. Antigen receptor repertoire profiling from RNA-seq data. _Nat. Biotechnol._ 35, 908–911 (2017). Article  CAS  Google Scholar  * Eltahla, A.


A. et al. Linking the T cell receptor to the single cell transcriptome in antigen-specific human T cells. _Immunol. Cell Biol._ 94, 604–611 (2016). Article  CAS  Google Scholar  * Glanville,


J. et al. Identifying specificity groups in the T cell receptor repertoire. _Nature_ 547, 94–98 (2017). Article  CAS  Google Scholar  * Dash, P. et al. Quantifiable predictive features


define epitope-specific T cell receptor repertoires. _Nature_ 547, 89–93 (2017). Article  CAS  Google Scholar  * Tubo, N. J. et al. Single naive CD4+ T cells from a diverse repertoire


produce different effector cell types during infection. _Cell_ 153, 785–796 (2013). Article  CAS  Google Scholar  * Buchholz, V. R. et al. Disparate individual fates compose robust CD8+ T


cell immunity. _Science_ 340, 630–635 (2013). Article  CAS  Google Scholar  * Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. _Nat. Protoc._ 9, 171–181 (2014).


Article  CAS  Google Scholar  * Sheng, K., Cao, W., Niu, Y., Deng, Q. & Zong, C. Effective detection of variation in single-cell transcriptomes using MATQ-seq. _Nat. Methods_ 14, 267–270


(2017). Article  CAS  Google Scholar  * Mimitou, E. P. et al. Multiplexed detection of proteins, transcriptomes, clonotypes and CRISPR perturbations in single cells. _Nat. Methods_ 16,


409–412 (2019). Article  CAS  Google Scholar  * Atchley, W. R., Zhao, J., Fernandes, A. D. & Drüke, T. Solving the protein sequence metric problem. _Proc. Natl Acad. Sci. USA_ 102,


6395–6400 (2005). Article  CAS  Google Scholar  * Ballard, D. Modular learning in neural networks. In _Proc. Sixth National Conference on Artificial Intelligence_ Vol. 1, 279–284 (ACM,


1987). * Ostmeyer, J. et al. Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis. _BMC Bioinf._ 18, 401 (2017). Article  Google


Scholar  * Ostmeyer, J., Christley, S., Toby, I. T. & Cowell, L. G. Biophysicochemical motifs in T-cell receptor sequences distinguish repertoires from tumor-infiltrating lymphocyte and


adjacent healthy tissue. _Cancer Res._ 79, 1671–1680 (2019). Article  CAS  Google Scholar  * Thomas, N. et al. Tracking global changes induced in the CD4 T-cell receptor repertoire by


immunization with a complex antigen using short stretches of CDR3 protein sequence. _Bioinformatics_ 30, 3181–3188 (2014). Article  CAS  Google Scholar  * Zhang, A. W. et al. Interfaces of


malignant and immunologic clonal dynamics in ovarian cancer. _Cell_ 173, 1755–1769.e22 (2018). Article  CAS  Google Scholar  * Thorsson, V. et al. The immune landscape of cancer. _Immunity_


48, 812–830.e14 (2018). Article  CAS  Google Scholar  * Wang, T. et al. An empirical approach leveraging tumorgrafts to dissect the tumor microenvironment in renal cell carcinoma identifies


missing link to prognostic inflammatory factors. _Cancer Disco._ 8, 1142–1155 (2018). Article  CAS  Google Scholar  * Tickotsky, N., Sagiv, T., Prilusky, J., Shifrut, E. & Friedman, N.


McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences. _Bioinformatics_ 33, 2924–2929 (2017). Article  CAS  Google Scholar  * Guo, X. et al. Global


characterization of T cells in non-small-cell lung cancer by single-cell sequencing. _Nat. Med._ 24, 978–985 (2018). Article  CAS  Google Scholar  * Zhang, L. et al. Lineage tracking reveals


dynamic relationships of T cells in colorectal cancer. _Nature_ 564, 268–272 (2018). Article  CAS  Google Scholar  * Zheng, C. et al. Landscape of Infiltrating T cells in liver cancer


revealed by single-cell sequencing. _Cell_ 169, 1342–1356.e16 (2017). Article  CAS  Google Scholar  * Azizi, E. et al. Single-cell map of diverse immune phenotypes in the breast tumor


microenvironment. _Cell_ 174, 1293–1308.e36 (2018). Article  CAS  Google Scholar  * Li, H. et al. Dysfunctional CD8 T cells form a proliferative, dynamically regulated compartment within


human melanoma. _Cell_ 176, 775–789.e18 (2019). Article  CAS  Google Scholar  * Yost, K. E. et al. Clonal replacement of tumor-specific T cells following PD-1 blockade. _Nat. Med._ 25,


1251–1259 (2019). Article  CAS  Google Scholar  * Eduati, F. et al. Prediction of human population responses to toxic compounds by a collaborative competition. _Nat. Biotechnol._ 33, 933–940


(2015). Article  CAS  Google Scholar  * Bansal, M. et al. A community computational challenge to predict the activity of pairs of compounds. _Nat. Biotechnol._ 32, 1213–1222 (2014). Article


  CAS  Google Scholar  * Costello, J. C. & Stolovitzky, G. Seeking the wisdom of crowds through challenge-based competitions in biomedical research. _Clin. Pharmacol. Ther._ 93, 396–398


(2013). Article  CAS  Google Scholar  * Waugh, K. A. et al. Molecular profile of tumor-specific CD8+ T cell hypofunction in a transplantable murine cancer model. _J. Immunol._ 197, 1477–1488


(2016). Article  CAS  Google Scholar  * Wu, A. A., Drake, V., Huang, H.-S., Chiu, S. & Zheng, L. Reprogramming the tumor microenvironment: tumor-induced immunosuppressive factors


paralyze T cells. _Oncoimmunology_ 4, e1016700 (2015). Article  Google Scholar  * Burkholder, B. et al. Tumor-induced perturbations of cytokines and immune cell networks. _Biochim. Biophys.


Acta_ 1845, 182–201 (2014). CAS  PubMed  Google Scholar  * Conley, J. M., Gallagher, M. P. & Berg, L. J. T cells and gene regulation: The switching on and turning up of genes after T


cell receptor stimulation in CD8 T cells. _Front. Immunol_. https://doi.org/10.3389/fimmu.2016.00076 (2016). * Cho, J.-H. et al. Unique features of naive CD8+ T cell activation by IL-2. _J.


Immunol._ 191, 5559–5573 (2013). Article  CAS  Google Scholar  * Iezzi, G., Karjalainen, K. & Lanzavecchia, A. The duration of antigenic stimulation determines the fate of naive and


effector T cells. _Immunity_ 8, 89–95 (1998). Article  CAS  Google Scholar  * Moskophidis, D., Lechner, F., Pircher, H. & Zinkernagel, R. M. Virus persistence in acutely infected


immunocompetent mice by exhaustion of antiviral cytotoxic effector T cells. _Nature_ 362, 758–761 (1993). Article  CAS  Google Scholar  * Kalergis, A. M. et al. Efficient T cell activation


requires an optimal dwell-time of interaction between the TCR and the pMHC complex. _Nat. Immunol._ 2, 229–234 (2001). Article  CAS  Google Scholar  * Corse, E., Gottschalk, R. A.,


Krogsgaard, M. & Allison, J. P. Attenuated T cell responses to a high-potency ligand in vivo. _PLoS Biol_. https://doi.org/10.1371/journal.pbio.1000481 (2010). * Mikolov, T., Chen, K.,


Corrado, G.S., & Dean, J. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781 (2013). * Zhang, Z. et al. SCINA: a semi-supervised subtyping algorithm of


single cells and bulk samples. _Genes_ https://doi.org/10.3390/genes10070531 (2019). * Zhang, Z. jcao89757/TESSA: mapping the functional landscape of T cell receptor repertoire by single T


cell transcriptomics. _Zenodo_ https://doi.org/10.5281/zenodo.4161819 (2020). Download references ACKNOWLEDGEMENTS We thank L.H.R. Xu for his valuable input on the manuscript writing. This


study was supported by the National Institutes of Health (NIH) (grant nos. CCSG 5P30CA142543 to T.W. and R15GM131390 to X.W.) and Cancer Prevention Research Institute of Texas (grant no.


CPRIT RP190208 to T.W.). AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern


Medical Center, Dallas, TX, USA Ze Zhang, Hongyu Liu & Tao Wang * Department of Statistical Science, Southern Methodist University, Dallas, TX, USA Danyi Xiong & Xinlei Wang * Center


for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, USA Tao Wang Authors * Ze Zhang View author publications You can also search for this author


inPubMed Google Scholar * Danyi Xiong View author publications You can also search for this author inPubMed Google Scholar * Xinlei Wang View author publications You can also search for this


author inPubMed Google Scholar * Hongyu Liu View author publications You can also search for this author inPubMed Google Scholar * Tao Wang View author publications You can also search for


this author inPubMed Google Scholar CONTRIBUTIONS Z.Z. contributed to the computational analyses and manuscript writing. D.X. and X.W. contributed to the design and write-up of the


statistical methodologies. H.L. provided valuable suggestions on the direction of the project, and contributed to manuscript writing. T.W. contributed to the overall supervision of the


project, study design and manuscript writing. CORRESPONDING AUTHOR Correspondence to Tao Wang. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL


INFORMATION PEER REVIEW INFORMATION Madhura Mukhopadhyay was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the


editorial team. PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. EXTENDED DATA EXTENDED DATA FIG. 1


DETAILS OF THE STACKED AUTO-ENCODER FOR TCR EMBEDDING. A, The structure of the auto-encoder, with the configurations of each layer shown. B, Typical examples of TCR CDR3b sequences, heatmaps


of the initially embedded ‘Atchley’ matrices of TCRs, and heatmaps of the auto-encoder-reconstructed ‘Athley’ matrices. The TCR sequence examples were not used in the training step of the


auto-encoder. C, Scatterplots showing the consistency between the ‘Atchley factor’ values of the original and re-constructed TCRs. Green points represent tiles in the heatmaps in (B). Source


data EXTENDED DATA FIG. 2 SCATTERPLOTS SHOWING THE RELATIONSHIPS BETWEEN THE DISTANCES OF TCRS AND THE DISTANCES OF RNA EXPRESSION LEVELS FOR SEVERAL MORE DATASETS. Both distances are


calculated in a pair-wise manner between all the T cell clonotypes of each dataset. Four example datasets are shown: Healthy-CD8-3 (A), Healthy-CD8-4 (B), Breast-1 (C), and Breast-2 (D)


(Supplementary Table 1). The P values indicate the significance of the Pearson correlation coefficients. The shaded areas denote the 95% confidence intervals for linear regressions. Source


data EXTENDED DATA FIG. 3 THE WEIGHTS OF THE TCR EMBEDDINGS LEARNED FROM TESSA. The X axis shows the digits of the 30-dimensional embeddings, and the Y axis shows the weights learned for all


datasets. Each bar represents one digit of the weights and shows the values of that digit obtained from all the 19 scRNA datasets in the Supplementary Table 1. Source data EXTENDED DATA


FIG. 4 BENCHMARKING RESULTS USING GLIPH. A, Clustering rates of the four Healthy-CD8 datasets from 10x Genomics, the Glanville dataset, and the Dash dataset under different global


convergence distance cutoff (‘_gccutoff_’) values (Supplementary Table 1). The dashed lines represented the tessa clustering rates of the corresponding datasets. B, Clustering purities of


GLIPH when the ‘_gccutoff_’ equals to 3. The cutoff value was selected so that the GLIPH clusters achieved clustering rates that are most similar to the tessa networks. The clustering


purities were calculated with the same method as in Fig. 2. C, D, The GLIPH network purities (C) and number of networks (D) with different _‘gccutoff’_ values, compared with the tessa


network purities and the number of networks. Source data EXTENDED DATA FIG. 5 CLUSTERING OF TCR CLONOTYPES INFORMED BY TESSA IS REFLECTIVE OF ANTIGEN BINDING SPECIFICITY. The antigen binding


specificity of 207 Human TCRβ chains from 704 T cells were profiled against two epitopes in the Dash dataset, and 276 TCRs from 415 T cells against three epitopes in the Glanville dataset.


A, B, T-SNE plots showing the TCR clonotypes in the space of the TCR embeddings, with the embeddings adjusted by the tessa-inferred weights. The hierarchical clustering tree cutoff used in


the two plots was represented with green dashed lines in c-f. Each point in the plots represents one TCR clonotype, and the size of the point refers to the clone size. Points are colored by


the true antigens that the corresponding TCRs target according to the original report. Points are connected if they are clustered into the same network based on hierarchical clustering of


the TCR embeddings. T cell clones with only one cell were deemed as having low confidence and unclustered clones, which does not affect the calculation of the purities, were excluded from


visualization. C, D, The numbers of TCR networks and the clustering rates with different hierarchical tree cutoffs in the Dash dataset (C) and in the Glanville dataset (D). Cluster rates


were calculated as the number of TCR clonotypes that are clustered with at least another TCR clonotype, divided by the total number of TCR clonotypes. E, F, The network purities and p-values


testing the significance of the purities with different hierarchical tree cutoffs in the Dash dataset (C) and the Glanville dataset (D). The network purity and P value calculations were


described in the Methods section. Source data EXTENDED DATA FIG. 6 T CELL PATHWAY ACTIVITY SCORES OF THE DIFFERENT T CELL SUBSETS IN THE BCC DATASET. The naive and activated pathways are


shown, to be compared against the inhibition, memory and exhausted pathways shown in Fig. 3. The T cell subsets were the same as those in Fig. 3e-g. Source data EXTENDED DATA FIG. 7


PSEUDOTIME ANALYSIS OF THE DIFFERENT T CELL SUBSETS IN THE BCC DATASET. The T cell subsets were the same as those in Fig. 3e–g. Source data EXTENDED DATA FIG. 8 A CARTOON SKETCH SHOWS HOW


THE UNEXPLAINED VARIANCE IN GENE EXPRESSION OF THE TCR NETWORKS WERE DETERMINED. Details were described in the MATERIALS AND METHODS section. Source data SUPPLEMENTARY INFORMATION


SUPPLEMENTARY INFORMATION Supplementary Tables 1 and 2 and Notes 1 and 2. REPORTING SUMMARY SOURCE DATA SOURCE DATA FIG. 1 Statistical source data. SOURCE DATA FIG. 2 Statistical source


data. SOURCE DATA FIG. 3 Statistical source data. SOURCE DATA FIG. 4 Statistical source data. SOURCE DATA EXTENDED DATA FIG. 1 Statistical source data. SOURCE DATA EXTENDED DATA FIG. 2


Statistical source data. SOURCE DATA EXTENDED DATA FIG. 3 Statistical source data. SOURCE DATA EXTENDED DATA FIG. 4 Statistical source data. SOURCE DATA EXTENDED DATA FIG. 5 Statistical


source data. SOURCE DATA EXTENDED DATA FIG. 6 Statistical source data. SOURCE DATA EXTENDED DATA FIG. 7 Statistical source data. SOURCE DATA EXTENDED DATA FIG. 8 Statistical source data.


RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Zhang, Z., Xiong, D., Wang, X. _et al._ Mapping the functional landscape of T cell receptor repertoires


by single-T cell transcriptomics. _Nat Methods_ 18, 92–99 (2021). https://doi.org/10.1038/s41592-020-01020-3 Download citation * Received: 08 April 2020 * Accepted: 12 November 2020 *


Published: 06 January 2021 * Issue Date: January 2021 * DOI: https://doi.org/10.1038/s41592-020-01020-3 SHARE THIS ARTICLE Anyone you share the following link with will be able to read this


content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative