A machine learning model for ranking candidate hla class i neoantigens based on known neoepitopes from multiple human tumor types

A machine learning model for ranking candidate hla class i neoantigens based on known neoepitopes from multiple human tumor types


Play all audios:


ABSTRACT Tumor neoepitopes presented by major histocompatibility complex (MHC) class I are recognized by tumor-infiltrating lymphocytes (TIL) and are targeted by adoptive T-cell therapies.


Identifying which mutant neoepitopes from tumor cells are capable of recognition by T cells can assist in the development of tumor-specific, cell-based therapies and can shed light on


antitumor responses. Here, we generate a ranking algorithm for class I candidate neoepitopes by using next-generation sequencing data and a dataset of 185 neoepitopes that are recognized by


HLA class I–restricted TIL from individuals with metastatic cancer. Random forest model analysis showed that the inclusion of multiple factors impacting epitope presentation and recognition


increased output sensitivity and specificity compared to the use of predicted HLA binding alone. The ranking score output provides a set of class I candidate neoantigens that may serve as


therapeutic targets and provides a tool to facilitate in vitro and in vivo studies aimed at the development of more effective immunotherapies. Access through your institution Buy or


subscribe This is a preview of subscription content, access via your institution ACCESS OPTIONS Access through your institution Access Nature and 54 other Nature Portfolio journals Get


Nature+, our best-value online-access subscription $29.99 / 30 days cancel any time Learn more Subscribe to this journal Receive 12 digital issues and online access to articles $119.00 per


year only $9.92 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices may be subject to local taxes which are calculated during


checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support SIMILAR CONTENT BEING VIEWED BY OTHERS MULTI-STEP SCREENING


OF NEOANTIGENS’ HLA- AND TCR-INTERFACES IMPROVES PREDICTION OF SURVIVAL Article Open access 11 May 2021 IDENTIFICATION OF NEOANTIGENS FOR INDIVIDUALIZED THERAPEUTIC CANCER VACCINES Article


01 February 2022 A COMPREHENSIVE PROTEOGENOMIC PIPELINE FOR NEOANTIGEN DISCOVERY TO ADVANCE PERSONALIZED CANCER IMMUNOTHERAPY Article Open access 11 October 2024 DATA AVAILABILITY All


next-generation sequencing data are available on dbGap under accession number phs001003.v1.p1. Source data are available from the NIH figshare repository at


https://doi.org/10.35092/yhjc.c.4792338.v2 (ref. 56). CODE AVAILABILITY The models developed and presented in this paper are available at


https://github.com/JaredJGartner/SB_neoantigen_Models. REFERENCES * Huang, J. et al. T cells associated with tumor regression recognize frameshifted products of the _CDKN2A_ tumor suppressor


gene locus and a mutated HLA class I gene product. _J. Immunol._ 172, 6057–6064 (2004). Article  CAS  PubMed  Google Scholar  * Zhou, J., Dudley, M. E., Rosenberg, S. A. & Robbins, P.


F. Persistence of multiple tumor-specific T-cell clones is associated with complete tumor regression in a melanoma patient receiving adoptive cell transfer therapy. _J. Immunother._ 28,


53–62 (2005). Article  PubMed  PubMed Central  Google Scholar  * Robbins, P. F. et al. Mining exomic sequencing data to identify mutated antigens recognized by adoptively transferred


tumor-reactive T cells. _Nat. Med._ 19, 747–752 (2013). Article  CAS  PubMed  PubMed Central  Google Scholar  * Lu, Y. C. et al. Mutated PPP1R3B is recognized by T cells used to treat a


melanoma patient who experienced a durable complete tumor regression. _J. Immunol._ 190, 6034–6042 (2013). Article  CAS  PubMed  Google Scholar  * Lu, Y. C. et al. Efficient identification


of mutated cancer antigens recognized by T cells associated with durable tumor regressions. _Clin. Cancer Res._ 20, 3401–3410 (2014). Article  CAS  PubMed  PubMed Central  Google Scholar  *


Prickett, T. D. et al. Durable complete response from metastatic melanoma after transfer of autologous T cells recognizing 10 mutated tumor antigens. _Cancer Immunol. Res._ 4, 669–678


(2016). Article  CAS  PubMed  PubMed Central  Google Scholar  * Tran, E. et al. Cancer immunotherapy based on mutation-specific CD4+ T cells in a patient with epithelial cancer. _Science_


344, 641–645 (2014). Article  CAS  PubMed  PubMed Central  Google Scholar  * Tran, E. et al. T-cell transfer therapy targeting mutant KRAS in cancer. _N. Engl. J. Med._ 375, 2255–2262


(2016). Article  CAS  PubMed  PubMed Central  Google Scholar  * Zacharakis, N. et al. Immune recognition of somatic mutations leading to complete durable regression in metastatic breast


cancer. _Nat. Med._ 24, 724–730 (2018). Article  CAS  PubMed  PubMed Central  Google Scholar  * Rizvi, N. A. et al. Mutational landscape determines sensitivity to PD-1 blockade in non-small


cell lung cancer. _Science_ 348, 124–128 (2015). Article  CAS  PubMed  PubMed Central  Google Scholar  * McGranahan, N. et al. Clonal neoantigens elicit T cell immunoreactivity and


sensitivity to immune checkpoint blockade. _Science_ 351, 1463–1469 (2016). Article  CAS  PubMed  PubMed Central  Google Scholar  * Hellmann, M. D. et al. Genomic features of response to


combination immunotherapy in patients with advanced non-small-cell lung cancer. _Cancer Cell_ 33, 843–852 (2018). * Le, D. T. et al. Mismatch repair deficiency predicts response of solid


tumors to PD-1 blockade. _Science_ 357, 409–413 (2017). Article  CAS  PubMed  PubMed Central  Google Scholar  * Le, D. T. et al. PD-1 blockade in tumors with mismatch-repair deficiency. _N.


Engl. J. Med._ 372, 2509–2520 (2015). Article  CAS  PubMed  PubMed Central  Google Scholar  * Peltomaki, P. DNA mismatch repair and cancer. _Mutat. Res._ 488, 77–85 (2001). Article  CAS 


PubMed  Google Scholar  * Peters, B. & Sette, A. Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method. _BMC


Bioinf._ 6, 132 (2005). Article  CAS  Google Scholar  * Alvarez, B. et al. NNAlign_MA; MHC peptidome deconvolution for accurate MHC binding motif characterization and improved T-cell epitope


predictions. _Mol. Cell. Proteomics_ 18, 2459–2477 (2019). Article  CAS  PubMed  PubMed Central  Google Scholar  * O’Donnell, T. J. et al. MHCflurry: open-source class I MHC binding


affinity prediction. _Cell Syst._ 7, 129–132 (2018). Article  PubMed  CAS  Google Scholar  * Duan, F. et al. Genomic and bioinformatic profiling of mutational neoepitopes reveals new rules


to predict anticancer immunogenicity. _J. Exp. Med._ 211, 2231–2248 (2014). Article  PubMed  PubMed Central  Google Scholar  * Bulik-Sullivan, B. et al. Deep learning using tumor HLA peptide


mass spectrometry datasets improves neoantigen identification. _Nat. Biotechnol._ 37, 55–63 (2019). * Hundal, J. et al. pVACtools: a computational toolkit to identify and visualize cancer


neoantigens. _Cancer Immunol. Res._ 8, 409–420 (2020). CAS  PubMed  PubMed Central  Google Scholar  * Bjerregaard, A. M., Nielsen, M., Hadrup, S. R., Szallasi, Z. & Eklund, A. C. MuPeXI:


prediction of neo-epitopes from tumor sequencing data. _Cancer Immunol. Immunother._ 66, 1123–1130 (2017). Article  CAS  PubMed  Google Scholar  * Kim, S. et al. Neopepsee: accurate


genome-level prediction of neoantigens by harnessing sequence and amino acid immunogenicity information. _Ann. Oncol._ 29, 1030–1036 (2018). Article  CAS  PubMed  Google Scholar  *


Kosaloglu-Yalcin, Z. et al. Predicting T cell recognition of MHC class I restricted neoepitopes. _Oncoimmunology_ 7, e1492508 (2018). Article  PubMed  PubMed Central  Google Scholar  *


Brown, S. D. et al. Neo-antigens predicted by tumor genome meta-analysis correlate with increased patient survival. _Genome Res._ 24, 743–750 (2014). Article  CAS  PubMed  PubMed Central 


Google Scholar  * Balachandran, V. P. et al. Identification of unique neoantigen qualities in long-term survivors of pancreatic cancer. _Nature_ 551, 512–516 (2017). Article  CAS  PubMed 


PubMed Central  Google Scholar  * Parkhurst, M. R. et al. Unique neoantigens arise from somatic mutations in patients with gastrointestinal cancers. _Cancer Discov._ 9, 1022–1035 (2019).


Article  CAS  PubMed  PubMed Central  Google Scholar  * Tran, E. et al. Immunogenicity of somatic mutations in human gastrointestinal cancers. _Science_ 350, 1387–1390 (2015). Article  CAS 


PubMed  PubMed Central  Google Scholar  * Lo, W. et al. Immunologic recognition of a shared p53 mutated neoantigen in a patient with metastatic colorectal cancer. _Cancer Immunol. Res._ 7,


534–543 (2019). Article  CAS  PubMed  PubMed Central  Google Scholar  * Jurtz, V. et al. NetMHCpan-4.0: improved peptide–MHC class I interaction predictions integrating eluted ligand and


peptide binding affinity data. _J. Immunol._ 199, 3360–3368 (2017). Article  CAS  PubMed  Google Scholar  * Gfeller, D. et al. The length distribution and multiple specificity of naturally


presented HLA-I ligands. _J. Immunol._ 201, 3705–3716 (2018). Article  CAS  PubMed  Google Scholar  * Sarkizova, S. et al. A large peptidome dataset improves HLA class I epitope prediction


across most of the human population. _Nat. Biotechnol._ 38, 199–209 (2020). Article  CAS  PubMed  Google Scholar  * Paul, S. et al. HLA class I alleles are associated with peptide-binding


repertoires of different size, affinity, and immunogenicity. _J. Immunol._ 191, 5831–5839 (2013). Article  CAS  PubMed  Google Scholar  * Chen, W., Yewdell, J. W., Levine, R. L. &


Bennink, J. R. Modification of cysteine residues in vitro and in vivo affects the immunogenicity and antigenicity of major histocompatibility complex class I-restricted viral determinants.


_J. Exp. Med._ 189, 1757–1764 (1999). Article  CAS  PubMed  PubMed Central  Google Scholar  * Chen, J. L. et al. Structural and kinetic basis for heightened immunogenicity of T cell


vaccines. _J. Exp. Med._ 201, 1243–1255 (2005). Article  CAS  PubMed  PubMed Central  Google Scholar  * Sachs, A., et al. Impact of cysteine residues on MHC binding predictions and


recognition by tumor-reactive T cells. _J. Immunol._ 205, 539–549 (2020). * Sahin, U. et al. Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer.


_Nature_ 547, 222–226 (2017). Article  CAS  PubMed  Google Scholar  * Horton, P. et al. WoLF PSORT: protein localization predictor. _Nucleic Acids Res._ 35, W585–W587 (2007). Article  PubMed


  PubMed Central  Google Scholar  * Abelin, J. G. et al. Mass spectrometry profiling of HLA-associated peptidomes in mono-allelic cells enables more accurate epitope prediction. _Immunity_


46, 315–326 (2017). Article  CAS  PubMed  PubMed Central  Google Scholar  * Rasmussen, M. et al. Pan-specific prediction of peptide–MHC class I complex stability, a correlate of T cell


immunogenicity. _J. Immunol._ 197, 1517–1524 (2016). * Jorgensen, K. W., Rasmussen, M., Buus, S. & Nielsen, M. NetMHCstab—predicting stability of peptide–MHC-I complexes; impacts for


cytotoxic T lymphocyte epitope discovery. _Immunology_ 141, 18–26 (2014). Article  CAS  PubMed  Google Scholar  * Groettrup, M., Kirk, C. J. & Basler, M. Proteasomes in immune cells:


more than peptide producers? _Nat. Rev. Immunol._ 10, 73–78 (2010). * Larsen, M. V. et al. An integrative approach to CTL epitope prediction: a combined algorithm integrating MHC class I


binding, TAP transport efficiency, and proteasomal cleavage predictions. _Eur. J. Immunol._ 35, 2295–2303 (2005). Article  CAS  PubMed  Google Scholar  * Capietto, A. H. et al. Mutation


position is an important determinant for predicting cancer neoantigens. _J. Exp. Med._ 217, e20190179 (2020). * Calis, J. J. et al. Properties of MHC class I presented peptides that enhance


immunogenicity. _PLoS Comput. Biol._ 9, e1003266 (2013). Article  PubMed  PubMed Central  Google Scholar  * Chowell, D. et al. TCR contact residue hydrophobicity is a hallmark of immunogenic


CD8+ T cell epitopes. _Proc. Natl Acad. Sci. USA_ 112, E1754–E1762 (2015). Article  CAS  PubMed  PubMed Central  Google Scholar  * Cohen, C. J. et al. Isolation of neoantigen-specific T


cells from tumor and peripheral lymphocytes. _J. Clin. Invest._ 125, 3981–3991 (2015). Article  PubMed  PubMed Central  Google Scholar  * Gros, A. et al. PD-1 identifies the patient-specific


CD8+ tumor-reactive repertoire infiltrating human tumors. _J. Clin. Invest._ 124, 2246–2259 (2014). Article  CAS  PubMed  PubMed Central  Google Scholar  * Gros, A. et al. Prospective


identification of neoantigen-specific lymphocytes in the peripheral blood of melanoma patients. _Nat. Med._ 22, 433–438 (2016). Article  CAS  PubMed  PubMed Central  Google Scholar  *


Parkhurst, M. et al. Isolation of T-cell receptors specifically reactive with mutated tumor-associated antigens from tumor-infiltrating lymphocytes based on CD137 expression. _Clin. Cancer


Res._ 23, 2491–2505 (2017). Article  CAS  PubMed  Google Scholar  * Stevanovic, S. et al. Landscape of immunogenic tumor antigens in successful immunotherapy of virally induced epithelial


cancer. _Science_ 356, 200–205 (2017). Article  CAS  PubMed  PubMed Central  Google Scholar  * Deniger, D. C. et al. T-cell responses to TP53 “Hotspot” mutations and unique neoantigens


expressed by human ovarian cancers. _Clin. Cancer Res._ 24, 5562–5573 (2018). Article  CAS  PubMed  PubMed Central  Google Scholar  * Yossef, R. et al. Enhanced detection of


neoantigen-reactive T cells targeting unique and shared oncogenes for personalized cancer immunotherapy. _JCI Insight_ 3, e122467 (2018). * Gros, A. et al. Recognition of human


gastrointestinal cancer neoantigens by circulating PD-1+ lymphocytes. _J. Clin. Invest._ 129, 4992–5004 (2019). Article  CAS  PubMed  PubMed Central  Google Scholar  * Larsen, M. V. et al.


Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction. _BMC Bioinf._ 8, 424 (2007). Article  CAS  Google Scholar  * Gartner, J. Datasets for ‘Development of a model


for ranking candidate HLA class I neoantigens based upon datasets of known neoepitopes’. figshare https://doi.org/10.35092/yhjc.c.4792338.v2 (2020). Download references ACKNOWLEDGEMENTS We


thank members of the NIH High Performance Computing (HPC) group for all of their support, assistance and technical advice. This work utilized the computational resources of the NIH HPC


Biowulf cluster (http://hpc.nih.gov). We also thank all members of the tissue procurement team for all of their efforts in acquiring and maintaining the specimens used in this study. AUTHOR


INFORMATION AUTHORS AND AFFILIATIONS * Surgery Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA Jared J. Gartner, Maria R. Parkhurst, Amy Copeland, 


Ken-Ichi Hanada, Nikolaos Zacharakis, Almin Lalani, Sri Krishna, Abraham Sachs, Todd D. Prickett, Yong F. Li, Maria Florentin, Scott Kivitz, Samuel C. Chatmon, Steven A. Rosenberg & Paul


F. Robbins * Vall d’Hebron Institute of Oncology (VHIO), Cellex Center, Barcelona, Spain Alena Gros * Earle A. Chiles Research Institute, Providence Cancer Institute, Portland, OR, USA Eric


Tran * Department of Surgery, Dartmouth-Hancock Medical Center, Lebanon, NH, USA Mohammad S. Jafferji Authors * Jared J. Gartner View author publications You can also search for this author


inPubMed Google Scholar * Maria R. Parkhurst View author publications You can also search for this author inPubMed Google Scholar * Alena Gros View author publications You can also search


for this author inPubMed Google Scholar * Eric Tran View author publications You can also search for this author inPubMed Google Scholar * Mohammad S. Jafferji View author publications You


can also search for this author inPubMed Google Scholar * Amy Copeland View author publications You can also search for this author inPubMed Google Scholar * Ken-Ichi Hanada View author


publications You can also search for this author inPubMed Google Scholar * Nikolaos Zacharakis View author publications You can also search for this author inPubMed Google Scholar * Almin


Lalani View author publications You can also search for this author inPubMed Google Scholar * Sri Krishna View author publications You can also search for this author inPubMed Google Scholar


* Abraham Sachs View author publications You can also search for this author inPubMed Google Scholar * Todd D. Prickett View author publications You can also search for this author inPubMed


 Google Scholar * Yong F. Li View author publications You can also search for this author inPubMed Google Scholar * Maria Florentin View author publications You can also search for this


author inPubMed Google Scholar * Scott Kivitz View author publications You can also search for this author inPubMed Google Scholar * Samuel C. Chatmon View author publications You can also


search for this author inPubMed Google Scholar * Steven A. Rosenberg View author publications You can also search for this author inPubMed Google Scholar * Paul F. Robbins View author


publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS J.J.G., P.F.R. and S.A.R. designed the study and drafted the manuscript. J.J.G. trained models and


evaluated all nmers and mmps. T.D.P. and S.C.C. generated exomes and RNA-seq libraries. N.Z., K.H., Y.F.L. and P.F.R. designed minigene constructs encoding candidate neoantigens and


generated in vitro-transcribed RNA used to perform screening assays. M.R.P., M.F. and S. Kivitz synthesized peptides used for T-cell screening assays. M.R.P., A.G., E.T., M.S.J., A.C., K.H.,


N.Z., A.L., S. Krishna and A.S. evaluated T cells for their ability to recognize nmers/mmps in the context of the appropriate HLA class I restriction elements. CORRESPONDING AUTHOR


Correspondence to Paul F. Robbins. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL INFORMATION PEER REVIEW INFORMATION _Nature Cancer_ thanks


the anonymous reviewers for their contribution to the peer review of this work. PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and


institutional affiliations. EXTENDED DATA EXTENDED DATA FIG. 1 PERCENTILE RANK COMPARISONS BETWEEN NETMHCPAN4.0 EL AND MHCFLURRY1.6 PERCENTILE RANK. Percentile rank of positive mmps were


mapped by their MHCflurry1.6 rank on the x-axis and the NetMHCpan4.0 EL model rank on the y-axis. Red Triangles correspond to mmps containing cysteine residues at positions 2,3 or C-terminus


(n=12) while orange dots correspond to peptides containing cysteine residues at position 1 or between positions 3 and the C-terminus (n=107). EXTENDED DATA FIG. 2 NMER LOCALIZATION


PREDICTIONS. WoLF Psort algorithm was used on all nmer proteins (n=9541) to predicted for localization. Blue bars are CD8 + Positive nmers, Orange bars are negative nmers. Y-axis represents


frequency of each group predicted to localize. X axis are the WoLF Psort prediction abbreviations. chlo = chloroplast, cyto = cytosol, cysk = cytoskeleton, E.R. = endoplasmic reticulum, extr


= extracellular, golg = Golgi apparatus, lyso = lysosome, mito = mitochondria, nucl = nuclear, pero = peroxisome, plas = plasma membrane, vacu = vacuolar membrane . Individual totals for


each groups positive and negative can be found in Supplementary Table 12. Hyphenated values denote compound prediction. P-values comparing positive to negative nmers displayed over each


prediction. P-values calculated using a two-sided Fisher’s exact test and corrected using Bonferroni correction for multiple comparisons. EXTENDED DATA FIG. 3 GENE EXPRESSION DECILE OF MMPS.


Gene expression deciles of positive (n=119) and negative mmps (n=2681162). Box indicates quartiles 2 & 3 and inter quartile range, median indicated by line in box plot, whiskers


represent quartile 1 and 4 ± 1.5X IQR or minimum/maximum value if within the whisker values. Significance calculated with Mann-Whitney U test. EXTENDED DATA FIG. 4 IEDB IMMUNOGENICITY SCORES


OF MMPS. IEDB Immunogenicity scores were generated for each mmp using the IEDB immunogenicity tool. The panels are split into all mmps (positive n=119, negative n=2681162), comparison of


just those with a mutation anchor in position 2,3 or C-terminus (positive n=55, negative n= 1167363) and those without mutations in position 2,3, or C-terminus (positive n= 64, negative n=


1513799). Box indicates quartiles 2 & 3 and inter quartile range, median indicated by line in box plot, whiskers represent quartile 1 and 4 ± 1.5X IQR or minimum/maximum value if within


the whisker values. Significance was calculated using the Mann-Whitney U test. EXTENDED DATA FIG. 5 HYDROPHOBICITY SCORES OF T-CELL CONTACT REGIONS. Hydrophobicity scores were calculated


summing the Kyte-Doolittle hydrophobicity score of positions 4 through n-1. The panels are split into all mmps (positive n=119, negative n=2681162), comparison of just those with a anchor in


position 2,3 or C-terminus (positive n=55, negative n= 1167363) and those without mutations in position 2,3, or C-terminus (positive n= 64, negative n= 1513799). Box indicates quartiles 2


& 3 and inter quartile range, median indicated by line in box plot, whiskers represent quartile 1 and 4 ± 1.5X IQR or minimum/maximum value if within the whisker values. Significance


calculated with Mann-Whitney U test. EXTENDED DATA FIG. 6 TOP NMER MODELS USING EITHER MMP SCORE OF MHCFLURRY SCORE AS INPUT. ROC curve showing the mean performance of the top models using


either MMP model scores or MHCflurry scores as input. Solid line represents mean for each model across n=5 folds, shaded area is the standard deviation at each point along the x-axis.


SUPPLEMENTARY INFORMATION REPORTING SUMMARY SUPPLEMENTARY TABLES 1–23 RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Gartner, J.J., Parkhurst, M.R.,


Gros, A. _et al._ A machine learning model for ranking candidate HLA class I neoantigens based on known neoepitopes from multiple human tumor types. _Nat Cancer_ 2, 563–574 (2021).


https://doi.org/10.1038/s43018-021-00197-6 Download citation * Received: 06 December 2019 * Accepted: 11 March 2021 * Published: 03 May 2021 * Issue Date: May 2021 * DOI:


https://doi.org/10.1038/s43018-021-00197-6 SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not


currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative