Overall objectives of the project
Personalized cancer immunotherapy has tremendous potential to provide an incredible benefit to society. Without doubt, successful cancer immunotherapies will not only reduce health care costs, but will also increase the quality of life of the affected patients. Clinical trials have demonstrated profound tumor regression including complete cure in patients with metastatic cancer after treatment with immune checkpoint blockers. Importantly, technological advances such as next-generation sequencing (NGS) allow for the first time the development of personalised cancer immunotherapies that target patient specific mutations.
However, clinical application is currently hampered by specific bottlenecks in bioinformatics. The project APERIM aimed to accelerate the clinical translation and maximize the accessibility and utility of biomedical data in research and medicine.
The overall objective of APERIM was to develop an advanced bioinformatics platform for personalised cancer immunotherapy. Towards this goal, a transdisciplinary network of leading experts in bioinformatics and cancer immunology was working on methods development, methods validation, software implementation, and software testing. The major tools developed and assembled for this platform are:
- A database for the integration of NGS data, images of whole tissue slides of tumour sections, and clinical data;
- A tool for the quantification of tumor-infiltrating lymphocytes using RNA-seq and imaging data;
- Analytical pipeline for NGS-guided personalised cancer vaccines;
- Novel tools for the characterization of T-cell receptor (TCR) sequences including software that identifies epitopes that are likely to elicit a cytotoxic T-cell immune response;
- Database of antigen-specific TCR sequences.
The bioinformatics methods we developed are an important prerequisite and will enhance personalized health care in the context of cancer immunotherapy. The unique methods and the easy to use software tools enable comprehensive characterization of patients samples and will provide the basis for developing efficient therapeutic strategies and ultimately lead to a benefit for the society.
During the course of the project period APERIM partners set up architectures and tools towards the development of novel analytical software pipelines.
- An advanced bioinformatics database (The Cancer Immunome Atlas http://tcia.at ) was developed by the Medical University of Innsbruck, Austria (I-Med) with contains data of the immunogenomic characterization for 20 solid cancers with >8000 tumor samples of The Cancer Genome Atlas. This database is publicly available and provides for the first time comprehensive view of the cellular composition of the intratumoral immune infiltrates. The database enables also integration of images from whole-tissue slides (digital pathology) and thereby e integrative analyses of NGS data and imaging data.
- Efremova M, et al. Targeting immune checkpoints potentiates immunoediting and changes the dynamics of tumor evolution Nature Communications 9, 2018
- Charoentong P, et al. Pan-cancer Immunogenomic Analyses Reveal Genotype-Immunophenotype Relationships and Predictors of Response to Checkpoint Blockade. Cell Rep. 2017
- Tappeiner E, et al. TIminer: NGS data mining pipeline for cancer immunology and immunotherapy. 2017
- Finotello F, Trajanoski Z. New strategies for cancer immunotherapy: targeting regulatory T cells. Genome Med. 2017
- Hackl H, Charoentong P, Finotello F, Trajanoski Z. Computational genomics tools for dissecting tumour-immune cell interactions. Nat Rev Genet. 2016
- To develop tools for the automated quantification of TILs partner Definiens from Germany set up a database for TIL quantification by cell segmentation and classification (stromal TILs, epithelial TILs) and has performed region classifications (epithelium/stroma/necrosis) and tumor core annotations for slides with clinical data. INSERM from France worked on the development of a fully automatic image analysis solution to detect tumor infiltrating lymphocytes (TIL) in tissue slides. This novel tool CRC classifier, based on the immunoscore software, can be taken to stratify patients into short and long survivor and to further develop targets for novel immunotherapies.
The digital TILSorter has been developed by the Spanish partner CNIC in collaboration with I-MED to enumerate and quantify the immune infiltration in colorectal and breast cancer RNA-Seq samples starting from scRNASeq . The method is based on a Deep Learning strategy, using a Deep Neural Network (DNN) model that allows quantification not only of lymphocytes as a general population but also to identify the exact amount of specific CD8+, CD4Tmem, CD4Th and CD4Tregs, as well as B-cells and Stromal content. On top of the specificity of the subpopulations identified, the signatures were built from scRNASeq data from the tumour, preserving the specific characteristics of the tumour microenvironment as opposite to other approaches in which cells were isolated from blood.
- Mlecnik B, et al. Integrative analyses of colorectal cancer show Immunoscore is a stronger predictor of patient survival than microsatellite instability, Immunity, 2016
- Mlecnik B, et al. The tumor microenvironment and Immunoscore are critical determinants of dissemination to distant metastasis. Science Transl. 2016
- Mlecnik B, et al. Comprehensive Intrametastatic Immune Quantification and Major Impact of Immunoscore on Survival. J Natl Cancer Inst. 2018
- To reach the aim of providing an analytical pipeline for NGS-guided personalised cancer vaccines, various models were developed and are compatible with each other, so that they can be combined to a real pipeline. Partner TRON from Germany has established the iCaM2.0 NGS data analyser, a standardized prototype pipeline for fusion gene detection and validation, which analyzes the data for a single patient. The pipeline manages the different tasks, which depend on each other, including short sequence read alignment, somatic mutation detection, determination of mutation effects and prediction of antigenicity. The group of TRON will reach the objective soon by a report on the performance, which includes key metrics as run-time per patient, number of detected mutations and neo-antigens and the ability to detect known neo-antigens in a set of well characterized patient data sets. Recently the group has extended the space of accessible neo antigens by in depth testing and characterization of NGS based fusion-transcript detection methods and the subsequent prediction of neo antigen candidates.
The software Immunopredictor was developed by University of Utrecht (UU), Netherlands and is a pipeline that provides predictors for antigen presentation on the cell surface. It outputs a set of scores that are relevant for predicting the potential of query epitopes (peptide-MHC complexes) to elicit effective cytotoxic T-cell responses. Immunopredictor was released on Github https://github.com/APERIM-EU/WP3-ImmunePredictor.
For the selection of sets of potential neo-epitope targets, multiple strategies have been implemented by the University of Tuebingen (UT), Germany, depending on different levels of information. The models have been implemented into a software product that is disseminated through the established infrastructure (https://github.com/APERIM-EU/WP3-EpitopeSelector, https://hub.docker.com/r/aperim/epitopeselector/ ) and uses the output of NGSanalyser and Immunopredictor. The models integrate neo-antigen and HLA allele expression, self-similarity as well as binding strength/immunogenicity of the neo-epitopes, and optimize overall immunogenicity and antigen coverage, while minimizing the “vaccine failure” risk. They further extended the framework with options to specifically include and exclude specific peptides and to use rank-based immunogenicity estimates. To improve accessibility and usability, UT implemented a web-based graphical user interface for the software product, which has been deployed as part of partner UT’s platform for biomedical research qPortal. In addition, UT developed an immunoinformatics toolbox (ImmunoNodes) that is fully integrated into the visual workflow environment KNIME and facilitates the usage of tools including (neo)epitope prediction and HLA typing (https://github.com/qbicsoftware/vaccine-designer-portlet).
The complete analytical pipeline for cancer vaccines has been evaluated with other software tools and clinical data at TRON and NKI. SOPs have been developed for use of the software at TRON.
Derivatives of the implemented software is used in clinical neoantigen vaccination trials of collaborating pharmaceutical companies. TRON uses the established SOPs and software for internal research and development and collaborations with international partners from academia and industry.
- Schubert B, Kohlbacher O: Designing string-of-beads vaccines with optimal spacers, Genome Medicine 2016
- Benjamin Schubert et al. FRED 2 – An Immunoinformatics Framework for Python. Bioinformatics 2016
- Schubert B, et al: ImmunoNodes – graphical development of complex immunoinformatics workflows, BMC Bioinformatics, 2017
- Strønen E, et al.: Targeting of cancer neoantigens with donor-derived T cell receptor repertoires. Science 2016
- To predict TCR specificity a TCR Analyser was developed by the German company AptaIT which comprises a software tool that enables the analysis of next generation sequencing (NGS) data derived from T-cells in order to provide T-cell-receptor (TCR) repertoire results. The TCR repertoire reflecting the tumour status of individuals can be digitalized by NGS and can therewith be made accessible for bioinformatic analysis. NGS results into big data files, which provide the input for the TCR Analyser software. The output is a SQLite database of the TCR repertoire. The database comprises the CDR3-sequences of alpha and beta TCR-chains, their V- and J-genes annotated according to the immuno-genetics (IMGT) reference database, the counts of the genes as well as the entire TCR-sequences (i.e. the connectivity of the individual V- and J-genes and CDR3-sequences).
In order to support various kinds of experiments the TCR Analyser software can assemble the results from paired-end sequencing runs before parsing the TCR repertoire. Further aim was the processing of paired alpha and beta chain data, which was achieved in the second period of the project. Paired chains data sets from both, multiwell and emulsion PCR approaches can be processed now. In addition, a dynamic graphical user interface (GUI) was implemented to ease the usage of the TCR Analyser (aperimTCRKit.pl ).
The most challenging part of the project was the aim to develop a TCR2Epitope package. Aim within APERIM was to evaluate the feasibility to predict epitope characteristics from the primary sequence information of TCRs. Three years of work from partners University of Utrecht (UU) and Masarykova University (MU), from Brno, Czech republic demonstrated that, predicting CDR3 beta chain characteristics associated with epitope binding, is feasible.
The software TCR2epitope is a python module that was developed by partner UU and can be found on github: https://github.com/ewaldvandyk/TCR2Epitope, including an extensive explanation on usage. The present version of the software can rank TCR sequences based on their epitope specificity on a validation set from the VDJdb database. Although a set of patient specific neo-epitopes can already be ranked based on their probability of binding, the ranking is not yet satisfactory and it still needs further work and data to develop a model that can reliably predict TCR binding to neoantigens. The team is currently implementing an automated training module for TCR2Epitope. This allows the models to be retrained automatically every time new data is added to VDJdb database.
The Platform VDJdb https://vdjdb.cdr3.net/ was developed by Partner MU and is a comprehensive database of antigen-specific T-cell receptor (TCR) sequences acquired by manual processing of published studies that report the ligand specificities of defined T-cell clonotypes. The primary goal of VDJdb is to facilitate access to existing information on TCR antigen specificities, i.e. the ability to recognize known epitopes presented by known major histocompatibility complex (MHC) class I and II molecules. The mission was to aggregate TCR specificity information on a continuous basis and establish a curated repository to store these data in the public domain. In the period of 2017-2018 VDJdb has grown substantially owing to constant efforts in aggregation of previously published results and input from our collaborators and the community. A milestone of 20,000 records was reached in November 2017, and a chunk of 12 studies describing a variety of CMV-associated epitopes was added in May 2018, covering a majority of known CMV-specific clonotypes up to this date.
The Netherland Caner Institute, (NKI) developed a data set plus matched TCR vectors that can be used to validate TCR2epitope and other TCR specificity prediction algorithms.
- Shugay M, et al. VDJdb: a curated database of T-cell receptor sequences with known antigen specificity. Nucleic Acids Res. 2017
- Shagin DA, et al. Application of nonsense-mediated primer exclusion (NOPE) for preparation of unique molecular barcoded libraries. BMC Genomics. 2017
- Bolotin DA, et al. Antigen receptor repertoire profiling from RNA-seq data. Nat Biotechnol. 2017
- Shugay M, et al. MAGERI: Computational pipeline for molecular-barcoded targeted resequencing. PLoS Comput Biol. 2017
The socio-economic impact
As cancer immunotherapy is developing at rapid pace with several approved drugs as well as application to increasing number of common malignancies, we strongly believe that the scientific findings from the project and the developed methods and software tools will have huge impact on cancer diagnostics and therapy. For example, our pan-cancer analysis of the immunophenotypes and antigenomes implicates that both, mutational profiles and immunological profiles are highly diverse and tissue-dependent. Thus, successful cancer therapy leading to long-term benefit will likely require precision immune-oncology approach by rationally selecting drugs and/or drug combinations.
The impact of the project work on the translation into clinical application is unique: The development of the integrated NGSanalyser, immunopredictor, and epitopeselector solution will enable the rapid rational design of cancer vaccines for the treatment of solid cancers. We expect, that in the future the software tools will be generally useful for all cancer vaccine developers and will be specifically and immediately used in our on-going, regulatory-approved, and planned individualized cancer vaccine clinical trials. Already derivatives of the implemented NGSanalyzer software are used in clinical neoantigen vaccination trials of collaborating pharmaceutical companies (ClinicalTrials.gov Identifiers: NCT02316457, NCT02035956 and NCT03289962) of TRON.
In parallel with the translation of the results into clinical research there are ongoing efforts to apply regulatory principles to the manufacture and quality control of vaccines (Britten et al., 2013). Despite the fact that these therapeutic approaches pose unique regulatory challenges, the development of cancer vaccines may be pursued within the existing regulatory framework of the EU.