Instructions

The TCRex web tool provides a user-friendly interface to predict the recognition of epitopes by TCRs.

Please check if your epitopes of interest are present in our database before making any predictions. If this is the case, follow the steps in section 'Predict TCR–epitope binding using prediction models provided by TCRex' to predict epitope–TCR binding. If your epitopes of interest are not available in our database you can still make predictions for this epitope by training new prediction models. More information is available in section 'Predict TCR–epitope binding for new epitopes by training custom prediction models'.

Predict TCR–epitope binding using prediction models provided by TCRex

Step 1: Upload TCR data file

Upload the file containing all TCR beta sequences for which you want to obtain predictions. This file must contain at least the CDR3 amino acid sequence and corresponding V/J genes for every TCR beta sequence. In case V/J gene information is not available, add V/J family information instead.

The following file formats are supported by TCRex:

immunoSEQ ANALYZER format (version 1)
File format obtained by using the 'Export Sample(s)' option when exporting your results from the immunoSEQ Analyzer platform.
immunoSEQ ANALYZER format (version 2)
File format obtained by using the 'Export Sample(s) (v2)' option when exporting your results from the immunoSEQ Analyzer platform.
MiXCR format
The MiXCR text file format for clones, obtained by exporting clones from a .clns file to a .txt file. Make sure that the exported file contains at least the following columns: 'bestJGene', bestVGene', and 'aaSeqCDR3'. Suitable MiXCR files can be generated using the following command line parameters when exporting clones: -jGene, -vGene, and -aaFeature CDR3. More information about the MiXCR format is available in the MiXCR documentation.
TCRex tab-delimited format

The TCRex format is a simple and general tab-delimited file format. TCRex files should contain the following columns with their corresponding headers:

  • 'CDR3_beta': containing the CDR3 amino acid sequence of the TCR beta chain
  • 'TRBJ_gene': containing the J gene of the TCR beta chain following IMGT notation
  • 'TRBV_gene': containing the V gene of the TCR beta chain following IMGT notation

See the following example TCRex input file. This file contains 10 human TCR beta sequences that are known to bind with EAAGIGILTV, downloaded from McPAS-TCR.

Download TCRex input example file

Attention: Make sure your CDR3 beta protein sequences are canonical CDR3 sequences (i.e. TCR beta sequences starting with a Cysteine and ending with a Phenylalanine). Predictions for non-canonical CDR3 sequences are not supported.

Attention: Make sure that only one V and one J gene is provided for every TCR sequence. TCR sequences with multiple V/J genes might be splitted into separate entries each having the same CDR3 beta sequence and one of the V/J genes. This can lead to a bias the enrichment testing results.

Attention: TCRex only supports prediction files with at most 50 000 TCR sequences.

Step 2: Select epitope(s) of interest

Use the checkboxes for your epitopes of interest that are present in our database. The toggle function can be used to select several epitopes in the same category at once.

Step 3: Check the advanced settings

IMGT parsing
By default the TCR sequences in the input file are corrected if they contain non-IMGT genes (i.e. genes that are not listed in the IMGT database), or removed if they contain non-IMGT families (i.e. families that are not listed in the IMGT database) or orphon genes.
Enrichment threshold
TCRex performs enrichment analyses to identify the epitopes for which significantly more specific TCRs are found in the uploaded dataset than expected in a background repertoire (i.e. a representation of a normal healthy TCR repertoire). For this, an enrichment threshold (in the range of 0.01%–5%) must be chosen. This threshold represents the percentage of identified epitope-specific TCRs in the background repertoire at a certain BPR threshold. Enrichment analysis are performed for each epitope separately, i.e. for each epitope TCRex tests wether the abundance of specific TCRs is significantly higher than expected in a background repertoire. TCRex performs enrichment analyses for all epitopes for which at least 2 different TCRs are identified. By default, an enrichment threshold of 0.01% is chosen.

Step 4: Submit the task

Please read the terms and conditions carefully before clicking the 'Submit' button. You will be automatically redirected to a new page with a unique URL showing your task ID and the status of your submission. In case of long-running tasks you can always return to this page using the unique URL containing the task ID. This page will refresh itself automatically every 10 seconds while the predictions have not been completed yet. As soon as your task is completed your predictions will be visible in a table at the bottom of the page. Your results will be kept available for at least 7 days.

Step 5: Get the results

The output page gives an overview of all submission details. This includes your task ID, the epitope(s) you selected, the file you uploaded, the time of submission, and a log overview.

Underneath, two tables are given. The first table gives an overview of the p values for each enrichment analysis. If your epitope of interest is not present in this table, the threshold of two different specific TCR sequences was not fullfilled and therefore no enrichment analysis was carried out for this epitope. The p values given by TCRex are not corrected for multiple testing.

The second table contains the prediction score and the BPR (Baseline Prediction Rate) for binding TCR–epitope pairs. The prediction score reflects the confidence of which the prediction model predicts a TCR to bind the epitope of interest. The BPR is used to filter the prediction results afterwards. It gives an estimate of the number of background TCR sequences that are predicted to bind the epitope of interest. For example, when filtering all TCR–epitope pairs with a BPR value of 1%, you can expect that 1% of your background TCR sequences are classified as epitope-binding. Since we expect the number of true positives in a background repertoire to be very low, this BPR value approaches the false positive rate and can therefore be used to control the number of false positives. By default the BPR threshold is set to 0.01%. This can easily be changed to any user-defined value on the result page. TCR sequences with a BPR value below or equal to the chosen threshold are considered to bind the corresponding epitope. These TCR sequences are shown in the table (which is limited to 5000 TCR–epitope pairs) and can be downloaded as a a tab-separated file by clicking on the 'Download results' button at the bottom of the page. We recommend to download these filtered results, as the table can give a slightly different view of the results due to rounding of the values.

Prediction results for all TCR sequences in your input file can be downloaded as well at the bottom of the output page. This file gives the probability of TCR–epitope recognition for every TCR sequence and selected epitope.

See the following example TCRex output file. This file has been obtained by using the example input file, the default BPR threshold, IMGT parsing and selecting following cancer epitopes: AMFWSVPTV, EAAGIGILTV, ELAGIGILTV, FLYNLLTRV, LLLGIGILV.

Download TCRex output example file

Predict TCR–epitope binding for new epitopes by training custom prediction models

Step 1: Upload the training data set

To train a new prediction model you need a data set containing TCR beta sequences that are known to bind with your epitope of interest. Please make sure that this data set contains epitope-specific TCR beta sequences for only one epitope. If this is not the case the predictions made by the prediction model will be unreliable.

The same file formats are supported as in step 1 of 'Predict TCR–epitope binding using prediction models provided by TCRex'.

Attention: Make sure your CDR3 beta protein sequences are canonical CDR3 sequences (i.e. TCR beta sequences starting with a Cysteine and ending with a Phenylalanine). Non-canonical TCR beta sequences will be removed and will not be used for training.

Attention: TCRex only supports training files with at most 500 TCR sequences.

Step 2: Upload the test data set

Besides the training data set used to train the TCR–epitope prediction model you can also provide the target data set. This file should contain all TCR beta sequences for which you want to obtain predictions using your newly trained prediction model. Again the same file formats are supported as in step 1 of 'Predict TCR–epitope binding using prediction models provided by TCRex'.

Attention: Make sure your CDR3 beta protein sequences are canonical CDR3 sequences (i.e. TCR beta sequences starting with a Cysteine and ending with a Phenylalanine). Predictions for non-canonical CDR3 sequences are not supported.

Attention: Make sure that only one V and one J gene is provided for every TCR sequence. TCR sequences with multiple V/J genes might be splitted into separate entries each having the same CDR3 beta sequence and one of the V/J genes. This can lead to a bias the enrichment testing results.

Attention: TCRex only supports prediction files with at most 50 000 TCR sequences.

Step 3: Check the advanced settings

IMGT parsing
By default the TCR sequences in the input file are corrected if they contain non-IMGT genes (i.e. genes that are not listed in the IMGT database), or removed if they contain non-IMGT families (i.e. families that are not listed in the IMGT database) or orphon genes.
Enrichment threshold
TCRex performs enrichment analyses to identify the epitopes for which significantly more specific TCRs are found in the uploaded dataset than expected in a background repertoire (i.e. a representation of a normal healthy TCR repertoire). For this, an enrichment threshold must be chosen (in the range of 0.01%–5%). This threshold represents the percentage of identified epitope-specific TCRs in the background repertoire at a certain BPR threshold. Enrichment analysis are performed for each epitope separately, i.e. for each epitope TCRex tests wether the abundance of specific TCRs is significantly higher than expected in a background repertoire. TCRex performs enrichment analyses for all epitopes for which at least 2 different TCRs are identified. By default, an enrichment threshold of 0.01% is chosen.

Step 4: Submit the task

Please read the terms and conditions carefully before clicking the 'Submit' button. You will be automatically redirected to a new page with a unique URL showing your task ID and the status of your submission. In case of long-running tasks you can always return to this page using the unique URL containing the task ID. This page will refresh itself automatically every 10 seconds while the predictions have not been completed yet. As soon as your task is completed your predictions will be visible in a table at the bottom of the page. Your results will be kept available for at least 7 days.

Step 5: Get the results

The output page gives an overview of all submission details. This includes your task ID, the epitope(s) you selected, the file you uploaded, the time of submission, and a log overview.

Underneath, two tables are given. The first table gives an overview of the p values for each enrichment analysis. If your epitope of interest is not present in this table, the threshold of two different specific TCR sequences was not fullfilled and therefore no enrichment analysis was carried out for this epitope. The p values given by TCRex are not corrected for multiple testing.

The second table contains the prediction score and the BPR (Baseline Prediction Rate) for binding TCR–epitope pairs. The prediction score reflects the confidence of which the prediction model predicts a TCR to bind the epitope of interest. The BPR is used to filter the prediction results afterwards. It gives an estimate of the number of background TCR sequences that are predicted to bind the epitope of interest. For example, when filtering all TCR–epitope pairs with a BPR value of 1%, you can expect that 1% of your background TCR sequences are classified as epitope-binding. Since we expect the number of true positives in a background repertoire to be very low, this BPR value approaches the false positive rate and can therefore be used to control the number of false positives. By default the BPR threshold is set to 0.01%. This can easily be changed to any user-defined value on the result page. TCR sequences with a BPR value below or equal to the chosen threshold are considered to bind the corresponding epitope. These TCR sequences are shown in the table (which is limited to 5000 TCR–epitope pairs) and can be downloaded as a a tab-separated file by clicking on the 'Download results' button at the bottom of the page. We recommend to download these filtered results, as the table can give a slightly different view of the results due to rounding of the values.

Prediction results for all TCR sequences in your input file can be downloaded as well at the bottom of the output page. This file gives the probability of TCR–epitope recognition for every TCR sequence and selected epitope.

Finally, the results page shows a summary of the classifier statistics, which can be used to evaluate the performance of your new prediction model. This includes the accuracy, the area under the receiver operating characteristic curve (AUROC), and the average precision, along with the ROC curve, the precision–recall curve, and an overview of the most important features.