CEN-tools is an integrative tool that identifies the underlying context that is associated with the essentiality of a gene from large-scale genome-scale CRISPR screens.
Citation: If you find CEN-tools useful please cite:
Sharma S*, Dincer C*, Weidemüller P, Wright GJ, Petsalaki E., CEN-tools: An integrative platform to identify the contexts of essential genes. MSB; doi: 10.15252/msb.20209698
Select a gene of interest and navigate through the essentiality contexts.
For examples on how to use CEN-tools please refer to the Documentation -> Examples tab.
Essentiality scores for all projects are processed from the fold change values downloaded from Project Score pipeline.
Contact us here for any problems or feedback on CEN-tools.
Follow @cen_toolsThe python package of CEN-tools is available here.
Sumana Sharma, Cansu Dincer,
Paula Weidemüller, and Evangelia Petsalaki
This section features the core gene analysis.
Project: Four options are available - 'BROAD', 'SANGER', 'BROAD and SANGER INTEGRATED' and 'Show-all'. Integrated analysis is in beta-version and will be updated once the preprint is accepted for publication.
Start typing to select gene:: A total of 16522 genes are available to choose from. Note that some genes may not have information in a given dataset, since the number of available genes slightly differs between datasets.
The displayed figure is an essentiality probability distribution. These distributions are generated in the following manner:
Identified core-essential genes were further classified into the following categories:
Drug response data is available for only a subset of cell lines with essentiality data. Therefore you might see the note 'Not enough samples for analysis' for several tissues.
Manual selection of cell lines. Clickable selection is ONLY available amongst cell lines that you have not already excluded on the side bar. Excluded (not clickable cell lines) are shown as faint grey dots. Clickable but not selected cell lines are shown as dark grey dots.
Either drag a window over a number of cell lines and then press one of the buttons below to confirm the selection. Or click on the dots to manually select/deselect them.
The nodes from the CEN are shown in the dropdown menu on the left panel. Either explore networks of individual proteins (default setting) or select 'Map CEN' to map all genes from the CEN onto the STRING PPI network and perform enrichment analysis. The enrichment plot will be displayed below the integrated PPI network if selected.
To identify the contexts that were associated with essentiality, we calculated the scaled log2 fold change (henceforth referred to as 'scaled essentiality score') from the CRISPRcleanR corrected gene-level fold-changes provided by Project Score (Behan et al., 2019). For this, gene level fold changes were first quantile normalised per sample and then median scaled according to BAGEL previously defined lists of essential and non-essential genes (Hart & Moffat, 2016) as applied in (Gonçalves et al. 2020). The scaled values were then multiplied by -1 such that essential genes have a median log2 fold change of 1 and non‐essential genes a median log2 fold change of 0.
Project: Three options are available - 'BROAD', 'SANGER' and BROAD and SANGER 'INTEGRATED'. Analysis for each project is performed separately. BROAD refers to the DepMap project and SANGER refers to the Project Score. Users can use the INTEGRATED dataset to investigate the essentialities of genes from a dataset that integrates BROAD and SANGER data (beta-version, based on preprint Pacini et al. 2020).
The number of cell lines used for computations is:
Query genes: For each selected project there is a number of genes to choose from:
Tissue of Origin: Tissue annotations are available for all cell lines. Default is 'All'. Using the default option, visualisations for the data will be performed on all available cell lines from the chosen project.
Selecting cell lines:
Expression: Expression values displayed in CEN-tools are normalised FPKM values.
log(FPKM+1)
.Mutation: There are three options available to subset cell lines based on their mutational status:
Drug response: GDSC1 and GDSC2 are two drug screening datasets obtained from CancerRxGene. Following options are available, further information on the different datasets can be found in the documentation of CancerRxGene:
This tab provides the gene specific essentiality profiles for each gene included in the project and the corresponding Uniprot link, link to the DepMap portal, and link to the Project score portal.
In this tab, there are options to perform group-wise statistical comparison or correlation studies.
TISSUE/COESSENTIALITY: Use this tab to check the essentiality profile for a given gene in a given project. If 'All' tissues are chosen, then multi-group comparison is performed using Kruskal Wallis test, with post-hoc two-sample Wilcoxon test. If specific tissue type is selected, then two group comparison is performed using two-sample Wilcoxon test. If advanced option is chosen and specific cell lines are selected, all chosen cell lines are annotated as 'Chosen' and not chosen cell lines are annotated as 'Pancancer' and two group comparison is performed using two-sample Wilcoxon test.
Also in this tab is the option to correlate essentiality between two genes. Pearson correlation is used for all correlation tests.
CO-EXPRESSION LEVEL: Use this tab to check the expression profile for a given gene in a given project. If 'All' tissues are chosen, then multi-group comparison is performed using Kruskal Wallis test, with post-hoc Mann-Whitney U (two-sample Wilcoxon) test. If specific tissue type is selected, then two group comparison is performed using Mann-Whitney U (two-sample Wilcoxon) test. If 'Advanced selection' is chosen and specific cell lines are selected, all chosen cell lines are annotated as 'Chosen' and not chosen cell lines are annotated as 'Pancancer' and two group comparison is performed using Mann-Whitney U (two-sample Wilcoxon) test.
Also in this tab is the option to correlate expression between two genes or expression of the query gene with the essentiality of any other gene. Pearson correlation is used for all correlation tests.
Expression values displayed in CEN-tools are normalised FPKM values.
log(FPKM+1)
.MUTATION LEVEL: Use this tab to compare the essentiality of a given gene in the context of a mutation of the same or a different gene. Three types of mutational annotations are used.
While using the last two option, users must consider that not all mutations will have the same effect on essentiality.
DRUG RESPONSE LEVEL: Use this tab to correlate the response of selected cell lines to a drug and the essentiality of a given gene. The drugs are grouped according to their targets as annotated from Genomics of Drug Sensitivity in Cancer (CancerRxGene). Drug sensitivity is represented as normalised Z-scores. A negative Z-score implies a higher sensitivity of the cell line towards that drug.
This tab contains all the pre-calculated associations for mutation, tissue/cancer, and expression related associations. The different sidebar options enable users to subset the network.
All calculations for the different projects are done separately. Choose between the calculations made from SANGER or BROAD projects.
Three types of context-specific essentiality networks (CENs) are available to generate:
Three options are available here 'Tissue/Cancer', 'Expression', and 'Mutation'. These are the contexts that are being tested and displayed in different colors in the results CEN.
These advanced options allow users to optimise the way the networks are visualised. By default the advanced options are disabled - use the toggle button to enable the options.
For each context that has been selected in the basic option of Display effector edges corresponding to, there are options to select the type of edges to visualise. Users can choose to either display edges that correspond to increase or decrease in essentiality or expression or display both.
The essentiality score was scaled in a way that that essential genes have a median log2 fold change of 1 and non‐essential genes a median log2 fold change of 0.
When testing for significant increase in essentiality, there are sliders available to adjust the median of the essentiality group to ensure that the tested group has high essentiality. This is only relevant for group-wise comparison and there are two separate option for adjusting this for either 'Tissue/Cancer' option or 'Mutation' comparison.
This option is only activated when Expression and Mutation options are selected in the 'Choose effector edges to display' option.
Use the Within a tissue of origin/cancer type option to perform all statistical tests within the tissue type. For example: Essentiality of BRAF in BRAF mutant cells of Skin. Use the Pancancer option to perform statistical tests using all available cell lines with a given context. For example: Essentiality of BRAF in all BRAF mutant cells.
Select confidence thresholds. The levels are defined as follows:
Users have an additional option to select confidence based on the number of samples that were in each group when performing the test. Group A calculations contain more than 5 samples per group, whereas selecting 'All' will also select less-restrictive tests with only 3 or more samples per group. Tests with low number of samples per group should be interpreted with caution as such tests are less reliable.
The interactive cell line selector allows users to select cell lines according to their choice. The option to select this tab will only be active if Advanced selection option is chosen in the CONTEXT ANALYSIS tab. Users will be prompted to 'Launch the interactive cell selector'. Upon clicking the button a new tab with the following options will open.
BRAF, KRAS
.The integration with the PPI networks allows users to map the CENs with a protein-protein interaction network from STRING. This option to select this tab will only be active once 'Map the current nodes to a PPI network' option in the Network Analysis tab is pressed. Upon pressing the button a new tab with the following options will open.
This is the interaction score from STRING. Users can choose a number between 0 and 1000 (default used in CEN-tools is 400).
Users can choose to hide the disconnected nodes from STRING using this toggle button. Hiding disconnected nodes is the default option.
The below examples explain the usage of the cell line selector and the generation of context essentiality networks (CENs). This examples can be used to reproduce the figures in the paper Sharma et al. 2020.
Lineage-specific CENs were extracted by subsetting the BROAD project using 'Expression' as the effector. The entire network was then subsetted to contain only the interactions of this gene set. To download this network and the node attributes from the CEN-tools website the following steps were taken:
Skin-specific CENs were extracted from the ‘BROAD’ project. To download this network and the node attributes from the CEN-tools website the following steps were taken:
The association of RPL22 mutation and the essentiality of its paralog RPL22L1 was investigated in the 'BROAD' project. The following steps were taken:
The association of ERBB2 amplification with its essentiality in breast and esophagus cell lines was investigated in the 'BROAD' project. The following steps were taken:
The association of microsatellite instability (MSI) with the essentiality of the WRN helicase in colorectal cell lines was investigated in the 'SANGER' project. The following steps were taken:
CEN-tools uses data from multiple publicly available sources. Refer to the references below to download the source data: Click the links to be redirected to the respective resources. Additional files curated by us are deposited in BioStudies (S-BSST479). Data for recreating the plots and networks generated by users on cen-tools.com can be directly downloaded in the respective tabs using the download buttons.
Data files generated in this study are deposited in BioStudies (S-BSST479). Following files are available:
CEN-tools uses data from multiple publicly available sources. Refer to the references below to download the source data: Click the links to download the files.
Annotations for cell lines like tissue or MSI status were curated from Cell model passports (model_list_20020610.csv) and Cancer Cell Line Encyclopedia.
## Terms and conditions CEN-tools is an integrative tool that identifies the underlying context that is associated with the essentiality of a gene from large-scale genome-scale CRISPR screens.
Copyright © 2020 EMBL- European Bioinformatics Institute
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
Neither the institution name nor the name CEN-tools can be used to endorse or promote products derived from this software without prior written permission. For written permission, please contact petsalaki@ebi.ac.uk.
Products derived from this software may not be called CEN-tools nor may CEN-tools appear in their names without prior written permission of the developers.
You should have received a copy of the GNU General Public License along with this program. If not, see gnu.org/licenses/.
You can also view the licence here.
For policies regarding the underlying data, please also refer to: