Welcome_file

Welcome to CEN-tools

CEN-tools is an integrative tool that identifies the underlying context that is associated with the essentiality of a gene from large-scale genome-scale CRISPR screens.

Citation: If you find CEN-tools useful please cite:
Sharma S*, Dincer C*, Weidemüller P, Wright GJ, Petsalaki E., CEN-tools: An integrative platform to identify the contexts of essential genes. MSB; doi: 10.15252/msb.20209698

Overview

Select a gene of interest and navigate through the essentiality contexts.

For examples on how to use CEN-tools please refer to the Documentation -> Examples tab.

Flow CEN-tools

Available datasets

  • SANGER: Essentiality screens on 323 cell lines performed from Project Score.
  • BROAD: Essentiality screens on 489 cell lines performed from DepMap project.
  • SANGER and BROAD INTEGRATED: Integrated analysis of essentiality screens performed on 786 cell lines from both SANGER and BROAD analysed by Pacini et al, 2020 .

    Essentiality scores for all projects are processed from the fold change values downloaded from Project Score pipeline.

  • Mutation data was obtained from CCLE and Cancer Genome Interpreter annotated mutations.
  • Cell expression datasets for cell lines with matched essentiality screens were obtained from DepMap and Cell model Passports .
  • Drug response data for a number of cell lines with matched essentiality screens was obtained from CancerRxGene .

Contact us here for any problems or feedback on CEN-tools.

The python package of CEN-tools is available here.


Developed by

Sumana Sharma, Cansu Dincer,
Paula Weidemüller, and Evangelia Petsalaki

Loading...

This section features the core gene analysis.

  • Project: Four options are available - 'BROAD', 'SANGER', 'BROAD and SANGER INTEGRATED' and 'Show-all'. Integrated analysis is in beta-version and will be updated once the preprint is accepted for publication.

  • Start typing to select gene:: A total of 16522 genes are available to choose from. Note that some genes may not have information in a given dataset, since the number of available genes slightly differs between datasets.

The displayed figure is an essentiality probability distribution. These distributions are generated in the following manner:

  1. Essentiality scores in the form of logFC values from all three datasets were downloaded from Project Score.
  2. A logistic regression (LR) function that can separate genes as essential and non essential inside each cell line was used. Pre-annotated BAGEL essential genes and BAGEL non-essential genes were used in training.
  3. The logFC table was then converted into a probability table. Probabilities of a gene being essential was calculated using the LR model (sigmoid function).
  4. Probability values were binned into 20 bins and the binned probability frequency table were used to perform k-means clustering.
  5. The cluster that best represented core-essential genes displayed a probability distribution skewed to 1 as a peak. The genes within these cluster were defined as core-essential genes.
  6. Genes identified as core-essential genes from both studies were assigned as CEN-core essential genes.

Further classification of core-essential genes (Core classification)

Identified core-essential genes were further classified into the following categories:

  • ADaM Core gene: Identified previously by ADaM pipeline.
  • Pluripotency gene: Identified as essential gene for human pluoripotent stem cells by Ihry et.al 2019.
  • Known essential processes: Genes belonging to the class of known essential processes (Ribosomes, spliceosome, RNA-processing…) that were not used in the training BAGEL geneset to train the linear model to convert essentiality matrix (logFC) into essentiality probability matrix.
  • New core gene: Identified as a core gene from this study.
  • Note: the new set of core-genes will be updated once the Integrated preprint (Pacini et al, 2020) is published.

'Essentiality score' refers to the scaled gene level fold changes. Scaled FC values are multiplied by -1 such that essential genes have a median log2 fold change of 1 and non‐essential genes a median log2 fold change of 0.





Drug response data is available for only a subset of cell lines with essentiality data. Therefore you might see the note 'Not enough samples for analysis' for several tissues.





Download table of selected cell lines

Manual selection of cell lines. Clickable selection is ONLY available amongst cell lines that you have not already excluded on the side bar. Excluded (not clickable cell lines) are shown as faint grey dots. Clickable but not selected cell lines are shown as dark grey dots.

Either drag a window over a number of cell lines and then press one of the buttons below to confirm the selection. Or click on the dots to manually select/deselect them.

Advanced options

'Essentiality score' refers to the scaled gene level fold changes. Scaled FC values are multiplied by -1 such that essential genes have a median log2 fold change of 1 and non‐essential genes a median log2 fold change of 0.


Download Network
Download node attributes
Download Image html
Loading...


Displaying CEN-mapped PPI network from STRING may take up to 1-2 minutes once you hit the 'Retrieve' button. You might see a previous network before it updates.



The nodes from the CEN are shown in the dropdown menu on the left panel. Either explore networks of individual proteins (default setting) or select 'Map CEN' to map all genes from the CEN onto the STRING PPI network and perform enrichment analysis. The enrichment plot will be displayed below the integrated PPI network if selected.

Information on terminology and datasets:

Calculation of the essentiality score

To identify the contexts that were associated with essentiality, we calculated the scaled log2 fold change (henceforth referred to as 'scaled essentiality score') from the CRISPRcleanR corrected gene-level fold-changes provided by Project Score (Behan et al., 2019). For this, gene level fold changes were first quantile normalised per sample and then median scaled according to BAGEL previously defined lists of essential and non-essential genes (Hart & Moffat, 2016) as applied in (Gonçalves et al. 2020). The scaled values were then multiplied by -1 such that essential genes have a median log2 fold change of 1 and non‐essential genes a median log2 fold change of 0.

User inputs:

  • Project: Three options are available - 'BROAD', 'SANGER' and BROAD and SANGER 'INTEGRATED'. Analysis for each project is performed separately. BROAD refers to the DepMap project and SANGER refers to the Project Score. Users can use the INTEGRATED dataset to investigate the essentialities of genes from a dataset that integrates BROAD and SANGER data (beta-version, based on preprint Pacini et al. 2020).

    The number of cell lines used for computations is:

    • in the ESSENTIALITY PROFILE and NETWORK ANALYSIS tab:
      • BROAD: 489 cell lines
      • SANGER: 324 cell lines
      • SANGER and BROAD INTEGRATED: 786 cell lines
    • in the CONTEXT ANALYSIS only the subset of cell lines for which combined essentiality, expression and mutation information was available is presented:
      • BROAD: 484 cell lines
      • SANGER: 307 cell lines
      • SANGER and BROAD INTEGRATED: 708 cell lines
  • Query genes: For each selected project there is a number of genes to choose from:

    • in the ESSENTIALITY PROFILE and NETWORK ANALYSIS tab:
      • BROAD: 15,546 genes (genes targeted in both BROAD and SANGER)
      • SANGER: 15,546 genes (genes targeted in both BROAD and SANGER)
      • SANGER and BROAD INTEGRATED: 15,599 genes
    • in the CONTEXT ANALYSIS tab:
      • BROAD: 17,343 genes (genes targeted in both BROAD and SANGER)
      • SANGER: 17,995 genes (genes targeted in both BROAD and SANGER)
      • SANGER and BROAD INTEGRATED: 16,827 genes
  • Tissue of Origin: Tissue annotations are available for all cell lines. Default is 'All'. Using the default option, visualisations for the data will be performed on all available cell lines from the chosen project.

  • Selecting cell lines:

    • Default option is 'General selection' - Users will still be able to select cell lines but only based on tissue of origin.
    • Advanced selection offers the possibility to launch the 'INTERACTIVE CELL LINE SELECTOR' in a new tab. Select the cell lines in the new tab and press 'CONFIRM SELECTION'. This will automatically update your analysis to the chosen cell lines. All cell lines that are not chosen will be labelled as 'Pancancer'. Users can also opt to subset the Pancancer cell lines to cell lines from specific tissue of origin.
  • Expression: Expression values displayed in CEN-tools are normalised FPKM values.

  • Mutation: There are three options available to subset cell lines based on their mutational status:

    • Hotspot mutation: This includes known oncogenic driver genes that contain commonly recurring hotspot mutations. The choice of genes is provided under Select hotspot mutation gene option. The mutation annotation was obtained from the CCLE database.
    • Oncogenic mutation: This includes known oncogenic driver genes that contain any mutations. The choice of genes is provided under Select oncogenic mutation gene option. While using this option, note that different types of mutations in the same oncogenic gene can have different effects on vulnerabilities. The mutation annotation was obtained from the Cancer Genome Interpreter.
    • Non-silent mutations: This includes genes that bear any non-silent mutation. The choice of genes is provided under Select gene with non-silent mutation option. The mutation annotation was obtained from the CCLE database.
  • Drug response: GDSC1 and GDSC2 are two drug screening datasets obtained from CancerRxGene. Following options are available, further information on the different datasets can be found in the documentation of CancerRxGene:

    • GDSC1 only: “GDSC1 is an expansion of the original dataset available from this website and published by Iorio et al. (Cell 2016).” (taken from CancerRxGene documentation)
    • GDSC2 only: This is an improved and expanded screen compared to GDSC1. “GDSC2 has been screened using improved equipment and procedures […]” (taken from CancerRxGene documentation).
    • GDSC1 & GDSC2: If duplicate data exists from both screening datasets only the values from GDSC2 dataset are shown, because “many experiments from GDSC1 have been repeated in GDSC2 and we would recommend, where duplicate IC50s exist, using the result from GDSC2” (taken from CancerRxGene documentation.
    • Default option is 'GDSC1 & GDSC2' - The drug response information are combined from both GDSC2 and GDSC1 datasets. Data for all cell lines are first taken from GDSC2 and if no information exists for a given cell line for a given drug in GDSC2, information was added from GDSC1. Z-scores were used in all cases. If the analysis is to be restricted to only one of the two available datasets, select 'GDSC1' or 'GDSC2'.
    • Note that drug information is not available for all cell lines.
      • BROAD: GDSC1 and GDSC2 datasets contain information for 260/484 and 229/484 cell lines, respectively.
      • SANGER: GDSC1 and GDSC2 datasets contain information for 306/307 and 269/307 cell lines, respectively.
      • SANGER and BROAD INTEGRATED: GDSC1 and GDSC2 datasets contain information for 379/708 and 379/708 cell lines, respectively.

TAB information:

  1. ESSENTIALITY PROFILE
  2. CONTEXT ANALYSIS
  3. NETWORK ANALYSIS
  4. INTERACTIVE CELL LINE SELECTOR
  5. INTEGRATION WITH PPI NETWORK

ESSENTIALITY PROFILE:

This tab provides the gene specific essentiality profiles for each gene included in the project and the corresponding Uniprot link, link to the DepMap portal, and link to the Project score portal.

CONTEXT ANALYSIS:

In this tab, there are options to perform group-wise statistical comparison or correlation studies.

  1. TISSUE/COESSENTIALITY: Use this tab to check the essentiality profile for a given gene in a given project. If 'All' tissues are chosen, then multi-group comparison is performed using Kruskal Wallis test, with post-hoc two-sample Wilcoxon test. If specific tissue type is selected, then two group comparison is performed using two-sample Wilcoxon test. If advanced option is chosen and specific cell lines are selected, all chosen cell lines are annotated as 'Chosen' and not chosen cell lines are annotated as 'Pancancer' and two group comparison is performed using two-sample Wilcoxon test.

    Also in this tab is the option to correlate essentiality between two genes. Pearson correlation is used for all correlation tests.

  2. CO-EXPRESSION LEVEL: Use this tab to check the expression profile for a given gene in a given project. If 'All' tissues are chosen, then multi-group comparison is performed using Kruskal Wallis test, with post-hoc Mann-Whitney U (two-sample Wilcoxon) test. If specific tissue type is selected, then two group comparison is performed using Mann-Whitney U (two-sample Wilcoxon) test. If 'Advanced selection' is chosen and specific cell lines are selected, all chosen cell lines are annotated as 'Chosen' and not chosen cell lines are annotated as 'Pancancer' and two group comparison is performed using Mann-Whitney U (two-sample Wilcoxon) test.

    Also in this tab is the option to correlate expression between two genes or expression of the query gene with the essentiality of any other gene. Pearson correlation is used for all correlation tests.

    Expression values displayed in CEN-tools are normalised FPKM values.

  3. MUTATION LEVEL: Use this tab to compare the essentiality of a given gene in the context of a mutation of the same or a different gene. Three types of mutational annotations are used.

    • Hotspot mutation uses the annotation from the CCLE database in which commonly occurring hotspot mutation in cancer driver genes are annotated.
    • Oncogenic mutation uses annotation from the Cancer Genome Interpreter in which all mutations on a given oncogenic driver gene are considered irrespective of if the mutation is a hotspot mutation or not.
    • Non-silent mutations uses annotation from the CCLE database in which all mutations on a given gene irrespective of if the mutation is a hotspot mutation or not or the gene is a known driver are considered.

    While using the last two option, users must consider that not all mutations will have the same effect on essentiality.

  4. DRUG RESPONSE LEVEL: Use this tab to correlate the response of selected cell lines to a drug and the essentiality of a given gene. The drugs are grouped according to their targets as annotated from Genomics of Drug Sensitivity in Cancer (CancerRxGene). Drug sensitivity is represented as normalised Z-scores. A negative Z-score implies a higher sensitivity of the cell line towards that drug.

NETWORK ANALYSIS:

This tab contains all the pre-calculated associations for mutation, tissue/cancer, and expression related associations. The different sidebar options enable users to subset the network.

Basic Options

Project:

All calculations for the different projects are done separately. Choose between the calculations made from SANGER or BROAD projects.

Generate CEN centered around:

Three types of context-specific essentiality networks (CENs) are available to generate:

  1. Tissue - Use this option to generate a tissue-centric network. The tissue of choice will be the central nodes. All the edges of this network correspond to the chosen tissue. If all is chosen, the network will be very big so users will have an option to only view the expression/tissue network. To activate this function - users must choose 'Expression' option from Show networks.
  2. Cancer type - Use this option to view a cancer type-centric network. This option is very similar to Tissue: CEN as Cancer types are sub-divisions of tissues. This is especially relevant for cell lines of Nervous system and Haematopoietic/Lymphoid lineage.
  3. Gene - Use this option to view a gene-centric network. The edges in this network are not specific to a particular tissue or a cancer type. However, an additional option on the main panel 'Load current network edges groups:“ will allow users to choose edges corresponding only to a particular tissue/cancer type. Users can also use the toggle button to switch between visualising tissues or cancer type connections.

Display effector edges corresponding to:

Three options are available here 'Tissue/Cancer', 'Expression', and 'Mutation'. These are the contexts that are being tested and displayed in different colors in the results CEN.

Advanced Options

These advanced options allow users to optimise the way the networks are visualised. By default the advanced options are disabled - use the toggle button to enable the options.

show edges corresponding to options:

For each context that has been selected in the basic option of Display effector edges corresponding to, there are options to select the type of edges to visualise. Users can choose to either display edges that correspond to increase or decrease in essentiality or expression or display both.

Adjusting median of the essential group options:

The essentiality score was scaled in a way that that essential genes have a median log2 fold change of 1 and non‐essential genes a median log2 fold change of 0.

When testing for significant increase in essentiality, there are sliders available to adjust the median of the essentiality group to ensure that the tested group has high essentiality. This is only relevant for group-wise comparison and there are two separate option for adjusting this for either 'Tissue/Cancer' option or 'Mutation' comparison.

Perform context comparisions:

This option is only activated when Expression and Mutation options are selected in the 'Choose effector edges to display' option.

Use the Within a tissue of origin/cancer type option to perform all statistical tests within the tissue type. For example: Essentiality of BRAF in BRAF mutant cells of Skin. Use the Pancancer option to perform statistical tests using all available cell lines with a given context. For example: Essentiality of BRAF in all BRAF mutant cells.

Confidence level:

Select confidence thresholds. The levels are defined as follows:

  • For categorical comparison (tissue, cancer type, mutation):
    • Level 1: 0.01 < p-value < 0.05
    • Level 2: 0.001 < p-value < 0.01
    • Level 3: 0.0001 < p-value < 0.001
    • Level 4: p-value < 0.0001
  • For Pearson correlation comparison (expression) (p<0.05 &):
    • Level 1: r < 0.5
    • Level 2: 0.5 < r < 0.6
    • Level 3: 0.6 < r < 0.65
    • Level 4: r > 0.7

Group selection:

Users have an additional option to select confidence based on the number of samples that were in each group when performing the test. Group A calculations contain more than 5 samples per group, whereas selecting 'All' will also select less-restrictive tests with only 3 or more samples per group. Tests with low number of samples per group should be interpreted with caution as such tests are less reliable.

INTERACTIVE CELL LINE SELECTOR:

The interactive cell line selector allows users to select cell lines according to their choice. The option to select this tab will only be active if Advanced selection option is chosen in the CONTEXT ANALYSIS tab. Users will be prompted to 'Launch the interactive cell selector'. Upon clicking the button a new tab with the following options will open.

  • Show cell lines with mutations from: Cell lines can be selected based on their mutational status. Four options are available:
    • Hotspot mutation
    • Oncogenic mutation
    • Input genes: This option can be used to select cell lines based on any gene of choice which contains any non-silent mutations. Multiple gene inputs are allowed, separated by a ', '. For example BRAF, KRAS.
    • No selection based on mutation: Use this option to view all cell lines regardless of the mutational status. This is the default option. If genes with mutations are selected, users can further restrict the displayed cell lines, by whether they should contain a mutation in any ('OR') or all ('AND') selected genes.
  • Selection based on Copy Number Variation (CNV)?: Users have the choice to select cell lines depending on the copy number variations of selected genes. When 'Yes' is selected, a slider will appear and users can adjust the slider to restrict the choice of genes that have a CNV within the selected range in the available cell lines.
  • Cell culture growth properties: Users can use this option to select the cell culture conditions of the cells. This information was obtained from Cell model passports.
  • Genome stability: This option allows users to select the microsatellite instability (MSI)/ microsatellite stable (MSS) status of the cell lines. This information was obtained from Cell model passports.
  • Tissue and Cancer type: Users may use this option to restrict the selected cell lines so far to a particular tissue type or cancer type.
  • Choose output: This option allows users to view the data as a table or as a t-sne plot. The t-sne plot is pre-built based on the gene-expression datasets. Within the t-sne plot individual points denote cell lines and users are able to select of deselect any point of interest by clicking on them or drag a window around a number of cell lines.
  • Choose a name for your selection: Once selections are made, users MUST provide a name for their selected group (e.g. "BRAF_mut_Skin”) to reflect their selection. This name will appear in the plots in the CONTEXT ANALYSIS tab. Once the name is given press Click here to confirm you selection, which will automatically direct users to the context analysis tab.

INTEGRATION WITH PPI NETWORK:

The integration with the PPI networks allows users to map the CENs with a protein-protein interaction network from STRING. This option to select this tab will only be active once 'Map the current nodes to a PPI network' option in the Network Analysis tab is pressed. Upon pressing the button a new tab with the following options will open.

Display Network from STRING: There are two choices for this option:

  • Protein-Protein interactions for individual proteins imported from the CEN: Upon selecting this option. The nodes from CEN represented in the Network analysis page will be shown in the dropdown menu under 'Explore PPI network of'. Select protein of interest from this list, after which a purple button with 'Retrieve PPI partners of the selected protein from STRING' will appear. Press this button to visualise the PPI network of the selected individual protein.
  • Map CEN onto STRING network: Using this option, the users can map all the genes that the CEN onto the STRING PPI network and perform enchriment analysis. Two further options will be enabled upon selecting this option.
    • Show CEN-mapped STRING network. After selecting this option a purple button with 'Retrieve PPI network of CEN from STRING' will appear. Press this button to display the mapped network. This network will only contain nodes that are represented in CEN.
    • Perform enrichment analysis- This option allows users to perform different types of enrichments. Two further options are available.
      • Choose the enrichment category to display: Three enrichment options are available: Function (Protein function), KEGG (KEGG pathways), and Process (GO-process).
      • Choose enrichments FDR cut-off: Three options (FDR 0.1, 0.05, 0.01) are available.

Adjust STRING interaction score:

This is the interaction score from STRING. Users can choose a number between 0 and 1000 (default used in CEN-tools is 400).

Hide disconnected nodes:

Users can choose to hide the disconnected nodes from STRING using this toggle button. Hiding disconnected nodes is the default option.

Worked examples

The below examples explain the usage of the cell line selector and the generation of context essentiality networks (CENs). This examples can be used to reproduce the figures in the paper Sharma et al. 2020.

  1. Generation of CENs
  2. Using the cell line selector

Generation of CENs

Example 1: Extraction of lineage CENs (Figure 2B)

Lineage-specific CENs were extracted by subsetting the BROAD project using 'Expression' as the effector. The entire network was then subsetted to contain only the interactions of this gene set. To download this network and the node attributes from the CEN-tools website the following steps were taken:

  1. Navigate to the 'NETWORK ANALYSIS' tab.
  2. Choose the following parameters:
    • Basic parameters:
      1. Project: BROAD
      2. Generate CEN centered around: Cancer type
      3. Cancer type: All
      4. Toggle Expression-specific
      5. Display effector edges corresponding to: Select all (Mutation, Expression, Tissue/Cancer)
    • Advanced edge filter options: First toggle to enable advanced options. Then make the following selections:
      1. For Expression context correlations show edges corresponding to: Positive correlations
      2. For Tissue/Cancer context comparisons show edges corresponding to: Increase in essentiality/expression
      3. For Mutation context comparisons show edges corresponding to: Both Increase and decrease in essentiality/expression
      4. Select mutation annotations: Hotspot mutation
      5. Only show tissue/cancer edges in which the median essentiality score of the essential context is higher than: 0.3
      6. Only show mutation edges in which the median essentiality score of the essential context is higher than: 0.4
      7. Perform context comparisons: Within a tissue of origin/cancer type
      8. Confidence level of association (Tissue/Cancer type): 1:Low
      9. Confidence level of association (Mutation): 2:Medium
      10. Select: Group A
  3. If this is the first network being made, press 'Initialise network' on the top left corner.
  4. Download the network and the node attributes by clicking the download buttons in the left sidebar. Open the network file in Cytoscape.

Example 2: Extraction of skin-specific CENs (Figure 2C)

Skin-specific CENs were extracted from the ‘BROAD’ project. To download this network and the node attributes from the CEN-tools website the following steps were taken:

  1. Navigate to the 'NETWORK ANALYSIS' tab.
  2. Choose the following parameters:
    • Basic parameters:
      1. Project: BROAD
      2. Generate CEN centered around: Tissue
      3. Tissue of Origin: Skin
      4. Display effector edges corresponding to: Select all (Mutation, Expression, Tissue/Cancer)
    • Advanced edge filter options: First toggle to enable advanced options. Then make the following selections:
      1. For Expression context correlations show edges corresponding to: Positive correlations
      2. For Tissue/Cancer context comparisons show edges corresponding to: Increase in essentiality/expression
      3. For Mutation context comparisons show edges corresponding to: Increase in essentiality/expression
      4. Select mutation annotations: Hotspot mutation
      5. Only show tissue/cancer edges in which the median essentiality score of the essential context is higher than: 0.2 (can be adjusted depending on the threshold required).
      6. Only show mutation edges in which the median essentiality score of the essential context is higher than: 0.4 (can be adjusted depending on the threshold required).
      7. Perform context comparisons: Within a tissue of origin/cancer type
      8. Confidence level of association (Tissue/Cancer type): 1:Low
      9. Confidence level of association (Mutation): 2:Medium
      10. Select: Group A
  3. Press Initialise network. The network will have too many nodes to be displayed but will still be generated (you will see a warning).
  4. Download the network and the node attributes by clicking the download buttons in the left sidebar. Open the network file in Cytoscape.
  5. For visualization, only display nodes with 'TF' attribute and any other nodes directly associated with these 'TF' nodes.

Using the cell line selector

Example 1: Investigating paralog dependency (Appendix Figure S8A)

The association of RPL22 mutation and the essentiality of its paralog RPL22L1 was investigated in the 'BROAD' project. The following steps were taken:

  1. Navigate to the 'Context analysis' tab and 'Tissue/Coessentiality' subtab.
  2. On the left menubar select:
    1. Project: BROAD
    2. 'Advanced selection' and then 'Interactive Cell Line Selector'
  3. Hit the 'Launch the interactive Cell Line Selector' button. You will be redirected to a new tab. Wait a little while until the interface is fully loaded.
  4. On the left menubar choose:
    1. Show cell lines with mutations from: Input genes
      1. Type RPL22 in the appearing text box
      2. Hit the 'Submit' button
      3. Should cell lines contain a mutation in all ('AND') or at least one ('OR') of the above chosen genes?: OR
    2. Selection based on Copy Number Variation (CNV)?: No selection based on CNV
    3. Cell culture growth properties: All
    4. Genome stability: All
    5. Tissue of origin (multiple selection allowed): All
    6. Cancer type (multiple selection allowed): All
    7. Colour cell lines by: Tissue
    8. Choose a name for your selection: e.g. 'RPL22 non-silent mutation'
  5. Hit the 'Click here to confirm your selection' button. You will be redirected to to the 'Context analysis' tab
  6. On the left menubar select:
    1. Start typing to select gene: RPL22L1
    2. Cells not chosen will be labelled as 'Pancancer', do you wish to subset Pancancer list by tissue type?: No

Example 2: Investigating essentiality based on CNV status (Appendix Figure S8B)

The association of ERBB2 amplification with its essentiality in breast and esophagus cell lines was investigated in the 'BROAD' project. The following steps were taken:

  1. Navigate to the 'Context analysis' tab and 'Tissue/Coessentiality' subtab.
  2. On the left menubar select:
    1. Project: BROAD
    2. 'Advanced selection' and then 'Interactive Cell Line Selector'
  3. Hit the 'Launch the interactive Cell Line Selector' button. You will be redirected to a new tab. Wait a little while until the interface is fully loaded.
  4. On the left menubar choose:
    1. Show cell lines with mutations from: No selection based on mutation
    2. Selection based on Copy Number Variation (CNV)?: Yes
      1. Wait until you see a black box below the slider. This might take a while.
      2. Choose a range of relative copy numbers to restrict the choice of genes with CNV: 1.1-7.22
      3. Pick 1 or more genes, whose relative CN lies within the selected range: ERBB2
      4. Should cell lines contain a CNV in all ('AND') or at least one ('OR') of the above chosen genes?: OR
    3. Cell culture growth properties: All
    4. Genome stability: All
    5. Tissue of origin (multiple selection allowed): Breast, Esophagus
    6. Cancer type (multiple selection allowed): All
    7. Colour cell lines by: Tissue
    8. Choose a name for your selection: e.g. 'ERBB2 amplification'
  5. Hit the 'Click here to confirm your selection' button. You will be redirected to to the 'Context analysis' tab
  6. On the left menubar select:
    1. Start typing to select gene: ERBB2
    2. Cells not chosen will be labelled as 'Pancancer', do you wish to subset Pancancer list by tissue type?: Yes
      • Subset Pancancer to a specific tissue of origin: Breast, Esophagus

Example 3: Investigating essentiality based on microsatellite instability status (MSI) (Appendix Figure S8C)

The association of microsatellite instability (MSI) with the essentiality of the WRN helicase in colorectal cell lines was investigated in the 'SANGER' project. The following steps were taken:

  1. Navigate to the 'Context analysis' tab and 'Tissue/Coessentiality' subtab.
  2. On the left menubar select:
    1. Project: SANGER
    2. 'Advanced selection' and then 'Interactive Cell Line Selector'
  3. Hit the 'Launch the interactive Cell Line Selector' button. You will be redirected to a new tab. Wait a little while until the interface is fully loaded.
  4. On the left menubar choose:
    1. Show cell lines with mutations from: No selection based on mutation
    2. Selection based on Copy Number Variation (CNV)?: No selection based on CNV
    3. Cell culture growth properties: All
    4. Genome stability: MSI
    5. Tissue of origin (multiple selection allowed): Colon/Rectum
    6. Cancer type (multiple selection allowed): All
    7. Colour cell lines by: Tissue
    8. Choose a name for your selection: e.g. 'MSI in colorectal'
  5. Hit the 'Click here to confirm your selection' button. You will be redirected to to the 'Context analysis' tab
  6. On the left menubar select:
    1. Start typing to select gene: WRN
    2. Cells not chosen will be labelled as 'Pancancer', do you wish to subset Pancancer list by tissue type?: Yes
      • Subset Pancancer to a specific tissue of origin: Colon/Rectum

CEN-tools uses data from multiple publicly available sources. Refer to the references below to download the source data: Click the links to be redirected to the respective resources. Additional files curated by us are deposited in BioStudies (S-BSST479). Data for recreating the plots and networks generated by users on cen-tools.com can be directly downloaded in the respective tabs using the download buttons.

Curated data

Data files generated in this study are deposited in BioStudies (S-BSST479). Following files are available:

  • Node file used for generating the CENs in CEN-tools
  • Edge file used for generating the CENs in CEN-tools
  • Cluster annotations after Core-gene analysis of CEN-tools
  • Cell line annotations for all cell lines used in CEN-tools
  • High quality core genes identified from CEN-tools pipeline with annotations.
  • List of oncogenic and hotspot mutations used as 'contexts' in CEN-tools.
  • Top 10 co-essential gene tables

Data resources

CEN-tools uses data from multiple publicly available sources. Refer to the references below to download the source data: Click the links to download the files.

Cell line annotations

Annotations for cell lines like tissue or MSI status were curated from Cell model passports (model_list_20020610.csv) and Cancer Cell Line Encyclopedia.

SANGER project

BROAD project

INTEGRATED project

## Terms and conditions CEN-tools is an integrative tool that identifies the underlying context that is associated with the essentiality of a gene from large-scale genome-scale CRISPR screens.

Copyright © 2020 EMBL- European Bioinformatics Institute

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

Neither the institution name nor the name CEN-tools can be used to endorse or promote products derived from this software without prior written permission. For written permission, please contact petsalaki@ebi.ac.uk.

Products derived from this software may not be called CEN-tools nor may CEN-tools appear in their names without prior written permission of the developers.

You should have received a copy of the GNU General Public License along with this program. If not, see gnu.org/licenses/.

You can also view the licence here.

Further Disclaimer

For policies regarding the underlying data, please also refer to: