Information on terminology and datasets:

Calculation of the essentiality score

To identify the contexts that were associated with essentiality, we calculated the scaled log2 fold change (henceforth referred to as 'scaled essentiality score') from the CRISPRcleanR corrected gene-level fold-changes provided by Project Score (Behan et al., 2019). For this, gene level fold changes were first quantile normalised per sample and then median scaled according to BAGEL previously defined lists of essential and non-essential genes (Hart & Moffat, 2016) as applied in (Gonçalves et al. 2020). The scaled values were then multiplied by -1 such that essential genes have a median log2 fold change of 1 and non‐essential genes a median log2 fold change of 0.

User inputs:

Project: Three options are available - 'BROAD', 'SANGER' and BROAD and SANGER 'INTEGRATED'. Analysis for each project is performed separately. BROAD refers to the DepMap project and SANGER refers to the Project Score. Users can use the INTEGRATED dataset to investigate the essentialities of genes from a dataset that integrates BROAD and SANGER data (beta-version, based on preprint Pacini et al. 2020).

The number of cell lines used for computations is:
- in the ESSENTIALITY PROFILE and NETWORK ANALYSIS tab:
  - BROAD: 489 cell lines
  - SANGER: 324 cell lines
  - SANGER and BROAD INTEGRATED: 786 cell lines
- in the CONTEXT ANALYSIS only the subset of cell lines for which combined essentiality, expression and mutation information was available is presented:
  - BROAD: 484 cell lines
  - SANGER: 307 cell lines
  - SANGER and BROAD INTEGRATED: 708 cell lines
Query genes: For each selected project there is a number of genes to choose from:
- in the ESSENTIALITY PROFILE and NETWORK ANALYSIS tab:
  - BROAD: 15,546 genes (genes targeted in both BROAD and SANGER)
  - SANGER: 15,546 genes (genes targeted in both BROAD and SANGER)
  - SANGER and BROAD INTEGRATED: 15,599 genes
- in the CONTEXT ANALYSIS tab:
  - BROAD: 17,343 genes (genes targeted in both BROAD and SANGER)
  - SANGER: 17,995 genes (genes targeted in both BROAD and SANGER)
  - SANGER and BROAD INTEGRATED: 16,827 genes
Tissue of Origin: Tissue annotations are available for all cell lines. Default is 'All'. Using the default option, visualisations for the data will be performed on all available cell lines from the chosen project.
Selecting cell lines:
- Default option is 'General selection' - Users will still be able to select cell lines but only based on tissue of origin.
- Advanced selection offers the possibility to launch the 'INTERACTIVE CELL LINE SELECTOR' in a new tab. Select the cell lines in the new tab and press 'CONFIRM SELECTION'. This will automatically update your analysis to the chosen cell lines. All cell lines that are not chosen will be labelled as 'Pancancer'. Users can also opt to subset the Pancancer cell lines to cell lines from specific tissue of origin.
Expression: Expression values displayed in CEN-tools are normalised FPKM values.
- For the BROAD and INTEGRATED project: Expression values were downloaded as normalised FPKM values from theCancer Cell Line Encyclopedia (CCLE).
- For the SANGER project: Expression values were obtained from the cell model passports database. Downloaded FPKM were further transformed via log(FPKM+1).
Mutation: There are three options available to subset cell lines based on their mutational status:
- Hotspot mutation: This includes known oncogenic driver genes that contain commonly recurring hotspot mutations. The choice of genes is provided under Select hotspot mutation gene option. The mutation annotation was obtained from the CCLE database.
- Oncogenic mutation: This includes known oncogenic driver genes that contain any mutations. The choice of genes is provided under Select oncogenic mutation gene option. While using this option, note that different types of mutations in the same oncogenic gene can have different effects on vulnerabilities. The mutation annotation was obtained from the Cancer Genome Interpreter.
- Non-silent mutations: This includes genes that bear any non-silent mutation. The choice of genes is provided under Select gene with non-silent mutation option. The mutation annotation was obtained from the CCLE database.
Drug response: GDSC1 and GDSC2 are two drug screening datasets obtained from CancerRxGene. Following options are available, further information on the different datasets can be found in the documentation of CancerRxGene:
- GDSC1 only: “GDSC1 is an expansion of the original dataset available from this website and published by Iorio et al. (Cell 2016).” (taken from CancerRxGene documentation)
- GDSC2 only: This is an improved and expanded screen compared to GDSC1. “GDSC2 has been screened using improved equipment and procedures […]” (taken from CancerRxGene documentation).
- GDSC1 & GDSC2: If duplicate data exists from both screening datasets only the values from GDSC2 dataset are shown, because “many experiments from GDSC1 have been repeated in GDSC2 and we would recommend, where duplicate IC50s exist, using the result from GDSC2” (taken from CancerRxGene documentation.
- Default option is 'GDSC1 & GDSC2' - The drug response information are combined from both GDSC2 and GDSC1 datasets. Data for all cell lines are first taken from GDSC2 and if no information exists for a given cell line for a given drug in GDSC2, information was added from GDSC1. Z-scores were used in all cases. If the analysis is to be restricted to only one of the two available datasets, select 'GDSC1' or 'GDSC2'.
- Note that drug information is not available for all cell lines.
  - BROAD: GDSC1 and GDSC2 datasets contain information for 260/484 and 229/484 cell lines, respectively.
  - SANGER: GDSC1 and GDSC2 datasets contain information for 306/307 and 269/307 cell lines, respectively.
  - SANGER and BROAD INTEGRATED: GDSC1 and GDSC2 datasets contain information for 379/708 and 379/708 cell lines, respectively.

TAB information:

ESSENTIALITY PROFILE
CONTEXT ANALYSIS
NETWORK ANALYSIS
- Basic options
- Advanced options
INTERACTIVE CELL LINE SELECTOR
INTEGRATION WITH PPI NETWORK

ESSENTIALITY PROFILE:

This tab provides the gene specific essentiality profiles for each gene included in the project and the corresponding Uniprot link, link to the DepMap portal, and link to the Project score portal.

CONTEXT ANALYSIS:

In this tab, there are options to perform group-wise statistical comparison or correlation studies.

TISSUE/COESSENTIALITY: Use this tab to check the essentiality profile for a given gene in a given project. If 'All' tissues are chosen, then multi-group comparison is performed using Kruskal Wallis test, with post-hoc two-sample Wilcoxon test. If specific tissue type is selected, then two group comparison is performed using two-sample Wilcoxon test. If advanced option is chosen and specific cell lines are selected, all chosen cell lines are annotated as 'Chosen' and not chosen cell lines are annotated as 'Pancancer' and two group comparison is performed using two-sample Wilcoxon test.

Also in this tab is the option to correlate essentiality between two genes. Pearson correlation is used for all correlation tests.
CO-EXPRESSION LEVEL: Use this tab to check the expression profile for a given gene in a given project. If 'All' tissues are chosen, then multi-group comparison is performed using Kruskal Wallis test, with post-hoc Mann-Whitney U (two-sample Wilcoxon) test. If specific tissue type is selected, then two group comparison is performed using Mann-Whitney U (two-sample Wilcoxon) test. If 'Advanced selection' is chosen and specific cell lines are selected, all chosen cell lines are annotated as 'Chosen' and not chosen cell lines are annotated as 'Pancancer' and two group comparison is performed using Mann-Whitney U (two-sample Wilcoxon) test.

Also in this tab is the option to correlate expression between two genes or expression of the query gene with the essentiality of any other gene. Pearson correlation is used for all correlation tests.

Expression values displayed in CEN-tools are normalised FPKM values.
- For the BROAD and INTEGRATED project: Expression values were downloaded as normalised FPKM values from the Cancer Cell Line Encyclopedia (CCLE).
- For the SANGER project: Expression values were obtained from the cell model passports database. Downloaded FPKM were further transformed via log(FPKM+1).
MUTATION LEVEL: Use this tab to compare the essentiality of a given gene in the context of a mutation of the same or a different gene. Three types of mutational annotations are used.
- Hotspot mutation uses the annotation from the CCLE database in which commonly occurring hotspot mutation in cancer driver genes are annotated.
- Oncogenic mutation uses annotation from the Cancer Genome Interpreter in which all mutations on a given oncogenic driver gene are considered irrespective of if the mutation is a hotspot mutation or not.
- Non-silent mutations uses annotation from the CCLE database in which all mutations on a given gene irrespective of if the mutation is a hotspot mutation or not or the gene is a known driver are considered.
While using the last two option, users must consider that not all mutations will have the same effect on essentiality.
DRUG RESPONSE LEVEL: Use this tab to correlate the response of selected cell lines to a drug and the essentiality of a given gene. The drugs are grouped according to their targets as annotated from Genomics of Drug Sensitivity in Cancer (CancerRxGene). Drug sensitivity is represented as normalised Z-scores. A negative Z-score implies a higher sensitivity of the cell line towards that drug.

NETWORK ANALYSIS:

This tab contains all the pre-calculated associations for mutation, tissue/cancer, and expression related associations. The different sidebar options enable users to subset the network.

Basic Options

Project:

All calculations for the different projects are done separately. Choose between the calculations made from SANGER or BROAD projects.

Generate CEN centered around:

Three types of context-specific essentiality networks (CENs) are available to generate:

Tissue - Use this option to generate a tissue-centric network. The tissue of choice will be the central nodes. All the edges of this network correspond to the chosen tissue. If all is chosen, the network will be very big so users will have an option to only view the expression/tissue network. To activate this function - users must choose 'Expression' option from Show networks.
Cancer type - Use this option to view a cancer type-centric network. This option is very similar to Tissue: CEN as Cancer types are sub-divisions of tissues. This is especially relevant for cell lines of Nervous system and Haematopoietic/Lymphoid lineage.
Gene - Use this option to view a gene-centric network. The edges in this network are not specific to a particular tissue or a cancer type. However, an additional option on the main panel 'Load current network edges groups:“ will allow users to choose edges corresponding only to a particular tissue/cancer type. Users can also use the toggle button to switch between visualising tissues or cancer type connections.

Display effector edges corresponding to:

Three options are available here 'Tissue/Cancer', 'Expression', and 'Mutation'. These are the contexts that are being tested and displayed in different colors in the results CEN.

Advanced Options

These advanced options allow users to optimise the way the networks are visualised. By default the advanced options are disabled - use the toggle button to enable the options.

show edges corresponding to options:

For each context that has been selected in the basic option of Display effector edges corresponding to, there are options to select the type of edges to visualise. Users can choose to either display edges that correspond to increase or decrease in essentiality or expression or display both.

Adjusting median of the essential group options:

The essentiality score was scaled in a way that that essential genes have a median log2 fold change of 1 and non‐essential genes a median log2 fold change of 0.

When testing for significant increase in essentiality, there are sliders available to adjust the median of the essentiality group to ensure that the tested group has high essentiality. This is only relevant for group-wise comparison and there are two separate option for adjusting this for either 'Tissue/Cancer' option or 'Mutation' comparison.

Perform context comparisions:

This option is only activated when Expression and Mutation options are selected in the 'Choose effector edges to display' option.

Use the Within a tissue of origin/cancer type option to perform all statistical tests within the tissue type. For example: Essentiality of BRAF in BRAF mutant cells of Skin. Use the Pancancer option to perform statistical tests using all available cell lines with a given context. For example: Essentiality of BRAF in all BRAF mutant cells.

Confidence level:

Select confidence thresholds. The levels are defined as follows:

For categorical comparison (tissue, cancer type, mutation):
- Level 1: 0.01 < p-value < 0.05
- Level 2: 0.001 < p-value < 0.01
- Level 3: 0.0001 < p-value < 0.001
- Level 4: p-value < 0.0001
For Pearson correlation comparison (expression) (p<0.05 &):
- Level 1: r < 0.5
- Level 2: 0.5 < r < 0.6
- Level 3: 0.6 < r < 0.65
- Level 4: r > 0.7

Group selection:

Users have an additional option to select confidence based on the number of samples that were in each group when performing the test. Group A calculations contain more than 5 samples per group, whereas selecting 'All' will also select less-restrictive tests with only 3 or more samples per group. Tests with low number of samples per group should be interpreted with caution as such tests are less reliable.

INTERACTIVE CELL LINE SELECTOR:

The interactive cell line selector allows users to select cell lines according to their choice. The option to select this tab will only be active if Advanced selection option is chosen in the CONTEXT ANALYSIS tab. Users will be prompted to 'Launch the interactive cell selector'. Upon clicking the button a new tab with the following options will open.

Show cell lines with mutations from: Cell lines can be selected based on their mutational status. Four options are available:
- Hotspot mutation
- Oncogenic mutation
- Input genes: This option can be used to select cell lines based on any gene of choice which contains any non-silent mutations. Multiple gene inputs are allowed, separated by a ', '. For example BRAF, KRAS.
- No selection based on mutation: Use this option to view all cell lines regardless of the mutational status. This is the default option. If genes with mutations are selected, users can further restrict the displayed cell lines, by whether they should contain a mutation in any ('OR') or all ('AND') selected genes.
Selection based on Copy Number Variation (CNV)?: Users have the choice to select cell lines depending on the copy number variations of selected genes. When 'Yes' is selected, a slider will appear and users can adjust the slider to restrict the choice of genes that have a CNV within the selected range in the available cell lines.
Cell culture growth properties: Users can use this option to select the cell culture conditions of the cells. This information was obtained from Cell model passports.
Genome stability: This option allows users to select the microsatellite instability (MSI)/ microsatellite stable (MSS) status of the cell lines. This information was obtained from Cell model passports.
Tissue and Cancer type: Users may use this option to restrict the selected cell lines so far to a particular tissue type or cancer type.
Choose output: This option allows users to view the data as a table or as a t-sne plot. The t-sne plot is pre-built based on the gene-expression datasets. Within the t-sne plot individual points denote cell lines and users are able to select of deselect any point of interest by clicking on them or drag a window around a number of cell lines.
Choose a name for your selection: Once selections are made, users MUST provide a name for their selected group (e.g. "BRAF_mut_Skin”) to reflect their selection. This name will appear in the plots in the CONTEXT ANALYSIS tab. Once the name is given press Click here to confirm you selection, which will automatically direct users to the context analysis tab.

INTEGRATION WITH PPI NETWORK:

The integration with the PPI networks allows users to map the CENs with a protein-protein interaction network from STRING. This option to select this tab will only be active once 'Map the current nodes to a PPI network' option in the Network Analysis tab is pressed. Upon pressing the button a new tab with the following options will open.

Display Network from STRING: There are two choices for this option:

Protein-Protein interactions for individual proteins imported from the CEN: Upon selecting this option. The nodes from CEN represented in the Network analysis page will be shown in the dropdown menu under 'Explore PPI network of'. Select protein of interest from this list, after which a purple button with 'Retrieve PPI partners of the selected protein from STRING' will appear. Press this button to visualise the PPI network of the selected individual protein.
Map CEN onto STRING network: Using this option, the users can map all the genes that the CEN onto the STRING PPI network and perform enchriment analysis. Two further options will be enabled upon selecting this option.
- Show CEN-mapped STRING network. After selecting this option a purple button with 'Retrieve PPI network of CEN from STRING' will appear. Press this button to display the mapped network. This network will only contain nodes that are represented in CEN.
- Perform enrichment analysis- This option allows users to perform different types of enrichments. Two further options are available.
  - Choose the enrichment category to display: Three enrichment options are available: Function (Protein function), KEGG (KEGG pathways), and Process (GO-process).
  - Choose enrichments FDR cut-off: Three options (FDR 0.1, 0.05, 0.01) are available.

Adjust STRING interaction score:

This is the interaction score from STRING. Users can choose a number between 0 and 1000 (default used in CEN-tools is 400).

Hide disconnected nodes:

Users can choose to hide the disconnected nodes from STRING using this toggle button. Hiding disconnected nodes is the default option.

Worked examples

The below examples explain the usage of the cell line selector and the generation of context essentiality networks (CENs). This examples can be used to reproduce the figures in the paper Sharma et al. 2020.

Generation of CENs
- Example 1: Extraction of lineage CENs (Figure 2B)
- Example 2: Extraction of skin-specific CENs (Figure 2C)
Using the cell line selector

Generation of CENs

Example 1: Extraction of lineage CENs (Figure 2B)

Lineage-specific CENs were extracted by subsetting the BROAD project using 'Expression' as the effector. The entire network was then subsetted to contain only the interactions of this gene set. To download this network and the node attributes from the CEN-tools website the following steps were taken:

Navigate to the 'NETWORK ANALYSIS' tab.
Choose the following parameters:
- Basic parameters:
  1. Project: BROAD
  2. Generate CEN centered around: Cancer type
  3. Cancer type: All
  4. Toggle Expression-specific
  5. Display effector edges corresponding to: Select all (Mutation, Expression, Tissue/Cancer)
- Advanced edge filter options: First toggle to enable advanced options. Then make the following selections:
  1. For Expression context correlations show edges corresponding to: Positive correlations
  2. For Tissue/Cancer context comparisons show edges corresponding to: Increase in essentiality/expression
  3. For Mutation context comparisons show edges corresponding to: Both Increase and decrease in essentiality/expression
  4. Select mutation annotations: Hotspot mutation
  5. Only show tissue/cancer edges in which the median essentiality score of the essential context is higher than: 0.3
  6. Only show mutation edges in which the median essentiality score of the essential context is higher than: 0.4
  7. Perform context comparisons: Within a tissue of origin/cancer type
  8. Confidence level of association (Tissue/Cancer type): 1:Low
  9. Confidence level of association (Mutation): 2:Medium
  10. Select: Group A
If this is the first network being made, press 'Initialise network' on the top left corner.
Download the network and the node attributes by clicking the download buttons in the left sidebar. Open the network file in Cytoscape.

Example 2: Extraction of skin-specific CENs (Figure 2C)

Skin-specific CENs were extracted from the ‘BROAD’ project. To download this network and the node attributes from the CEN-tools website the following steps were taken:

Navigate to the 'NETWORK ANALYSIS' tab.
Choose the following parameters:
- Basic parameters:
  1. Project: BROAD
  2. Generate CEN centered around: Tissue
  3. Tissue of Origin: Skin
  4. Display effector edges corresponding to: Select all (Mutation, Expression, Tissue/Cancer)
- Advanced edge filter options: First toggle to enable advanced options. Then make the following selections:
  1. For Expression context correlations show edges corresponding to: Positive correlations
  2. For Tissue/Cancer context comparisons show edges corresponding to: Increase in essentiality/expression
  3. For Mutation context comparisons show edges corresponding to: Increase in essentiality/expression
  4. Select mutation annotations: Hotspot mutation
  5. Only show tissue/cancer edges in which the median essentiality score of the essential context is higher than: 0.2 (can be adjusted depending on the threshold required).
  6. Only show mutation edges in which the median essentiality score of the essential context is higher than: 0.4 (can be adjusted depending on the threshold required).
  7. Perform context comparisons: Within a tissue of origin/cancer type
  8. Confidence level of association (Tissue/Cancer type): 1:Low
  9. Confidence level of association (Mutation): 2:Medium
  10. Select: Group A
Press Initialise network. The network will have too many nodes to be displayed but will still be generated (you will see a warning).
Download the network and the node attributes by clicking the download buttons in the left sidebar. Open the network file in Cytoscape.
For visualization, only display nodes with 'TF' attribute and any other nodes directly associated with these 'TF' nodes.

Using the cell line selector

Example 1: Investigating paralog dependency (Appendix Figure S8A)

The association of RPL22 mutation and the essentiality of its paralog RPL22L1 was investigated in the 'BROAD' project. The following steps were taken:

Navigate to the 'Context analysis' tab and 'Tissue/Coessentiality' subtab.
On the left menubar select:
1. Project: BROAD
2. 'Advanced selection' and then 'Interactive Cell Line Selector'
Hit the 'Launch the interactive Cell Line Selector' button. You will be redirected to a new tab. Wait a little while until the interface is fully loaded.
On the left menubar choose:
1. Show cell lines with mutations from: Input genes
  1. Type RPL22 in the appearing text box
  2. Hit the 'Submit' button
  3. Should cell lines contain a mutation in all ('AND') or at least one ('OR') of the above chosen genes?: OR
2. Selection based on Copy Number Variation (CNV)?: No selection based on CNV
3. Cell culture growth properties: All
4. Genome stability: All
5. Tissue of origin (multiple selection allowed): All
6. Cancer type (multiple selection allowed): All
7. Colour cell lines by: Tissue
8. Choose a name for your selection: e.g. 'RPL22 non-silent mutation'
Hit the 'Click here to confirm your selection' button. You will be redirected to to the 'Context analysis' tab
On the left menubar select:
1. Start typing to select gene: RPL22L1
2. Cells not chosen will be labelled as 'Pancancer', do you wish to subset Pancancer list by tissue type?: No

Example 2: Investigating essentiality based on CNV status (Appendix Figure S8B)

The association of ERBB2 amplification with its essentiality in breast and esophagus cell lines was investigated in the 'BROAD' project. The following steps were taken:

Navigate to the 'Context analysis' tab and 'Tissue/Coessentiality' subtab.
On the left menubar select:
1. Project: BROAD
2. 'Advanced selection' and then 'Interactive Cell Line Selector'
Hit the 'Launch the interactive Cell Line Selector' button. You will be redirected to a new tab. Wait a little while until the interface is fully loaded.
On the left menubar choose:
1. Show cell lines with mutations from: No selection based on mutation
2. Selection based on Copy Number Variation (CNV)?: Yes
  1. Wait until you see a black box below the slider. This might take a while.
  2. Choose a range of relative copy numbers to restrict the choice of genes with CNV: 1.1-7.22
  3. Pick 1 or more genes, whose relative CN lies within the selected range: ERBB2
  4. Should cell lines contain a CNV in all ('AND') or at least one ('OR') of the above chosen genes?: OR
3. Cell culture growth properties: All
4. Genome stability: All
5. Tissue of origin (multiple selection allowed): Breast, Esophagus
6. Cancer type (multiple selection allowed): All
7. Colour cell lines by: Tissue
8. Choose a name for your selection: e.g. 'ERBB2 amplification'
Hit the 'Click here to confirm your selection' button. You will be redirected to to the 'Context analysis' tab
On the left menubar select:
1. Start typing to select gene: ERBB2
2. Cells not chosen will be labelled as 'Pancancer', do you wish to subset Pancancer list by tissue type?: Yes
  - Subset Pancancer to a specific tissue of origin: Breast, Esophagus

Example 3: Investigating essentiality based on microsatellite instability status (MSI) (Appendix Figure S8C)

The association of microsatellite instability (MSI) with the essentiality of the WRN helicase in colorectal cell lines was investigated in the 'SANGER' project. The following steps were taken:

Navigate to the 'Context analysis' tab and 'Tissue/Coessentiality' subtab.
On the left menubar select:
1. Project: SANGER
2. 'Advanced selection' and then 'Interactive Cell Line Selector'
Hit the 'Launch the interactive Cell Line Selector' button. You will be redirected to a new tab. Wait a little while until the interface is fully loaded.
On the left menubar choose:
1. Show cell lines with mutations from: No selection based on mutation
2. Selection based on Copy Number Variation (CNV)?: No selection based on CNV
3. Cell culture growth properties: All
4. Genome stability: MSI
5. Tissue of origin (multiple selection allowed): Colon/Rectum
6. Cancer type (multiple selection allowed): All
7. Colour cell lines by: Tissue
8. Choose a name for your selection: e.g. 'MSI in colorectal'
Hit the 'Click here to confirm your selection' button. You will be redirected to to the 'Context analysis' tab
On the left menubar select:
1. Start typing to select gene: WRN
2. Cells not chosen will be labelled as 'Pancancer', do you wish to subset Pancancer list by tissue type?: Yes
  - Subset Pancancer to a specific tissue of origin: Colon/Rectum

Welcome to CEN-tools

Overview

Available datasets

Further classification of core-essential genes (Core classification)

Advanced options