The Bioinformatics Core facility offers advice on experimental design and statistics and provides training and support for data processing and analysis, working collaboratively with colleagues in CRUK CI research groups and other core facilities.
An important role of the Bioinformatics Core is to provide training to CRUK CI scientists and, working in partnership with the University’s Bioinformatics Training Facility, we offer a number of classroom-based training courses with an emphasis on hands-on, practical-based learning.
Bioinformatic and statistical data analysis
The team has considerable experience in analysing datasets generated by high-throughput technologies and supports research projects that employ a range of experimental approaches:
RNA-seq for differential gene expression between sample groups, e.g. treated vs. untreated cell lines, and single cell RNA-seq for identification and characterization of sub-populations of cells
ChIP-seq to investigate how transcriptional regulation is altered in cancers by locating binding sites of regulatory proteins and determining which of these are differentially bound under different conditions
Whole genome, exome and amplicon sequencing to explore variation in cancer genomes, identifying single nucleotide variants, small insertions/deletions, copy number aberration and genomic rearrangements
Quantitative proteomics including the analysis of tandem mass tag (TMT) mass spectrometry data to look at changes in protein abundance following treatment or a perturbation
Statistical analysis of a wide range of data types including application of mixed models for tumour growth curve data and survival analysis comparing treatment groups or disease subtypes
Interpretation of CRUK CI datasets and research findings in the context of publicly available data, including genomic feature annotations, sequence motifs, pathways and survival data
The Bioinformatics Core has developed a number of software packages and tools for analyzing and visualizing the data sets we work with. These include R and Bioconductor packages, analysis pipelines developed using Nextflow and interactive web applications written using the R Shiny framework.
Data processing and analysis infrastructure
The Bioinformatics Core is actively involved in processing increasingly large volumes of genomics data and develops analysis pipelines to run these efficiently on the CRUK CI high-performance compute cluster.
We work closely with the Genomics Core and support the Illumina sequencing operation with automated data processing and QC, LIMS deployment and extension for bespoke laboratory workflows, and delivery of sequence data to 12 partner institutes and University departments.
Supporting other core facilities
We are providing support to several of the other core facilities at CRUK CI.
Genomics – managing and delivering the data coming off the Illumina sequencers since early 2008 with the Genome Analyser I to the current NovaSeq 6000
Proteomics – mass spectrometry results database and query interface
Pre-Clinical Genome Editing – CRISPR guide design and clone selection tools
Flow Cytometry – mass cytometry image extraction and reconstruction