Functional genomics of breast cancer

We have redefined breast cancer as a constellation of 10 genomic driver-based subtypes. This new molecular taxonomy of breast cancer will now be translated into the clinic in stratification, tumour monitoring and therapy studies. In parallel we will continue to develop models to characterize the biology of these subtypes.

Translational breast cancer genomics: applications of molecular profiling in prognosis, prediction and novel therapeutics

The genomic landscapes of breast cancer are dominated by somatic copy number alterations (CNAs), which further supports the significance of the novel molecular taxonomy of breast cancer. We have completed the targeted sequencing at 250-500x depth of 180 genes in all 2,000 cases (and an additional 500) used for the discovery of the 10 integrative clustering subtypes. This reveals distinct patterns of SNVs (single nucleotide variants) across the 10 integrative clusters (Figure 1).

Figure 1: Word clouds illustrating the distributions of mutations in 178 genes in four Integrative Clusters (IntClusts) enriched for ER positive tumours.  In each panel, the size of each word corresponds to the relative frequency of mutations observed in a given gene for that IntClust.

We are now integrating the CNA, SNV and expression data to identify all the drivers across the subtypes. We will also define the patterns of pathway disruption, clonal architecture and clinical correlation. Our aim is to define a minimal set of features that can be used to generate a simple molecular test, ideally performed using routinely collected paraffin-embedded tumour blocks, that can be used prospectively to assign any new tumour to one of the subtypes and identify the driver mutations of the particular tumour. This test can be used for stratification, to design patient-specific tumour monitoring, and ultimately to assign the best therapy to the particular patient.

We have also conducted seminal studies in metastatic breast cancer that have shown that cell-free circulating tumour DNA (ctDNA) in plasma is a better tumour burden biomarker than circulating tumour cells (CTCs). Rises in ctDNA often precede radiological tumour progression by a few months, the dynamic range of ctDNA is several-fold better than CTCs, and the ability to quantify different mutations affords the possibility of non-invasive clonal tracking. In patients with high tumour burden cancer exomes can be directly sequenced in ctDNA in plasma and provide a true liquid biopsy to identify mutations associated with resistance. We are now studying several cases for which we have ctDNA, primary and metastatic tumour biopsies, to characterize the clonal architecture in these unique samples. We have also pilot data that shows promise for ctDNA as an early response biomarker in neoadjuvant therapy and plan to start studies looking at its value as an early relapse biomarker.

We have continued our collaborative efforts to develop image analyses algorithms adapted from astronomy to robustly classify tumour cells, lymphocytes and stromal cells in histological sections of primary human breast cancers. Once this is done individual cells can be treated as objects and both homotypic and heterotypic spatial correlations characterized and integrated with other pathological, biological or clinical features (Figure 2). Our aim is to use these spatial correlations as potential classifiers with prognostic and predictive value. We are also exploring image-processing algorithms to analyse higher-level architectural features as new descriptors of different breast cancer subtypes.

Figure 2: Astronomical image-processing deciphers morphological complexity. Diagram illustrating the process by which spatial correlations are estimated between classified cells. Panel A: A standard histopathological image (left) is subjected to image-processing including detection and classification of cell nuclei utilising a pathologist-trained dataset and k-nearest-neighbour classifier (right). The positions of cells classified as cancer cells (red), stromal cells (beige) and lymphocytes (green) are used to estimate auto-correlation (relationship of each cell type to cells of the same cell type) and cross-correlation (relationship of different cell types to each other). Panel B: Line plots depict the pattern of these correlations according to the pixel-distance from each detected cell. Statistics for calculating correlations are borrowed directly from astronomical applications.

We have used our unique tissue microarray (TMA) resource to look at the association between T-cell infiltration and breast cancer survival in 12,439 patients. This has shown that the presence of CD8+ T-cells in breast cancer is associated with a significant reduction in the relative risk of death from the disease in both the ER-negative and the ER-positive HER2-positive subtypes. We have also conducted the largest study of AR as a prognostic marker in breast cancer.

Collaborators: Sam Aparicio (University of British Columbia), Simon Tavaré, Jason Carroll and Florian Markowetz (CRUK CI), Paul Pharoah (Strangeways Research Laboratory), Mike Irwin and Nick Walton (Institute of Astronomy), Richard Baird and Helena Earl (Department of Oncology).

Functional breast cancer genomics: characterising tumour initiating/cancer stem cells in breast cancer subtypes

We have generated a collection of patient-derived tumour xenografts (PDTXs), and have imported others from collaborators, aiming at having a representation of the 10 breast cancer subtypes. All of these PDTXs are being extensively characterized with  whole genome/whole exome sequencing, expression profiling, miRNA profiling and whole genome/reduced-representation bisulfite sequencing. Importantly for all the models generated we have matched normal DNA, and the originating primary tumour or metastasis where a similar molecular profiling is performed. We have now optimised protocols to derive viable single-cell suspensions from each xenograft, which we designate as patient-derived tumour cells (PDTCs). These PDTCs can be used 24 hours after collection for in vitro perturbation (TGFβ exposure, miRNA over-expression, shRNA or RNAi), but crucially also for high-throughput in vitro drug screening. In collaboration with the Sanger Institute we have now conducted pilot experiments screening around 100 compounds in 17 distinct PDTCs, establishing proof of principle that eventually we will be able to completely replace existing cell lines for these experiments. Our aim is to correlate the drug responses/resistance with the molecular profiles of the PDTCs/PDTXs and hence identify novel predictive biomarkers for translational use.

Collaborators: Sam Aparicio (University of British Columbia), Mathew Garnett and Ultan McDermott (Sanger Institute).