Visit the Balasubramanian lab website

Chemical biology of nucleic acids

Recent advances in understanding nucleic acid function have shown that alternative secondary structures and the chemical modification of nucleotide bases have key regulatory roles in diverse cellular processes, from transcription and translation to cell division and genome stability.

Genetic information is carried not only by the sequence of nucleic acids, but also their secondary structures and chemical modifications. For example, guanine-rich sequences can form stable four‑stranded structures called G-quadruplexes (G4s), while certain cytosine bases in DNA can become methylated. We hypothesise that such alternative structures, or chemical modifications, have critical functions in normal cells and cancer. By identifying where base modifications and G4 structures are located in the cancer cell genome, and through the application of synthetic small molecules that selectively target G4 structures, we aim to understand the oncogenic process and develop novel approaches for potential use in treatment and diagnosis of cancer. We are also exploring new strategies to target the DNA binding activity of FOXM1, a key cancer-related transcription factor involved in cell cycle control.


In guanine (G)-rich regions, G bases can adopt stable arrangements to form four-stranded G4 structures comprising stacked G-tetrads (Figure 1).

Figure 1: G4 formation mediated by Hoogsteen hydrogen bonding between four guanines and co-ordinated by a central metal cation (left).  Stacked G-tetrads in an intramolecular G4 (right) (see Balasubramanian et al, Nat Rev Drug Discovery 2011; 10: 261).

Sequences with potential to fold into a G4 structure are common in the human genome and many are found in cancer-related genes such as KIT, RAS and SRC. G4s are implicated in biological processes ranging from chromosome stability to the regulation of gene transcription, and we seek to understand their biological function and validity as drug targets.

We are identifying where in the genome G4s form and their regulation in cancer phenotypes. By synthesising small chemical probes and engineering antibodies to recognise G4 structures with high specificity and affinity, we have visualised G4 formation G4 in the nuclei of cancer cells (Figure 2).

Figure 2: The G4 stabilising ligand, pyridostatin, leads to DNA damage at oncogenes including SRC. SRC gene structure is shown below (black).  The sites of DNA damage, indicated by the γH2AX marker, before (Unt) and after pyridostatin treatment (1) are shown above.  Predicted G4 sequences (PQS) in the SRC gene are indicated in purple (see Rodriguez et al., Nature Chem Biol. 2012; 8: 301).

Using chromatin immunoprecipitation and next‑generation sequencing (ChIP-Seq), we have localised G4 structures genome-wide in human DNA and shown that expression of G4-containing genes is modulated by small molecule ligands. Indeed, our G4-binding small molecule, pyridostatin (PDS), imparts growth arrest of human cancer cells through the activation of a DNA damage response in which sites of DNA damage localise to several oncogenes, including SRC (Figure 2). PDS causes down-regulation of SRC expression and inhibition of SRC-mediated cellular motility. Furthermore, we have found that G4 DNA is a molecular target for synthetic lethality of cancer cells since PDS acts synergistically when DNA repair pathways are inhibited or mutated. This work provides a novel framework for defining functional drug-DNA interactions and their potential use in cancer therapies.

During cell division, chromosome ends (telomeres) are protected from damage by recruitment of a protective protein complex called shelterin. Telomeres can form stable G4 structures in vitro and we have now used our antibody probe to demonstrate their presence at telomeres and across the genome in human cancer chromosomes (Figure 3).

Figure 3: Detection of G4 structures (red) in the nuclei (circles) of breast cancer cells using an engineered structure-specific antibody before (A) and after (B) stabilisation with pyridostatin.  C) Detection of G4 structures (red) in breast cancer metaphase chromosomes (blue), the arrow indicates localisation to the telomere.

As 85% of primary tumours show increased expression of the telomere maintenance enzyme telomerase, targeting telomeres may lead to cancer cell death. We have found that PDS treatment releases shelterin from telomeres leading to DNA damage and cell death. Telomeres are also known to be actively transcribed into a G4-containing telomeric RNA called TERRA. While TERRA forms stable G4s in vitro, it is not known whether this holds true in vivo. We have now demonstrated that shelterin proteins bind to TERRA via the G4 structure, and we are aiming to understand how this influences telomere function.

Predicted G4 structures are common in RNA, and their position suggests they have key roles in RNA biology. We have shown in vitro that a conserved G4 motif in the NRAS oncogene 5’-UTR modulates protein translation, and can be targeted by small molecule ligands. Despite this, the functional relevance of G4 structures is not known in vivo, nor whether G4 structures normally form in RNA. Recently, using our antibody probes, we have provided evidence for the presence of RNA G4s in the cell cytoplasm. We have also used a ‘click-chemistry’ procedure to synthesise small molecules that are selective for RNA over DNA G4s and used these in cells to selectively stabilise cytoplasmic RNA G4s. We are now using such chemical biology tools, together with genome-wide approaches, to identify and map RNA G4s within the transcriptome and to understand their roles in RNA biology in cancer cells.

Epigenetics and Modified Bases

We are interested in understanding chemical modifications to DNA and the effect of such changes to the structure and function of DNA. DNA is made up of four bases – cytosine, guanine, adenine and thymine. However, these bases can naturally undergo chemical modification leading to new bases. Changing one of the bases in a strand of DNA in this way alters its property and function by controlling how the sequence is interpreted. This can affect how genes are switched on and off in different cell types, tissues and organs.

The modified base 5-methylcytosine (5mC) is a well-known epigenetic mark that can regulate transcription of the genome. Since 2009 three further modified bases have been detected in the mammalian genome. These are the TET-enzyme generated bases; 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). The presence of these modifications opens up questions as to their function in normal cellular biology and disease states.

We are developing chemical tools and genomic methods to map and elucidate the function of these modified bases (Figure 4).

Figure 4: Oxidative bisulfite sequencing of 5mC and 5hmC. By oxidising 5hmC to 5fC followed by bisulfite treatment, C and 5hmC are read as T while 5mC is read as C. Without oxidation mC and 5hmC are read as C following bisulfite treatment, thus 5mC can be determined by subtraction (see Booth et al., Nat Protoc. 2013; 8: 1841)

We are also exploring the molecular basis for their involvement in biological mechanisms.  Part of this work exploits state of the art genomics technologies. We have already created methods to quantitatively sequence 5mC, 5hmC and 5fC at single-base resolution. Such tools allow much more accurate study of these epigenetic marks.

The scope of our work will also include the identification, mapping and elucidation of the biological function of other base modifications in the DNA and RNA of various organisms.