The control and evolution of tissue-specific gene expression

The proteins that control DNA, known as transcription factors (TF), bind to it in a combinatorial manner in yeast and bacteria, and our earliest work showed that this combinatorial binding occurs in mammalian tissues as well. Master regulators in primary human hepatocytes form a highly interconnected core circuitry that frequently bind promoter regions in clusters, particularly at highly regulated and transcribed genes (Odom et al., Mol Syst Biol 2006; 2: 2006.0017). We have recently found that transcriptional regulation diverges very rapidly in mammals (Schmidt et al., Science 2010; 328: 1036; Odom et al., Nat Genet 2007; 39: 730). Despite this evolution, we found specific genetic architectures that appear to preserve a small handful of transcription factor binding events across large evolutionary timescales (>300 million years) (Schmidt et al., Science 2010; 328: 1036). Even at closely related evolutionary distances, such as closely related inbred strains of mice, TF binding diverges with surprisingly greater speed than do the underlying genetic sequences (Figure 1) (Stefflova et al., Cell 2013; 154: 530).

Figure 1: Evolution of transcriptional regulation.

In asking why rapid variation occurs among most transcription factor binding events, we realised that a number of causative factors could contribute. These possible causes may be the result of variability of genetic sequences, the types and number of marks left in the histone proteins that package DNA (commonly thought of as an epigenetic code), or even diet or environmental differences between different species. In order to isolate a single one of these variables, we used a previously created mouse model of Down’s syndrome that carries a virtually complete copy of a human chromosome (O’Doherty et al., Science 2005; 309: 2033). By exploiting this aneuploid mouse strain, a unique and powerful genetic tool designed for an entirely different purpose, we determined that genetic sequence dominates other factors in directing transcription (Wilson et al., Science 2008; 322: 434). More recently, we have used this mouse to investigate how human-specific repetitive elements contain latent regulatory potential that is unmasked in a mouse heterologous environment (Ward et al., Mol Cell 2013; 49: 262).

Evolution of transcriptional regulation

In the last five years, our lab has shown that relationships between transcription factor functionality and conservation of transcription factor binding in vertebrates are complex and difficult to predict (Ballester et al., 2014; Schmidt et al., 2010b; Stefflova et al., 2013).  Previously we have shown that DNA sequence variation is the ultimate driver of regulatory evolution by using an existing mouse model of Down’s syndrome carrying human chromosome 21 to place human genetic sequence into mouse diet, lifestyle, epigenetic machineries, developmental processes, and nuclear concentration of transcription factors (Wilson et al., 2008).  Our more recent exploration using the Tc1 mouse has afforded insight into the co-evolution of species-specific repeats and the control mechanisms to restrain them (Ward et al., 2013).  Our laboratory has recently reported the first large-scale analysis of mammalian enhancer evolution using functional genomics approaches to profile regulatory activity in tissues from twenty species of mammals (Villar et al., 2015).

Non-coding RNAs – tRNAs and lncRNAs

Our work has broadened our understanding of the evolution and control of the noncoding genome. For example, by investigating how RNA polymerase III controls tRNAs in multiple mammals, we have discovered that the polymerases responsible for gene expression appear to be under constraint at the level of their transcripts (Kutter et al., 2011), the mechanisms for which we are continuing to investigate (Schmitt et al., 2014). In addition, our work has shown how the rapid birth and death of lncRNAs strongly influences transcription of nearby genes (Kutter et al., 2012).  Alongside this, we have recently shown that the nuclear lncRNA GNG12-AS1 regulates the tumour suppressor DIRAS3 but also independently has an effect on MET signalling and cell migration (Stojic et al., 2015).

Figure 2: Non-coding RNAs – tRNAs and lncRNAs.

CTCF and Cohesin

We have also explored how repetitive element expansions have been actively remodelling the genomes of most mammalian lineages for hundreds of millions of years by carrying CTCF binding into tens of thousands of new locations (Schmidt et al., 2012), and what genomic features dictate the conservation of CTCF binding in primates (Schwalie et al., 2013). In addition, we have explored how TF binding evolution and gene expression appear to be evolutionarily decoupled (Wong et al., 2015) and the roles that cohesin can play in connecting TF binding in enhancers with their target proximal promoters (Faure et al., 2012; Merkenschlager and Odom, 2013; Schmidt et al., 2010a).

Mechanisms connecting transcription and cancer

Our lab is currently undertaking a systematic, large-scale project to dissect how tissue-specific regulatory networks and epigenetics help guide the evolution of liver cancer.

Figure 3: Mechanisms connecting transcription and cancer.