AP Morgan, JM Holt, RC McMullan, TA Bell, AM-F Clayshulte, JP Didion, L Yadgary, D Thybert, DT Odom, P Flicek, L McMillan, FP-M de Villena
Gene duplication and loss are major sources of genetic polymorphism in populations, and are important forces shaping the evolution of genome content and organization. We have reconstructed the origin and history of a 127-kbp segmental duplication, R2d, in the house mouse (Mus musculus). R2d contains a single protein-coding gene, Cwc22 De novo assembly of both the ancestral (R2d1) and the derived (R2d2) copies reveals that they have been subject to nonallelic gene conversion events spanning tens of kilobases. R2d2 is also a hotspot for structural variation: its diploid copy number ranges from zero in the mouse reference genome to >80 in wild mice sampled from around the globe. Hemizygosity for high copy-number alleles of R2d2 is associated in cis with meiotic drive; suppression of meiotic crossovers; and copy-number instability, with a mutation rate in excess of 1 per 100 transmissions in some laboratory populations. Our results provide a striking example of allelic diversity generated by duplication and demonstrate the value of de novo assembly in a phylogenetic context for understanding the mutational processes affecting duplicate genes.