LE Palmer, AL O'Shaughnessy, RR Preston, L Santos, VS Balija, LU Nascimento, TL Zutavern, PS Henthorn, GJ Hannon, WR McCombie
We have initially sequenced approximately 8,000 canine expressed sequence tags (ESTs) from several complementary DNA (cDNA) libraries: testes, whole brain, and Madin-Darby canine kidney (MDCK) cells. Analysis of these sequences shows that they provide partial sequence information for about 5%-10% of the canine genes. An analysis pipeline has been created to cluster the ESTs and to map individual ESTs as well as clustered ESTs to both the human genome and the human proteome. Gene ontology (GO) terms have been assigned to the ESTs and clusters based on their top matches to the International Protein Index (IPI) set of human proteins. The data generated is stored in a MySQL relational database for analysis and display. A Web-based Perl script has been written to display the analyzed data to the scientific community.