Site identification in high-throughput RNA-protein interaction data.

Abstract:: MOTIVATION: Post-transcriptional and co-transcriptional regulation is a crucial link between genotype and phenotype. The central players are the RNA-binding proteins, and experimental technologies [such as cross-linking with immunoprecipitation- (CLIP-) and RIP-seq] for probing their activities have advanced rapidly over the course of the past decade. Statistically robust, flexible computational methods for binding site identification from high-throughput immunoprecipitation assays are largely lacking however. RESULTS: We introduce a method for site identification which provides four key advantages over previous methods: (i) it can be applied on all variations of CLIP and RIP-seq technologies, (ii) it accurately models the underlying read-count distributions, (iii) it allows external covariates, such as transcript abundance (which we demonstrate is highly correlated with read count) to inform the site identification process and (iv) it allows for direct comparison of site usage across cell types or conditions. AVAILABILITY AND IMPLEMENTATION: We have implemented our method in a software tool called Piranha. Source code and binaries, licensed under the GNU General Public License (version 3) are freely available for download from http://smithlab.usc.edu. CONTACT: andrewds@usc.edu SUPPLEMENTARY INFORMATION: Supplementary data available at Bioinformatics online.

Authors:: PJ Uren, E Bahrami-Samani, SC Burns, M Qiao, FV Karginov, E Hodges, GJ Hannon, JR Sanford, LOF Penalva, AD Smith

Themes