ATL Lun, S Riesenfeld, T Andrews, TP Dao, T Gomes, participants in the 1st Human Cell Atlas Jamboree, JC Marioni
Droplet-based single-cell RNA sequencing protocols have dramatically increased the throughput of single-cell transcriptomics studies. A key computational challenge when processing these data is to distinguish libraries for real cells from empty droplets. Here, we describe a new statistical method for calling cells from droplet-based data, based on detecting significant deviations from the expression profile of the ambient solution. Using simulations, we demonstrate that EmptyDrops has greater power than existing approaches while controlling the false discovery rate among detected cells. Our method also retains distinct cell types that would have been discarded by existing methods in several real data sets.