New tool allows for full-length reconstruction of RNA molecules in individual cells


By Sarah Small

UNIVERSITY PARK, Pa. — A new computational tool known as Scallop2 allows for accurate assembly of full-length RNA molecules at single-cell resolution, according to Penn State researchers. The results of this work were recently published in Nature Computation Science. 

Researchers measure the expressed RNA molecules, which are products of genes, to investigate which genes might be involved in different biological processes in health or disease. Existing single-cell RNA-sequencing (scRNA-seq) technologies can provide fragments of the RNA molecules in individual cells, but not their full-length sequences, according to the corresponding author Mingfu Shao, Charles K. Etner Early Career Assistant Professor of Computer Science and sEngineering at Penn State. Because of this, computational methods are needed to reconstruct the full-length RNA molecules from their fragments produced by an scRNA-seq protocol. 

Scallop2 is able to make use of the barcode information attached to fragments enabled by Smart-seq3 — an scRNA-seq protocol that generates fragments and assigns the same barcode to a subset of fragments originating from the same RNA molecule — to improve assembly accuracy. Scallop2 models the barcode information and deploys a new algorithm to assemble the full-length sequences from their fragments. According to Shao, this is the first time transcript assembly for barcode-enabled RNA-seq data has been abstracted and had novel algorithms, which are a key part of the new computational tool, designed for this purpose. He said that by allowing for the identification and quantification of RNA molecules at single-cell resolution, Scallop2 could facilitate a variety of fundamental biological and biomedical research, from studying gene functions to determining cell development and cancer progression to identifying biomarkers for disease diagnosis and treatment.  

Scallop2 has been released as open source and is available here 

In addition to Shao, who is also affiliated with the Penn State Huck Institutes of the Life Sciences, the other authors on the paper were Penn State computer science and engineering graduate students Qimin Zhang and Qian Shi. The National Science Foundation, the National Institutes of Health and the Charles K. Etner Career Development Professorship funded this work. Initial algorithmic exploration of Scallop2 advancements were conducted with Carl Kingsford and were funded by the Gordon and Betty Moore Foundation and the NIH. 


Share this story:

facebook linked in twitter email


College of Engineering Media Relations