Publications
A host subtraction database for virus discovery in human cell line sequencing data
Miller JR, Dilley KA, Harkins DM, Stockwell TB, Shabman RS, Sutton GG
PMID: 31231504
Abstract
The human cell lines HepG2, HuH-7, and Jurkat are commonly used for amplification of the RNA viruses present in environmental samples. To assist with assays by RNAseq, we sequenced these cell lines and developed a subtraction database that contains sequences expected in sequence data from uninfected cells. RNAseq data from cell lines infected with Sendai virus were analyzed to test host subtraction. The process of mapping RNAseq reads to our subtraction database vastly reduced the number non-viral reads in the dataset to allow for efficient secondary analyses.