FastqPuri: high-performance preprocessing of RNA-seq data

Pérez-Rubio, P.; Lottaz, C.; Engelmann, J.C.

doi:/10.1186/s12859-019-2799-0

Over het archief

Het OWA, het open archief van het Waterbouwkundig Laboratorium heeft tot doel alle vrij toegankelijke onderzoeksresultaten van dit instituut in digitale vorm aan te bieden. Op die manier wil het de zichtbaarheid, verspreiding en gebruik van deze onderzoeksresultaten, alsook de wetenschappelijke communicatie maximaal bevorderen.

Dit archief wordt uitgebouwd en beheerd volgens de principes van de Open Access Movement, en het daaruit ontstane Open Archives Initiative.

Basisinformatie over ‘Open Access to scholarly information'.

[ meld een fout in dit record ]

mandje (10): toevoegen | toon

one publication added to basket [310962]

FastqPuri: high-performance preprocessing of RNA-seq data

Pérez-Rubio, P.; Lottaz, C.; Engelmann, J.C. (2019). FastqPuri: high-performance preprocessing of RNA-seq data. BMC Bioinformatics 20(1): 226. https://dx.doi.org/10.1186/s12859-019-2799-0

Bijhorende data:

https://doi.org/10.4121/uuid:9d88ee8d-ceda-4d7e-8109-1cfcd2892632
https://doi.org/10.4121/uuid:b1c4ee4f-9b88-493f-81d8-4040f0d1af25

In: BMC Bioinformatics. BioMed Central: London. e-ISSN 1471-2105, meer

Beschikbaar in	Auteurs
VLIZ [ aanvragen ]

Author keywords

fastq; RNA-seq; Quality control; Preprocessing; Sequence data

Auteurs		Top
Pérez-Rubio, P. Lottaz, C. Engelmann, J.C., meer

Abstract

Pérez-Rubioet al. BMC Bioinformatics (2019) 20:226 https://doi.org/10.1186/s12859-019-2799-0SOFTWAREOpen AccessFastqPuri: high-performancepreprocessing of RNA-seq dataPaula Pérez-Rubio1, Claudio Lottaz1and Julia C. Engelmann2*AbstractBackground:RNA sequencing (RNA-seq) has become the standard means of analyzing gene and transcriptexpression in high-throughput. While previously sequence alignment was a time demanding step, fast alignmentmethods and even more so transcript counting methods which avoid mapping and quantify gene and transcriptexpression by evaluating whether a read is compatible with a transcript, have led to significant speed-ups in dataanalysis. Now, the most time demanding step in the analysis of RNA-seq data is preprocessing the raw sequence data,such as running quality control and adapter, contamination and quality filtering before transcript or genequantification. To do so, many researchers chain different tools, but a comprehensive, flexible and fast software thatcovers all preprocessing steps is currently missing.Results:We here presentFastqPuri, a light-weight and highly efficient preprocessing tool for fastq data.FastqPuriprovides sequence quality reports on the sample and dataset level with new plots which facilitate decision making forsubsequent quality filtering. Moreover,FastqPuriefficiently removes adapter sequences and sequences frombiological contamination from the data. It accepts both single- and paired-end data in uncompressed or compressedfastq files.FastqPurican be run stand-alone and is suitable to be run within pipelines. We benchmarkedFastqPuriagainst existing tools and found thatFastqPuriis superior in terms of speed, memory usage, versatility andcomprehensiveness.Conclusions:FastqPuriis a new tool which covers all aspects of short read sequence data preprocessing. It wasdesigned for RNA-seq data to meet the needs for fast preprocessing of fastq data to allow transcript and genecounting, but it is suitable to process any short read sequencing data of which high sequence quality is needed, suchas for genome assembly or SNV (single nucleotide variant) detection.FastqPuriis most flexible in filtering undesiredbiological sequences by offering two approaches to optimize speed and memory usage dependent on the total sizeof the potential contaminating sequences.FastqPuriis available athttps://github.com/jengelmann/FastqPuri.Itisimplemented in C and R and licensed under GPL v3.

Alle informatie in het Integrated Marine Information System (IMIS) valt onder het VLIZ Privacy beleid

Top | Auteurs

IMIS is ontwikkeld en wordt gehost door het VLIZ.

Open WL Archief (OWA)

Over het archief

Waterbouwkundig Laboratorium Hoofdkantoor

Subscribe to our newsletter

FLANDERS HYDRAULICS

MARITIME TECHNOLOGY DIVISION

U bent hier

Open WL Archief (OWA)

Over het archief

Waterbouwkundig Laboratorium Hoofdkantoor

Volg ons

Subscribe to our newsletter

FLANDERS HYDRAULICS

MARITIME TECHNOLOGY DIVISION