The large volumes of data generated by modern sequencing experiments present significant challenges in their manipulation and analysis. Traditional approaches are often found to be complicated to scale. We describe our ongoing work on SeqPig, a tool that facilitates the use of the Pig Latin distributed scripting language to manipulate, analyze and query sequencing data applying the advances motivated by the “big data revolution” in data-intensive activities. SeqPig provides access to popular data formats and implements a number of custom sequencing-specific functions. Most importantly, it grants users access to the scalable Hadoop platform from a high level scripting language84-85Pubblicat
Scaling up production in medium and large high-throughput sequencing facilities presents a number of...
Background: While next-generation sequencing (NGS) technologies are rapidly advancing, an area that ...
Genome sequencing technology has been improved intensely, but the number of bases generated by moder...
Analysis of high volumes of data has always been performed with distributed computing on computer cl...
The dramatic progress in DNA sequencing technology over the last decade, with the revolutionary int...
Contributed presentation to the 14th Bioinformatics Open Source Conference.2013-07Berlin14th Bioinfo...
BACKGROUND: Whole-genome sequencing (WGS) and whole-exome sequencing (WES) technologies are increasi...
The continuous increase in sequencing throughput imposes a new generation of tools for data processi...
Abstract Background Since the introduction of next-generation DNA sequencers the rapid increase in s...
Scaling up production in medium and large high-throughput sequencing facilities presents a number of...
Large-scale sequencing techniques to chart genomes are entirely consolidated. Stable computational m...
Over the past decade, the evolution of next-generation sequencing technology has considerably advanc...
Many time-consuming analyses of next -: generation sequencing data can be addressed with modern clou...
Next-generation sequencing technologies are increasing our ability to study genome function. A new a...
Collana seminari interni 2012, Number 20120418.In this seminar, we explore the Hadoop MapReduce fram...
Scaling up production in medium and large high-throughput sequencing facilities presents a number of...
Background: While next-generation sequencing (NGS) technologies are rapidly advancing, an area that ...
Genome sequencing technology has been improved intensely, but the number of bases generated by moder...
Analysis of high volumes of data has always been performed with distributed computing on computer cl...
The dramatic progress in DNA sequencing technology over the last decade, with the revolutionary int...
Contributed presentation to the 14th Bioinformatics Open Source Conference.2013-07Berlin14th Bioinfo...
BACKGROUND: Whole-genome sequencing (WGS) and whole-exome sequencing (WES) technologies are increasi...
The continuous increase in sequencing throughput imposes a new generation of tools for data processi...
Abstract Background Since the introduction of next-generation DNA sequencers the rapid increase in s...
Scaling up production in medium and large high-throughput sequencing facilities presents a number of...
Large-scale sequencing techniques to chart genomes are entirely consolidated. Stable computational m...
Over the past decade, the evolution of next-generation sequencing technology has considerably advanc...
Many time-consuming analyses of next -: generation sequencing data can be addressed with modern clou...
Next-generation sequencing technologies are increasing our ability to study genome function. A new a...
Collana seminari interni 2012, Number 20120418.In this seminar, we explore the Hadoop MapReduce fram...
Scaling up production in medium and large high-throughput sequencing facilities presents a number of...
Background: While next-generation sequencing (NGS) technologies are rapidly advancing, an area that ...
Genome sequencing technology has been improved intensely, but the number of bases generated by moder...