Abstract Background A basic task in bioinformatics is the counting of k-mers in genome sequences. Existing k-mer counting tools are most often optimized for small k < 32 and suffer from excessive memory resource consumption or degrading performance for large k. However, given the technology trend towards long reads of next-generation sequencers, support for large k becomes increasingly important. Results We present the open source k-mer counting software Gerbil that has been designed for the efficient counting of k-mers for k ≥ 32. Our software is the result of an intensive process of algorithm engineering. It implements a two-step approach. In the first step, genome reads are loaded from disk and redistributed to temporary files. In a seco...
The impending advent of population-scaled sequencing cohorts involving tens of millions of individua...
Genome analysis benefits precise medical care, wildlife conservation, pandemic treatment, e.g., COVI...
Summary: Counting all the k-mers (substrings of length k) in DNA/RNA sequencing reads is the prelimi...
k-mer counting is a popular pre-processing step in many bioinformatic algorithms. KMC2 is one of the...
Motivation: Building the histogram of occurrences of every k-symbol long substring of nucleotide dat...
Motivation: Building the histogram of occurrences of every k-symbol long substring of nucleotide dat...
The emergence of Next Generation Sequencing (NGS) platforms has increased the throughput of genomic ...
k-mer counting is an essential algorithm found in many genomic related processes. It may seem like a...
A fundamental step in many bioinformatics computations is to count the frequency of fixed-length seq...
Bioinformatics journal requires that we post only the pre-print, which does not include modification...
Additional file 1. The additional file contains links to all test data sets used in the experiments ...
Motivation: A major challenge in next-generation genome seque-ncing (NGS) is to assemble massive ove...
Over the past few years, DNA sequencing technology has been advancing at such a fast pace that compu...
Fast and robust algorithms and aligners have been developed to help the researchers in the analysis ...
Abstract Genomics data analysis requires efficient tools to address the vast amount of data generate...
The impending advent of population-scaled sequencing cohorts involving tens of millions of individua...
Genome analysis benefits precise medical care, wildlife conservation, pandemic treatment, e.g., COVI...
Summary: Counting all the k-mers (substrings of length k) in DNA/RNA sequencing reads is the prelimi...
k-mer counting is a popular pre-processing step in many bioinformatic algorithms. KMC2 is one of the...
Motivation: Building the histogram of occurrences of every k-symbol long substring of nucleotide dat...
Motivation: Building the histogram of occurrences of every k-symbol long substring of nucleotide dat...
The emergence of Next Generation Sequencing (NGS) platforms has increased the throughput of genomic ...
k-mer counting is an essential algorithm found in many genomic related processes. It may seem like a...
A fundamental step in many bioinformatics computations is to count the frequency of fixed-length seq...
Bioinformatics journal requires that we post only the pre-print, which does not include modification...
Additional file 1. The additional file contains links to all test data sets used in the experiments ...
Motivation: A major challenge in next-generation genome seque-ncing (NGS) is to assemble massive ove...
Over the past few years, DNA sequencing technology has been advancing at such a fast pace that compu...
Fast and robust algorithms and aligners have been developed to help the researchers in the analysis ...
Abstract Genomics data analysis requires efficient tools to address the vast amount of data generate...
The impending advent of population-scaled sequencing cohorts involving tens of millions of individua...
Genome analysis benefits precise medical care, wildlife conservation, pandemic treatment, e.g., COVI...
Summary: Counting all the k-mers (substrings of length k) in DNA/RNA sequencing reads is the prelimi...