AbstractEvaluation of the expected frequency of occurrences of a given set of patterns in a DNA sequence has numerous applications and has been extensively studied recently. We provide a unified framework for this evaluation that adapts to various constraints and allow to extend previous results. We assume successively that the patterns may, then may not, overlap. We derive exact formulae for the moments in a Markovian model, that are linear functions of the size of the sequence. We show that our formulae, that occasionally simplify previous results, are computable at low cost, which makes them useful for practical applications
A problem in biology arises in the evaluation of statistical significance of the observed frequency ...
To establish lists of words with unexpected frequencies in random sequences, for instance in a molec...
Motivation: Next Generation Sequencing (NGS) technologies generate large amounts of short read data ...
AbstractEvaluation of the expected frequency of occurrences of a given set of patterns in a DNA sequ...
Consider a given pattern H and a random text T generated by a Markovian source. We study the frequen...
atteson. ~ p eaplant, biology, yale. edu We present algorithms for the exact computation of the prob...
International audiencen the following, an overview is given on statistical and probabilistic propert...
This work investigates frequency distributions of strings within a text. The mathematical derivation...
Probability models are employed in the analysis of data emerging from DNA sequence studies. To seque...
International audienceIn this paper, me give an overview about the different results existing on the...
Consider a given pattern H and a random text T generated by a Markovian source of any order. We stud...
International audienceBACKGROUND: In bioinformatics it is common to search for a pattern of interest...
Using recent results on the occurrence times of a string of symbols in a stochastic process with mix...
International audienceIdentifying a word (pattern) in a long sequence of letters is not an easy task...
The distribution of the distance between two (or more) successive occurrences of a specific word in ...
A problem in biology arises in the evaluation of statistical significance of the observed frequency ...
To establish lists of words with unexpected frequencies in random sequences, for instance in a molec...
Motivation: Next Generation Sequencing (NGS) technologies generate large amounts of short read data ...
AbstractEvaluation of the expected frequency of occurrences of a given set of patterns in a DNA sequ...
Consider a given pattern H and a random text T generated by a Markovian source. We study the frequen...
atteson. ~ p eaplant, biology, yale. edu We present algorithms for the exact computation of the prob...
International audiencen the following, an overview is given on statistical and probabilistic propert...
This work investigates frequency distributions of strings within a text. The mathematical derivation...
Probability models are employed in the analysis of data emerging from DNA sequence studies. To seque...
International audienceIn this paper, me give an overview about the different results existing on the...
Consider a given pattern H and a random text T generated by a Markovian source of any order. We stud...
International audienceBACKGROUND: In bioinformatics it is common to search for a pattern of interest...
Using recent results on the occurrence times of a string of symbols in a stochastic process with mix...
International audienceIdentifying a word (pattern) in a long sequence of letters is not an easy task...
The distribution of the distance between two (or more) successive occurrences of a specific word in ...
A problem in biology arises in the evaluation of statistical significance of the observed frequency ...
To establish lists of words with unexpected frequencies in random sequences, for instance in a molec...
Motivation: Next Generation Sequencing (NGS) technologies generate large amounts of short read data ...