Motivated by mass-spectrometry protein sequencing, we consider the problem of reconstructing a string from the multisets of its substring composition. We show that all strings of length 7, one less than a prime and one less than twice a prime, can be reconstructed uniquely up to reversal. For all other lengths, we show that unique reconstruction is not always possible and provide sometimes-tight bounds on the largest number of strings with given substring compositions. The lower bounds are derived by combinatorial arguments, while the upper bounds follow from algebraic approaches that lead to precise characterizations of the sets of strings with the same substring compositions in terms of the factorization properties of bivariate polynomial...
AbstractGiven a finite set of strings X, the Longest Common Subsequence problem (LCS) consists in fi...
We introduce a new concept of randomness for binary strings of finite length, reconstructive randomn...
AbstractVarious versions of the shortest common superstring problem play important roles in data com...
Abstract—Motivated by protein sequencing, we con-sider the problem of reconstructing a string from t...
Abstract—Motivated by the problem of deducing the struc-ture of proteins using mass-spectrometry, we...
We consider the following problem: given a collection of strings s 1;...; s m, nd the shortest stri...
In the reverse complement equivalence model, it is not possible to distinguish a string from its rev...
AbstractIn the reverse complement equivalence model, it is not possible to distinguish a string from...
This paper introduces a new family of reconstruction codes which is motivated by applications in DNA...
The problem of reconstructing strings from substring information has found many applications due to ...
We propose new frequent substring pattern mining which can enumerate all substrings with statistical...
A natural problem in extremal combinatorics is to maximize the number of distinct subsequences for a...
Abstract. Finding similar substrings/substructures is a central task in analyzing huge amounts of st...
Abstract. Given a finite set of strings X, the longest common subsequence problem (LCS) consists in ...
Given a string S over a finite alphabet Σ, the character set (also called the fingerprint) of a subs...
AbstractGiven a finite set of strings X, the Longest Common Subsequence problem (LCS) consists in fi...
We introduce a new concept of randomness for binary strings of finite length, reconstructive randomn...
AbstractVarious versions of the shortest common superstring problem play important roles in data com...
Abstract—Motivated by protein sequencing, we con-sider the problem of reconstructing a string from t...
Abstract—Motivated by the problem of deducing the struc-ture of proteins using mass-spectrometry, we...
We consider the following problem: given a collection of strings s 1;...; s m, nd the shortest stri...
In the reverse complement equivalence model, it is not possible to distinguish a string from its rev...
AbstractIn the reverse complement equivalence model, it is not possible to distinguish a string from...
This paper introduces a new family of reconstruction codes which is motivated by applications in DNA...
The problem of reconstructing strings from substring information has found many applications due to ...
We propose new frequent substring pattern mining which can enumerate all substrings with statistical...
A natural problem in extremal combinatorics is to maximize the number of distinct subsequences for a...
Abstract. Finding similar substrings/substructures is a central task in analyzing huge amounts of st...
Abstract. Given a finite set of strings X, the longest common subsequence problem (LCS) consists in ...
Given a string S over a finite alphabet Σ, the character set (also called the fingerprint) of a subs...
AbstractGiven a finite set of strings X, the Longest Common Subsequence problem (LCS) consists in fi...
We introduce a new concept of randomness for binary strings of finite length, reconstructive randomn...
AbstractVarious versions of the shortest common superstring problem play important roles in data com...