Due to its robustness to outliers, many Pattern Recognition algorithms use the median as a representative of a set of points. A special case arises in Syntactical Pattern Recognition when the points (prototypes) are represented by strings. However, when the edit distance is used, finding the median becomes a NP-Hard problem. Then, either the search is restricted to strings in the data (set-median) or some heuristic approach is applied. In this work we use the (conditional) stochastic edit distance instead of the plain edit distance. It is not yet known if in this case the problem is also NP-Hard so an approximation algorithm is described. The algorithm is based on the extension of the string structure to multistrings (strings of stochastic...
In this paper we present a foundational basis for optimal and information theoretic syntactic patter...
We consider a string edit problem in a probabilistic framework. This problem is of considerable inte...
International audienceWe compare different statistical characterizations of a set of strings, for th...
In a number of fields, it is necessary to compare a witness string with a distribution. One possibil...
This paper presents a new algorithm that can be used to compute an approximation to the median of a ...
AbstractGiven a set of strings, the problem of finding a string that minimises its distance to the s...
13 pagesMany pattern recognition algorithms are based on the nearest neighbour search and use the we...
In many applications, it is necessary to determine the similarity of two strings. A widely-used noti...
pages 240-252International audienceMany real-world applications such as spell-checking or DNA analys...
We consider a string edit problem in a probabilistic framework. This problem is of considerable inte...
In a number of fields one is to compare a witness string with a distribution. One possibility is to ...
AbstractGiven a finite set of strings, the Median String problem consists in finding a string that m...
We consider a string editing problem in a probabilistic framework. This problem is of considerable i...
Abstract—In many applications, it is necessary to determine the similarity of two strings. A widely-...
The distance of a string from a set of strings is defined by the sum of distances to the strings of ...
In this paper we present a foundational basis for optimal and information theoretic syntactic patter...
We consider a string edit problem in a probabilistic framework. This problem is of considerable inte...
International audienceWe compare different statistical characterizations of a set of strings, for th...
In a number of fields, it is necessary to compare a witness string with a distribution. One possibil...
This paper presents a new algorithm that can be used to compute an approximation to the median of a ...
AbstractGiven a set of strings, the problem of finding a string that minimises its distance to the s...
13 pagesMany pattern recognition algorithms are based on the nearest neighbour search and use the we...
In many applications, it is necessary to determine the similarity of two strings. A widely-used noti...
pages 240-252International audienceMany real-world applications such as spell-checking or DNA analys...
We consider a string edit problem in a probabilistic framework. This problem is of considerable inte...
In a number of fields one is to compare a witness string with a distribution. One possibility is to ...
AbstractGiven a finite set of strings, the Median String problem consists in finding a string that m...
We consider a string editing problem in a probabilistic framework. This problem is of considerable i...
Abstract—In many applications, it is necessary to determine the similarity of two strings. A widely-...
The distance of a string from a set of strings is defined by the sum of distances to the strings of ...
In this paper we present a foundational basis for optimal and information theoretic syntactic patter...
We consider a string edit problem in a probabilistic framework. This problem is of considerable inte...
International audienceWe compare different statistical characterizations of a set of strings, for th...