String similarity is most often measured by weighted or unweighted edit distance d(x, y). Ristad and Yianilos (1998) de-fined stochastic edit distance—a probabil-ity distribution p(y | x) whose parame-ters can be trained from data. We general-ize this so that the probability of choosing each edit operation can depend on contex-tual features. We show how to construct and train a probabilistic finite-state trans-ducer that computes our stochastic con-textual edit distance. To illustrate the im-provement from conditioning on context, we model typos found in social media text.
Abstract. Trees provide a suited structural representation to deal with complex tasks such as web in...
The need to measure sequence similarity arises in information extraction, object identity, data mini...
pages 42-53International audienceTrees provide a suited structural representation to deal with compl...
String similarity is most often measured by weighted or unweighted edit distance d(x, y). Ristad and...
pages 240-252International audienceMany real-world applications such as spell-checking or DNA analys...
In many applications, it is necessary to determine the similarity of two strings. A widely-used noti...
Abstract—In many applications, it is necessary to determine the similarity of two strings. A widely-...
13 pagesMany pattern recognition algorithms are based on the nearest neighbour search and use the we...
In a number of fields, it is necessary to compare a witness string with a distribution. One possibil...
In a number of fields one is to compare a witness string with a distribution. One possibility is to ...
We consider a string edit problem in a probabilistic framework. This problem is of considerable inte...
We consider a string editing problem in a probabilistic framework. This problem is of considerable i...
International audienceDuring the past few years, several works have been done to derive string kerne...
We consider a string edit problem in a probabilistic framework. This problem is of considerable inte...
During the past few years, several works have been done to derive string kernels from probability di...
Abstract. Trees provide a suited structural representation to deal with complex tasks such as web in...
The need to measure sequence similarity arises in information extraction, object identity, data mini...
pages 42-53International audienceTrees provide a suited structural representation to deal with compl...
String similarity is most often measured by weighted or unweighted edit distance d(x, y). Ristad and...
pages 240-252International audienceMany real-world applications such as spell-checking or DNA analys...
In many applications, it is necessary to determine the similarity of two strings. A widely-used noti...
Abstract—In many applications, it is necessary to determine the similarity of two strings. A widely-...
13 pagesMany pattern recognition algorithms are based on the nearest neighbour search and use the we...
In a number of fields, it is necessary to compare a witness string with a distribution. One possibil...
In a number of fields one is to compare a witness string with a distribution. One possibility is to ...
We consider a string edit problem in a probabilistic framework. This problem is of considerable inte...
We consider a string editing problem in a probabilistic framework. This problem is of considerable i...
International audienceDuring the past few years, several works have been done to derive string kerne...
We consider a string edit problem in a probabilistic framework. This problem is of considerable inte...
During the past few years, several works have been done to derive string kernels from probability di...
Abstract. Trees provide a suited structural representation to deal with complex tasks such as web in...
The need to measure sequence similarity arises in information extraction, object identity, data mini...
pages 42-53International audienceTrees provide a suited structural representation to deal with compl...