The paper discusses the application of a similarity metric based on compression to the measurement of the distance among Bulgarian dia- lects. The similarity metric is de ned on the basis of the notion of Kolmo- gorov complexity of a le (or binary string). The application of Kolmogorov complexity in practice is not possible because its calculation over a le is an undecidable problem. Thus, the actual similarity metric is based on a real life compressor which only approximates the Kolmogorov complexity. To use the metric for distance measurement of Bulgarian dialects we rst represent the dialectological data in such a way that the metric is applicable. We propose two such representations which are compared to a baseline distance between d...
Information distance is a parameter-free similarity measure based on compression, used in pattern re...
We examine various string distance measures for suitability in modeling dialect distance, especially...
This paper proposes a simple metric of dialect distance, based on the ratio between identical word p...
Abstract. The paper discusses the application of a similarity metric based on compression to the mea...
We present a new similarity measure based on information theoretic measures which is superior than N...
In this paper a range of methods for measuring the phonetic distance between dialectal variants are ...
Dialect classification is a classical problem in traditional dialectology. In the course of the last...
AbstractNormalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, ...
We examine various string distance measures for suitability in modeling dialect distance, especially...
Normalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, which fo...
First we consider pair-wise distances for literal objects consisting of finite binary files. These f...
Normalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, which fo...
The normalized information distance is a universal distance measure for objects of all kinds. It is ...
Dialectometry is a multidisciplinary field that uses quantitative methods in the analysis of dialect...
The Levenshtein distance is an established metric to represent phonological distances between dialec...
Information distance is a parameter-free similarity measure based on compression, used in pattern re...
We examine various string distance measures for suitability in modeling dialect distance, especially...
This paper proposes a simple metric of dialect distance, based on the ratio between identical word p...
Abstract. The paper discusses the application of a similarity metric based on compression to the mea...
We present a new similarity measure based on information theoretic measures which is superior than N...
In this paper a range of methods for measuring the phonetic distance between dialectal variants are ...
Dialect classification is a classical problem in traditional dialectology. In the course of the last...
AbstractNormalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, ...
We examine various string distance measures for suitability in modeling dialect distance, especially...
Normalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, which fo...
First we consider pair-wise distances for literal objects consisting of finite binary files. These f...
Normalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, which fo...
The normalized information distance is a universal distance measure for objects of all kinds. It is ...
Dialectometry is a multidisciplinary field that uses quantitative methods in the analysis of dialect...
The Levenshtein distance is an established metric to represent phonological distances between dialec...
Information distance is a parameter-free similarity measure based on compression, used in pattern re...
We examine various string distance measures for suitability in modeling dialect distance, especially...
This paper proposes a simple metric of dialect distance, based on the ratio between identical word p...