We have applied the generalised and universal distance measure NCD—Normalised Compression Distance—to the problem of determining the type of file fragments. To enable later comparison of the results, the algorithm was applied to fragments of a publicly available corpus of files. The NCD algorithm in conjunction with the k-nearest-neighbour (k ranging from one to ten) as the classification algorithm was applied to a random selection of circa 3000 512-byte file fragments from 28 different file types. This procedure was then repeated ten times. While the overall accuracy of the n-valued classification only improved the prior probability from approximately 3.5% to circa 32%–36%, the classifier reached accuracies of circa 70% for the most successful fil...
AbstractNormalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, ...
Normalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, which fo...
Genomic sequences are usually compared using evolutionary distance, a procedure that implies the al...
We have applied the generalised and universal distance measure NCD—Normalised Compression Distance—t...
Part 4: FILESYSTEM FORENSICSInternational audienceThe first step when recovering deleted files using...
The huge amount of information stored in text form makes methods that deal with texts really interes...
Genomic sequences are usually compared using evolutionary distance, a procedure that implies the ali...
We present a new method for clustering based on compression. The method doesn't use subject-spe...
The normalized information distance is a universal distance measure for objects of all kinds. It is ...
Normalized compression distance (NCD) is a parameter-free, feature-free, alignment-free, similarity ...
The normalised compression distance measures the mutual compressibility of two signals. We show that...
We present a new similarity measure based on information theoretic measures which is superior than N...
Normalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, which fo...
The paper discusses the application of a similarity metric based on compression to the measurement o...
A local distance measure for the nearest neighbor classification rule is shown to achieve high comp...
AbstractNormalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, ...
Normalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, which fo...
Genomic sequences are usually compared using evolutionary distance, a procedure that implies the al...
We have applied the generalised and universal distance measure NCD—Normalised Compression Distance—t...
Part 4: FILESYSTEM FORENSICSInternational audienceThe first step when recovering deleted files using...
The huge amount of information stored in text form makes methods that deal with texts really interes...
Genomic sequences are usually compared using evolutionary distance, a procedure that implies the ali...
We present a new method for clustering based on compression. The method doesn't use subject-spe...
The normalized information distance is a universal distance measure for objects of all kinds. It is ...
Normalized compression distance (NCD) is a parameter-free, feature-free, alignment-free, similarity ...
The normalised compression distance measures the mutual compressibility of two signals. We show that...
We present a new similarity measure based on information theoretic measures which is superior than N...
Normalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, which fo...
The paper discusses the application of a similarity metric based on compression to the measurement o...
A local distance measure for the nearest neighbor classification rule is shown to achieve high comp...
AbstractNormalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, ...
Normalized information distance (NID) uses the theoretical notion of Kolmogorov complexity, which fo...
Genomic sequences are usually compared using evolutionary distance, a procedure that implies the al...