Over the past few years, the popularity of approximate matching algorithms (a.k.a. fuzzy hashing) has increased. Especially within the area of bytewise approximate matching, several algorithms were published, tested, and improved. It has been shown that these algorithms are powerful, however they are sometimes too precise for real world investigations. That is, even very small commonalities (e.g., in the header of a file) can cause a match. While this is a desired property, it may also lead to unwanted results. In this paper, we show that by using simple pre-processing, we significantly can influence the outcome. Although our test set is based on text-based file types (cause of an easy processing), this technique can be used for other, well...
Forensic investigations are often comparable to find the needle in the haystack – the agents are ove...
Digital forensic investigators frequently have to search for relevant files in massive digital corpo...
Fuzzy matching in translation memories (TM) is mostly string-based in current CAT tools. These tools...
Over the past few years the popularity of approximate matching algorithms (a.k.a. fuzzy hashing) has...
Hash functions are established and well-known in digital forensics, where they are commonly used for...
Handling hundreds of thousands of files is a major challenge in today’s digital forensics. In order ...
Bytewise approximate matching is a relatively new area within digital forensics, but its importance ...
AbstractBytewise approximate matching is a relatively new area within digital forensics, but its imp...
AbstractApproximate Hash Based Matching (AHBM), also known as Fuzzy Hashing, is used to identify com...
Fuzzy hashing or similarity hashing (a.k.a. bytewise approximate matching) converts digital artifact...
Investigating seized devices within digital forensics gets more and more difficult due to the increa...
A challenge for digital forensic investigations is dealing with large amounts of data that need to b...
Bytewise approximate matching algorithms have in recent years shown significant promise in detecting...
AbstractInvestigating seized devices within digital forensics gets more and more difficult due to th...
The technical aspects of digital forensics are often dependent upon the progress made in other scien...
Forensic investigations are often comparable to find the needle in the haystack – the agents are ove...
Digital forensic investigators frequently have to search for relevant files in massive digital corpo...
Fuzzy matching in translation memories (TM) is mostly string-based in current CAT tools. These tools...
Over the past few years the popularity of approximate matching algorithms (a.k.a. fuzzy hashing) has...
Hash functions are established and well-known in digital forensics, where they are commonly used for...
Handling hundreds of thousands of files is a major challenge in today’s digital forensics. In order ...
Bytewise approximate matching is a relatively new area within digital forensics, but its importance ...
AbstractBytewise approximate matching is a relatively new area within digital forensics, but its imp...
AbstractApproximate Hash Based Matching (AHBM), also known as Fuzzy Hashing, is used to identify com...
Fuzzy hashing or similarity hashing (a.k.a. bytewise approximate matching) converts digital artifact...
Investigating seized devices within digital forensics gets more and more difficult due to the increa...
A challenge for digital forensic investigations is dealing with large amounts of data that need to b...
Bytewise approximate matching algorithms have in recent years shown significant promise in detecting...
AbstractInvestigating seized devices within digital forensics gets more and more difficult due to th...
The technical aspects of digital forensics are often dependent upon the progress made in other scien...
Forensic investigations are often comparable to find the needle in the haystack – the agents are ove...
Digital forensic investigators frequently have to search for relevant files in massive digital corpo...
Fuzzy matching in translation memories (TM) is mostly string-based in current CAT tools. These tools...