Abstract—Many techniques for text processing are based on efficient data storing and retrieval techniques. Careful selection of data structures used and retrieval techniques play a significant role in efficiency of the whole system of data processing. Hashing technique is one very often used technique with O(1) run-time complexity for data storing and retrieval. A comparison of new technique for hash function construction is presented in the paper without need of division operation. The comparison of the proposed technique is especially convenient for large textual data sets processing. State of the art in hashing of textual data is given (the perfect hashing techniques are not included). The proposed hash function construction and hashing ...
Using only a few simple instructions, the algorithm, maps variable-length text strings onto small in...
Data hashing has been widely used to approximate large-scale similarity searches. Original text data...
The administrative process carried out continuously produces large data. So the search process takes...
Abstract—Many techniques for text processing are based on efficient data storing and retrieval techn...
Techniques based on hashing are heavily used in many applications, e.g. information retrieval, geom...
Textual and geometrical algorithms have been considered as two separate fields. This was caused by t...
© 2017 ACM. With the rapid development of information storage and networking technologies, quintilli...
Abstract. Fast elimination of duplicate data is needed in many areas, especially in the textual data...
Hashing is a well-known and widely used technique for providing O(1) access to large files on second...
One of the crucial points in the text mining studies is the feature hashing step. Most of the text m...
Signature files are extremely compressed versions of text files which can be used as access or index...
This paper deals with the construction of digital lexicons within the scope of Natural Language Proc...
Fast elimination of duplicate data is needed in many areas, especially in the textual data context....
Abstract—This work presents an innovative method of compar-ing sets of textual documents with an aim...
This study was conducted to compare two minimal perfect hashing method Chang's method and Jaeschke's...
Using only a few simple instructions, the algorithm, maps variable-length text strings onto small in...
Data hashing has been widely used to approximate large-scale similarity searches. Original text data...
The administrative process carried out continuously produces large data. So the search process takes...
Abstract—Many techniques for text processing are based on efficient data storing and retrieval techn...
Techniques based on hashing are heavily used in many applications, e.g. information retrieval, geom...
Textual and geometrical algorithms have been considered as two separate fields. This was caused by t...
© 2017 ACM. With the rapid development of information storage and networking technologies, quintilli...
Abstract. Fast elimination of duplicate data is needed in many areas, especially in the textual data...
Hashing is a well-known and widely used technique for providing O(1) access to large files on second...
One of the crucial points in the text mining studies is the feature hashing step. Most of the text m...
Signature files are extremely compressed versions of text files which can be used as access or index...
This paper deals with the construction of digital lexicons within the scope of Natural Language Proc...
Fast elimination of duplicate data is needed in many areas, especially in the textual data context....
Abstract—This work presents an innovative method of compar-ing sets of textual documents with an aim...
This study was conducted to compare two minimal perfect hashing method Chang's method and Jaeschke's...
Using only a few simple instructions, the algorithm, maps variable-length text strings onto small in...
Data hashing has been widely used to approximate large-scale similarity searches. Original text data...
The administrative process carried out continuously produces large data. So the search process takes...