We propose a novel approach for categorizing text documents based on the use of a special kernel. The kernel is an inner product in the feature space generated by all subsequences of length k. A subsequence is any ordered sequence of k characters occurring in the text though not necessarily contiguously. The subsequences are weighted by an exponentially decaying factor of their full length in the text, hence emphasising those occurrences that are close to contiguous. A direct computation of this feature vector would involve a prohibitive amount of computation even for modest values of k, since the dimension of the feature space grows exponentially with k. The paper describes howdespite this fact the inner product can be efficiently evaluate...
In many text classification applications, it is appealing to take every document as a string of char...
Problems of analysis and modeling of sequential data arise in many practical applications. In this w...
Ganiz, Murat Can (Dogus Author) -- Conference full title: 2013 10th International Conference on Elec...
We propose a novel approach for categorizing text documents based on the use of a special kernel. Th...
The expanding popularity of the Internet in recent years has lead to a corresponding increase in the...
University of Technology, Sydney. Faculty of Engineering and Information Technology.NO FULL TEXT AVA...
This paper proposes a class of string kernels that can handle a variety of subsequence-based feature...
In this paper we propose a novel kernel for text categorization. This kernel is an inner product def...
In this thesis text categorization is investigated in four dimensions of analysis: theoretically as ...
Traditional bag-of-words model and recent word-sequence kernel are two well-known techniques in the ...
We present a package which provides a general framework, including tools and algorithms, for text mi...
Recently, the use of string kernels that compare documents as a string of letters has been shown to ...
This paper introduces a convolutional sen-tence kernel based on word embeddings. Our kernel overcome...
We propose a semantic kernel for Support Vector Machines (SVM) that takes advantage of higher-order ...
Abstract: This work presents kernel functions that can be used in conjunction with the Support Vecto...
In many text classification applications, it is appealing to take every document as a string of char...
Problems of analysis and modeling of sequential data arise in many practical applications. In this w...
Ganiz, Murat Can (Dogus Author) -- Conference full title: 2013 10th International Conference on Elec...
We propose a novel approach for categorizing text documents based on the use of a special kernel. Th...
The expanding popularity of the Internet in recent years has lead to a corresponding increase in the...
University of Technology, Sydney. Faculty of Engineering and Information Technology.NO FULL TEXT AVA...
This paper proposes a class of string kernels that can handle a variety of subsequence-based feature...
In this paper we propose a novel kernel for text categorization. This kernel is an inner product def...
In this thesis text categorization is investigated in four dimensions of analysis: theoretically as ...
Traditional bag-of-words model and recent word-sequence kernel are two well-known techniques in the ...
We present a package which provides a general framework, including tools and algorithms, for text mi...
Recently, the use of string kernels that compare documents as a string of letters has been shown to ...
This paper introduces a convolutional sen-tence kernel based on word embeddings. Our kernel overcome...
We propose a semantic kernel for Support Vector Machines (SVM) that takes advantage of higher-order ...
Abstract: This work presents kernel functions that can be used in conjunction with the Support Vecto...
In many text classification applications, it is appealing to take every document as a string of char...
Problems of analysis and modeling of sequential data arise in many practical applications. In this w...
Ganiz, Murat Can (Dogus Author) -- Conference full title: 2013 10th International Conference on Elec...