In this thesis text categorization is investigated in four dimensions of analysis: theoretically as well as empirically, and as a manual as well as a machine-based process. In the first four chapters we look at the theoretical foundation of subject classification of text documents, with a certain focus on classification as a procedure for organizing documents in libraries. A working hypothesis used in the theoretical analysis is that classification of documents is a process that involves translations between statements in different languages, both natural and artificial. We further investigate the close relationships between structures in classification languages and the order relations and topological structures that arise from classificat...
We propose a novel approach for categorizing text documents based on the use of a special kernel. Th...
International audienceThe document similarity measure is a key point in textual data processing. It ...
Web-mediated access to distributed informa-tion is a complex problem. Before any learn-ing can start...
We propose a semantic kernel for Support Vector Machines (SVM) that takes advantage of higher-order ...
Text categorization plays a crucial role in both academic and commercial platforms due to the growin...
Ganiz, Murat Can (Dogus Author) -- Conference full title: 2013 10th International Conference on Elec...
In text categorization, a document is usually represented by a vector space model which can accompli...
The bag of words (BOW) representation of documents is very common in text classification systems. Ho...
In this study, we propose a novel methodology to build a semantic smoothing kernel to use with Suppo...
Ganiz, Murat Can (Dogus Author) -- Conference full title: 2014 IEEE International Symposium on Innov...
International audienceSince a decade, text categorization has become an active field of research in ...
Most text classification systems use bag-of-words represen- tation of documents to find the classifi...
The expanding popularity of the Internet in recent years has lead to a corresponding increase in the...
Previous work on Natural Language Processing for Information Retrieval has shown the inadequateness ...
University of Technology, Sydney. Faculty of Engineering and Information Technology.NO FULL TEXT AVA...
We propose a novel approach for categorizing text documents based on the use of a special kernel. Th...
International audienceThe document similarity measure is a key point in textual data processing. It ...
Web-mediated access to distributed informa-tion is a complex problem. Before any learn-ing can start...
We propose a semantic kernel for Support Vector Machines (SVM) that takes advantage of higher-order ...
Text categorization plays a crucial role in both academic and commercial platforms due to the growin...
Ganiz, Murat Can (Dogus Author) -- Conference full title: 2013 10th International Conference on Elec...
In text categorization, a document is usually represented by a vector space model which can accompli...
The bag of words (BOW) representation of documents is very common in text classification systems. Ho...
In this study, we propose a novel methodology to build a semantic smoothing kernel to use with Suppo...
Ganiz, Murat Can (Dogus Author) -- Conference full title: 2014 IEEE International Symposium on Innov...
International audienceSince a decade, text categorization has become an active field of research in ...
Most text classification systems use bag-of-words represen- tation of documents to find the classifi...
The expanding popularity of the Internet in recent years has lead to a corresponding increase in the...
Previous work on Natural Language Processing for Information Retrieval has shown the inadequateness ...
University of Technology, Sydney. Faculty of Engineering and Information Technology.NO FULL TEXT AVA...
We propose a novel approach for categorizing text documents based on the use of a special kernel. Th...
International audienceThe document similarity measure is a key point in textual data processing. It ...
Web-mediated access to distributed informa-tion is a complex problem. Before any learn-ing can start...