This dissertation proposes new machine learning methods where the corresponding learning problem is characterized by a high number of features, unbalanced class distribution and asymmetric misclassification costs. The input is given as a set of text documents or their Web addresses (URLs). The induced target concept is appropriate for the classification of new documents including shortened documents describing individual hyperlinks. The proposed methods are based on several new solutions. Proposed is a new, enriched document representation that extends the bag-of-words representation by adding word sequences and document topic categories. Features that represent word sequences are generated using a new efficient procedure. Features giving t...
In this paper a Web mining tool for content-based classification of Web pages is presented. The tool...
Exponential growth rates of learning materials and rapid distribution of those resources among e-lea...
This thesis deals with advanced machine-learning methods for text classification. At first, these me...
This paper describes the usage of machine learning techniques to assign keywords to documents. The l...
This paper proposes an efficient algorithm for the generation of new features that enrich the known ...
classification systems which utilize machine learning develops classification models through learnin...
This paper describes the usage of machine learning techniques to assign keywords to documents. The l...
Due to massive information overload on the web it's hard to index and reuse existing learning resour...
Automatic text classification is the process of automatically classifying text documents into pre-de...
This paper describes automatic document categorization based on large text hierarchy. We handle the...
This work aims to use machine learning techniques for the classification of specific parts of web pa...
Because of the explosion of digital and online text information, automatic organization of documents...
In the new age era, there is tons of information published on the web every day. Thus, it will take ...
The article addresses the problem of document classification. A technology for automatic topic extra...
Most of the research on text categorization has focused on classifying text documents into a set of ...
In this paper a Web mining tool for content-based classification of Web pages is presented. The tool...
Exponential growth rates of learning materials and rapid distribution of those resources among e-lea...
This thesis deals with advanced machine-learning methods for text classification. At first, these me...
This paper describes the usage of machine learning techniques to assign keywords to documents. The l...
This paper proposes an efficient algorithm for the generation of new features that enrich the known ...
classification systems which utilize machine learning develops classification models through learnin...
This paper describes the usage of machine learning techniques to assign keywords to documents. The l...
Due to massive information overload on the web it's hard to index and reuse existing learning resour...
Automatic text classification is the process of automatically classifying text documents into pre-de...
This paper describes automatic document categorization based on large text hierarchy. We handle the...
This work aims to use machine learning techniques for the classification of specific parts of web pa...
Because of the explosion of digital and online text information, automatic organization of documents...
In the new age era, there is tons of information published on the web every day. Thus, it will take ...
The article addresses the problem of document classification. A technology for automatic topic extra...
Most of the research on text categorization has focused on classifying text documents into a set of ...
In this paper a Web mining tool for content-based classification of Web pages is presented. The tool...
Exponential growth rates of learning materials and rapid distribution of those resources among e-lea...
This thesis deals with advanced machine-learning methods for text classification. At first, these me...