AbstractAuthorship attribution is one of the research areas in data mining domain and various methods can be employed for performing that task. The paper presents results of research on influence of data discretization on efficiency of Naive Bayes classifier. The analysis has been carried on datasets founded on texts of two male and two female authors using the WEKA data mining software framework. The binary classification was performed separately for both datasets for wide range of parameters of discretization process in order to investigate dependency between ways of discretization and quality of classification using Naive Bayes method. The numerical results of tests have been compared and discussed and some observations and conclusions f...
Authorship attribution (AA) is a subfield of linguistics analysis, aiming to identify the original a...
The naïve Bayes classifier is a simple form of Bayesian classifiers which assumes all the features a...
We consider several statistical approaches to binary classification and multiple hypothesis testing ...
AbstractAuthorship attribution is one of the research areas in data mining domain and various method...
Authorship attribution (AA) is the task of identifying authors of disputed or anonymous texts. It ca...
When patterns to be recognised are described by features of continuous type, discretisation becomes ...
Abstract. We investigate why discretization can be effective in naive-Bayes learning. We prove a the...
This paper considers estimation of success probabilities of categorical binary data subject to miscl...
Classification and clustering techniques in d ata mining are useful for a wide variety of real time ...
During recent years, the amounts of data, collected and stored by organizations on a daily basis, ha...
Nowadays data mining become one of the technologies that paly major effect on business intelligence....
AbstractAuthorship attribution is the process of assigning an author to an anonymous text based on w...
Discretization is the process of converting numerical values into categorical values. Contemporary l...
In this thesis, a simulation study was performed to investigate the effects of normalization and un...
The performance of many machine learning algorithms can be substantially improved with a proper disc...
Authorship attribution (AA) is a subfield of linguistics analysis, aiming to identify the original a...
The naïve Bayes classifier is a simple form of Bayesian classifiers which assumes all the features a...
We consider several statistical approaches to binary classification and multiple hypothesis testing ...
AbstractAuthorship attribution is one of the research areas in data mining domain and various method...
Authorship attribution (AA) is the task of identifying authors of disputed or anonymous texts. It ca...
When patterns to be recognised are described by features of continuous type, discretisation becomes ...
Abstract. We investigate why discretization can be effective in naive-Bayes learning. We prove a the...
This paper considers estimation of success probabilities of categorical binary data subject to miscl...
Classification and clustering techniques in d ata mining are useful for a wide variety of real time ...
During recent years, the amounts of data, collected and stored by organizations on a daily basis, ha...
Nowadays data mining become one of the technologies that paly major effect on business intelligence....
AbstractAuthorship attribution is the process of assigning an author to an anonymous text based on w...
Discretization is the process of converting numerical values into categorical values. Contemporary l...
In this thesis, a simulation study was performed to investigate the effects of normalization and un...
The performance of many machine learning algorithms can be substantially improved with a proper disc...
Authorship attribution (AA) is a subfield of linguistics analysis, aiming to identify the original a...
The naïve Bayes classifier is a simple form of Bayesian classifiers which assumes all the features a...
We consider several statistical approaches to binary classification and multiple hypothesis testing ...