In this paper, we develop two automated authorship attribution schemes, one based on Multiple Discriminant Analysis (MDA) and the other based on a Support Vector Machine (SVM). The classification features we exploit are based on word frequencies in the text. We adopt an approach of preprocessing each text by stripping it of all characters except a-z and space. This is in order to increase the portability of the software to different types of texts. We test the methodology on a corpus of undisputed English texts, and use leave-one-out cross validation to demonstrate classification accuracies in excess of 90%. We further test our methods on the Federalist Papers, which have a partly disputed authorship and a fair degree of scholarly consensus...
The process of establishing the most likely author of a collection of texts or documents whose autho...
Authorship attribution (AA) is the process of identifying the author of a given text and from the ma...
Attributing authorship of documents with unknown creators has been studied extensively for natural l...
In this paper, we develop two automated authorship attribution schemes, one based on Multiple Discri...
In this paper, we develop two automated authorship attribution schemes, one based on Multiple Discri...
In this paper, we develop two automated authorship attribution schemes, one based on Multiple Discri...
© 2006 COPYRIGHT SPIE--The International Society for Optical EngineeringAuthorship attribution has a...
Automatic authorship attribution is an umbrella term for methods trying to derive authorship from te...
In authorship attribution, one assigns texts from an unknown author to either one of two or more can...
This paper uses text mining algorithms, especially classification procedures, to learn the specific ...
In order to authorship attribution techniques, the Federalist Papers have been applied as a testing-...
Techniques for identifying the author of an unattributed document can be applied to problems in info...
Authorship attribution (AA) is the task of identifying authors of disputed or anonymous texts. It ca...
In recent years, methods of computational authorship attribution have offered promising results for ...
This paper covers a text classification problem: the identification of the author of a text. It is n...
The process of establishing the most likely author of a collection of texts or documents whose autho...
Authorship attribution (AA) is the process of identifying the author of a given text and from the ma...
Attributing authorship of documents with unknown creators has been studied extensively for natural l...
In this paper, we develop two automated authorship attribution schemes, one based on Multiple Discri...
In this paper, we develop two automated authorship attribution schemes, one based on Multiple Discri...
In this paper, we develop two automated authorship attribution schemes, one based on Multiple Discri...
© 2006 COPYRIGHT SPIE--The International Society for Optical EngineeringAuthorship attribution has a...
Automatic authorship attribution is an umbrella term for methods trying to derive authorship from te...
In authorship attribution, one assigns texts from an unknown author to either one of two or more can...
This paper uses text mining algorithms, especially classification procedures, to learn the specific ...
In order to authorship attribution techniques, the Federalist Papers have been applied as a testing-...
Techniques for identifying the author of an unattributed document can be applied to problems in info...
Authorship attribution (AA) is the task of identifying authors of disputed or anonymous texts. It ca...
In recent years, methods of computational authorship attribution have offered promising results for ...
This paper covers a text classification problem: the identification of the author of a text. It is n...
The process of establishing the most likely author of a collection of texts or documents whose autho...
Authorship attribution (AA) is the process of identifying the author of a given text and from the ma...
Attributing authorship of documents with unknown creators has been studied extensively for natural l...