One of the main drivers of the recent advances in authorship verification is the PAN large-scale authorship dataset. Despite generating significant progress in the field, inconsistent performance differences between the closed and open test sets have been reported. To this end, we improve the experimental setup by proposing five new public splits over the PAN dataset, specifically designed to isolate and identify biases related to the text topic and to the author's writing style. We evaluate several BERT-like baselines on these splits, showing that such models are competitive with authorship verification state-of-the-art methods. Furthermore, using explainable AI, we find that these baselines are biased towards named entities. We show that ...
Many approaches have been proposed recently to identify the author of a given document. Thereby, one...
Automatically disentangling an author's style from the content of their writing is a longstanding an...
Digital communication makes it easy for anyone to write and publish texts all over the world. In som...
Task Authorship verification is the task of deciding whether two texts have been written by the sam...
Enhancing information retrieval systems with the ability to take the writing style of people into ac...
The authorship verification task at PAN 2022 follows the experimental setup of similar shared tasks ...
Authorship attribution is a problem in information retrieval and computational linguistics that invo...
Task Authorship verification is the task of deciding whether two texts have been written by the sam...
Applications of authorship attribution ‘in the wild ’ [Koppel, M., Schler, J., and Argamon, S. (2010...
Existing research on Authorship Attribution (AA) focuses on texts for which a lot of data is availab...
Abstract. Authorship verification is one of the most challenging tasks in stylebased text categoriza...
The author identification task at PAN-2014 focuses on author verification. Similar to PAN-2013 we ar...
Authorship attribution is an important problem in information retrieval and computational linguistic...
Many approaches have been proposed recently to identify the author of a given document. Thereby, one...
Automatically disentangling an author's style from the content of their writing is a longstanding an...
Digital communication makes it easy for anyone to write and publish texts all over the world. In som...
Task Authorship verification is the task of deciding whether two texts have been written by the sam...
Enhancing information retrieval systems with the ability to take the writing style of people into ac...
The authorship verification task at PAN 2022 follows the experimental setup of similar shared tasks ...
Authorship attribution is a problem in information retrieval and computational linguistics that invo...
Task Authorship verification is the task of deciding whether two texts have been written by the sam...
Applications of authorship attribution ‘in the wild ’ [Koppel, M., Schler, J., and Argamon, S. (2010...
Existing research on Authorship Attribution (AA) focuses on texts for which a lot of data is availab...
Abstract. Authorship verification is one of the most challenging tasks in stylebased text categoriza...
The author identification task at PAN-2014 focuses on author verification. Similar to PAN-2013 we ar...
Authorship attribution is an important problem in information retrieval and computational linguistic...
Many approaches have been proposed recently to identify the author of a given document. Thereby, one...
Automatically disentangling an author's style from the content of their writing is a longstanding an...
Digital communication makes it easy for anyone to write and publish texts all over the world. In som...