Lost in Translation: Authorship Attribution using Frame Semantics

Steffen Hedegaard and Jakob Grue Simonsen
University of Copenhagen


Abstract

We investigate authorship attribution using classifiers based on frame semantics. The pur- pose is to discover whether adding semantic information to lexical and syntactic methods for authorship attribution will improve them, specifically to address the difficult problem of authorship attribution of translated texts. Our results suggest (i) that frame-based classifiers are usable for author attribution of both trans- lated and untranslated texts; (ii) that frame- based classifiers generally perform worse than the baseline classifiers for untranslated texts, but (iii) perform as well as, or superior to the baseline classifiers on translated texts; (iv) that—contrary to current belief—naïve clas- sifiers based on lexical markers may perform tolerably on translated texts if the combination of author and translator is present in the train- ing set of a classifier.




Full paper: http://www.aclweb.org/anthology/P/P11/P11-2012.pdf