Variational Autoencoder with Embedded Student-t Mixture Model for Authorship Attribution

Benedikt Boenninghoff; Steffen Zeiler; Robert Nickel; Dorothea Kolossa

2020 COLING COLING 2020

Variational Autoencoder with Embedded Student-t Mixture Model for Authorship Attribution

Abstract

AbstractTraditional computational authorship attribution describes a classification task in a closed-set scenario. Given a finite set of candidate authors and corresponding labeled texts, the objective is to determine which of the authors has written another set of anonymous or disputed texts. In this work, we propose a probabilistic autoencoding framework to deal with this supervised classification task. Variational autoencoders (VAEs) have had tremendous success in learning latent representations. However, existing VAEs are currently still bound by limitations imposed by the assumed Gaussianity of the underlying probability distributions in the latent space. In this work, we are extending a VAE with an embedded Gaussian mixture model to a Student-t mixture model, which allows for an independent control of the “heaviness” of the respective tails of the implied probability densities. Experiments over an Amazon review dataset indicate superior performance of the proposed method.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — student-t mixture model

🐣 Hot Topic Early Bird — authorship attribution

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Benedikt Boenninghoff , Steffen Zeiler , Robert Nickel , Dorothea Kolossa

Topics

Machine Learning > Optimization & Theory > Bayesian Inference Deep Learning > Models > Generative Models Deep Learning > Models > Variational Inference

Keywords

probabilistic modeling authorship attribution latent representation variational autoencoder student-t mixture model

Download PDF

Related papers

Persuasiveness of News Editorials depending on Ideology and Personality 2020

A Graph Representation of Semi-structured Data for Web Question Answering 2020

Span-based Joint Entity and Relation Extraction with Attention-based Span-specific and Contextual Semantic Representations 2020

Hierarchical Chinese Legal event extraction via Pedal Attention Mechanism 2020

End-to-End Emotion-Cause Pair Extraction with Graph Convolutional Network 2020