2015 JMLR JMLR 2015

Combination of Feature Engineering and Ranking Models for Paper-Author Identification in KDD Cup 2013

Abstract

This paper describes the winning solution of team National Taiwan University for track 1 of KDD Cup 2013. The track 1 in KDD Cup 2013 considers the paper-author identification problem, which is to identify whether a paper is truly written by an author. First, we conduct feature engineering to transform the various types of provided text information into 97 features. Second, we train classification and ranking models using these features. Last, we combine our individual models to boost the performance by using results on the internal validation set and the official Valid set. Some effective post-processing techniques have also been proposed. Our solution achieves 0.98259 MAP score and ranks the first place on the private leaderboard of the Test set. [abs] [ pdf ][ bib ] © JMLR 2015. (edit, beta)

👥 Mega-Team — 24 authors
📈 Trend Setter — Data Augmentation
🧭 Keyword Pioneer — paper author identification
🐣 Hot Topic Early Bird — ensemble model
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio