2023 EACL EACL 2023

Probabilistic Robustness for Data Filtering

Abstract

AbstractWe introduce our probabilistic robustness rewarded data optimization (PRoDO) approach as a framework to enhance the modelโ€™s generalization power by selecting training data that optimizes our probabilistic robustness metrics. We use proximal policy optimization (PPO) reinforcement learning to approximately solve the computationally intractable training subset selection problem. The PPOโ€™s reward is defined as our (๐›ผ,๐œ–, ๐›พ)-Robustness that measures performance consistency over multiple domains by simulating unknown test sets in real-world scenarios using a leaving-one-out strategy. We demonstrate that our PRoDO effectively filters data that lead to significantly higher prediction accuracy and robustness on unknown-domain test sets. Our experiments achieve up to +17.2% increase of accuracy (+25.5% relatively) in sentiment analysis, and -28.05 decrease of perplexity (-32.1% relatively) in language modeling.In addition, our probabilistic (๐›ผ,๐œ–, ๐›พ)-Robustness definition serves as an evaluation metric with higher levels of agreement with human annotations than typical performance-based metrics.

๐ŸŒ‰ Interdisciplinary Bridge โ€” Machine Learning and Reinforcement Learning
๐Ÿฃ Hot Topic Early Bird โ€” proximal policy optimization
๐Ÿ Cross-Pollinator โ€” Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio