Refining App Reviews: Dataset, Methodology, and Evaluation

Amrita Singh; Chirag Jain; Mohit Chaudhary; Preethu Rose Anish

2024 EMNLP EMNLP 2024

Refining App Reviews: Dataset, Methodology, and Evaluation

Abstract

AbstractWith the growing number of mobile users, app development has become increasingly lucrative. Reviews on platforms such as Google Play and Apple App Store provide valuable insights to developers, highlighting bugs, suggesting new features, and offering feedback. However, many reviews contain typos, spelling errors, grammar mistakes, and complex sentences, hindering efficient interpretation and slowing down app improvement processes. To tackle this, we introduce RARE (Repository for App review REfinement), a benchmark dataset of 10,000 annotated pairs of original and refined reviews from 10 mobile applications. These reviews were collaboratively refined by humans and large language models (LLMs). We also conducted an evaluation of eight state-of-the-art LLMs for automated review refinement. The top-performing model (Flan-T5) was further used to refine an additional 10,000 reviews, contributing to RARE as a silver corpus.

🌉 Interdisciplinary Bridge — Deep Learning and Natural Language Processing

🧭 Keyword Pioneer — automated refinement

🐣 Hot Topic Early Bird — automated evaluation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Amrita Singh , Chirag Jain , Mohit Chaudhary , Preethu Rose Anish

Topics

Natural Language Processing > Generation > Text Generation Natural Language Processing > Applications > Text Processing Deep Learning > Learning Types > Text Generation

Keywords

text editing automated evaluation large language model text refinement text improvement app review automated refinement review processing

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024