Papers
3,922 papers found
Better Generalizing to Unseen Concepts: An Evaluation Framework and An LLM-Based Auto-Labeled Pipeline for Biomedical Concept Recognition
Shanshan Liu, Noriki Nishida, Fei Cheng et al.
Beyond Accuracy: Alignment and Error Detection across Languages in the Bi-GSM8K Math-Teaching Benchmark
Jieun Park, KyungTae Lim, Joon-ho Lim
Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pre-training
Jeffrey Li, Joshua P Gardner, Doug Kang et al.
Beyond Bias Scores: Unmasking Vacuous Neutrality in Small Language Models
Sumanth Manduru, Carlotta Domeniconi
Beyond Blind Following: Evaluating Robustness of LLM Agents under Imperfect Guidance
Yao Fu, Ran Qiu, Xinhe Wang et al.
Beyond Coherence: Improving Temporal Consistency and Interpretability in Dynamic Topic Models
Thanh Vinh Nguyen, Ngo Van Dong, Minh Chu Xuan et al.
Beyond Divergent Creativity: A Human-Based Evaluation of Creativity in Large Language Models
Kumiko Nakajima, Jan Zuiderveld, Sandro Pezzelle
Beyond Grid Search: Leveraging Bayesian Optimization for Accelerating RAG Pipeline Optimization
Anum Afzal, Xueru Zheng, Florian Matthes
Beyond IVR: Benchmarking Customer Support LLM Agents for Business-Adherence
Sumanth Balaji, Piyush Mishra, Aashraya Sachdeva et al.
Beyond Length: Context-Aware Expansion and Independence as Developmentally Sensitive Evaluation in Child Utterances
Jiyun Chun, Eric Fosler-Lussier, Michael White et al.
Beyond Many-Shot Translation: Scaling In-Context Demonstrations For Low-Resource Machine Translation
Luis Frentzen Salim, Esteban Carlin, Alexandre Morinvil et al.
Beyond Math: Stories as a Testbed for Memorization-Constrained Reasoning in LLMs
Yuxuan Jiang, Francis Ferraro
Beyond Memorization: A Rigorous Evaluation Framework for Medical Knowledge Editing
Shigeng Chen, Linhao Luo, Zhangchi Qiu et al.
Beyond Multiple Choice: Evaluating Steering Vectors for Summarization
Joschka Braun, Carsten Eickhoff, Seyed Ali Bahrainian
Beyond Musical Descriptors: Extracting Preference-Bearing Intent in Music Queries
Marion Baranes, Romain Hennequin, Elena V. Epure
Beyond Names: How Grammatical Gender Markers Bias LLM-based Educational Recommendations
Luca Benedetto, Antonia Donvito, Alberto Lucchetti et al.
Beyond "Not Novel Enough": Enriching Scholarly Critique with LLM-Assisted Feedback
Osama Mohammed Afzal, Preslav Nakov, Tom Hope et al.
Beyond One-Step Distillation: Bridging the Capacity Gap in Small Language Models via Multi-Step Knowledge Transfer
Gaeun Yim, Nayoung Ko, Manasa Bharadwaj
Beyond Passive Viewing: A Pilot Study of a Hybrid Learning Platform Augmenting Video Lectures with Conversational AI.
Mohammed Abraar, Raj Dandekar, Rajat Dandekar et al.
Beyond Random Sampling: Efficient Language Model Pretraining via Curriculum Learning
Yang Zhang, Amr Mohamed, Hadi Abdine et al.
Beyond Sample-Level Feedback: Using Reference-Level Feedback to Guide Data Synthesis
Shuhaib Mehri, Xiusi Chen, Heng Ji et al.
Beyond Sampling: Self-Sorting for Long-Context Ranking
Juseon Do, Sungwoo Han, Jingun Kwon et al.
Beyond Semantics: How Temporal Biases Shapes Retrieval in Transformer and State-Space Models
Anooshka Bajaj, Deven Mahesh Mistry, Sahaj Singh Maini et al.
Beyond Single Words: MWE Identification in Bioinformatics Research Articles and Dispersion Profiling Across IMRaD
Jurgi Giraud, Andrew Gargett