Papers
2,804 papers found
Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws
Kush Bhatia, Wenshuo Guo, Jacob Steinhardt
On the Scaling Laws of Geographical Representation in Language Models
Nathan Godey, Éric de la Clergerie, Benoît Sagot
Reproducible Scaling Laws for Contrastive Language-Image Learning
Mehdi Cherti, Romain Beaumont, Ross Wightman et al.
Scaling Laws for Data Filtering-- Data Curation cannot be Compute Agnostic
Sachin Goyal, Pratyush Maini, Zachary C. Lipton et al.
Scaling Laws of Synthetic Images for Model Training ... for Now
Lijie Fan, Kaifeng Chen, Dilip Krishnan et al.
Towards Precise Scaling Laws for Video Diffusion Transformers
Yuanyang Yin, Yaqi Zhao, Mingwu Zheng et al.
Data and Parameter Scaling Laws for Neural Machine Translation
Mitchell A Gordon, Kevin Duh, Jared Kaplan
Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments
Maor Ivgi, Yair Carmon, Jonathan Berant
Transcending Scaling Laws with 0.1% Extra Compute
Yi Tay, Jason Wei, Hyung Chung et al.
ScalingFilter: Assessing Data Quality through Inverse Utilization of Scaling Laws
Ruihang Li, Yixuan Wei, Miaosen Zhang et al.
Scaling Laws Across Model Architectures: A Comparative Analysis of Dense and MoE Models in Large Language Models
Siqi Wang, Zhengyu Chen, Bei Li et al.
Scaling Laws for Linear Complexity Language Models
Xuyang Shen, Dong Li, Ruitao Leng et al.
Scaling Laws for Fact Memorization of Large Language Models
Xingyu Lu, Xiaonan Li, Qinyuan Cheng et al.
Scaling Laws of Decoder-Only Models on the Multilingual Machine Translation Task
Gaëtan Caillaut, Mariam Nakhlé, Raheel Qader et al.
Demystifying Synthetic Data in LLM Pre-training: A Systematic Study of Scaling Laws, Benefits, and Pitfalls
Feiyang Kang, Newsha Ardalani, Michael Kuchnik et al.
Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions
Emmy Liu, Amanda Bertsch, Lintang Sutawika et al.
Spectral Scaling Laws in Language Models: emphHow Effectively Do Feed-Forward Networks Use Their Latent Space?
Nandan Kumar Jha, Brandon Reagen
Scaling Laws Are Unreliable for Downstream Tasks: A Reality Check
Nicholas Lourie, Michael Y. Hu, Kyunghyun Cho
Uncovering Scaling Laws for Large Language Models via Inverse Problems
Arun Verma, Zhaoxuan Wu, Zijian Zhou et al.
Scaling Laws for Native Multimodal Models
Mustafa Shukor, Enrico Fini, Victor Guilherme Turrisi da Costa et al.
Scaling Laws for Neural Machine Translation
Behrooz Ghorbani, Orhan Firat, Markus Freitag et al.
Broken Neural Scaling Laws
Ethan Caballero, Kshitij Gupta, Irina Rish et al.
How Much Data Are Augmentations Worth? An Investigation into Scaling Laws, Invariance, and Implicit Regularization
Jonas Geiping, Micah Goldblum, Gowthami Somepalli et al.
Scaling Laws For Deep Learning Based Image Reconstruction
Tobit Klug, Reinhard Heckel
Scaling Laws for a Multi-Agent Reinforcement Learning Model
Oren Neumann, Claudius Gros