Papers
2,804 papers found
A Proposal for Scaling the Scaling Laws
Wout Schellaert, Ronan Hamon, Fernando Martínez-Plumed et al.
Beyond neural scaling laws: beating power law scaling via data pruning
Ben Sorscher, Robert Geirhos, Shashank Shekhar et al.
(Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning
Margaret Li, Sneha Kudugunta, Luke Zettlemoyer
Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?
Yi Tay, Mostafa Dehghani, Samira Abnar et al.
Revisiting Neural Scaling Laws in Language and Vision
Ibrahim M Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
Uncovering Neural Scaling Laws in Molecular Representation Learning
Dingshuo Chen, Yanqiao Zhu, Jieyu Zhang et al.
Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design
Ibrahim M Alabdulmohsin, Xiaohua Zhai, Alexander Kolesnikov et al.
Scaling laws for language encoding models in fMRI
Richard Antonello, Aditya Vaidya, Alexander Huth
Scaling Laws for Hyperparameter Optimization
Arlind Kadra, Maciej Janowski, Martin Wistuba et al.
Observational Scaling Laws and the Predictability of Langauge Model Performance
Yangjun Ruan, Chris J. Maddison, Tatsunori Hashimoto
4+3 Phases of Compute-Optimal Neural Scaling Laws
Elliot Paquette, Courtney Paquette, Lechao Xiao et al.
An exactly solvable model for emergence and scaling laws in the multitask sparse parity problem
Yoonsoo Nam, Nayara Fonseca, Seok Hyeong Lee et al.
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Licong Lin, Jingfeng Wu, Sham M. Kakade et al.
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Alexander Hägele, Elie Bakouch, Atli Kosson et al.
Dimension-free deterministic equivalents and scaling laws for random feature regression
Leonardo Defilippis, Bruno Loureiro, Theodor Misiakiewicz
Scaling laws for learning with real and surrogate data
Ayush Jain, Andrea Montanari, Eren Sasoglu
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Chaofan Tao, Qian Liu, Longxu Dou et al.
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
Rafael Rafailov, Yaswanth Chittepu, Ryan Park et al.
Scaling Laws for BERT in Low-Resource Settings
Gorka Urbizu, Iñaki San Vicente, Xabier Saralegi et al.
Revisiting Scaling Laws for Language Models: The Role of Data Quality and Training Strategies
Zhengyu Chen, Siqi Wang, Teng Xiao et al.
Scaling Laws and Efficient Inference for Ternary Language Models
Tejas Vaidhya, Ayush Kaushal, Vineet Jain et al.
Training Dynamics Underlying Language Model Scaling Laws: Loss Deceleration and Zero-Sum Learning
Andrei Mircea, Supriyo Chakraborty, Nima Chitsazan et al.
Diversity Explains Inference Scaling Laws: Through a Case Study of Minimum Bayes Risk Decoding
Hidetaka Kamigaito, Hiroyuki Deguchi, Yusuke Sakai et al.
Scaling Laws for Multilingual Language Models
Yifei He, Alon Benhaim, Barun Patra et al.