Papers
401 papers found
NanoFlow: Towards Optimal Large Language Model Serving Throughput
Kan Zhu, Yufei Gao, Yilong Zhao et al.
Neutrino: Fine-grained GPU Kernel Profiling via Programmable Probing
Songlin Huang, Chenshu Wu
Okapi: Decoupling Data Striping and Redundancy Grouping in Cluster File Systems
Sanjith Athlur, Timothy Kim, Saurabh Kadekodi et al.
Paralegal: Practical Static Analysis for Privacy Bugs
Justus Adam, Carolyn Zech, Livia Zhu et al.
Picsou: Enabling Replicated State Machines to Communicate Efficiently
Reginald Frank, Micah Murray, Chawinphat Tankuranand et al.
PipeThreader: Software-Defined Pipelining for Efficient DNN Execution
Yu Cheng, Lei Wang, Yining Shi et al.
PoWER Never Corrupts: Tool-Agnostic Verification of Crash Consistency and Corruption Detection
Hayley LeBlanc, Jacob R. Lorch, Chris Hawblitzel et al.
Principles and Methodologies for Serial Performance Optimization
Sujin Park, Mingyu Guan, Xiang Cheng et al.
QiMeng-Xpiler: Transcompiling Tensor Programs for Deep Learning Systems with a Neural-Symbolic Approach
Shouyang Dong, Yuanbo Wen, Jun Bi et al.
QOS: Quantum Operating System
Emmanouil Giortamis, Francisco Romão, Nathaniel Tornow et al.
Quake: Adaptive Indexing for Vector Search
Jason Mohoney, Devesh Sarda, Mengze Tang et al.
Quantum Virtual Machines
Runzhou Tao, Hongzheng Zhu, Jason Nieh et al.
Scalio: Scaling up DPU-based JBOF Key-value Store with NVMe-oF Target Offload
Xun Sun, Mingxing Zhang, Yingdi Shan et al.
Skybridge: Bounded Staleness for Distributed Caches
Robert Lyerly, Scott Pruett, Kevin Doherty et al.
Söze: One Network Telemetry Is All You Need for Per-flow Weighted Bandwidth Allocation at Scale
Weitao Wang, T. S. Eugene Ng
Stripeless Data Placement for Erasure-Coded In-Memory Storage
Jian Gao, Jiwu Shu, Bin Yan et al.
Tiered Memory Management Beyond Hotness
Jinshu Liu, Hamid Hadian, Hanchen Xu et al.
Tigon: A Distributed Database for a CXL Pod
Yibo Huang, Haowei Chen, Newton Ni et al.
Tintin: A Unified Hardware Performance Profiling Infrastructure to Uncover and Manage Uncertainty
Ao Li, Marion Sudvarg, Zihan Li et al.
To PRI or Not To PRI, That's the question
Yun Wang, Liang Chen, Jie Ji et al.
Training with Confidence: Catching Silent Errors in Deep Learning Training with Automated Proactive Checks
Yuxuan Jiang, Ziming Zhou, Boyu Xu et al.
Understanding Stragglers in Large Model Training Using What-if Analysis
Jinkun Lin, Ziheng Jiang, Zuquan Song et al.
WaferLLM: Large Language Model Inference at Wafer Scale
Congjie He, Yeqi Huang, Pei Mu et al.
Weave: Efficient and Expressive Oblivious Analytics at Scale
Mahdi Soleimani, Grace Jia, Anurag Khandelwal
WLB-LLM: Workload-Balanced 4D Parallelism for Large Language Model Training
Zheng Wang, Anna Cai, Xinfeng Xie et al.