Papers
401 papers found
XSched: Preemptive Scheduling for Diverse XPUs
Weihang Shen, Mingcong Han, Jialong Liu et al.
ZEN: Empowering Distributed Training with Sparsity-driven Data Synchronization
Zhuang Wang, Zhaozhuo Xu, Jingyi Xi et al.
ACCL+: an FPGA-Based Collective Engine for Distributed Applications
Zhenhao He, Dario Korolija, Yu Zhu et al.
Anvil: Verifying Liveness of Cluster Management Controllers
Xudong Sun, Wenjie Ma, Jiawei Tyler Gu et al.
A Tale of Two Paths: Toward a Hybrid Data Plane for Efficient Far-Memory Applications
Lei Chen, Shi Liu, Chenxi Wang et al.
Automatically Reasoning About How Systems Code Uses the CPU Cache
Rishabh Iyer, Katerina Argyraki, George Candea
Beaver: Practical Partial Snapshots for Distributed Cloud Services
Liangcheng Yu, Xiao Zhang, Haoran Zhang et al.
Burstable Cloud Block Storage with Data Processing Units
Junyi Shu, Kun Qian, Ennan Zhai et al.
Caravan: Practical Online Learning of In-Network ML Models with Labeling Agents
Qizheng Zhang, Ali Imran, Enkeleda Bardhi et al.
ChameleonAPI: Automatic and Efficient Customization of Neural Networks for ML Applications
Yuhan Liu, Chengcheng Wan, Kuntai Du et al.
Chop Chop: Byzantine Atomic Broadcast to the Network Limit
Martina Camaioni, Rachid Guerraoui, Matteo Monti et al.
Data-flow Availability: Achieving Timing Assurance in Autonomous Systems
Ao Li, Ning Zhang
Detecting Logic Bugs in Database Engines via Equivalent Expression Transformation
Zu-Ming Jiang, Zhendong Su
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving
Yinmin Zhong, Shengyu Liu, Junda Chen et al.
dLoRA: Dynamically Orchestrating Requests and Adapters for LoRA LLM Serving
Bingyang Wu, Ruidong Zhu, Zili Zhang et al.
DRust: Language-Guided Distributed Shared Memory with Fine Granularity, Full Transparency, and Ultra Efficiency
Haoran Ma, Yifan Qiao, Shi Liu et al.
DSig: Breaking the Barrier of Signatures in Data Centers
Marcos K. Aguilera, Clément Burgelin, Rachid Guerraoui et al.
Enabling Tensor Language Model to Assist in Generating High-Performance Tensor Programs for Deep Learning
Yi Zhai, Sijia Yang, Keyu Pan et al.
Fairness in Serving Large Language Models
Ying Sheng, Shiyi Cao, Dacheng Li et al.
FairyWREN: A Sustainable Cache for Emerging Write-Read-Erase Flash Interfaces
Sara McAllister, Yucong "Sherry" Wang, Benjamin Berg et al.
Fast and Scalable In-network Lock Management Using Lock Fission
Hanze Zhang, Ke Cheng, Rong Chen et al.
Flock: A Framework for Deploying On-Demand Distributed Trust
Darya Kaviani, Sijun Tan, Pravein Govindan Kannan et al.
Harvesting Memory-bound CPU Stall Cycles in Software with MSH
Zhihong Luo, Sam Son, Sylvia Ratnasamy et al.
High-throughput and Flexible Host Networking for Accelerated Computing
Athinagoras Skiadopoulos, Zhiqiang Xie, Mark Zhao et al.
Identifying On-/Off-CPU Bottlenecks Together with Blocked Samples
Minwoo Ahn, Jeongmin Han, Youngjin Kwon et al.