Papers
401 papers found
Basilisk: Using Provenance Invariants to Automate Proofs of Undecidable Protocols
Tony Nuda Zhang, Keshav Singh, Tej Chajed et al.
Bayesian Code Diffusion for Efficient Automatic Deep Learning Program Optimization
Isu Jeong, Seulki Lee
BlitzScale: Fast and Live Large Model Autoscaling with O(1) Host Caching
Dingyan Zhang, Haotian Wang, Yang Liu et al.
Building Bridges: Safe Interactions with Foreign Languages through Omniglot
Leon Schuermann, Jack Toubes, Tyler Potyondy et al.
Compass: Encrypted Semantic Search with High Accuracy
Jinhao Zhu, Liana Patel, Matei Zaharia et al.
DecDEC: A Systems Approach to Advancing Low-Bit LLM Quantization
Yeonhong Park, Jake Hyun, Hojoon Kim et al.
Decentralized, Epoch-based F2FS Journaling with Fine-grained Crash Recovery
Yaotian Cui, Zhiqi Wang, Renhai Chen et al.
Decouple and Decompose: Scaling Resource Allocation with DeDe
Zhiying Xu, Minlan Yu, Francis Y. Yan
Deriving Semantic Checkers from Tests to Detect Silent Failures in Production Distributed Systems
Chang Lou, Dimas Shidqi Parikesit, Yujin Huang et al.
Deterministic Client: Enforcing Determinism on Untrusted Machine Code
Zachary Yedidia, Geoffrey Ramseyer, David Mazières
Disentangling the Dual Role of NIC Receive Rings
Boris Pismenny, Adam Morrison, Dan Tsafrir
EMT: An OS Framework for New Memory Translation Architectures
Siyuan Chai, Jiyuan Zhang, Jongyul Kim et al.
Enabling Efficient GPU Communication over Multiple NICs with FuseLink
Zhenghang Ren, Yuxuan Li, Zilong Wang et al.
Extending Applications Safely and Efficiently
Yusheng Zheng, Tong Yu, Yiwei Yang et al.
Fast and Synchronous Crash Consistency with Metadata Write-Once File System
Yanqi Pan, Wen Xia, Yifeng Zhang et al.
FineMem: Breaking the Allocation Overhead vs. Memory Waste Dilemma in Fine-Grained Disaggregated Memory Management
Xiaoyang Wang, Yongkun Li, Kan Wu et al.
Fork in the Road: Reflections and Optimizations for Cold Start Latency in Production Serverless Systems
Xiaohu Chai, Tianyu Zhou, Keyang Hu et al.
Kamino: Efficient VM Allocation at Scale with Latency-Driven Cache-Aware Scheduling
David Domingo, Hugo Barbalho, Marco Molinaro et al.
KPerfIR: Towards a Open and Compiler-centric Ecosystem for GPU Kernel Performance Tooling on Modern AI Workloads
Yue Guan, Yuanwei Fang, Keren Zhou et al.
KRR: Efficient and Scalable Kernel Record Replay
Tianren Zhang, Sishuai Gong, Pedro Fonseca
Low End-to-End Latency atop a Speculative Shared Log with Fix-Ante Ordering
Shreesha G. Bhat, Tony Hong, Xuhao Luo et al.
Mako: Speculative Distributed Transactions with Geo-Replication
Weihai Shen, Yang Cui, Siddhartha Sen et al.
MettEagle: Costs and Benefits of Implementing Containers on Microkernels
Till Miemietz, Viktor Reusch, Matthias Hille et al.
Mirage: A Multi-Level Superoptimizer for Tensor Programs
Mengdi Wu, Xinhao Cheng, Shengyu Liu et al.