Publications
Publications by categories in reversed chronological order. 1 represents co-first author.
2025
- ASPLOSHelix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUsProceedings of ASPLOS Conference 2025
2024
- SOSPEnabling Parallelism Hot Switching for Efficient Training of Large Language ModelsProceedings of SOSP Conference 2024
- SCAtlas: Hierarchical Partitioning for Quantum Circuit Simulation on GPUsProceedings of SC Conference 2024
- ASPLOSSpotServe: Serving Generative Large Language Models on Preemptible Instances (Distinguished Artifact Award)Proceedings of ASPLOS Conference 2024
- ASPLOSSpecInfer: Accelerating Generative Large Language Model Serving with Speculative Inference and Token Tree VerificationProceedings of ASPLOS Conference 2024
- ASPLOSOptimal Kernel Orchestration for Tensor Programs with KorchProceedings of ASPLOS Conference 2024
- VLDBExperimental Analysis of Large-scale Learnable Vector Storage CompressionProc. VLDB Endow. 2024
- ICDEMFIX: An Efficient and Reliable Index Advisor via Multi-Fidelity Bayesian OptimizationProceedings of ICDE Conference 2024
2023
- OSDIEinNet: Optimizing Tensor Programs with Derivation-Based TransformationsProceedings of OSDI Conference 2023
- VLDBSDPipe: A Semi-Decentralized Framework for Heterogeneity-aware Pipeline-parallel TrainingProc. VLDB Endow. 2023
- VLDBGalvatron: Efficient Transformer Training over Multiple GPUs Using Automatic ParallelismProc. VLDB Endow. 2023
- SIGMODFlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device PlacementProceedings of SIGMOD Conference 2023
2022
- VLDBHET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework (Best Scalable Data Science Paper Award)Proc. VLDB Endow. 2022
- VLDBTowards Communication-efficient Vertical Federated Learning Training via Cache-enabled Local UpdatesProc. VLDB Endow. 2022
- SIGMODHET-GMP: A Graph-based System Approach to Scaling Large Embedding Model TrainingIn Proceedings of SIGMOD Conference 2022
- VLDBJP2CG: A Privacy Preserving Collaborative Graph Neural Network Training FrameworkThe VLDB Journal 2022