Xupeng Miao

Xupeng Miao is an Assistant Professor in the Department of Computer Science at Purdue University. Before that, he was a Post Doctoral Fellow working with Prof. Zhihao Jia and Prof. Tianqi Chen in Catalyst Group and Parallel Data Lab at Computer Science Department of Carnegie Mellon University. He received his Ph.D. degree in computer science from Peking University in June 2022, supervised by Prof. Bin Cui. He is broadly interested in machine learning systems, data management and distributed computing.

Prospective students: I am actively looking for strong and self-motivated PhD/MS students, postdocs, and (remote) interns interested in building systems for machine learning to join my group. If you are interested in working with me, please send me an email with your CV and transcripts.

news

Jun 18, 2024	I was awared WAIC 2024 Yunfan Award · Bright Stars!
Apr 24, 2024	We will lanuch a tutorial on efficient LLM serving in ICML 2024.
Apr 23, 2024	I was invited to give a talk at the ASPLOS’24 XTensor workshop.
Apr 4, 2024	I was invited to give a talk at the MLSys’24 Young Professionals Symposium.
Feb 29, 2024	SpecInfer was accepted by ASPLOS 2024.
Feb 28, 2024	SpotServe has been selected for the Distinguished Artifact Award at ASPLOS 2024!
Feb 2, 2024	We will lanuch a tutorial on data managment for LLM in SIGMOD 2024.
Dec 23, 2023	We announce a survey about efficient generative LLM serving on arXiv.
Dec 7, 2023	One paper on distributed training over spot instances was accepted by NSDI 2024.
Nov 7, 2023	One paper about LLM serving over preemptive instances was accepted by ASPLOS 2024.
May 16, 2023	We announce the first speculative LLM inference engine called SpecInfer.
May 13, 2023	Three papers were accepted by VLDB 2023.
Mar 23, 2023	One paper was accepted by OSDI 2023.
Jan 30, 2023	I was grateful to be awared 2022 ACM China Doctoral Dissertation Award.

selected publications

ASPLOS

SpotServe: Serving Generative Large Language Models on Preemptible Instances (Distinguished Artifact Award)

Xupeng Miao, Chunan Shi, Jiangfei Duan, Xiaoli Xi and 3 more authors

Proceedings of ASPLOS Conference 2024

arXiv Bib

@article{asplos24spotserve,
  title = {SpotServe: Serving Generative Large Language Models on Preemptible Instances},
  author = {Miao, Xupeng and Shi, Chunan and Duan, Jiangfei and Xi, Xiaoli and Lin, Dahua and Cui, Bin and Jia, Zhihao},
  journal = {Proceedings of ASPLOS Conference},
  year = {2024},
  notes = {(Distinguished Artifact Award)},
}

ASPLOS

SpecInfer: Accelerating Generative Large Language Model Serving with Speculative Inference and Token Tree Verification

Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng and 10 more authors

Proceedings of ASPLOS Conference 2024

arXiv Bib

@article{miao23specinfer,
  title = {SpecInfer: Accelerating Generative Large Language Model Serving with Speculative Inference and Token Tree Verification},
  author = {Miao, Xupeng and Oliaro, Gabriele and Zhang, Zhihao and Cheng, Xinhao and Wang, Zeyu and Wong, Rae Ying Yee and Zhu, Alan and Yang, Lijie and Shi, Xiaoxiang and Shi, Chunan and Chen, Zhuoming and Arfeen, Daiyaan and Abhyankar, Reyna and Jia, Zhihao},
  journal = {Proceedings of ASPLOS Conference},
  year = {2024},
  doi = {10.48550/arXiv.2305.09781},
}

NSDI

Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances

Jiangfei Duan¹, Ziang Song¹, Xupeng Miao¹, Xiaoli Xi and 4 more authors

Proceedings of NSDI Conference 2024

arXiv Bib

@article{nsdi24parcae,
  title = {Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances},
  author = {Duan, Jiangfei and Song, Ziang and Miao, Xupeng and Xi, Xiaoli and Lin, Dahua and Xu, Harry and Zhang, Minjia and Jia, Zhihao},
  journal = {Proceedings of NSDI Conference},
  cofirst = {true},
  year = {2024},
}

VLDB

SDPipe: A Semi-Decentralized Framework for Heterogeneity-aware Pipeline-parallel Training

Xupeng Miao, Yining Shi, Zhi Yang, Bin Cui and 1 more author

Proc. VLDB Endow. 2023

Bib

@article{miao2023sdpipe,
  title = {SDPipe: A Semi-Decentralized Framework for Heterogeneity-aware Pipeline-parallel Training},
  author = {Miao, Xupeng and Shi, Yining and Yang, Zhi and Cui, Bin and Jia, Zhihao},
  journal = {Proc. {VLDB} Endow.},
  volume = {16},
  year = {2023},
  publisher = {VLDB Endowment},
}

VLDB

Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism

Xupeng Miao, Yujie Wang, Youhe Jiang, Chunan Shi and 3 more authors

Proc. VLDB Endow. 2023

Bib PDF

@article{miao2023galvatron,
  title = {Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism},
  author = {Miao, Xupeng and Wang, Yujie and Jiang, Youhe and Shi, Chunan and Nie, Xiaonan and Zhang, Hailin and Cui, Bin},
  journal = {Proc. {VLDB} Endow.},
  volume = {16},
  number = {3},
  pages = {470--479},
  year = {2023},
  doi = {10.14778/3570690.3570697},
  publisher = {VLDB Endowment},
}

VLDB

HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework (Best Scalable Data Science Paper)

Xupeng Miao, Hailin Zhang, Yining Shi, Xiaonan Nie and 3 more authors

Proc. VLDB Endow. 2022

Bib PDF

@article{miao2021het,
  title = {{HET:} Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework},
  author = {Miao, Xupeng and Zhang, Hailin and Shi, Yining and Nie, Xiaonan and Yang, Zhi and Tao, Yangyu and Cui, Bin},
  journal = {Proc. {VLDB} Endow.},
  volume = {15},
  number = {2},
  pages = {312--320},
  year = {2022},
  publisher = {VLDB Endowment},
  notes = {(Best Scalable Data Science Paper)},
}

SIGMOD

HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Training

Xupeng Miao, Yining Shi, Hailin Zhang, Xin Zhang and 3 more authors

In Proceedings of SIGMOD Conference 2022

Bib PDF

@inproceedings{miao2022hetgmp,
  author = {Miao, Xupeng and Shi, Yining and Zhang, Hailin and Zhang, Xin and Nie, Xiaonan and Yang, Zhi and Cui, Bin},
  title = {{HET-GMP:} {A} Graph-based System Approach to Scaling Large Embedding Model Training},
  booktitle = {Proceedings of SIGMOD Conference},
  pages = {470--480},
  publisher = {{ACM}},
  year = {2022},
  doi = {10.1145/3514221.3517902},
}

SIGMOD

Heterogeneity-Aware Distributed Machine Learning Training via Partial Reduce

Xupeng Miao, Xiaonan Nie, Yingxia Shao, Zhi Yang and 3 more authors

In Proceedings of SIGMOD Conference 2021

Bib PDF

@inproceedings{DBLP:conf/sigmod/MiaoNSYJM021,
  author = {Miao, Xupeng and Nie, Xiaonan and Shao, Yingxia and Yang, Zhi and Jiang, Jiawei and Ma, Lingxiao and Cui, Bin},
  title = {Heterogeneity-Aware Distributed Machine Learning Training via Partial
                 Reduce},
  booktitle = {Proceedings of SIGMOD Conference},
  pages = {2262--2270},
  publisher = {{ACM}},
  year = {2021},
  doi = {10.1145/3448016.3452773},
}