Tony Zhao
Open Menu
Close Menu
Bio
Papers
Experience
Game Projects
ESC
All Results
Searching...
Finding results for ""
No results found
No results for ""
Clear search
↑↓
Navigate
↵
Select
0 filters
Powered by Hugo Blox
Publications
Juntao Zhao
,
Qi Lu
,
Wei Jia
,
Borui Wan
,
Lei Zuo
,
Junda Feng
,
Jianyu Jiang
,
Yangrui Chen
,
Shuaishuai Cao
,
Jialing He
,
Others
(2026).
MegaScale-Data: Scaling DataLoader for Multisource Large Foundation Model Training
.
EuroSys 2026
.
PDF
Cite
Borui Wan
,
Juntao Zhao
,
Chuan Wu
,
Chuanxiong Guo
,
Et Al.
(2026).
Efficient LLM Serving on Hybrid Real-time and Best-effort Requests
.
IEEE INFOCOM 2026
.
PDF
Cite
Juntao Zhao
,
Borui Wan
,
Chuan Wu
,
Yanghua Peng
,
Haibin Lin
(2025).
SplitQuant: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and Adaptive Quantization
.
PPoPP 2024 Poster; IEEE Cluster 2025
.
Link
Cite
Juntao Zhao
,
Jiuru Li
,
Chuan Wu
(2025).
Sandwich: Separating Prefill-Decode Compilation for Efficient CPU LLM Serving
.
DAC 2026
.
PDF
Cite
Juntao Zhao
,
Borui Wan
,
Yanghua Peng
,
Haibin Lin
,
Yibo Zhu
,
Chuan Wu
(2024).
QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
.
38th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2024)
.
Code
Cite
Juntao Zhao
,
Wenhao Lu
,
Sheng Wang
,
Lingpeng Kong
,
Chuan Wu
(2024).
QSpec: Speculative Decoding with Complementary Quantization Schemes
.
EMNLP 2025 Main
.
PDF
Cite
Juntao Zhao
,
Borui Wan
,
Yanghua Peng
,
Haibin Lin
,
Chuan Wu
(2024).
Llm-pq: Serving llm on heterogeneous clusters with phase-aware partition and adaptive quantization
.
arXiv preprint arXiv:2403.01136
.
Cite
Hanpeng Hu
,
Junwei Su
,
Juntao Zhao
,
Yanghua Peng
,
Yibo Zhu
,
Haibin Lin
,
Chuan Wu
(2024).
Cdmpp: A device-model agnostic framework for latency prediction of tensor programs
.
Proceedings of the Nineteenth European Conference on Computer Systems
.
Cite
Borui Wan
,
Juntao Zhao
,
Chuan Wu
(2023).
Adaptive message quantization and parallelization for distributed full-graph gnn training
.
Proceedings of Machine Learning and Systems
.
Cite
Yu Chen
,
Tian Min
,
Juntao Zhao
,
Wei Cai
(2022).
Synchronization in games sound: an audiovisual study on player experience and performance
.
Proceedings of the 2nd Workshop on Games Systems
.
Cite
Next »