Tony Zhao
  • Bio
  • Papers
  • Experience
  • Game Projects
ESC

Searching...

Finding results for ""

No results found

No results for ""

↑↓ Navigate ↵ Select 0 filters
Powered by Hugo Blox
  • Recent & Upcoming Talks
    • Example Talk
  • Publications
    • Efficient LLM Serving on Hybrid Real-time and Best-effort Requests
    • MegaScale-Data: Scaling DataLoader for Multisource Large Foundation Model Training
    • Sandwich: Separating Prefill-Decode Compilation for Efficient CPU LLM Serving
    • SplitQuant: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and Adaptive Quantization
    • Cdmpp: A device-model agnostic framework for latency prediction of tensor programs
    • Llm-pq: Serving llm on heterogeneous clusters with phase-aware partition and adaptive quantization
    • QSpec: Speculative Decoding with Complementary Quantization Schemes
    • QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
    • Adaptive message quantization and parallelization for distributed full-graph gnn training
    • CryptoArcade: A Cloud Gaming System With Blockchain-Based Token Economy
    • Synchronization in games sound: an audiovisual study on player experience and performance
    • CloudArcade: A blockchain empowered cloud gaming system
    • An example preprint / working paper
    • An example journal article
    • An example conference paper
  • Projects
  • Blog
    • 🎉 Easily create your own simple yet highly customizable blog
    • 🧠 Sharpen your thinking with a second brain
    • 📈 Communicate your results effectively with the best data visualizations
    • 👩🏼‍🏫 Teach academic courses
    • ✅ Manage your projects
  • Experience
  • Teaching
    • Learn JavaScript
    • Learn Python
  • Projects
    • CookingMasterSimulator
    • TransHome
    • Shadow Ticker

MegaScale-Data: Scaling DataLoader for Multisource Large Foundation Model Training

Jan 1, 2026·
Juntao Zhao
,
Qi Lu
,
Wei Jia
,
Borui Wan
,
Lei Zuo
,
Junda Feng
,
Jianyu Jiang
,
Yangrui Chen
,
Shuaishuai Cao
,
Jialing He
,
Others
· 0 min read
PDF
Last updated on Jan 1, 2026

← Efficient LLM Serving on Hybrid Real-time and Best-effort Requests Jan 1, 2026
Sandwich: Separating Prefill-Decode Compilation for Efficient CPU LLM Serving Jan 1, 2025 →

© 2026 Me. This work is licensed under CC BY NC ND 4.0

Made with Hugo Blox. Start free →