QSpec: Speculative Decoding with Complementary Quantization Schemes Jan 1, 2024· Juntao Zhao , Wenhao Lu , Sheng Wang , Lingpeng Kong , Chuan Wu · 0 min read PDF Cite Last updated on Jan 1, 2024 ← Llm-pq: Serving llm on heterogeneous clusters with phase-aware partition and adaptive quantization Jan 1, 2024 QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices Jan 1, 2024 →