QSpec: Speculative Decoding with Complementary Quantization Schemes Jan 1, 2024ยท Juntao Zhao , Wenhao Lu , Sheng Wang , Lingpeng Kong , Chuan Wu ยท 0 min read Cite Type Journal article Publication arXiv preprint arXiv:2410.11305 Last updated on Jan 1, 2024 โ POSTER: LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization Jan 1, 2024 QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices Jan 1, 2024 โ