Llm-pq: Serving llm on heterogeneous clusters with phase-aware partition and adaptive quantization
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis posuere tellus ac convallis placerat. Proin tincidunt magna sed ex sollicitudin condimentum.