Plato: Plan to Efficiently Decode for Large Language Model Inference

Publication
Conference on Language Modeling
Shuowei Jin
Shuowei Jin
Applied Scientist at Amazon

My research interests include efficient LLM inference/training algorithms/systems, LLM post-training recipe, and general machine learning systems.