Abstract: Analog Compute-In-Memory (ACIM) using Non-Volatile Memory arrays can accelerate large language model (LLM) inference, combining large weight capacity with efficient compute and achieving energy and performance benefits. I will review IBM’s work on ACIM demos, addressing its unique device, circuit and architectural challenges and discussing future opportunities for LLM workloads.

Bio: Dr. Pritish Narayanan is Principal Research Scientist at IBM Research, Almaden where he leads Analog AI Accelerator design and test efforts. He has worked across the hardware ecosystem from semiconductor fabrication to system software, and given several keynote, invited and tutorial talks.