Analog Compute-In-Memory (ACIM) using Non-Volatile Memory arrays can accelerate large language model (LLM) inference, combining large weight capacity with efficient compute and achieving energy and performance benefits. I will review IBM’s work on ACIM demos, addressing its unique device, circuit and architectural challenges and discussing future opportunities for LLM workloads.
Analog Compute-in-Memory Accelerators for Deep Learning
MTL Seminar Series
Pritish Narayanan, IBM
Abstract
Bio
Dr. Pritish Narayanan is Principal Research Scientist at IBM Research, Almaden where he leads Analog AI Accelerator design and test efforts. He has worked across the hardware ecosystem from semiconductor fabrication to system software, and given several keynote, invited and tutorial talks.