Analog Compute-in-Memory Accelerators for Deep Learning

MTL Seminar Series
Pritish Narayanan, IBM

Abstract

Analog Compute-In-Memory (ACIM) using Non-Volatile Memory arrays can accelerate large language model (LLM) inference, combining large weight capacity with efficient compute and achieving energy and performance benefits. I will review IBM’s work on ACIM demos, addressing its unique device, circuit and architectural challenges and discussing future opportunities for LLM workloads.