Geoffrey Burr1 Pritish Narayanan1 Stefano Ambrogio1 Hsinyu Tsai1 Robert M. Shelby1

1, IBM Almaden Research Ctr, San Jose, California, United States

Deep Neural Networks (DNNs) are very large artificial neural networks trained using very large datasets, typically using the supervised learning technique known as backpropagation. Currently, CPUs and GPUs are used for these computations. Over the next few years, we can expect special-purpose hardware accelerators based on conventional digital-design techniques to optimize the GPU framework for these DNN computations. Here there are opportunities to increase speed and reduce power for two distinct but related tasks: training and forward-inference. During training, the weights of a DNN are adjusted to improve network performance through repeated exposure to the labelled data-examples of a large dataset. Often this involves a distributed network of chips working together in the cloud. During forward-inference, already trained networks are used to analyze new data-examples, sometimes in a latency-constrained cloud environment and sometimes in a power-constrained environment (sensors, mobile phones, “edge-of-network” devices, etc.).

Even after the improved computational performance and efficiency that is expected from these special-purpose digital accelerators, there would still be an opportunity for even higher performance and even better energy-efficiency from neuromorphic computation based on analog memories.

In this presentation, we discuss the origin of this opportunity as well as the challenges inherent in delivering on it, with a particular focus on memory materials. We review our work towards neuromorphic chips for the hardware acceleration of training and inference of Fully-Connected DNNs [1-4]. We use arrays of emerging non-volatile memories (NVM), such as Phase Change Memory, to implement the synaptic weights connecting layers of neurons. We will discuss the impact of real device characteristics – such as non-linearity, variability, asymmetry, and stochasticity – on performance, and describe how these effects determine the desired specifications for the analog resistive memories needed for this application. We present some novel solutions to finesse some of these issues in the near-term, and describe some challenges in designing and implementing the CMOS circuitry around the NVM array. We will end with an outlook on the prospects for analog memory-based DNN hardware accelerators.

[1] G. W. Burr et al., IEDM Tech. Digest, 29.5 (2014).
[2] G. W. Burr et al., IEEE Trans. Elec. Dev, 62(11), pp. 3498 (2015).
[3] G. W. Burr et al., IEDM Tech. Digest, 4.4 (2015).
[4] P. Narayanan et al., IBM J. Res. Dev., 61(4/5), 11:1-11 (2017).