Efficient neural network acceleration on GPGPU using content addressable memory

Mohsen Imani, Daniel Peroni, Yeseong Kim, Abbas Rahimi, Tajana Rosing

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

45 Scopus citations

Abstract

Recently, neural networks have been demonstrated to be effective models for image processing, video segmentation, speech recognition, computer vision and gaming. However, high energy computation and low performance are the primary bottlenecks of running the neural networks. In this paper, we propose an energy/performance-efficient network acceleration technique on General Purpose GPU (GPGPU) architecture which utilizes specialized resistive nearest content addressable memory blocks, called NNCAM, by exploiting computation locality of the learning algorithms. NNCAM stores highly frequent patterns corresponding to neural network operations and searches for the most similar patterns to reuse the computation results. To improve NNCAM computation efficiency and accuracy, we proposed layer-based associative update and selective approximation techniques. The layer-based update improves data locality of NNCAM blocks by filling NNCAM values based on the frequent computation patterns of each neural network layer. To guarantee the appropriate level of computation accuracy while providing maximum energy saving, our design adaptively allocates the neural network operations to either NNCAM or GPGPU floating point units (FPUs). The selective approximation relaxes computation on neural network layers by considering the impact on accuracy. In evaluation, we integrate NNCAM blocks with the modern AMD southern Island GPU architecture. Our experimental evaluation shows that the enhanced GPGPU can result in 68% energy savings and 40% speedup running on four popular convolutional neural networks (CNN), ensuring acceptable < 2% quality loss.

Original languageEnglish
Title of host publicationProceedings of the 2017 Design, Automation and Test in Europe, DATE 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1026-1031
Number of pages6
ISBN (Electronic)9783981537093
DOIs
StatePublished - 11 May 2017
Event20th Design, Automation and Test in Europe, DATE 2017 - Swisstech, Lausanne, Switzerland
Duration: 27 Mar 201731 Mar 2017

Publication series

NameProceedings of the 2017 Design, Automation and Test in Europe, DATE 2017

Conference

Conference20th Design, Automation and Test in Europe, DATE 2017
Country/TerritorySwitzerland
CitySwisstech, Lausanne
Period27/03/1731/03/17

Bibliographical note

Publisher Copyright:
© 2017 IEEE.

Fingerprint

Dive into the research topics of 'Efficient neural network acceleration on GPGPU using content addressable memory'. Together they form a unique fingerprint.

Cite this