DualPIM: A Dual-Precision and Low-Power CNN Inference Engine Using SRAM- and eDRAM-based Processing-in-Memory Arrays

Sangwoo Jung, Jaehyun Lee, Huiseong Noh, Jong Hyeok Yoon, Jaeha Kung

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Recently, machine learning community has focused on developing deep learning models that are not only accurate but also efficient to deploy them on resource-limited devices. One popular approach to improve the model efficiency is to aggressively quantize both features and weight parameters. However, the quantization generally entails accuracy degradation thus additional compensation techniques are required. In this work, we present a novel network architecture, named DualNet, that leverages two separate bit-precision paths to effectively achieve high accuracy and low model complexity. On top of this new network architecture, we propose to utilize both SRAM-and eDRAM-based processing-in-memory (PIM) arrays, named DualPIM, to run each computing path in a DualNet at a dedicated PIM array. As a result, the proposed DualNet significantly reduces the energy consumption by 81% on average compared to other quantized neural networks (i.e., 4-bit and ternary), while achieving 1.3 % higher accuracy on average.

Original languageEnglish
Title of host publicationProceeding - IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages70-73
Number of pages4
ISBN (Electronic)9781665409964
DOIs
StatePublished - 2022
Event4th IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022 - Incheon, Korea, Republic of
Duration: 13 Jun 202215 Jun 2022

Publication series

NameProceeding - IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022

Conference

Conference4th IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022
Country/TerritoryKorea, Republic of
CityIncheon
Period13/06/2215/06/22

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

Keywords

  • convolutional neural networks
  • deep learning
  • processing-in-memory
  • quantized neural networks

Fingerprint

Dive into the research topics of 'DualPIM: A Dual-Precision and Low-Power CNN Inference Engine Using SRAM- and eDRAM-based Processing-in-Memory Arrays'. Together they form a unique fingerprint.

Cite this