SPET: Transparent SRAM Allocation and Model Partitioning for Real-time DNN Tasks on Edge TPU

Changhun Han, Hoon Sung Chwa, Kilho Lee, Sangeun Oh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Deep neural networks (DNNs) have been deployed in many safety-critical real-time embedded systems. To support DNN tasks in real-time, most previous studies focused on GPU or CPU. However, Edge TPU has not yet been studied for real-time guarantees. This paper presents a real-time DNNs framework for Edge TPU to satisfy multiple DNN inference tasks' timing requirements. The proposed framework provides 1) SRAM allocation and model partitioning techniques and 2) a MIP-based algorithm that determines the amount of SRAM and the number of segments for each task. The experiment result shows that our framework provides 79% higher schedulability than the existing Edge TPU system.

Original languageEnglish
Title of host publication2023 60th ACM/IEEE Design Automation Conference, DAC 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350323481
DOIs
StatePublished - 2023
Event60th ACM/IEEE Design Automation Conference, DAC 2023 - San Francisco, United States
Duration: 9 Jul 202313 Jul 2023

Publication series

NameProceedings - Design Automation Conference
Volume2023-July
ISSN (Print)0738-100X

Conference

Conference60th ACM/IEEE Design Automation Conference, DAC 2023
Country/TerritoryUnited States
CitySan Francisco
Period9/07/2313/07/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Fingerprint

Dive into the research topics of 'SPET: Transparent SRAM Allocation and Model Partitioning for Real-time DNN Tasks on Edge TPU'. Together they form a unique fingerprint.

Cite this