Abstract
Wireless technologies, such as WirelessHART, are being adopted in industrial wireless sensor-actuator networks (IWSAN), which are required to provide reliable quality of control (QoC). This article focuses on adaptively selecting the best network path for reliable QoC in the IWSAN. The main challenge is estimating the time-varying packet delivery ratio (PDR) of each path. The IWSAN path selection problem in a multi-armed bandit (MAB) framework is formulated. A novel algorithm criticality-aware adaptive path learning (CAPL) is proposed, which determines the criticality of each packet according to the degree of QoC degradation if it is lost. The key novelty of CAPL is that it simultaneously considers the fundamental exploration-exploitation trade-off in MAB and QoC in the IWSAN. CAPL uses low-criticality packets for exploration to measure the PDR so that it can minimize the impact of exploration on QoC degradation. CAPL with extensive simulation and empirical studies for DC motor position control is validated.
| Original language | English |
|---|---|
| Pages (from-to) | 9123-9133 |
| Number of pages | 11 |
| Journal | IEEE Transactions on Industrial Informatics |
| Volume | 19 |
| Issue number | 8 |
| DOIs | |
| State | Published - 1 Aug 2023 |
Bibliographical note
Publisher Copyright:© 2005-2012 IEEE.
Keywords
- Exploration-exploitation trade-off
- multiarmed bandit (MAB)
- quality of control (QoC)
- reinforcement learning (RL)
- wireless sensor-actuator networks (WSAN)