AdaLo: Adaptive learning rate optimizer with loss for classification

Jae Jin Jeong, Gyogwon Koo

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Gradient-based algorithms are frequently used to optimize neural networks, with various methods developed to enhance their performance. Among them, the adaptive moment estimation (Adam) optimizer is well-known for its effectiveness and ease of implementation. However, it suffers from poor generalization without a learning rate scheduler. Additionally, it has the disadvantage of a large computational burden because of individual learning rate term, as known as second-order moments of gradients. In this study, we propose a novel gradient descent algorithm called AdaLo, which stands for Adaptive Learning Rate Optimizer with Loss. AdaLo addresses two problems using its adaptive learning rate (ALR). Firstly, the proposed ALR adjusts the learning rate, based on the model's training progress, specifically the loss value. Therefore AdaLo's ALR effectively replaces traditional learning rate schedulers. Secondly, the ALR is a scalar global learning rate, reducing the computational burden. In addition, the stability of the proposed method is analyzed from the perspective of the learning rate. The superiority of AdaLo was proven by non-convex functions. Simulation results indicated that the proposed optimizer outperformed the Adam, AdaBelief, and diffGrad with regard to the training error and test accuracy.

Original languageEnglish
Article number121607
JournalInformation Sciences
Volume690
DOIs
StatePublished - Feb 2025

Bibliographical note

Publisher Copyright:
© 2024 Elsevier Inc.

Keywords

  • CIFAR
  • Gradient decent
  • Learning rate scheduler
  • Neural network
  • Optimization

Fingerprint

Dive into the research topics of 'AdaLo: Adaptive learning rate optimizer with loss for classification'. Together they form a unique fingerprint.

Cite this