← Back to Articles
⚠️
IEEE Published Article
This article is published by IEEE and the copyright belongs to IEEE. Please click here to access the full text.

Integrating Local and Global Frequency Attention for Multiteacher Knowledge Distillation

View PDF

Abstract

Knowledge distillation (KD), particularly in multiteacher settings, presents significant challenges in effectively transferring knowledge from multiple complex models to a more compact student model. Traditional approaches often fall short in capturing the full spectrum of useful information. In this paper, we propose a novel method that integrates local and global frequency attention mechanisms to enhance the multiteacher KD process. By simultaneously addressing both fine-grained local details and broad global patterns, our approach improves the student model’s ability to assimilate and generalize from the diverse knowledge provided by multiple teachers. Experimental evaluations on standard benchmarks demonstrate that our method consistently outperforms existing multiteacher distillation techniques, achieving superior accuracy and robustness. Our results suggest that incorporating frequency-based attention mechanisms can significantly advance the effectiveness of KD in multiteacher scenarios, offering new insights and techniques for model compression and transfer learning.

Keywords

Knowledge distillation frequency attention mechanisms model compression deep learning

Authors

Z. Yao
Graduate School of Science and Engineering, Hosei University, Tokyo, Japan
X. Cheng
Graduate School of Science and Engineering, Hosei University, Tokyo, Japan
Z. Zhang
School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang, China
M. Du
Instrumentation Technology and Economy Institute, Beijing, China
W. Yu
Fujiang Laboratory, Sichuan Civil-military Integration Institute, Mianyang, China

Publication Details

Type
proceedings
Publisher
IEEE
Volume
Issue
ISSN
Citations
0
Views
0