Indexed by:
Abstract:
Anomaly detection in industrial manufacturing faces challenges such as limited training data, noise interference and inefficiency of single-model architecture. To deal with these problems, this article proposes a cross-architecture knowledge distillation framework in which a transformer (DINOv2-small) acts as a teacher network and a lightweight CNN (ResNet18) acts as a student network, connected by ResNet block adapters to address the gradient mismatch problem of heterogeneous architectures. The cross-attention mechanism further enhances multi-level feature migration through query-key-value interaction. Evaluation on the VisA dataset shows that our approach produces cutting-edge results: image-level AUROC of 94.5%, pixel-level AUROC of 98.6%, pixel-level AP of 47.1%, and region-level AUPRO of 93.3%, while maintaining an inference time of 12 seconds. Compared with existing methods, the model exhibits stronger robustness and localization accuracy in complex textures (e.g., 'cashewc') and continuous defective regions (e.g., 'pipe - fryumc'). This study provides a low-cost real-time solution for industrial inspection that balances efficiency and accuracy. Future work should explore channel pruning and quantization for further optimization. © 2025 IEEE.
Keyword:
Reprint 's Address:
Email:
Source :
Year: 2025
Page: 572-577
Language: English
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 0
Affiliated Colleges: