Indexed by:
Abstract:
Due to the existence of small inter-class differences and large intra-class variance among fine-grained images, the existing classification algorithms only focus on the extraction and representation learning of salient local features of a single image, ignoring the local heterogeneous semantic discrimination information between multiple images, difficult to pay attention to the subtle details that distinguish different categories, resulting in the lack of sufficient discrimination of the learned features. This paper proposes a progressive network to learn the information of different granularity levels of the image in a weakly supervised manner. First, attention accumulation object localization module (AAOLM) is constructed to perform semantic target integration localization on attention information from different training epochs and feature extraction stages on a single image. Second, a multi-image heterogeneous local interactive graph module (HLIGM) is designed to construct a graph network and aggregate information between the local region features of multiple images under the guidance of the category label after extracting the salient local region features of each image to enhance the discriminative power of the representation. Finally, the optimization information generated by HLIGM is fed back to the backbone by using knowledge distillation so that it can directly extract features with strong discrimination, avoiding the computational overhead of building the graph in the test phase. Through experiments on multiple data sets, it proves the effectiveness of the proposed method, which can improve the fine-grained classification accuracy. © 2024 Science Press. All rights reserved.
Keyword:
Reprint 's Address:
Email:
Version:
Source :
Acta Automatica Sinica
ISSN: 0254-4156
Year: 2024
Issue: 11
Volume: 50
Page: 2219-2230
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 11