• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索
High Impact Results & Cited Count Trend for Year Keyword Cloud and Partner Relationship

Query:

学者姓名:童同

Refining:

Source

Submit Unfold

Co-

Submit Unfold

Language

Submit

Clean All

Sort by:
Default
  • Default
  • Title
  • Year
  • WOS Cited Count
  • Impact factor
  • Ascending
  • Descending
< Page ,Total 8 >
MSAByNet: A multiscale subtraction attention network framework based on Bayesian loss for medical image segmentation SCIE
期刊论文 | 2025 , 103 | BIOMEDICAL SIGNAL PROCESSING AND CONTROL
Abstract&Keyword Cite Version(2)

Abstract :

Medical image segmentation is a critical and complex process in medical image processing and analysis. With the development of artificial intelligence, the application of deep learning in medical image segmentation is becoming increasingly widespread. Existing techniques are mostly based on the U-shaped convolutional neural network and its variants, such as the U-Net framework, which uses skip connections or element-wise addition to fuse features from different levels in the decoder. However, these operations often weaken the compatibility between features at different levels, leading to a significant amount of redundant information and imprecise lesion segmentation. The construction of the loss function is a key factor in neural network design, but traditional loss functions lack high domain generalization and the interpretability of domain-invariant features needs improvement. To address these issues, we propose a Bayesian loss-based Multi-Scale Subtraction Attention Network (MSAByNet). Specifically, we propose an inter-layer and intra-layer multi-scale subtraction attention module, and different sizes of receptive fields were set for different levels of modules to avoid loss of feature map resolution and edge detail features. Additionally, we design a multi-scale deep spatial attention mechanism to learn spatial dimension information and enrich multi-scale differential information. Furthermore, we introduce Bayesian loss, re-modeling the image in spatial terms, enabling our MSAByNet to capture stable shapes, improving domain generalization performance. We have evaluated our proposed network on two publicly available datasets: the BUSI dataset and the Kvasir-SEG dataset. Experimental results demonstrate that the proposed MSAByNet outperforms several state-of-the-art segmentation methods. The codes are available at https://github.com/zlxokok/MSAByNet.

Keyword :

Bayesian loss Bayesian loss Deep convolutional neural networks Deep convolutional neural networks Deep learning Deep learning Medical image segmentation Medical image segmentation Multi-scale processing Multi-scale processing

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Zhao, Longxuan , Wang, Tao , Chen, Yuanbin et al. MSAByNet: A multiscale subtraction attention network framework based on Bayesian loss for medical image segmentation [J]. | BIOMEDICAL SIGNAL PROCESSING AND CONTROL , 2025 , 103 .
MLA Zhao, Longxuan et al. "MSAByNet: A multiscale subtraction attention network framework based on Bayesian loss for medical image segmentation" . | BIOMEDICAL SIGNAL PROCESSING AND CONTROL 103 (2025) .
APA Zhao, Longxuan , Wang, Tao , Chen, Yuanbin , Zhang, Xinlin , Tang, Hui , Zong, Ruige et al. MSAByNet: A multiscale subtraction attention network framework based on Bayesian loss for medical image segmentation . | BIOMEDICAL SIGNAL PROCESSING AND CONTROL , 2025 , 103 .
Export to NoteExpress RIS BibTex

Version :

MSAByNet: A multiscale subtraction attention network framework based on Bayesian loss for medical image segmentation Scopus
期刊论文 | 2025 , 103 | Biomedical Signal Processing and Control
MSAByNet: A multiscale subtraction attention network framework based on Bayesian loss for medical image segmentation EI
期刊论文 | 2025 , 103 | Biomedical Signal Processing and Control
Deep learning-based prediction of HER2 status and trastuzumab treatment efficacy of gastric adenocarcinoma based on morphological features SCIE
期刊论文 | 2025 , 23 (1) | JOURNAL OF TRANSLATIONAL MEDICINE
Abstract&Keyword Cite Version(1)

Abstract :

BackgroundFirst-line treatment for advanced gastric adenocarcinoma (GAC) with human epidermal growth factor receptor 2 (HER2) is trastuzumab combined with chemotherapy. In clinical practice, HER2 positivity is identified through immunohistochemistry (IHC) or fluorescence in situ hybridization (FISH), whereas deep learning (DL) can predict HER2 status based on tumor histopathological features. However, it remains uncertain whether these deep learning-derived features can predict the efficacy of anti-HER2 therapy.MethodsWe analyzed a cohort of 300 consecutive surgical specimens and 101 biopsy specimens, all undergoing HER2 testing, along with 41 biopsy specimens receiving trastuzumab-based therapy for HER2-positive GAC.ResultsWe developed a convolutional neural network (CNN) model using surgical specimens that achieved an area under the curve (AUC) value of 0.847 in predicting HER2 amplification, and achieved an AUC of 0.903 in predicting HER2 status specifically in patients with HER2 2 + expression. The model also predicted HER2 status in gastric biopsy specimens, achieving an AUC of 0.723. Furthermore, our classifier was trained using 41 HER2-positive gastric biopsy specimens that had undergone trastuzumab treatment, our model demonstrated an AUC of 0.833 for the (CR + PR) / (SD + PD) subgroup.ConclusionThis work explores an algorithm that utilizes hematoxylin and eosin (H&E) staining to accurately predict HER2 status and assess the response to trastuzumab in GAC, potentially facilitating clinical decision-making.

Keyword :

Deep learning Deep learning Efficacy Efficacy Gastric adenocarcinoma Gastric adenocarcinoma HER2 HER2 Trastuzumab Trastuzumab

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Wu, Zhida , Wang, Tao , Lan, Junlin et al. Deep learning-based prediction of HER2 status and trastuzumab treatment efficacy of gastric adenocarcinoma based on morphological features [J]. | JOURNAL OF TRANSLATIONAL MEDICINE , 2025 , 23 (1) .
MLA Wu, Zhida et al. "Deep learning-based prediction of HER2 status and trastuzumab treatment efficacy of gastric adenocarcinoma based on morphological features" . | JOURNAL OF TRANSLATIONAL MEDICINE 23 . 1 (2025) .
APA Wu, Zhida , Wang, Tao , Lan, Junlin , Wang, Jianchao , Chen, Gang , Tong, Tong et al. Deep learning-based prediction of HER2 status and trastuzumab treatment efficacy of gastric adenocarcinoma based on morphological features . | JOURNAL OF TRANSLATIONAL MEDICINE , 2025 , 23 (1) .
Export to NoteExpress RIS BibTex

Version :

Deep learning-based prediction of HER2 status and trastuzumab treatment efficacy of gastric adenocarcinoma based on morphological features Scopus
期刊论文 | 2025 , 23 (1) | Journal of Translational Medicine
A novel framework for segmentation of small targets in medical images SCIE
期刊论文 | 2025 , 15 (1) | SCIENTIFIC REPORTS
WoS CC Cited Count: 1
Abstract&Keyword Cite Version(1)

Abstract :

Medical image segmentation represents a pivotal and intricate procedure in the domain of medical image processing and analysis. With the progression of artificial intelligence in recent years, the utilization of deep learning techniques for medical image segmentation has witnessed escalating popularity. Nevertheless, the intricate nature of medical image poses challenges on the segmentation of diminutive targets is still in its early stages. Current networks encounter difficulties in addressing the segmentation of exceedingly small targets, especially when the number of training samples is limited. To overcome this constraint, we have implemented a proficient strategy to enhance lesion images containing small targets and constrained samples. We introduce a segmentation framework termed STS-Net, specifically designed for small target segmentation. This framework leverages the established capacity of convolutional neural networks to acquire effective image representations. The proposed STS-Net network adopts a ResNeXt50-32x4d architecture as the encoder, integrating attention mechanisms during the encoding phase to amplify the feature representation capabilities of the network. We evaluated the proposed network on four publicly available datasets. Experimental results underscore the superiority of our approach in the domain of medical image segmentation, particularly for small target segmentation. The codes are available at https://github.com/zlxokok/STSNet.

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Zhao, Longxuan , Wang, Tao , Chen, Yuanbin et al. A novel framework for segmentation of small targets in medical images [J]. | SCIENTIFIC REPORTS , 2025 , 15 (1) .
MLA Zhao, Longxuan et al. "A novel framework for segmentation of small targets in medical images" . | SCIENTIFIC REPORTS 15 . 1 (2025) .
APA Zhao, Longxuan , Wang, Tao , Chen, Yuanbin , Zhang, Xinlin , Tang, Hui , Lin, Fuxin et al. A novel framework for segmentation of small targets in medical images . | SCIENTIFIC REPORTS , 2025 , 15 (1) .
Export to NoteExpress RIS BibTex

Version :

A novel framework for segmentation of small targets in medical images Scopus
期刊论文 | 2025 , 15 (1) | Scientific Reports
Multimodal Cross Global Learnable Attention Network for MR images denoising with arbitrary modal missing SCIE
期刊论文 | 2025 , 121 | COMPUTERIZED MEDICAL IMAGING AND GRAPHICS
Abstract&Keyword Cite Version(2)

Abstract :

Magnetic Resonance Imaging (MRI) generates medical images of multiple sequences, i.e., multimodal, from different contrasts. However, noise will reduce the quality of MR images, and then affect the doctor's diagnosis of diseases. Existing filtering methods, transform-domain methods, statistical methods and Convolutional Neural Network (CNN) methods main aim to denoise individual sequences of images without considering the relationships between multiple different sequences. They cannot balance the extraction of high-dimensional and low-dimensional features in MR images, and hard to maintain a good balance between preserving image texture details and denoising strength. To overcome these challenges, this work proposes a controllable Multimodal Cross-Global Learnable Attention Network (MMCGLANet) for MR image denoising with Arbitrary Modal Missing. Specifically, Encoder is employed to extract the shallow features of the image which share weight module, and Convolutional Long Short-Term Memory(ConvLSTM) is employed to extract the associated features between different frames within the same modal. Cross Global Learnable Attention Network(CGLANet) is employed to extract and fuse image features between multimodal and within the same modality. In addition, sequence code is employed to label missing modalities, which allows for Arbitrary Modal Missing during model training, validation, and testing. Experimental results demonstrate that our method has achieved good denoising results on different public and real MR image dataset.

Keyword :

Arbitrary modal missing Arbitrary modal missing Controllable Controllable Cross global attention Cross global attention Multimodal fusion Multimodal fusion Multimodal MR image denoising Multimodal MR image denoising

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Jiang, Mingfu , Wang, Shuai , Chan, Ka-Hou et al. Multimodal Cross Global Learnable Attention Network for MR images denoising with arbitrary modal missing [J]. | COMPUTERIZED MEDICAL IMAGING AND GRAPHICS , 2025 , 121 .
MLA Jiang, Mingfu et al. "Multimodal Cross Global Learnable Attention Network for MR images denoising with arbitrary modal missing" . | COMPUTERIZED MEDICAL IMAGING AND GRAPHICS 121 (2025) .
APA Jiang, Mingfu , Wang, Shuai , Chan, Ka-Hou , Sun, Yue , Xu, Yi , Zhang, Zhuoneng et al. Multimodal Cross Global Learnable Attention Network for MR images denoising with arbitrary modal missing . | COMPUTERIZED MEDICAL IMAGING AND GRAPHICS , 2025 , 121 .
Export to NoteExpress RIS BibTex

Version :

Multimodal Cross Global Learnable Attention Network for MR images denoising with arbitrary modal missing Scopus
期刊论文 | 2025 , 121 | Computerized Medical Imaging and Graphics
Multimodal Cross Global Learnable Attention Network for MR images denoising with arbitrary modal missing EI
期刊论文 | 2025 , 121 | Computerized Medical Imaging and Graphics
DiffSteISR: Harnessing diffusion prior for superior real-world stereo image super-resolution SCIE
期刊论文 | 2025 , 623 | NEUROCOMPUTING
Abstract&Keyword Cite Version(2)

Abstract :

Although diffusion prior-based single-image super-resolution has demonstrated remarkable reconstruction capabilities, its potential in the domain of stereo image super-resolution remains underexplored. One significant challenge lies in the inherent stochasticity of diffusion models, which makes it difficult to ensure that the generated left and right images exhibit high semantic and texture consistency. This poses a considerable obstacle to advancing research in this field. Therefore, We introduce DiffSteISR, a pioneering framework for reconstructing real-world stereo images. DiffSteISR utilizes the powerful prior knowledge embedded in pre-trained text-to-image model to efficiently recover the lost texture details in low-resolution stereo images. Specifically, DiffSteISR implements a time-aware stereo cross attention with temperature adapter (TASCATA) to guide the diffusion process, ensuring that the generated left and right views exhibit high texture consistency thereby reducing disparity error between the super-resolved images and the ground truth (GT) images. Additionally, a stereo omni attention control network (SOA ControlNet) is proposed to enhance the consistency of super-resolved images with GT images in the pixel, perceptual, and distribution space. Finally, DiffSteISR incorporates a stereo semantic extractor (SSE) to capture unique viewpoint soft semantic information and shared hard tag semantic information, thereby effectively improving the semantic accuracy and consistency of the generated left and right images. Extensive experimental results demonstrate that DiffSteISR accurately reconstructs natural and precise textures from low-resolution stereo images while maintaining a high consistency of semantic and texture between the left and right views.

Keyword :

ControlNet ControlNet Diffusion model Diffusion model Reconstructing Reconstructing Stereo image super-resolution Stereo image super-resolution Texture consistency Texture consistency

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Zhou, Yuanbo , Zhang, Xinlin , Deng, Wei et al. DiffSteISR: Harnessing diffusion prior for superior real-world stereo image super-resolution [J]. | NEUROCOMPUTING , 2025 , 623 .
MLA Zhou, Yuanbo et al. "DiffSteISR: Harnessing diffusion prior for superior real-world stereo image super-resolution" . | NEUROCOMPUTING 623 (2025) .
APA Zhou, Yuanbo , Zhang, Xinlin , Deng, Wei , Wang, Tao , Tan, Tao , Gao, Qinquan et al. DiffSteISR: Harnessing diffusion prior for superior real-world stereo image super-resolution . | NEUROCOMPUTING , 2025 , 623 .
Export to NoteExpress RIS BibTex

Version :

DiffSteISR: Harnessing diffusion prior for superior real-world stereo image super-resolution EI
期刊论文 | 2025 , 623 | Neurocomputing
DiffSteISR: Harnessing diffusion prior for superior real-world stereo image super-resolution Scopus
期刊论文 | 2025 , 623 | Neurocomputing
A universal parameter-efficient fine-tuning approach for stereo image super-resolution SCIE
期刊论文 | 2025 , 151 | ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
Abstract&Keyword Cite Version(2)

Abstract :

Despite advances in the use of the strategy of pre-training then fine-tuning in low-level vision tasks, the increasing size of models presents significant challenges for this paradigm, particularly in terms of training time and memory consumption. In addition, unsatisfactory results may occur when pre-trained single-image models are directly applied to a multi-image domain. In this paper, we propose an efficient method for transferring a pre-trained single-image super-resolution transformer network to the domain of stereo image super-resolution (SteISR) using a parameter-efficient fine-tuning approach. Specifically, the concept of stereo adapters and spatial adapters are introduced, which are incorporated into the pre-trained single-image super-resolution transformer network. Subsequently, only the inserted adapters are trained on stereo datasets. Compared with the classical full fine-tuning paradigm, our method can effectively reduce training time and memory consumption by 57% and 15%, respectively. Moreover, this method allows us to train only 4.8% of the original model parameters, achieving state-of-the-art performance on four commonly used SteISR benchmarks. This technology is expected to improve stereo image resolution in various fields such as medical imaging and autonomous driving, thereby indirectly enhancing the accuracy of depth estimation and object recognition tasks.

Keyword :

Autonomous driving Autonomous driving Parameter-efficient fine-tuning Parameter-efficient fine-tuning Stereo image super-resolution Stereo image super-resolution Transfer learning Transfer learning

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Zhou, Yuanbo , Xue, Yuyang , Zhang, Xinlin et al. A universal parameter-efficient fine-tuning approach for stereo image super-resolution [J]. | ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE , 2025 , 151 .
MLA Zhou, Yuanbo et al. "A universal parameter-efficient fine-tuning approach for stereo image super-resolution" . | ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE 151 (2025) .
APA Zhou, Yuanbo , Xue, Yuyang , Zhang, Xinlin , Deng, Wei , Wang, Tao , Tan, Tao et al. A universal parameter-efficient fine-tuning approach for stereo image super-resolution . | ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE , 2025 , 151 .
Export to NoteExpress RIS BibTex

Version :

A universal parameter-efficient fine-tuning approach for stereo image super-resolution EI
期刊论文 | 2025 , 151 | Engineering Applications of Artificial Intelligence
A universal parameter-efficient fine-tuning approach for stereo image super-resolution Scopus
期刊论文 | 2025 , 151 | Engineering Applications of Artificial Intelligence
Parameter-efficient fine-tuning for single image snow removal SCIE
期刊论文 | 2025 , 265 | EXPERT SYSTEMS WITH APPLICATIONS
Abstract&Keyword Cite Version(2)

Abstract :

The degradation types of snow are complex and diverse. Existing methods employ sophisticated model architectures to model sufficient visual representations for snow removal. In order to remove snow more efficiently, inspired by the powerful visual representations of pre-trained large models and the efficient parameter fine-tuning paradigm in the field of natural language processing, we have pioneered the exploration of applying efficient parameter fine-tuning in low-level vision. Taking the desnowing task as the starting point, we introduced TuneSnow, a framework for efficient parameter fine-tuning that can be integrated with desnowing network to improve desnowing performance. Initially, we introduced Hybrid Adapters for the efficient fine-tuning of pre-trained vision models. We then proposed a Progressive Multi-Scale Perception module (PMSP) to harness the feature representation potential of pre-trained models. Finally, we presented a Degraded Area Restoration module (DAR) based on Multi-Scale Fusion Refinement module (MSFR) to recovery details after desnowing. Extensive experiments demonstrate that our approach trains only 15% of the parameters and delivers state-of-the-art performance on multiple publicly available datasets. TuneSnow can serve as a plug-and-play component to enhance the performance of other U-shaped image restoration models, including derain, dehaze, deblur, and more. The code and datasets in this study are available at https://github.com/dxw2000/PEFT-TuneSnow.

Keyword :

Parameter-efficient fine-tuning Parameter-efficient fine-tuning Pre-trained model Pre-trained model Segment anything model Segment anything model Snow removal Snow removal

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Dai, Xinwei , Zhou, Yuanbo , Qiu, Xintao et al. Parameter-efficient fine-tuning for single image snow removal [J]. | EXPERT SYSTEMS WITH APPLICATIONS , 2025 , 265 .
MLA Dai, Xinwei et al. "Parameter-efficient fine-tuning for single image snow removal" . | EXPERT SYSTEMS WITH APPLICATIONS 265 (2025) .
APA Dai, Xinwei , Zhou, Yuanbo , Qiu, Xintao , Tang, Hui , Tong, Tong . Parameter-efficient fine-tuning for single image snow removal . | EXPERT SYSTEMS WITH APPLICATIONS , 2025 , 265 .
Export to NoteExpress RIS BibTex

Version :

Parameter-efficient fine-tuning for single image snow removal Scopus
期刊论文 | 2025 , 265 | Expert Systems with Applications
Parameter-efficient fine-tuning for single image snow removal EI
期刊论文 | 2025 , 265 | Expert Systems with Applications
MHAVSR: A multi-layer hybrid alignment network for video super-resolution SCIE
期刊论文 | 2025 , 624 | NEUROCOMPUTING
Abstract&Keyword Cite Version(2)

Abstract :

Video super-resolution (VSR) aims to restore high-resolution (HR) frames from low-resolution (LR) frames, the key to this task is to fully utilize the complementary information between frames to reconstruct high- resolution sequences. Current works tackle with this by exploiting a sliding window strategy or a recurrent architecture for single alignment, which either lacks long range modeling ability or is prone to frame-by-frame error accumulation. In this paper, we propose a Multi-layer Hybrid Alignment network for VSR (MHAVSR), which combines a sliding window with a recurrent structure and extends the number of propagation layers based on this hybrid structure. Repeatedly, at each propagation layer, alignment operations are performed simultaneously on bidirectional neighboring frames and hidden states from recursive propagation, which improves the alignment while fully utilizing both the short-term and long-term information in the video sequence. Next, we present a flow-enhanced dual-deformable alignment module, which improves the accuracy of deformable convolutional offsets by optical flow and fuses the separate alignment results of the hybrid alignment to reduce the artifacts caused by alignment errors. In addition, we introduce a spatial-temporal reconstruction module to compensate the representation capacity of model at different scales. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches. In particular, on the Vid4 test set, our model exceeds the IconVSR by 0.82 dB in terms of PSNR with a similar number of parameters. Codes are available at https://github.com/fzuqxt/MHAVSR.

Keyword :

Deformable convolution Deformable convolution Hybrid propagation Hybrid propagation Long-short term information Long-short term information Multi-layer alignment Multi-layer alignment Video super-resolution Video super-resolution

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Qiu, Xintao , Zhou, Yuanbo , Zhang, Xinlin et al. MHAVSR: A multi-layer hybrid alignment network for video super-resolution [J]. | NEUROCOMPUTING , 2025 , 624 .
MLA Qiu, Xintao et al. "MHAVSR: A multi-layer hybrid alignment network for video super-resolution" . | NEUROCOMPUTING 624 (2025) .
APA Qiu, Xintao , Zhou, Yuanbo , Zhang, Xinlin , Xue, Yuyang , Lin, Xiaoyong , Dai, Xinwei et al. MHAVSR: A multi-layer hybrid alignment network for video super-resolution . | NEUROCOMPUTING , 2025 , 624 .
Export to NoteExpress RIS BibTex
HSINet: A Hybrid Semantic Integration Network for Medical Image Segmentation EI
会议论文 | 2025 , 2302 CCIS , 339-353 | 19th Chinese Conference on Image and Graphics Technologies and Applications, IGTA 2024
Abstract&Keyword Cite Version(1)

Abstract :

Medical image segmentation is crucial in medical image analysis. In recent years, deep learning, particularly convolutional neural networks (CNNs) and Transformer models, has significantly advanced this field. To fully leverage the abilities of CNNs and Transformers in extracting local and global information, we propose HSINet, which employs Swin Transformer and the newly introduced Deep Dense Feature Extraction (DFE) block to construct dual encoders. A Swin Transformer and DFE Encoded Feature Fusion (TDEF) module is designed to merge features from the two branches, and the Multi-Scale Semantic Fusion (MSSF) module further promotes the full utilization of low-level and high-level features from the encoders. We evaluated the proposed network on the familial cerebral cavernous malformations private dataset (SG-FCCM) and the ISIC-2017 challenge dataset. The experimental results indicate that the proposed HSINet outperforms several other advanced segmentation methods, demonstrating its superiority in medical image segmentation. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

Keyword :

Convolutional neural networks Convolutional neural networks Deep neural networks Deep neural networks Semantic Segmentation Semantic Segmentation

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Zong, Ruige , Wang, Tao , Zhang, Xinlin et al. HSINet: A Hybrid Semantic Integration Network for Medical Image Segmentation [C] . 2025 : 339-353 .
MLA Zong, Ruige et al. "HSINet: A Hybrid Semantic Integration Network for Medical Image Segmentation" . (2025) : 339-353 .
APA Zong, Ruige , Wang, Tao , Zhang, Xinlin , Gao, Qinquan , Kang, Dezhi , Lin, Fuxin et al. HSINet: A Hybrid Semantic Integration Network for Medical Image Segmentation . (2025) : 339-353 .
Export to NoteExpress RIS BibTex

Version :

HSINet: A Hybrid Semantic Integration Network for Medical Image Segmentation Scopus
其他 | 2025 , 2302 CCIS , 339-353 | Communications in Computer and Information Science
Contrastive Learning via Randomly Generated Deep Supervision EI
会议论文 | 2025 | 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
Abstract&Keyword Cite

Abstract :

Unsupervised visual representation learning has gained significant attention in the computer vision community, driven by recent advancements in contrastive learning. Most existing contrastive learning frameworks rely on instance discrimination as a pretext task, treating each instance as a distinct category. However, this often leads to intra-class collision in a large latent space, compromising the quality of learned representations. To address this issue, we propose a novel contrastive learning method that utilizes randomly generated supervision signals. Our framework incorporates two projection heads: one handles conventional classification tasks, while the other employs a random algorithm to generate fixed-length vectors representing different classes. The second head executes a supervised contrastive learning task based on these vectors, effectively clustering instances of the same class and increasing the separation between different classes. Our method, Contrastive Learning via Randomly Generated Supervision(CLRGS), significantly improves the quality of feature representations across various datasets and achieves state-of-the-art performance in contrastive learning tasks. © 2025 IEEE.

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Wang, Shibo , Ma, Zili , Chan, Ka-Hou et al. Contrastive Learning via Randomly Generated Deep Supervision [C] . 2025 .
MLA Wang, Shibo et al. "Contrastive Learning via Randomly Generated Deep Supervision" . (2025) .
APA Wang, Shibo , Ma, Zili , Chan, Ka-Hou , Liu, Yue , Tong, Tong , Gao, Qinquan et al. Contrastive Learning via Randomly Generated Deep Supervision . (2025) .
Export to NoteExpress RIS BibTex

Version :

10| 20| 50 per page
< Page ,Total 8 >

Export

Results:

Selected

to

Format:
Online/Total:122/10107250
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1