Query:
学者姓名:赵铁松
Refining:
Year
Type
Indexed by
Source
Complex
Former Name
Co-
Language
Clean All
Abstract :
Sonar technology has been widely used in underwater surface mapping and remote object detection for its light-independent characteristics. Recently, the booming of artificial intelligence further surges sonar image (SI) processing and understanding techniques. However, the intricate marine environments and diverse nonlinear postprocessing operations may degrade the quality of SIs, impeding accurate interpretation of underwater information. Efficient image quality assessment (IQA) methods are crucial for quality monitoring in sonar imaging and processing. Existing IQA methods overlook the unique characteristics of SIs or focus solely on typical distortions in specific scenarios, which limits their generalization capability. In this article, we propose a unified sonar IQA method, which overcomes the challenges posed by diverse distortions. Though degradation conditions are changeable, ideal SIs consistently require certain properties that must be task-centered and exhibit attribute consistency. We derive a comprehensive set of quality attributes from both the task background and visual content of SIs. These attribute features are represented in just ten dimensions and ultimately mapped to the quality score. To validate the effectiveness of our method, we construct the first comprehensive SI dataset. Experimental results demonstrate the superior performance and robustness of the proposed method.
Keyword :
Attribute consistency Attribute consistency Degradation Degradation Distortion Distortion Image quality Image quality image quality assessment (IQA) image quality assessment (IQA) Imaging Imaging Noise Noise Nonlinear distortion Nonlinear distortion no-reference (NR) no-reference (NR) Quality assessment Quality assessment Silicon Silicon Sonar Sonar sonar imaging and processing sonar imaging and processing Sonar measurements Sonar measurements
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Cai, Boqin , Chen, Weiling , Zhang, Jianghe et al. Unified No-Reference Quality Assessment for Sonar Imaging and Processing [J]. | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING , 2025 , 63 . |
MLA | Cai, Boqin et al. "Unified No-Reference Quality Assessment for Sonar Imaging and Processing" . | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 63 (2025) . |
APA | Cai, Boqin , Chen, Weiling , Zhang, Jianghe , Junejo, Naveed Ur Rehman , Zhao, Tiesong . Unified No-Reference Quality Assessment for Sonar Imaging and Processing . | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING , 2025 , 63 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Anomaly detection can significantly aid doctors in interpreting chest X-rays. The commonly used strategy involves utilizing the pre-trained network to extract features from normal data to establish feature representations. However, when a pre-trained network is applied to more detailed X-rays, differences of similarity can limit the robustness of these feature representations. Therefore, we propose an intra- and inter-correlation learning framework for chest X-ray anomaly detection. Firstly, to better leverage the similar anatomical structure information in chest X-rays, we introduce the Anatomical-Feature Pyramid Fusion Module for feature fusion. This module aims to obtain fusion features with both local details and global contextual information. These fusion features are initialized by a trainable feature mapper and stored in a feature bank to serve as centers for learning. Furthermore, to Facing Differences of Similarity (FDS) introduced by the pre-trained network, we propose an intra- and inter-correlation learning strategy: 1) We use intra-correlation learning to establish intra-correlation between mapped features of individual images and semantic centers, thereby initially discovering lesions; 2) We employ inter-correlation learning to establish inter-correlation between mapped features of different images, further mitigating the differences of similarity introduced by the pre-trained network, and achieving effective detection results even in diverse chest disease environments. Finally, a comparison with 18 state-of-the-art methods on three datasets demonstrates the superiority and effectiveness of the proposed method across various scenarios.
Keyword :
chest X-ray chest X-ray correlation learning correlation learning feature fusion feature fusion Medical anomaly detection Medical anomaly detection transfer learning transfer learning
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Xu, Shicheng , Li, Wei , Li, Zuoyong et al. Facing Differences of Similarity: Intra- and Inter-Correlation Unsupervised Learning for Chest X-Ray Anomaly Detection [J]. | IEEE TRANSACTIONS ON MEDICAL IMAGING , 2025 , 44 (2) : 801-814 . |
MLA | Xu, Shicheng et al. "Facing Differences of Similarity: Intra- and Inter-Correlation Unsupervised Learning for Chest X-Ray Anomaly Detection" . | IEEE TRANSACTIONS ON MEDICAL IMAGING 44 . 2 (2025) : 801-814 . |
APA | Xu, Shicheng , Li, Wei , Li, Zuoyong , Zhao, Tiesong , Zhang, Bob . Facing Differences of Similarity: Intra- and Inter-Correlation Unsupervised Learning for Chest X-Ray Anomaly Detection . | IEEE TRANSACTIONS ON MEDICAL IMAGING , 2025 , 44 (2) , 801-814 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Due to the complex underwater imaging environment, existing Underwater Image Enhancement (UIE) techniques are unable to handle the increasing demand for high-quality underwater content in broadcasting systems. Thus, a robust quality assessment method is highly expected to effectively compare the quality of different enhanced underwater images. To this end, we propose a novel quality assessment method for enhanced underwater images by utilizing multiple levels of features at various stages of the network's depth. We first select underwater images with different distortions to analyze the characteristics of different UIE results at various feature levels. We found that low-level features are more sensitive to color information, while mid-level features are more indicative of structural differences. Based on this, a Channel-Spatial-Pixel Attention Module (CSPAM) is designed for low-level perception to capture color characteristics, utilizing channel, spatial, and pixel dimensions. To capture structural variations, a Parallel Structural Perception Module (PSPM) with convolutional kernels of different scales is introduced for mid-level perception. For high-level perception, due to the accumulation of noise, an Adaptive Weighted Downsampling (AWD) layer is employed to restore the semantic information. Furthermore, a new top-down multi-level feature fusion method is designed. Information from different levels is integrated through a Selective Feature Fusion (SFF) mechanism, which produces semantically rich features and enhances the model's feature representation capability. Experimental results demonstrate the superior performance of the proposed method over the competing image quality evaluation methods.
Keyword :
image quality assessment image quality assessment multi-level perception multi-level perception Underwater image enhancement Underwater image enhancement
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Xu, Yiwen , Lin, Yuxiang , He, Nian et al. Multi-Level Perception Assessment for Underwater Image Enhancement [J]. | IEEE TRANSACTIONS ON BROADCASTING , 2025 . |
MLA | Xu, Yiwen et al. "Multi-Level Perception Assessment for Underwater Image Enhancement" . | IEEE TRANSACTIONS ON BROADCASTING (2025) . |
APA | Xu, Yiwen , Lin, Yuxiang , He, Nian , Wang, Xuejin , Zhao, Tiesong . Multi-Level Perception Assessment for Underwater Image Enhancement . | IEEE TRANSACTIONS ON BROADCASTING , 2025 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
The introduction of multiple viewpoints in video scenes inevitably increases the bitrates required for storage and transmission. To reduce bitrates, researchers have developed methods to skip intermediate viewpoints during compression and delivery, and ultimately reconstruct them using Side Information (SInfo). Typically, depth maps are used to construct SInfo. However, these methods suffer from reconstruction inaccuracies and inherently high bitrates. In this paper, we propose a novel multi-view video coding method that leverages the image generation capabilities of Generative Adversarial Network (GAN) to improve the reconstruction accuracy of SInfo. Additionally, we consider incorporating information from adjacent temporal and spatial viewpoints to further reduce SInfo redundancy. At the encoder, we construct a spatio-temporal Epipolar Plane Image (EPI) and further utilize a convolutional network to extract the latent code of a GAN as SInfo. At the decoder, we combine the SInfo and adjacent viewpoints to reconstruct intermediate views using the GAN generator. Specifically, we establish a joint encoder constraint for reconstruction cost and SInfo entropy to achieve an optimal trade-off between reconstruction quality and bitrate overhead. Experiments demonstrate the significant improvement in Rate-Distortion (RD) performance compared to state-of-the-art methods.
Keyword :
Epipolar plane image Epipolar plane image Generative adversarial network Generative adversarial network Latent code learning Latent code learning Multi-view video coding Multi-view video coding
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lan, Chengdong , Yan, Hao , Luo, Cheng et al. GAN-based multi-view video coding with spatio-temporal EPI reconstruction [J]. | SIGNAL PROCESSING-IMAGE COMMUNICATION , 2025 , 132 . |
MLA | Lan, Chengdong et al. "GAN-based multi-view video coding with spatio-temporal EPI reconstruction" . | SIGNAL PROCESSING-IMAGE COMMUNICATION 132 (2025) . |
APA | Lan, Chengdong , Yan, Hao , Luo, Cheng , Zhao, Tiesong . GAN-based multi-view video coding with spatio-temporal EPI reconstruction . | SIGNAL PROCESSING-IMAGE COMMUNICATION , 2025 , 132 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Natural video capturing suffers from visual blurriness due to high-motion of cameras or objects. Until now, the video blurriness removal task has been extensively explored for both human vision and machine processing. However, its computational cost is still a critical issue and has not yet been fully addressed. In this paper, we propose a novel Lightweight Video Deblurring (LightViD) method that achieves the top-tier performance with an extremely low parameter size. The proposed LightViD consists of a blur detector and a deblurring network. In particular, the blur detector effectively separate blurriness regions, thus avoid both unnecessary computation and over-enhancement on non-blurriness regions. The deblurring network is designed as a lightweight model. It employs a Spatial Feature Fusion Block (SFFB) to extract hierarchical spatial features, which are further fused by ConvLSTM for effective spatial-temporal feature representation. Comprehensive experiments with quantitative and qualitative comparisons demonstrate the effectiveness of our LightViD method, which achieves competitive performances on GoPro and DVD datasets, with reduced computational costs of 1.63M parameters and 96.8 GMACs. Trained model available: https://github.com/wgp/LightVid. IEEE
Keyword :
Blur Detection Blur Detection Computational efficiency Computational efficiency Computational modeling Computational modeling Detectors Detectors Feature extraction Feature extraction Image restoration Image restoration Kernel Kernel Spatial-Temporal Feature Fusion Spatial-Temporal Feature Fusion Task analysis Task analysis Video Deblurring Video Deblurring
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lin, L. , Wei, G. , Liu, K. et al. LightViD: Efficient Video Deblurring with Spatial-Temporal Feature Fusion [J]. | IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 34 (8) : 1-1 . |
MLA | Lin, L. et al. "LightViD: Efficient Video Deblurring with Spatial-Temporal Feature Fusion" . | IEEE Transactions on Circuits and Systems for Video Technology 34 . 8 (2024) : 1-1 . |
APA | Lin, L. , Wei, G. , Liu, K. , Feng, W. , Zhao, T. . LightViD: Efficient Video Deblurring with Spatial-Temporal Feature Fusion . | IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 34 (8) , 1-1 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
In large-scale surveillance of urban or rural areas, an effective placement of cameras is critical in maximizing surveillance coverage or minimizing economic cost of cameras. Existing Surveillance Camera Placement (SCP) methods generally focus on physical coverage of surveillance by implicitly assuming uniform distribution of interested targets or objects across all blocks, which is, however, uncommon in real-world scenarios. In this paper, we are the first to propose a target-aware SCP (tSCP) model, which prioritizes optimizing the task based on uneven target densities, allowing cameras to preferentially cover blocks with more interested targets. First, we define target density as the likelihood of interested targets occurring in a block, which is positively correlated with the importance of the block. Second, we combine aerial imagery with a lightweight object detection network to identify target density. Third, we formulate tSCP as an optimization problem to maximize target coverage in surveillance area, and solve this problem with a target-guided genetic algorithm. Our method optimizes the rational and economical utilization of cameras in large-scale video survillance. Compared with the state-of-the-art methods, our tSCP achieves the highest target coverage with a fixed number of cameras (8.31%-14.81% more than its peers), or utilizes the minimum number of cameras to achieve a preset target coverage. Codes are available at https://github.com/wu-hongxin/tSCP_main. IEEE
Keyword :
Internet of Things (IoT) Internet of Things (IoT) large-scale video surveillance large-scale video surveillance smart city smart city Surveillance Camera Placement (SCP) Surveillance Camera Placement (SCP)
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wu, H. , Zeng, Q. , Guo, C. et al. Target-Aware Camera Placement for Large-Scale Video Surveillance [J]. | IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 34 (12) : 1-1 . |
MLA | Wu, H. et al. "Target-Aware Camera Placement for Large-Scale Video Surveillance" . | IEEE Transactions on Circuits and Systems for Video Technology 34 . 12 (2024) : 1-1 . |
APA | Wu, H. , Zeng, Q. , Guo, C. , Zhao, T. , Chen, C.W. . Target-Aware Camera Placement for Large-Scale Video Surveillance . | IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 34 (12) , 1-1 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Super-Resolution (SR) algorithms aim to enhance the resolutions of images. Massive deep-learning-based SR techniques have emerged in recent years. In such case, a visually appealing output may contain additional details compared with its reference image. Accordingly, fully referenced Image Quality Assessment (IQA) cannot work well; however, reference information remains essential for evaluating the qualities of SR images. This poses a challenge to SR-IQA: How to balance the referenced and no-reference scores for user perception? In this paper, we propose a Perception-driven Similarity-Clarity Tradeoff (PSCT) model for SR-IQA. Specifically, we investigate this problem from both referenced and no-reference perspectives, and design two deep-learning-based modules to obtain referenced and no-reference scores. We present a theoretical analysis based on Human Visual System (HVS) properties on their tradeoff and also calculate adaptive weights for them. Experimental results indicate that our PSCT model is superior to the state-of-the-arts on SR-IQA. In addition, the proposed PSCT model is also capable of evaluating quality scores in other image enhancement scenarios, such as deraining, dehazing and underwater image enhancement. The source code is available at https://github.com/kekezhang112/PSCT. © 1991-2012 IEEE.
Keyword :
Deep learning Deep learning Demulsification Demulsification Feature extraction Feature extraction Image enhancement Image enhancement Image quality Image quality Job analysis Job analysis Optical resolving power Optical resolving power Quality control Quality control
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhang, Keke , Zhao, Tiesong , Chen, Weiling et al. Perception-Driven Similarity-Clarity Tradeoff for Image Super-Resolution Quality Assessment [J]. | IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 34 (7) : 5897-5907 . |
MLA | Zhang, Keke et al. "Perception-Driven Similarity-Clarity Tradeoff for Image Super-Resolution Quality Assessment" . | IEEE Transactions on Circuits and Systems for Video Technology 34 . 7 (2024) : 5897-5907 . |
APA | Zhang, Keke , Zhao, Tiesong , Chen, Weiling , Niu, Yuzhen , Hu, Jinsong , Lin, Weisi . Perception-Driven Similarity-Clarity Tradeoff for Image Super-Resolution Quality Assessment . | IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 34 (7) , 5897-5907 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Low-light image enhancement is a challenging task due to the limited visibility in dark environments. While recent advances have shown progress in integrating CNNs and Transformers, the inadequate local-global perceptual interactions still impedes their application in complex degradation scenarios. To tackle this issue, we propose BiFormer, a lightweight framework that facilitates local-global collaborative perception via bilateral interaction. Specifically, our framework introduces a core CNN-Transformer collaborative perception block (CPB) that combines local-aware convolutional attention (LCA) and global-aware recursive transformer (GRT) to simultaneously preserve local details and ensure global consistency. To promote perceptual interaction, we adopt bilateral interaction strategy for both local and global perception, which involves local-to-global second-order interaction (SoI) in the dual-domain, as well as a mixed-channel fusion (MCF) module for global-to-local interaction. The MCF is also a highly efficient feature fusion module tailored for degraded features. Extensive experiments conducted on low-level and high-level tasks demonstrate that BiFormer achieves state-of-the-art performance. Furthermore, it exhibits a significant reduction in model parameters and computational cost compared to existing Transformer-based low-light image enhancement methods. IEEE
Keyword :
bilateral interaction bilateral interaction Collaboration Collaboration Convolutional neural networks Convolutional neural networks hybrid CNN-Transformer hybrid CNN-Transformer Image enhancement Image enhancement Lighting Lighting Low-light image enhancement Low-light image enhancement mixed-channel fusion mixed-channel fusion Task analysis Task analysis Transformers Transformers Visualization Visualization
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Xu, R. , Li, Y. , Niu, Y. et al. Bilateral Interaction for Local-Global Collaborative Perception in Low-Light Image Enhancement [J]. | IEEE Transactions on Multimedia , 2024 , 26 : 1-13 . |
MLA | Xu, R. et al. "Bilateral Interaction for Local-Global Collaborative Perception in Low-Light Image Enhancement" . | IEEE Transactions on Multimedia 26 (2024) : 1-13 . |
APA | Xu, R. , Li, Y. , Niu, Y. , Xu, H. , Chen, Y. , Zhao, T. . Bilateral Interaction for Local-Global Collaborative Perception in Low-Light Image Enhancement . | IEEE Transactions on Multimedia , 2024 , 26 , 1-13 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Making full use of spatial-temporal information is the key factor for removing compressed video artifacts. Recently, many deep learning-based compression artifact reduction methods have emerged. Among them, a series of methods based on deformable convolution have shown excellent capabilities in spatio-temporal feature extraction. However, local deformable offset prediction and pixel-wise inter-frame feature alignment in the unidirectional form limit the full utilization of temporal features in the existing method. Additionally, compressed video shows inconsistent degrees of distortion on different frequency components, and their restoration difficulty is also nonuniform. For the above problems presented by existing methods, we propose an enlarged motion-aware and frequency-aware network (EMAFA) to further extract spatio-temporal information and enhance information of different frequency components. To perceive different degrees of motion artifacts between compressed frames as accurately as possible, we design a bidirectional dense propagation pattern with pixel-wise and patch-wise deformable convolution (PIPA) module in the feature domain. In addition, we propose a multi-scale atrous deformable alignment (MSADA) module to enrich spatio-temporal features in image domain. Moreover, we design a multi-direction frequency enhancement (MDFE) module with multiple direction convolution to enhance the features of different frequency components. The experimental results show that the proposed method performs better than the state-of-the-art methods in both objective evaluation and visual perception experience. Supplementary experiments for Internet Streamed Video with hybrid-distortion demonstrate that our method also exhibits considerable generalizability for quality enhancement. IEEE
Keyword :
Circuits and systems Circuits and systems compressed video artifact reduction compressed video artifact reduction Convolution Convolution Feature extraction Feature extraction Quantization (signal) Quantization (signal) Task analysis Task analysis video compression video compression Video compression Video compression video quality enhancement video quality enhancement Video recording Video recording
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Liu, W. , Gao, W. , Li, G. et al. Enlarged Motion-Aware and Frequency-Aware Network for Compressed Video Artifact Reduction [J]. | IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 34 (10) : 1-1 . |
MLA | Liu, W. et al. "Enlarged Motion-Aware and Frequency-Aware Network for Compressed Video Artifact Reduction" . | IEEE Transactions on Circuits and Systems for Video Technology 34 . 10 (2024) : 1-1 . |
APA | Liu, W. , Gao, W. , Li, G. , Ma, S. , Zhao, T. , Yuan, H. . Enlarged Motion-Aware and Frequency-Aware Network for Compressed Video Artifact Reduction . | IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 34 (10) , 1-1 . |
Export to | NoteExpress RIS BibTex |
Version :
Abstract :
Video compression leads to compression artifacts, among which Perceivable Encoding Artifacts (PEAs) degrade user perception. Most of existing state-of-the-art Video Compression Artifact Removal (VCAR) methods indiscriminately process all artifacts, thus leading to over-enhancement in non-PEA regions. Therefore, accurate detection and location of PEAs is crucial. In this paper, we propose the largest-ever Fine-grained PEA database (FPEA). First, we employ the popular video codecs, VVC and AVS3, as well as their common test settings, to generate four types of spatial PEAs (blurring, blocking, ringing and color bleeding) and two types of temporal PEAs (flickering and floating). Second, we design a labeling platform and recruit sufficient subjects to manually locate all the above types of PEAs. Third, we propose a voting mechanism and feature matching to synthesize all subjective labels to obtain the final PEA labels with fine-grained locations. Besides, we also provide Mean Opinion Score (MOS) values of all compressed video sequences. Experimental results show the effectiveness of FPEA database on both VCAR and compressed Video Quality Assessment (VQA). We envision that FPEA database will benefit the future development of VCAR, VQA and perception-aware video encoders. The FPEA database has been made publicly available. IEEE
Keyword :
Perceivable encoding artifact Perceivable encoding artifact video compression video compression video compression artifact removal video compression artifact removal video quality assessment video quality assessment
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lin, L. , Wang, M. , Yang, J. et al. Toward Efficient Video Compression Artifact Detection and Removal: A Benchmark Dataset [J]. | IEEE Transactions on Multimedia , 2024 , 26 : 1-12 . |
MLA | Lin, L. et al. "Toward Efficient Video Compression Artifact Detection and Removal: A Benchmark Dataset" . | IEEE Transactions on Multimedia 26 (2024) : 1-12 . |
APA | Lin, L. , Wang, M. , Yang, J. , Zhang, K. , Zhao, T. . Toward Efficient Video Compression Artifact Detection and Removal: A Benchmark Dataset . | IEEE Transactions on Multimedia , 2024 , 26 , 1-12 . |
Export to | NoteExpress RIS BibTex |
Version :
Export
Results: |
Selected to |
Format: |