• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索
High Impact Results & Cited Count Trend for Year Keyword Cloud and Partner Relationship

Query:

学者姓名:柯逍

Refining:

Source

Submit Unfold

Co-

Submit Unfold

Language

Submit

Clean All

Sort by:
Default
  • Default
  • Title
  • Year
  • WOS Cited Count
  • Impact factor
  • Ascending
  • Descending
< Page ,Total 11 >
GFENet: Generalization Feature Extraction Network for Few-Shot Object Detection Scopus
期刊论文 | 2024 , 1-1 | IEEE Transactions on Circuits and Systems for Video Technology
Abstract&Keyword Cite

Abstract :

Few-shot object detection achieves rapid detection of novel-class objects by training detectors with a minimal number of novel-class annotated instances. Transfer learning-based few-shot object detection methods have shown better performance compared to other methods such as meta-learning. However, when training with base-class data, the model may gradually bias towards learning the characteristics of each category in the base-class data, which could result in a decrease in learning ability during fine-tuning on novel classes, and further overfitting due to data scarcity. In this paper, we first find that the generalization performance of the base-class model has a significant impact on novel class detection performance and proposes a generalization feature extraction network framework to address this issue. This framework perturbs the base model during training to encourage it to learn generalization features and solves the impact of changes in object shape and size on overall detection performance, improving the generalization performance of the base model. Additionally, we propose a feature-level data augmentation method based on self-distillation to further enhance the overall generalization ability of the model. Our method achieves state-of-the-art results on both the COCO and PASCAL VOC datasets, with a 6.94% improvement on the PASCAL VOC 10-shot dataset. IEEE

Keyword :

Adaptation models Adaptation models Computational modeling Computational modeling data augmentation data augmentation Data models Data models Feature extraction Feature extraction few-shot learning few-shot learning object detection object detection Object detection Object detection self-distillation self-distillation Shape Shape Training Training Transfer learning Transfer learning

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Ke, X. , Chen, Q. , Liu, H. et al. GFENet: Generalization Feature Extraction Network for Few-Shot Object Detection [J]. | IEEE Transactions on Circuits and Systems for Video Technology , 2024 : 1-1 .
MLA Ke, X. et al. "GFENet: Generalization Feature Extraction Network for Few-Shot Object Detection" . | IEEE Transactions on Circuits and Systems for Video Technology (2024) : 1-1 .
APA Ke, X. , Chen, Q. , Liu, H. , Guo, W. . GFENet: Generalization Feature Extraction Network for Few-Shot Object Detection . | IEEE Transactions on Circuits and Systems for Video Technology , 2024 , 1-1 .
Export to NoteExpress RIS BibTex

Version :

StegFormer: Rebuilding the Glory of Autoencoder-Based Steganography EI
会议论文 | 2024 , 38 (3) , 2723-2731 | 38th AAAI Conference on Artificial Intelligence, AAAI 2024
Abstract&Keyword Cite

Abstract :

Image hiding aims to conceal one or more secret images within a cover image of the same resolution. Due to strict capacity requirements, image hiding is commonly called large-capacity steganography. In this paper, we propose StegFormer, a novel autoencoder-based image-hiding model. StegFormer can conceal one or multiple secret images within a cover image of the same resolution while preserving the high visual quality of the stego image. In addition, to mitigate the limitations of current steganographic models in real-world scenarios, we propose a normalizing training strategy and a restrict loss to improve the reliability of the steganographic models under realistic conditions. Furthermore, we propose an efficient steganographic capacity expansion method to increase the capacity of steganography and enhance the efficiency of secret communication. Through this approach, we can increase the relative payload of StegFormer to 96 bits per pixel without any training strategy modifications. Experiments demonstrate that our StegFormer outperforms existing state-of-the-art (SOTA) models. In the case of single-image steganography, there is an improvement of more than 3 dB and 5 dB in PSNR for secret/recovery image pairs and cover/stego image pairs. Copyright © 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Keyword :

Artificial intelligence Artificial intelligence Image enhancement Image enhancement Learning systems Learning systems Steganography Steganography

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Ke, Xiao , Wu, Huanqi , Guo, Wenzhong . StegFormer: Rebuilding the Glory of Autoencoder-Based Steganography [C] . 2024 : 2723-2731 .
MLA Ke, Xiao et al. "StegFormer: Rebuilding the Glory of Autoencoder-Based Steganography" . (2024) : 2723-2731 .
APA Ke, Xiao , Wu, Huanqi , Guo, Wenzhong . StegFormer: Rebuilding the Glory of Autoencoder-Based Steganography . (2024) : 2723-2731 .
Export to NoteExpress RIS BibTex

Version :

基于时空交叉感知的实时动作检测方法 CSCD PKU
期刊论文 | 2024 | 电子学报
Abstract&Keyword Cite Version(2)

Abstract :

时空动作检测依赖于视频空间信息与时间信息的学习. 目前,最先进的基于卷积神经网络的动作检测器采用2D CNN或3D CNN架构,取得了显著的效果. 然而,由于网络结构的复杂性与时空信息感知的原因,这些方法通常采用非实时、离线的方式. 时空动作检测主要的挑战在于设计高效的检测网络架构,并能有效地感知融合时空特征. 考虑到上述问题,本文提出了一种基于时空交叉感知的实时动作检测方法. 该方法首先通过对输入视频进行乱序重排来增强时序信息,针对仅使用2D或3D骨干网络无法有效对时空特征进行建模,提出了基于时空交叉感知的多分支特征提取网络. 针对单一尺度时空特征描述性不足,提出一个多尺度注意力网络来学习长期的时间依赖和空间上下文信息. 针对时序和空间两种不同来源特征的融合,提出了一种新的运动显著性增强融合策略,对时空信息进行编码交叉映射,引导时序特征和空间特征之间的融合,突出更具辨别力的时空特征表示. 最后,基于帧级检测器结果在线计算动作关联性链接 . 本文提出的方法在两个时空动作数据集 UCF101-24 和 JHMDB-21 上分别达到了 84.71% 和78.4%的准确率,优于现有最先进的方法,并达到 73帧/秒的速度 . 此外,针对 JHMDB-21数据集存在高类间相似性与难样本数据易于混淆等问题,本文提出了基于动作表示的关键帧光流动作检测方法,避免了冗余光流的产生,进一步提升了动作检测准确率.

Keyword :

多尺度注意力 多尺度注意力 实时动作检测 实时动作检测 时空交叉感知 时空交叉感知

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 柯逍 , 缪欣 , 郭文忠 . 基于时空交叉感知的实时动作检测方法 [J]. | 电子学报 , 2024 .
MLA 柯逍 et al. "基于时空交叉感知的实时动作检测方法" . | 电子学报 (2024) .
APA 柯逍 , 缪欣 , 郭文忠 . 基于时空交叉感知的实时动作检测方法 . | 电子学报 , 2024 .
Export to NoteExpress RIS BibTex

Version :

基于时空交叉感知的实时动作检测方法 CSCD PKU
期刊论文 | 2024 , 52 (2) , 574-588 | 电子学报
基于时空交叉感知的实时动作检测方法 CSCD PKU
期刊论文 | 2024 , 52 (02) , 574-588 | 电子学报
Splitting the backbone: A novel hierarchical method for assessing light field image quality SCIE
期刊论文 | 2024 , 178 | OPTICS AND LASERS IN ENGINEERING
Abstract&Keyword Cite Version(2)

Abstract :

The rising popularity of light field imaging underscores the pivotal role of image quality in user experience. However, evaluating the quality of light field images presents significant challenges owing to their highdimensional nature. Current quality assessment methods for light field images predominantly rely on machine learning or statistical analysis, often overlooking the interdependence among pixels. To overcome this limitation, we propose an innovative approach that employs a universal backbone network and introduces a dual-task framework for feature extraction. Specifically, we integrate a staged "primary-secondary" hierarchical evaluation mode into the universal backbone networks, enabling accurate quality score inference while preserving the intrinsic information of the original image. Our proposed approach reduces inference time by over 75% compared to existing methods, simultaneously achieving state-of-the-art results in terms of evaluation metrics. By harnessing the efficiency of neural networks, our framework offers an effective solution for the quality assessment of light field images, providing superior accuracy and speed compared to current methodologies.

Keyword :

Deep learning Deep learning Image quality assessment Image quality assessment Light field images Light field images Multitasking mode Multitasking mode

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Guo, Wenzhong , Wang, Hanling , Ke, Xiao . Splitting the backbone: A novel hierarchical method for assessing light field image quality [J]. | OPTICS AND LASERS IN ENGINEERING , 2024 , 178 .
MLA Guo, Wenzhong et al. "Splitting the backbone: A novel hierarchical method for assessing light field image quality" . | OPTICS AND LASERS IN ENGINEERING 178 (2024) .
APA Guo, Wenzhong , Wang, Hanling , Ke, Xiao . Splitting the backbone: A novel hierarchical method for assessing light field image quality . | OPTICS AND LASERS IN ENGINEERING , 2024 , 178 .
Export to NoteExpress RIS BibTex

Version :

Splitting the backbone: A novel hierarchical method for assessing light field image quality Scopus
期刊论文 | 2024 , 178 | Optics and Lasers in Engineering
Splitting the backbone: A novel hierarchical method for assessing light field image quality EI
期刊论文 | 2024 , 178 | Optics and Lasers in Engineering
Two-path target-aware contrastive regression for action quality assessment SCIE
期刊论文 | 2024 , 664 | INFORMATION SCIENCES
Abstract&Keyword Cite Version(2)

Abstract :

Action quality assessment (AQA) is a challenging vision task due to the complexity and variance of the scoring rules embedded in the videos. Recent approaches have reduced the prediction difficulty of AQA via learning action differences between videos, but there are still challenges in learning scoring rules and capturing feature differences. To address these challenges, we propose a two -path target -aware contrastive regression (T2CR) framework. We propose to fuse direct and contrastive regression and exploit the consistency of information across multiple visual fields. Specifically, we first directly learn the relational mapping between global video features and scoring rules, which builds occupational domain prior knowledge to better capture local differences between videos. Then, we acquire the auxiliary visual fields of the videos through sparse sampling to learn the commonality of feature representations in multiple visual fields and eliminate the effect of subjective noise from a single visual field. To demonstrate the effectiveness of T2CR, we conduct extensive experiments on four AQA datasets (MTL-AQA, FineDiving, AQA-7, JIGSAWS). Our method is superior to state-of-the-art methods without elaborate structural design and fine-grained information.

Keyword :

Action quality assessment Action quality assessment Multi-view information Multi-view information Video understanding Video understanding

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Ke, Xiao , Xu, Huangbiao , Lin, Xiaofeng et al. Two-path target-aware contrastive regression for action quality assessment [J]. | INFORMATION SCIENCES , 2024 , 664 .
MLA Ke, Xiao et al. "Two-path target-aware contrastive regression for action quality assessment" . | INFORMATION SCIENCES 664 (2024) .
APA Ke, Xiao , Xu, Huangbiao , Lin, Xiaofeng , Guo, Wenzhong . Two-path target-aware contrastive regression for action quality assessment . | INFORMATION SCIENCES , 2024 , 664 .
Export to NoteExpress RIS BibTex

Version :

Two-path target-aware contrastive regression for action quality assessment Scopus
期刊论文 | 2024 , 664 | Information Sciences
Two-path target-aware contrastive regression for action quality assessment EI
期刊论文 | 2024 , 664 | Information Sciences
Text-based person search via cross-modal alignment learning SCIE
期刊论文 | 2024 , 152 | PATTERN RECOGNITION
Abstract&Keyword Cite Version(2)

Abstract :

Text -based person search aims to use text descriptions to search for corresponding person images. However, due to the obvious pattern differences in image and text modalities, it is still a challenging problem to align the two modalities. Most existing approaches only consider semantic alignment within a global context or partial parts, lacking consideration of how to match image and text in terms of differences in model information. Therefore, in this paper, we propose an efficient Modality -Aligned Person Search network (MAPS) to address this problem. First, we suppress image -specific information by image feature style normalization to achieve modality knowledge alignment and reduce information differences between text and image. Second, we design a multi -granularity modal feature fusion and optimization method to enrich the modal features. To address the problem of useless and redundant information in the multi -granularity fused features, we propose a Multigranularity Feature Self -optimization Module (MFSM) to adaptively adjust the corresponding contributions of different granularities in the fused features of the two modalities. Finally, to address the problem of information inconsistency in the training and inference stages, we propose a Cross -instance Feature Alignment (CFA) to help the network enhance category -level generalization ability and improve retrieval performance. Extensive experiments demonstrate that our MAPS achieves state-of-the-art performance on all text -based person search datasets, and significantly outperforms other existing methods.

Keyword :

CNN CNN Cross-modality Cross-modality Image-text retrieval Image-text retrieval Person re-identification Person re-identification

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Ke, Xiao , Liu, Hao , Xu, Peirong et al. Text-based person search via cross-modal alignment learning [J]. | PATTERN RECOGNITION , 2024 , 152 .
MLA Ke, Xiao et al. "Text-based person search via cross-modal alignment learning" . | PATTERN RECOGNITION 152 (2024) .
APA Ke, Xiao , Liu, Hao , Xu, Peirong , Lin, Xinru , Guo, Wenzhong . Text-based person search via cross-modal alignment learning . | PATTERN RECOGNITION , 2024 , 152 .
Export to NoteExpress RIS BibTex

Version :

Text-based person search via cross-modal alignment learning Scopus
期刊论文 | 2024 , 152 | Pattern Recognition
Text-based person search via cross-modal alignment learning EI
期刊论文 | 2024 , 152 | Pattern Recognition
基于时空交叉感知的实时动作检测方法 CSCD PKU
期刊论文 | 2023 | 电子学报
Abstract&Keyword Cite

Abstract :

时空动作检测依赖于视频空间信息与时间信息的学习。目前最先进的基于卷积神经网络的动作检测器采用2D CNN或3D CNN架构,取得了显著的效果。然而,由于网络结构的复杂性与时空信息感知的原因,这些方法通常采用非实时、离线的方式。时空动作检测主要的挑战在于设计高效的检测网络架构,并能有效地感知融合时空特征。考虑到上述问题,本文提出了一种基于时空交叉感知的实时动作检测方法,该方法首先通过对输入视频进行乱序重排来增强时序信息,针对仅使用2D或3D骨干网络无法有效对时空特征进行建模,提出了基于时空交叉感知的多分支特征提取网络。针对单一尺度时空特征描述性不足,提出一个多尺度注意力网络来学习长期的时间依赖和空间上下文信息。针对时序和空间两种不同来源特征的融合,提出了一种新的运动显著性增强融合策略,对时空信息进行编码交叉映射,引导时序特征和空间特征之间的融合,突出更具辨别力的时空特征表示。最后,基于帧级检测器结果在线计算动作关联性链接。本文提出的方法在两个时空动作数据集UCF101-24和JHMDB-21上分别达到了84.71%和78.4%的准确率,优于现有最先进的方法,并达到73帧/秒的速度。此外,针对JHMDB-21数据集存在高类间相似性与难样本数据易于混淆等问题,本文提出了基于动作表示的关键帧光流动作检测方法,避免了冗余光流的产生,进一步提升了动作检测准确率。

Keyword :

多尺度注意力 多尺度注意力 实时动作检测 实时动作检测 时空交叉感知 时空交叉感知

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 柯逍 , 缪欣 , 郭文忠 . 基于时空交叉感知的实时动作检测方法 [J]. | 电子学报 , 2023 .
MLA 柯逍 et al. "基于时空交叉感知的实时动作检测方法" . | 电子学报 (2023) .
APA 柯逍 , 缪欣 , 郭文忠 . 基于时空交叉感知的实时动作检测方法 . | 电子学报 , 2023 .
Export to NoteExpress RIS BibTex

Version :

面料(实例分割) incoPat
专利 | 2022-11-05 00:00:00 | CN202230737663.7
Abstract&Keyword Cite

Abstract :

1.本外观设计产品的名称:面料(实例分割)。2.本外观设计产品的用途:用于作为服装面料使用。3.本外观设计产品的设计要点:在于形状、图案与色彩的结合。4.最能表明设计要点的图片或照片:主视图。5.请求保护的外观设计包含色彩。6.使用时不易见或不常见面,省略后视图; 本外观设计产品为平面产品,省略左视图; 本外观设计产品为平面产品,省略右视图; 本外观设计产品为平面产品,省略俯视图; 本外观设计产品为平面产品,省略仰视图。7.其他需要说明的情形其他说明:本外观设计产品为单幅不连续。

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 柯逍 , 石晓楠 . 面料(实例分割) : CN202230737663.7[P]. | 2022-11-05 00:00:00 .
MLA 柯逍 et al. "面料(实例分割)" : CN202230737663.7. | 2022-11-05 00:00:00 .
APA 柯逍 , 石晓楠 . 面料(实例分割) : CN202230737663.7. | 2022-11-05 00:00:00 .
Export to NoteExpress RIS BibTex

Version :

Integrates Spatiotemporal Visual Stimuli for Video Quality Assessment SCIE
期刊论文 | 2023 , 70 (1) , 223-237 | IEEE TRANSACTIONS ON BROADCASTING
Abstract&Keyword Cite Version(2)

Abstract :

While feature extraction employing pre-trained models proves effective and efficient for no-reference video tasks, it falls short of adequately accounting for the intricacies of the Human Visual System (HVS). In this study, we proposed a novel approach to Integration of spatio-temporal Visual Stimuli into Video Quality Assessment (IVS-VQA) for the inaugural time. Exploiting the heightened sensitivity of optic rod cells to edges and motion, along with the capability to track motion via conjugate gaze, our approach affords a distinctive perspective on video quality assessment. To capture significant changes at each timestamp, we incorporate edge information to enhance the feature extraction of the pre-trained model. To tackle pronounced motion across the timeline, we introduce an interactive temporal disparity query employing a dual-branch transformer architecture. This approach adeptly introduces feature biases and extracts comprehensive global attention, culminating in enhanced emphasis on non-continuous segments within the video. Additionally, we integrate low-level color texture information within the temporal domain to comprehensively capture distortions spanning various scales, both higher and lower. Empirical results illustrate that the proposed model attains state-of-the-art performance across all six benchmark databases, along with their corresponding weighted averages.

Keyword :

dual-branch transformer network dual-branch transformer network No-reference video quality assessment No-reference video quality assessment spatial and temporal stimuli spatial and temporal stimuli user-generated content user-generated content

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Guo, Wenzhong , Zhang, Kairui , Ke, Xiao . Integrates Spatiotemporal Visual Stimuli for Video Quality Assessment [J]. | IEEE TRANSACTIONS ON BROADCASTING , 2023 , 70 (1) : 223-237 .
MLA Guo, Wenzhong et al. "Integrates Spatiotemporal Visual Stimuli for Video Quality Assessment" . | IEEE TRANSACTIONS ON BROADCASTING 70 . 1 (2023) : 223-237 .
APA Guo, Wenzhong , Zhang, Kairui , Ke, Xiao . Integrates Spatiotemporal Visual Stimuli for Video Quality Assessment . | IEEE TRANSACTIONS ON BROADCASTING , 2023 , 70 (1) , 223-237 .
Export to NoteExpress RIS BibTex

Version :

Integrates Spatiotemporal Visual Stimuli for Video Quality Assessment EI
期刊论文 | 2024 , 70 (1) , 223-237 | IEEE Transactions on Broadcasting
Integrates Spatiotemporal Visual Stimuli for Video Quality Assessment Scopus
期刊论文 | 2023 , 70 (1) , 1-15 | IEEE Transactions on Broadcasting
U-Transformer-based multi-levels refinement for weakly supervised action segmentation SCIE
期刊论文 | 2023 , 149 | PATTERN RECOGNITION
WoS CC Cited Count: 1
Abstract&Keyword Cite Version(2)

Abstract :

Action segmentation is a research hotspot in human action analysis, which aims to split videos into segments of different actions. Recent algorithms have achieved great success in modeling based on temporal convolution, but these methods weight local or global timing information through additional modules, ignoring the existing long-term and short-term information connections between actions. This paper proposes a U-Transformer structure based on multi-level refinement, introduces neighborhood attention to learn the neighborhood information of adjacent frames, and aggregates video frame features to effectively process long-term sequence information. Then a loss optimization strategy is proposed to smooth the original classification effect and generate a more accurate calibration sequence by introducing a pairing similarity optimization method based on deep feature learning. In addition, we propose a timestamp supervised training method to generate complete information for actions based on pseudo-label predictions for action boundary predictions. Experiments on three challenging action segmentation datasets, 50Salads, GTEA, and Breakfast, show that our model performs state-of-the-art models, and our weakly supervised model also performs comparably to fully supervised performance.

Keyword :

Action segmentation Action segmentation Multi-stages refinement Multi-stages refinement Timestamp supervision Timestamp supervision U-Transformer U-Transformer

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Ke, Xiao , Miao, Xin , Guo, Wenzhong . U-Transformer-based multi-levels refinement for weakly supervised action segmentation [J]. | PATTERN RECOGNITION , 2023 , 149 .
MLA Ke, Xiao et al. "U-Transformer-based multi-levels refinement for weakly supervised action segmentation" . | PATTERN RECOGNITION 149 (2023) .
APA Ke, Xiao , Miao, Xin , Guo, Wenzhong . U-Transformer-based multi-levels refinement for weakly supervised action segmentation . | PATTERN RECOGNITION , 2023 , 149 .
Export to NoteExpress RIS BibTex

Version :

U-Transformer-based multi-levels refinement for weakly supervised action segmentation Scopus
期刊论文 | 2024 , 149 | Pattern Recognition
U-Transformer-based multi-levels refinement for weakly supervised action segmentation EI
期刊论文 | 2024 , 149 | Pattern Recognition
10| 20| 50 per page
< Page ,Total 11 >

Export

Results:

Selected

to

Format:
Online/Total:572/6813697
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1