• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Lu, Fang (Lu, Fang.) [1] | Bai, Qingyuan (Bai, Qingyuan.) [2] (Scholars:白清源)

Indexed by:

CPCI-S EI Scopus

Abstract:

This paper studies a special case of semi-supervised text categorization. We want to build a text classifier with only a set P of labeled positive documents from one class (called positive class) and a set U of a large number of unlabeled documents from both positive class and other diverse classes (called negative class). This kind of semi-supervised text classification is called positive and unlabeled learning (PU-Learning). Although there are some effective methods for PU-Learning, they do not perform very well when the labeled positive documents are very few. In this paper, we propose a refined method to do the PU-Learning with the known technique combining Rocchio and K-means algorithm. Considering the set P may be very small (<= 5%), not only we extract more reliable negative documents from U but also enlarge the size of P with extracting some most reliable positive documents from U. Our experimental results show that the refined method can perform better when the set P is very small.

Keyword:

cluster semi-supervised learning text categorization

Community:

  • [ 1 ] [Lu, Fang]Fuzhou Univ, Coll Math & Comp Sci, Fuzhou 350002, Peoples R China
  • [ 2 ] [Bai, Qingyuan]Fuzhou Univ, Coll Math & Comp Sci, Fuzhou 350002, Peoples R China

Reprint 's Address:

  • 白清源

    [Bai, Qingyuan]Fuzhou Univ, Coll Math & Comp Sci, Fuzhou 350002, Peoples R China

Email:

Show more details

Version:

Related Keywords:

Source :

2010 3RD INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS (BMEI 2010), VOLS 1-7

ISSN: 1948-2914

Year: 2010

Page: 3075-3079

Language: English

Cited Count:

WoS CC Cited Count: 7

SCOPUS Cited Count: 8

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 1

Online/Total:274/10038204
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1