• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
成果搜索

author:

Lu, Fang (Lu, Fang.) [1] | Bai, Qingyuan (Bai, Qingyuan.) [2]

Indexed by:

EI

Abstract:

This paper studies a special case of semi-supervised text categorization. We want to build a text classifier with only a set P of labeled positive documents from one class (called positive class) and a set U of a large number of unlabeled documents from both positive class and other diverse classes (called negative class). This kind of semi-supervised text classification is called positive and unlabeled learning (PU-Learning). Although there are some effective methods for PU-Learning, they do not perform very well when the labeled positive documents are very few. In this paper, we propose a refined method to do the PU-Learning with the known technique combining Rocchio and K-means algorithm. Considering the set P may be very small (≤5%), not only we extract more reliable negative documents from U but also enlarge the size of P with extracting some most reliable positive documents from U. Our experimental results show that the refined method can perform better when the set P is very small. ©2010 IEEE.

Keyword:

Biomedical engineering Classification (of information) K-means clustering Machine learning Supervised learning Text processing

Community:

  • [ 1 ] [Lu, Fang]College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China
  • [ 2 ] [Bai, Qingyuan]College of Mathematics and Computer Science, Fuzhou University, Fuzhou, China

Reprint 's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Year: 2010

Volume: 7

Page: 3075-3079

Language: English

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count: 10

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 1

Affiliated Colleges:

Online/Total:118/10046125
Address:FZU Library(No.2 Xuyuan Road, Fuzhou, Fujian, PRC Post Code:350116) Contact Us:0591-22865326
Copyright:FZU Library Technical Support:Beijing Aegean Software Co., Ltd. 闽ICP备05005463号-1