Semi-supervised Text Categorization with Only a Few Positive and Unlabeled Documents - Details

author：

Lu, Fang (Lu, Fang.) ^[1] | Bai, Qingyuan (Bai, Qingyuan.) ^[2] (Scholars：白清源)

Indexed by：

CPCI-S EI Scopus

Abstract：

This　paper　studies　a　special　case　of　semi-supervised　text　categorization.　We　want　to　build　a　text　classifier　with　only　a　set　P　of　labeled　positive　documents　from　one　class　(called　positive　class)　and　a　set　U　of　a　large　number　of　unlabeled　documents　from　both　positive　class　and　other　diverse　classes　(called　negative　class).　This　kind　of　semi-supervised　text　classification　is　called　positive　and　unlabeled　learning　(PU-Learning).　Although　there　are　some　effective　methods　for　PU-Learning,　they　do　not　perform　very　well　when　the　labeled　positive　documents　are　very　few.　In　this　paper,　we　propose　a　refined　method　to　do　the　PU-Learning　with　the　known　technique　combining　Rocchio　and　K-means　algorithm.　Considering　the　set　P　may　be　very　small　(＜=　5%),　not　only　we　extract　more　reliable　negative　documents　from　U　but　also　enlarge　the　size　of　P　with　extracting　some　most　reliable　positive　documents　from　U.　Our　experimental　results　show　that　the　refined　method　can　perform　better　when　the　set　P　is　very　small.

Keyword：

cluster semi-supervised learning text categorization

Community：

[ 1 ] [Lu, Fang]Fuzhou Univ, Coll Math & Comp Sci, Fuzhou 350002, Peoples R China
[ 2 ] [Bai, Qingyuan]Fuzhou Univ, Coll Math & Comp Sci, Fuzhou 350002, Peoples R China

Reprint 's Address：

白清源
[Bai, Qingyuan]Fuzhou Univ, Coll Math & Comp Sci, Fuzhou 350002, Peoples R China

Email：

Show more details

Version：

Semi-supervised text categorization with only a few positive and unlabeled documents
2010，Proceedings - 2010 3rd International Conference on Biomedical Engineering and Informatics, BMEI 2010
Semi-supervised text categorization with only a few positive and unlabeled documents
2010，

Related Keywords：

Three-dimensional metal-organic framework based on pentanuclear manganese clusters as building blocks
2016，JOURNAL OF COORDINATION CHEMISTRY
STRUCTURES AND ELECTRONIC PROPERTIES OF A CO2P CLUSTER DEPOSITED ON THE RUTILE TiO2(110) SURFACE BY FIRST-PRINCIPLES CALCULATIONS
2013，JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY
Geometrical Structures, Electronic States, and Stability of NinAl Clusters
2010，INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY
A quantum chemistry study of Se-4 clusters
2004，CHINESE JOURNAL OF STRUCTURAL CHEMISTRY

Source ：

2010 3RD INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS (BMEI 2010), VOLS 1-7

ISSN： 1948-2914

Year： 2010

Page： 3075-3079

Language： English

Cited Count：

WoS CC Cited Count： 7

SCOPUS Cited Count： 8

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

数学与统计学院本学院/部未明确归属的数据

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to