Search Sciweavers | Sciweavers

201

WSDM
2009
ACM

136views Data Mining» more WSDM 2009»

Mining common topics from multiple asynchronous text streams

16 years 1 months ago

Download wsdm2009.org

Text streams are becoming more and more ubiquitous, in the forms of news feeds, weblog archives and so on, which result in a large volume of data. An eﬀective way to explore the...

Xiang Wang 0002, Kai Zhang, Xiaoming Jin, Dou Shen

claim paper

Read More »

182

click to vote

PAISI
2009
Springer

161views Security Privacy» more PAISI 2009»

Discovering Compatible Top-K Theme Patterns from Text Based on Users' Preferences

16 years 1 months ago

Download www.yongxintong.net

Discovering a representative set of theme patterns from a large amount of text for interpreting their meaning has always been concerned by researches of both data mining and inform...

Yongxin Tong, Shilong Ma, Dan Yu, Yuanyuan Zhang, ...

claim paper

Read More »

176

click to vote

ICDE
2007
IEEE

211views Database» more ICDE 2007»

Document Representation and Dimension Reduction for Text Clustering

16 years 25 days ago

Download torch.cs.dal.ca

Increasingly large text datasets and the high dimensionality associated with natural language create a great challenge in text mining. In this research, a systematic study is cond...

M. Mahdi Shafiei, Singer Wang, Roger Zhang, Evange...

claim paper

Read More »

173

click to vote

CICLING
2010
Springer

174views Natural Language Processing» more CICLING 2010»

Word Length n-Grams for Text Re-use Detection

15 years 10 months ago

Download users.dsic.upv.es

Abstract. The automatic detection of shared content in written documents –which includes text reuse and its unacknowledged commitment, plagiarism– has become an important probl...

Alberto Barrón-Cedeño, Chiara Basile...

claim paper

Read More »

180

click to vote

DASFAA
2004
IEEE

135views Database» more DASFAA 2004»

Semi-supervised Text Classification Using Partitioned EM

15 years 10 months ago

Download www.cs.uic.edu

Text classification using a small labeled set and a large unlabeled data is seen as a promising technique to reduce the labor-intensive and time consuming effort of labeling traini...

Gao Cong, Wee Sun Lee, Haoran Wu, Bing Liu

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers