Abstract. This paper investigates a new extension of the Probabilistic Latent Semantic Analysis (PLSA) model [6] for text classification where the training set is partially labeled...
: Question answering systems have proven to be helpful to users because they can provide succinct answers that do not require users to wade through a large number of documents. How...
Jimmy J. Lin, Dennis Quan, Vineet Sinha, Karun Bak...
Abstract This vandalism detector uses features primarily derived from a wordpreserving differencing of the text for each Wikipedia article from before and after the edit, along wit...
Abstract. We report our experiences from participating to the controlled experiment of the ImageCLEF 2010 Wikipedia Retrieval task. We built an experimental search engine which com...
Avi Arampatzis, Savvas A. Chatzichristofis, Konsta...
The correct web site text content must be help to the visitors to find what they are looking for. However, the reality is quite different, many times the web page text content is a...