In this paper, we present the AutoCat system for product classification. AutoCat uses a vector space model, modified to consider product attributes unavailable in traditional docu...
One of the central challenges in sentimentbased text categorization is that not every portion of a document is equally informative for inferring the overall sentiment of the docum...
Anchor text has been shown to be effective in ranking[6] and a variety of information retrieval tasks on web pages. Some authors have expanded on anchor text by using the words ar...
Although the literature contains reports of very high accuracy figures for the recognition of named entities in text, there are still some named entity phenomena that remain probl...
Word meaning ambiguity has always been an important problem in information retrieval and extraction, as well as, text mining (documents clustering and classification). Knowledge di...
Henryk Rybinski, Marzena Kryszkiewicz, Grzegorz Pr...