In this paper, we propose a new application of Bayesian language model based on Pitman-Yor process for information retrieval. This model is a generalization of the Dirichlet distr...
Provenance is the documentation concerning the origin of a result generated by a process, and provides explanations about who, how, what resources were used in a process, and the ...
Shrija Rajbhandari, Ian Wootten, Ali Shaikh Ali, O...
Short texts clustering is one of the most difficult tasks in natural language processing due to the low frequencies of the document terms. We are interested in analysing these kind...
Diego Ingaramo, David Pinto, Paolo Rosso, Marcelo ...
WikiWoods is an ongoing initiative to provide rich syntacto-semantic annotations for English Wikipedia. We sketch an automated processing pipeline to extract relevant textual cont...
Dan Flickinger, Stephan Oepen, Gisle Ytrestø...
S Stemming is a fundamental step in processing textual data preceding the tasks of information retrieval, text mining, and natural language processing. The common goal of stemming ...