Measuring the similarity between documents and queries has been extensively studied in information retrieval. However, there are a growing number of tasks that require computing th...
In order to search within corpora written in two or more languages, the simplest and most effective approach is to translate the submitted request into the required language(s). To...
Community QA portals provide an important resource for non-factoid question-answering. The inherent noisiness of user-generated data makes the identification of high-quality cont...
The term online reputation addresses trust relationships amongst agents in dynamic open systems. These can appear as ratings, recommendations, referrals and feedback. Several repu...
As with any application of machine learning, web search ranking requires labeled data. The labels usually come in the form of relevance assessments made by editors. Click logs can...