Statistical machine learning techniques for data classification usually assume that all entities are i.i.d. (independent and identically distributed). However, real-world entities...
We introduce the Ranked Feature Fusion framework for information retrieval system design. Typical information retrieval formalisms such as the vector space model, the bestmatch mo...
The typical workload in a database system consists of a mixture of multiple queries of different types, running concurrently and interacting with each other. Hence, optimizing per...
Inverse document frequency (IDF) is one of the most useful and widely used concepts in information retrieval. There have been various attempts to provide theoretical justification...
Finding biological entities (such as genes or proteins) that satisfy certain conditions from texts is an important and challenging task in biomedical information retrieval and tex...