Web entities, such as documents and hyperlinks, are created for different purposes, or intents. Existing intent-based retrieval methods largely focus on information seekers’ int...
The bit-sliced index (BSI) was originally defined in [ONQ97]. The current paper introduces the concept of BSI arithmetic. For any two BSI's X and Y on a table T, we show how ...
Denis Rinfret, Patrick E. O'Neil, Elizabeth J. O'N...
It is crucial for a web crawler to distinguish between ephemeral and persistent content. Ephemeral content (e.g., quote of the day) is usually not worth crawling, because by the t...
: We describe our participation in the TREC 2003 Robust and Web tracks. For the Robust track, we experimented with the impact of stemming and feedback on the worst scoring topics. ...
Jaap Kamps, Christof Monz, Maarten de Rijke, B&oum...
Information filtering (IF) systems usually filter data items by correlating a vector of terms (keywords) that represent the user profile with similar vectors of terms that represe...