Term extraction relates to extracting the most characteristic or important terms (words or phrases) in a document. This information is commonly used for improving the accuracy of ...
As the number of available Web pages grows, users experience increasing difficulty finding documents relevant to their interests. One of the underlying reasons for this is that mo...
SALSA is a link-based ranking algorithm that takes the result set of a query as input, extends the set to include additional neighboring documents in the web graph, and performs a...
Increasingly, taxonomies are being developed for a wide variety of industrial domains and specific applications within those domains. These industry or application specific taxono...
Chin Pang Cheng, Jiayi Pan, Gloria T. Lau, Kincho ...
Efficient similarity search in high-dimensional spaces is important to content-based retrieval systems. Recent studies have shown that sketches can effectively approximate L1 dist...