Text classification has matured as a research discipline over the last decade. Independently, business intelligence over structured databases has long been a source of insights fo...
Multi-label problems arise in various domains such as multitopic document categorization and protein function prediction. One natural way to deal with such problems is to construc...
The BioJournalMonitor is a decision support system for the analysis of trends and topics in the biomedical literature. Its main goal is to identify potential diagnostic and therap...
Supervised classification methods have been shown to be very effective for a large number of applications. They require a training data set whose instances are labeled to indicate...
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...