We present a general framework for the task of extracting specific information “on demand” from a large corpus such as the Web under resource-constraints. Given a database wit...
Term signal is an existing text representation that depicts a term as a vector of frequencies of occurrences in a number of user-defined partitions of a document. Although term si...
Supphachai Thaicharoen, Tom Altman, Krzysztof J. C...
Although very widely used in unsupervised data mining, most clustering methods are affected by the instability of the resulting clusters w.r.t. the initialization of the algorithm ...
An easily implementable mixed-integer algorithm is proposed that generates a nonlinear kernel support vector machine (SVM) classifier with reduced input space features. A single ...
Large graph databases are commonly collected and analyzed in numerous domains. For reasons related to either space efficiency or for privacy protection (e.g., in the case of socia...