We cast name discrimination as a problem in clustering short contexts. Each occurrence of an ambiguous name is treated independently, and represented using second?order context vec...
: We combine the speed and scalability of information retrieval with the generally superior classification accuracy offered by machine learning, yielding a two-phase text classifie...
Recent work has shown the feasibility and promise of templateindependent Web data extraction. However, existing approaches use decoupled strategies ? attempting to do data record ...
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Y...
While the vast majority of clustering algorithms are partitional, many real world datasets have inherently overlapping clusters. Several approaches to finding overlapping clusters...
We propose a simple yet potentially very effective way of visualizing trained support vector machines. Nomograms are an established model visualization technique that can graphica...
Aleks Jakulin, Martin Mozina, Janez Demsar, Ivan B...