The integration of heterogeneous legacy databases requires understanding of database structure and content. We previously developed a theoretical and software infrastructure to sup...
Mark S. Schmalz, Joachim Hammer, Mingxi Wu, Oguzha...
The problem of simultaneous feature extraction and selection, for classifier design, is considered. A new framework is proposed, based on boosting algorithms that can either 1) s...
Abstract. Location is the most essential presence information for mobile users. In this paper, we present an improved time-based clustering technique for extracting significant lo...
We apply the hypothesis of "One Sense Per Discourse" (Yarowsky, 1995) to information extraction (IE), and extend the scope of "discourse" from one single docum...
In this paper we analyze our recent research on the use of document analysis techniques for metadata extraction from PDF papers. We describe a package that is designed to extract ...