We introduce the corpus of United States Congressional bills from 1947 to 1998 for use by language research communities. The U.S. Policy Agenda Legislation Corpus Volume 1 (USPALC...
Background: Information obtained from diverse data sources can be combined in a principled manner using various machine learning methods to increase the reliability and range of k...
Bolan Linghu, Evan S. Snitkin, Dustin T. Holloway,...
Machine learning techniques for data extraction from semistructured sources exhibit different precision and recall characteristics. However to date the formal relationship between...
Guizhen Yang, Saikat Mukherjee, I. V. Ramakrishnan
It is often expensive to acquire data in real-world data mining applications. Most previous data mining and machine learning research, however, assumes that a fixed set of trainin...
Anomaly detection for network intrusion detection is usually considered an unsupervised task. Prominent techniques, such as one-class support vector machines, learn a hypersphere ...