In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
The power and popularity of kernel methods stem in part from their ability to handle diverse forms of structured inputs, including vectors, graphs and strings. Recently, several m...
Darrin P. Lewis, Tony Jebara, William Stafford Nob...
We propose a new family of latent variable models called max-margin min-entropy (m3e) models, which define a distribution over the output and the hidden variables conditioned on ...
Kevin Miller, M. Pawan Kumar, Benjamin Packer, Dan...
Information extraction (IE) systems are costly to build because they require development texts, parsing tools, and specialized dictionaries for each application domain and each na...
Spam filtering is defined as a task trying to label emails with spam or ham in an online situation. The online feature requires the spam filter has a strong timely generalization a...