Sciweavers

3137 search results - page 440 / 628
» A Text Understander that Learns
Sort
View
ICML
2006
IEEE
16 years 7 months ago
Pachinko allocation: DAG-structured mixture models of topic correlations
Latent Dirichlet allocation (LDA) and other related topic models are increasingly popular tools for summarization and manifold discovery in discrete data. However, LDA does not ca...
Wei Li, Andrew McCallum
ICML
2006
IEEE
16 years 7 months ago
Topic modeling: beyond bag-of-words
Some models of textual corpora employ text generation methods involving n-gram statistics, while others use latent topic variables inferred using the "bag-of-words" assu...
Hanna M. Wallach
ICML
2004
IEEE
16 years 7 months ago
Leveraging the margin more carefully
Boosting is a popular approach for building accurate classifiers. Despite the initial popular belief, boosting algorithms do exhibit overfitting and are sensitive to label noise. ...
Nir Krause, Yoram Singer
ICML
2000
IEEE
16 years 7 months ago
Maximum Entropy Markov Models for Information Extraction and Segmentation
Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many text-related tasks, such as part-of-speech t...
Andrew McCallum, Dayne Freitag, Fernando C. N. Per...
WWW
2008
ACM
16 years 7 months ago
Automatically refining the wikipedia infobox ontology
The combined efforts of human volunteers have recently extracted numerous facts from Wikipedia, storing them as machine-harvestable object-attribute-value triples in Wikipedia inf...
Fei Wu, Daniel S. Weld