In spite of the popularity of probabilistic mixture models for latent structure discovery from data, mixture models do not have a natural mechanism for handling sparsity, where ea...
A common approach in machine learning is to use a large amount of labeled data to train a model. Usually this model can then only be used to classify data in the same feature spac...
XML is fast becoming the standard format to store, exchange and publish over the web, and is getting embedded in applications. Two challenges in handling XML are its size (the XML...
Paolo Ferragina, Fabrizio Luccio, Giovanni Manzini...
XML has emerged as the primary standard of data representation and data exchange [13]. Although many software tools exist to assist the XML implementation process, data must be ma...
Electronic newsgroups are one of the primary means for the dissemination, exchange and sharing of information. We argue that the current newsgroup model is unsatisfactory, especial...