We describe the use of machine learning and data mining to detect and classify malicious executables as they appear in the wild. We gathered 1,971 benign and 1,651 malicious execu...
The traditional mention-pair model for coreference resolution cannot capture information beyond mention pairs for both learning and testing. To deal with this problem, we present ...
Xiaofeng Yang, Jian Su, Jun Lang, Chew Lim Tan, Ti...
A Bayesian network is an appropriate tool to deal with the uncertainty that is typical of real-life applications. Bayesian network arcs represent statistical dependence between dif...
In a corpus of jokes, a human might judge two documents to be the "same joke" even if characters, locations, and other details are varied. A given joke could be retold w...
Background: Word sense disambiguation (WSD) algorithms attempt to select the proper sense of ambiguous terms in text. Resources like the UMLS provide a reference thesaurus to be u...