The run-time performance of VLIW (very long instruction word) microprocessors depends heavily on the effectiveness of its associated optimizing compiler. Typical VLIW compiler pha...
We here describe the subword approach we used in the 2006 ImageCLEF Medical Image Retrieval task. It is based on the assupmtion that neither fully inflected nor automatically stem...
In a corpus of jokes, a human might judge two documents to be the "same joke" even if characters, locations, and other details are varied. A given joke could be retold w...
Let C be a code of length n over an alphabet of size q. A word d is a descendant of a pair of codewords x, y C if di {xi, yi} for 1 i n. A code C is an identifiable parent prop...
Most traditional text clustering methods are based on "bag of words" (BOW) representation based on frequency statistics in a set of documents. BOW, however, ignores the ...
Jian Hu, Lujun Fang, Yang Cao, Hua-Jun Zeng, Hua L...