More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
Background: Single nucleotide polymorphisms (SNPs) and genes that exhibit presence/absence variation have provided informative marker sets for bacterial and viral genotyping. Iden...
Erin P. Price, John Inman-Bamber, Venugopal Thiruv...
Content-based Image Retrieval (CBIR) is a computer vision application that aims at automatically retrieving images based on their visual content. Linear Discriminat Analysis and i...
Automatic classification of web pages is an effective way to deal with the difficulty of retrieving information from the Internet. Although there are many automatic classification...
As the amount of textual information grows explosively in various kinds of business systems, it becomes more and more desirable to analyze both structured data records and unstruc...