We have used a general purpose data mining tool to determine whether we can find any ‘golden nuggets’ in the web access logs of a large academic web site. Our goal was to use...
Leximancer is a software system for performing conceptual analysis of text data in a largely language independent manner. The system is modelled on Content Analysis and provides u...
A tanglegram is a pair of trees whose leaf sets are in oneto-one correspondence; matching leaves are connected by inter-tree edges. In applications such as phylogenetics or hierar...
Programming assignments are easy to plagiarize in such a way as to foil casual reading by graders. Graders can resort to automatic plagiarism detection systems, which can generate...
We present a method for picture detection in document page images, which can come from scanned or camera images, or rendered from electronic file formats. Our method uses OCR to s...