As the use of Electronic Medical Records (EMRs) becomes more widespread, so does the need for effective information discovery on them. Recently proposed EMR standards are XML-based...
This paper identifies and explores the problem of seed selection in a web-scale crawler. We argue that seed selection is not a trivial but very important problem. Selecting proper...
Abstract. Dissimilarity measurement plays a crucial role in contentbased image retrieval, where data objects and queries are represented as vectors in high-dimensional content feat...
TOP-SURF is an image descriptor that combines interest points with visual words, resulting in a high performance yet compact descriptor that is designed with a wide range of conte...
This paper reports on and discusses a set of user experiments using the TREC 2003 Web interactive track protocol. The focus is on comparing humans and machine algorithms in terms ...
Mingfang Wu, Gheorghe Muresan, Alistair McLean, Mu...