Abstract. This paper presents a language-independent Multilingual Document Clustering (MDC) approach on comparable corpora. Named entites (NEs) such as persons, locations, organiza...
Record linkage is an important data integration task that has many practical uses for matching, merging and duplicate removal in large and diverse databases. However, a quadratic ...
Timothy de Vries, Hui Ke, Sanjay Chawla, Peter Chr...
During the past few years, hypermedia systems have emerged as an essential component of many application domains ranging from software engineering to library information systems. ...
Shahram Ghandeharizadeh, Luis Ramos, Zubair Asad, ...
Over the years the amount and range of electronic text stored on the WWW has expanded rapidly, overwhelming both users and tools designed to index and search the information. It is...
Recently the world of the web has become more social and more real-time. Facebook and Twitter are perhaps the exemplars of a new generation of social, real-time web services and w...