Information obtained by merging data extracted from problem reporting systems – such as Bugzilla – and versioning systems – such as Concurrent Version System (CVS) – is wi...
Many documents on the Web are formated in a weakly structured format. Because of their weak semantic and because of the heterogeneity of their formats, the information conveyed by...
Successful information management implies the ability to design accurate representations of the real world of interest, in spite of the diversity of perceptions from the applicati...
Relationships are an integral part of the design of a database. Comparing and integrating relationships from heterogeneous databases requires that the relationships be mapped to ea...
We give a new view on building content clusters from page pair models. We measure the heuristic importance within every two pages by computing the distance of their accessed positi...