From Proc. CAiSE05 LNCS 3520, Pages 460-474 c Springer-Verlag 2005 Semi-structured data sources, such as XML, HTML or CSV files, present special problems when performing data int...
We discuss the problem of Web data extraction and describe an XML-based methodology whose goal extends far beyond simple "screen scraping." An ideal data extraction proc...
With the development of inexpensive storage devices, space usage is no longer a bottleneck for computer users. However, the increasingly large amount of personal information poses ...
Molecular-biological annotation data is continuously being collected, curated and made accessible in numerous public data sources. Integration of this data is a major challenge in ...
The need for information integration is paramount in many biological disciplines, because of the large heterogeneity in both the types of data involved and in the diversity of app...