Sciweavers

2677 search results - page 205 / 536
» Extracting Structured Data from Web Pages
Sort
View
CIDR
2009
148views Algorithms» more  CIDR 2009»
15 years 7 months ago
The Case for a Structured Approach to Managing Unstructured Data
The challenge of managing unstructured data represents perhaps the largest data management opportunity for our community since managing relational data. And yet we are risking let...
AnHai Doan, Jeffrey F. Naughton, Akanksha Baid, Xi...
WWW
2010
ACM
15 years 6 months ago
Talking about data: sharing richly structured information through blogs and wikis
Abstract. Several projects have brought rich data semantics to collaborative wikis, but blogging platforms remain primarily limited to text. As blogs comprise a significant portion...
Edward Benson, Adam Marcus 0002, Fabian Howahl, Da...
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
16 years 1 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
ADVIS
2006
Springer
16 years 15 days ago
Structural and Event Based Multimodal Video Data Modeling
Investments on multimedia technology enable us to store many more reflections of the real world in digital world as videos. By recording videos about real world entities, we carry...
Hakan Öztarak, Adnan Yazici
EMNLP
2008
15 years 8 months ago
Mining and Modeling Relations between Formal and Informal Chinese Phrases from Web Corpora
We present a novel method for discovering and modeling the relationship between informal Chinese expressions (including colloquialisms and instant-messaging slang) and their forma...
Zhifei Li, David Yarowsky