In this paper, we report on our experience with the creation of an automated, human-assisted process to extract metadata from documents in a large (>100,000), dynamically growi...
Jianfeng Tang, Kurt Maly, Steven J. Zeil, Mohammad...
We have developed a web-repository crawler that is used for reconstructing websites when backups are unavailable. Our crawler retrieves web resources from the Internet Archive, Go...
Physical Design of modern systems on chip is extremely challenging. Such digital integrated circuits often contain tens of millions of logic gates, intellectual property blocks, e...
Aaron N. Ng, Igor L. Markov, Rajat Aggarwal, Venky...
Over the last decade, the major firms and cultural institutions that have dominated media and information industries in the U.S. and globally have been challenged by people adopti...
Modern distributed information retrieval techniques require accurate knowledge of collection size. In non-cooperative environments, where detailed collection statistics are not av...