Recently many large scale computer systems are built in order to meet the high storage and processing demands of compute and data-intensive applications. MapReduce is one of the mo...
This paper presents a series of tools for the extraction of specialized corpora from the web and its subsequent analysis mainly with statistical techniques. It is an integrated sy...
1 Tags are an important information source in Web 2.0. They can be used to describe users’ topic preferences as well as the content of items to make personalized recommendations....
In this paper, we established a newborn screening system under the HL7/Web Services frameworks. We rebuilt the NTUH Newborn Screening Laboratory's original standalone architec...
One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...