On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Users prefer to navigate subjects from organized topics in an abundance resources than to list pages retrieved from search engines. We propose a framework to cluster frequent items...
This paper describes the architecture of a Bulgarian–Bulgarian question answering system — BulQA. The system relies on a partially parsed corpus for answer extraction. The que...
Topical blog post retrieval is the task of ranking blog posts with respect to their relevance for a given topic. To improve topical blog post retrieval we incorporate textual cred...
In document retrieval task, random projection (RP) is a useful technique of dimension reduction. It can be obtained very quickly yet the recalculation is not necessary to any chang...