Where Information Retrieval (IR) and Text Categorization delivers a set of (ranked) documents according to a query, users of large document collections would rather like to receiv...
In many criminal cases, forensically collected data contain valuable information about a suspect’s social networks. An investigator often has to manually extract information fro...
Rabeah Al-Zaidy, Benjamin C. M. Fung, Amr M. Youss...
A typical web search engine consists of three principal parts: crawling engine, indexing engine, and searching engine. The present work aims to optimize the performance of the cra...
Konstantin Avrachenkov, Alexander N. Dudin, Valent...
Abstract. Automated language identification of written text is a wellestablished research domain that has received considerable attention in the past. By now, efficient and effecti...
Recently, high resolution digital cameras have made the digitization process more flexible and convenient than traditional scanning technology. Therefore, document image analysis ...