Techniques for being able to automatically identify acronym patterns are very important for enhancing a multitude of applications that rely upon search. This task is challenging, d...
—We perform a statistical analysis and describe the asymptotic behavior of the frequency and size distribution of δoccurrent, minimal δ-occurrent, and maximal δ-occurrent item...
Where Information Retrieval (IR) and Text Categorization delivers a set of (ranked) documents according to a query, users of large document collections would rather like to receiv...
Recently the efficiency of an outlier detection algorithm ORCA was improved by RCS (Randomization with faster Cutoff update and Space utilization after pruning), which changes the ...
This paper introduces new specificity measuring methods of terms using inside and outside information. Specificity of a term is the quantity of domain specific information contain...