The past few years have experienced an explosive growth in scientific and regulatory documents related to the patent system. Relevant information is siloed into many heterogeneous...
Siddharth Taduri, Gloria T. Lau, Kincho H. Law, Ha...
In many text retrieval tasks, it is highly desirable to obtain a "similarity profile" of the document collection for a given query. We propose sampling-based techniques ...