Statistical bilingual word alignment has been well studied in the context of machine translation. This paper adapts the bilingual word alignment algorithm to monolingual scenario ...
To learn concepts over massive data streams, it is essential to design inference and learning methods that operate in real time with limited memory. Online learning methods such a...
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...
In this paper, we show that a Bio-inspired classifier’s accuracy can be dramatically improved if it operates on intelligent features. We propose a novel set of intelligent feat...
M. Zubair Shafiq, Syed Ali Khayam, Muddassar Faroo...
Modern trading and cluster applications require microsecond latencies and almost no losses in data centers. This paper introduces an algorithm called FineComb that can estimate ...