Scalable similarity search is the core of many large scale learning or data mining applications. Recently, many research results demonstrate that one promising approach is creatin...
We have developed a threaded parallel data streaming approach using Logistical Networking (LN) to transfer multi-terabyte simulation data from computers at NERSC to our local anal...
Viraj Bhat, Scott Klasky, Scott Atchley, Micah Bec...
Abstract. With the invention of biotechnological high throughput methods like DNA microarrays, biologists are capable of producing huge amounts of data. During the analysis of such...
Abstract. Query optimization is an important functionality of modern database systems and often based on estimating the selectivity of queries before actually executing them. Well-...
The similarity join has become an important database primitive to support similarity search and data mining. A similarity join combines two sets of complex objects such that the r...