Recognizing the alternative ways people use to reference an entity, is important for many Web applications that query structured data. In such applications, there is often a mismat...
In this paper, we propose GAD (General Activity Detection) for fast clustering on large scale data. Within this framework we design a set of algorithms for different scenarios: (...
Jiawei Han, Liangliang Cao, Sangkyum Kim, Xin Jin,...
In this paper, an editing algorithm based on the projection of the examples in each dimension is presented. The algorithm, that we have called EOP (Editing by Ordered Projection) h...
The Social Informatics Data Grid (SIDGrid) is a new cyberinfrastructure designed to transform how social and behavioral scientists collect and annotate data, collaborate and share...
Theproblemof efficiently and accurately locating patterns of interest in massivetimeseries data sets is an important and non-trivial problemin a wide variety of applications, incl...