A visual word lexicon can be constructed by clustering primitive visual features, and a visual object can be described by a set of visual words. Such a "bag-of-words" re...
In this paper we present two contributions to improve accuracy and speed of an image search system based on bag-of-features: a contextual dissimilarity measure (CDM) and an effici...
Each facial event will give rise to complex facial appearance variation. In this paper, we propose similarity features to describe the facial appearance for video-based facial even...
Abstract. We propose a fully automatic framework to detect and extract arbitrary human motion volumes from real-world videos collected from YouTube. Our system is composed of two s...
Juan Carlos Niebles, Bohyung Han, Andras Ferencz, ...
Learning typical motion patterns or activities from videos of crowded scenes is an important visual surveillance problem. To detect typical motion patterns in crowded scenarios, w...