In this paper, we present a systematic framework for recognizing realistic actions from videos “in the wild.” Such unconstrained videos are abundant in personal collections as...
This contribution describes an almost parameterless iterative context compilation method, which produces feature layers, that are especially suited for mixed bottom-up top-down ass...
We adapted the Bubbles procedure [Vis. Res. 41 (2001) 2261] to examine the effective use of information during the first 282 ms of face identification. Ten participants each viewe...
We describe a method to align ASL video subtitles with a closed-caption transcript. Our alignments are partial, based on spotting words within the video sequence, which consists o...
Abstract. We propose a framework that learns functional objectes from spatio-temporal data sets such as those abstracted from video. The data is represented as one activity graph t...
Muralikrishna Sridhar, Anthony G. Cohn, David C. H...