We present an active learning approach to choose image annotation requests among both object category labels and the objects’ attribute labels. The goal is to solicit those labe...
Many applications involve multiple-modalities such as text and images that describe the problem of interest. In order to leverage the information present in all the modalities, on...
In this work we present a new crowd analysis algorithm powered by behavior priors that are learned on a large database of crowd videos gathered from the Internet. The algorithm wo...
Mikel Rodriguez, Josef Sivic, Ivan Laptev, Jean-Yv...
Complex human activities occurring in videos can be defined in terms of temporal configurations of primitive actions. Prior work typically hand-picks the primitives, their total...
In this work, we propose to use attributes and parts for recognizing human actions in still images. We define action attributes as the verbs that describe the properties of human...
Bangpeng Yao, Xiaoye Jiang, Aditya Khosla, Andy La...