This paper proposes a method for capturing the performance
of a human or an animal from a multi-view video
sequence. Given an articulated template model and silhouettes
from a m...
Juergen Gall (BIWI, ETH Zurich), Carsten Stoll (Ma...
— We present a system for Monocular Simultaneous Localization and Mapping (Mono-SLAM) relying solely on video input. Our algorithm makes it possible to precisely estimate the cam...
Inferring 3D body pose as well as viewpoint from a single silhouette image is a challenging problem. We present a new generative model to represent shape deformations according to...
We present spatio-temporal feature descriptors that can be inferred from video and used as building blocks in action recognition systems. They capture the evolution of ``elementar...
— The problem of accurate 6-DoF pose estimation of 3D objects based on their shape has so far been solved only for specific object geometries. Edge-based recognition and trackin...