Modeling and Inferring Human Intents and Hidden Functional Objects for Trajectory Prediction

Dan Xie1, Tianmin Shu1, Sinisa Todorovic2 and Song-Chun Zhu1

Center for Vision, Cognition, Learning, and Autonomy, UCLA1

School of EECS, Oregon State University2

Overview

Introduction

This paper presents an approach to predicting human trajectories in surveillance videos by reasoning about latent human intentions and locations of latent functional objects in the scene. Functionality of objects is defined by how they affect people's trajectories -- namely, as attracting people to approach them for satisfying certain needs (e.g., vending machines could quench thirst), or repelling people to avoid them (e.g., grass lawns). The low-resolution of surveillance videos does not allow for reliable object detection. Therefore, occurrences of most objects of interest can be detected by their functionality, since they appear as "dark matter" emanating "dark energy" which affects people. Given an initial observation of human trajectories in the video, we address three tasks: (1) Localizing latent functional objects in the scene; (2) Inferring latent human intentions to reach a particular functional object; and (3) Predicting how the human trajectories unfold in the unobserved future parts of the video. We make two assumptions that the people are familiar with the scene layout (not our approach, though), and that they prefer to take shortest paths toward the intended "dark matter" while avoiding obstacles. This allows us to formulate the new Agent-based Lagrangian mechanics wherein human behavior is probabilistically modeled as the motion of a particle-agent intending to reach a certain destination in the attraction and repulsion force fields of the functional objects. A data-driven Markov Chain Monte Carlo process is used for inference. Our evaluation on videos of public squares and courtyards demonstrates our effectiveness in localizing functional objects and predicting people's trajectories, as well as superior performance relative to prior work which does not reason about human intentions and functional objects in the scene.

Paper

Dan Xie, Sinisa Todorovic and Song-Chun Zhu. Inferring "Dark Matter" and "Dark Energy" from Videos. IEEE International Cnference on Computer Vision (ICCV), 2013. [pdf][poster]

@inproceedings{XieDarkMatter2013,
  title     = {Inferring "Dark Matter" and "Dark Energy" from Videos},
  author    = {Dan Xie and Sinisa Todorovic and Song-Chun Zhu},
  year      = {2013},
  booktitle = {IEEE International Cnference on Computer Vision (ICCV)}
}

Dan Xie, Tianmin Shu, Sinisa Todorovic and Song-Chun Zhu. Modeling and Inferring Human Intents and Hidden Functional Objects for Trajectory Prediction. Under preparation, 2016.

@inproceedings{XieDarkMatterJournal2016,
  title     = {Modeling and Inferring Human Intents and Hidden Functional Objects for Trajectory Prediction},
  author    = {Dan Xie and Tianmin Shu and Sinisa Todorovic and Song-Chun Zhu},
  year      = {2016},
  booktitle = {Under preparation}
}

Demo

Contact

Any question? Please contact Tianmin Shu (tianmin.shu [at] ucla.edu)