SIG-12: Turorial on Stochastic Image Grammars

ORGANIZERS

Prof. Song-Chun Zhu, sczhu@stat.ucla.edu

Prof. Ales Leonardis, ales.leonardis@fri.uni-lj.si

Prof. Sinisa Todorovic, sinisa@eecs.oregonstate.edu

COURSE DESCRIPTION

Stochastic image grammar (SIG) is a general theoretical framework that includes hierarchical representations of objects, events, and their spatiotemporal contexts, as well as associated learning and inference algorithms. In comparison with other alternative frameworks, SIG has been demonstrated as competitive, and often superior, in terms of accuracy, learnability and scalability, for answering the “What/Who” and “Where/When” queries. SIG has also been successfully used for formalizing additional types of reasoning, including causality (“Why”), and synthesis (e.g., prediction and postdiction). Importantly, unlike alternatives, SIG provides a unified framework for addressing all these diverse vision problems.

Computer vision experiences a resurgence of SIGs in recent years, e.g., AND-OR graphs, sum-product networks, Markov logic networks, and deep learning. This momentum is, in part, due to two successful workshops – SIG-09 and SIG-11 -- organized in conjunction with CVPR 2009 and ICCV 2011. The workshops were aimed at formulating a unified theoretical framework of SIGs, and demonstrating their merit in computer vision.

SIGs, however, have not reached their full potential in vision, due to a number of reasons. First, it seems that there is a disconnect between current progress and previous work. Researchers who are new to SIGs tend to “re-invent the wheel,” at almost all levels of formalism, from introducing new terms for already established concepts to re-deriving old theoretical results. Second, the SIG subcommunity has thus far provided relatively poor research infrastructure, in terms of teaching material, open-source code, implementation documentation, datasets, and standardized evaluation methodologies. Such an infrastructure is instrumental for ensuring continued progress, and a broader involvement of both experts and beginners in the SIG-related research.

We view CVPR 2012 as a great opportunity to organize a full-day short course on SIGs. The course will be aimed at addressing the above issues by:

Providing lectures, slides, and other teaching material on the key theoretical foundations of SIGs,

Presenting live demos of our software for SIG-based object and activity recognition, with a focus on answering “What/Who”, “Where/When” and “Why” queries,

Establishing a solid infrastructure for research on SIGs in terms of a designated website for sharing open-source code, datasets, evaluation methodologies, presentations, and technical reports.

SIG-12: Tutorial on Stochastic Image Grammars

for Object, Scence and Event Understanding