Natural images contains a broad spectrum of stochastic visual patterns, such as texture, point, line, curve, graph processes, and geometry and shape. Developing mathematical models of these patterns is of crucial importance for building general purpose and robust vision systems. In general, we have two observations: 1. The spectrum of patterns should be continuous. Thus models for texture up to geometry should form a nested family and be compatible. The visual patterns vary from each other in two aspects: a). What are the basic elements (paricles)? pixels, wavelet bases, textons, lines, curves, junctions, up to geometric descriptions: spline bases, meaningful parts. Evidently, the definition of elements, such as "textons", "meaningful parts" must be governed by proper mathematical models of the whole pattern. b). What are the potential functions characterizing the interactions among elements? In statistical mechanics, a stochastic pattern is modeled by a Gibbs distribution with potentials defined on a neighborhood system. Then other intuitive physical concepts can be derived, such as "forces", "diffusion", et al. Of course, a natural question arises: how do we define a good neighborhood system? 2. Models of visual patterns should be learned from observations. a). We must use generative models which naturally reflect the hierarchic organization of particles. b). Descriptive models are the precursors of generative model. Descriptive models can be learned from a minimax entropy learning theory, and generative models are learned through EM-type algorithm and it integrates the minimax entropy learning.
We are making the following attempts in studying the problems: