[One Example] [More Examples] [Background] [Summary] [Applications] [References]

One Example:       

Parameters Used

Image Size  300*200
Sketchable Pixels 18,185 ~ 25%
Primitive Number 230
Primitive Width 7
Primitive Parameters 2,350 ~ 3.5%
MRF Parameters 5*7*13=455
Total Parameters 2,805 ~ 4.7%
a). input image b). sketching pursuit process c). sketches
f). synthesized image e). synthesized texture  d). sketch image

Figure 1. One example of our primal sketch model. 

For an input image in Figure 1(a), we run a greedy algorithm called "sketching pursuit process" (Figure 1(b)) to achieve the sketches which is represented by a graph as shown in Figure 1(c). Figure 1(d) shows the sketch image which is modeled by a generative model with a dictionary of visual primitives shown in Figure 2. Figure 1(e) shows the synthesized textures by Julesz ensemble. The sketch image places the boundary condition for the textures. The final synthesized image is shown in Figure 1(f) which combines the sketch image and the synthesized textures.

Figure 2. Visual primitives learned from natural images.


More Examples (click each image icon to show result)





        In his monumental book (Vision 1982), Marr proposed the primal sketch as an intermediate representation in his representational framework for deriving shape information from images. This representation is the stage between the original image and the 2.5-D sketch. The primal sketch was supposed to be a first level inner representation of generic images, in terms of image primitives, such as bars, edges, terminators, etc. However, despite many inspiring observations, Marr provided neither an explicit mathematical model nor a rigorous definition for the dictionary of visual primitives.

Figure 3. Marr's Representational Framework. 


Summary of our Primal Sketch Model

        We [1] propose a mathematical theory of primal sketch and define sketchability. The theory consists of four components:

  1. The center of our theory is a primal sketch model for natural images, which integrates the MRF and wavelet theories;
  2. A sketching pursuit process, which combines the matching pursuit procedure (Mallat and Zhang, 1993) for the sketchable part by adding one base at a time, and the filter pursuit procedure (Zhu, Wu, and Mumford 1997) for the non-sketchable part by adding one filter at a time;
  3. A definition of sketchability;
  4. Learning a dictionary of primitives (or textons) in image sketch.

        In plain words, we summarize our model as:

  1. Automatically separate the image into "sketchable" (structures) and "non-sketchable" (textures) parts by a "sketching pursuit process" which computes "sketchability";
  2. Model the structures by a generative model (Wavelet-like model) with a dictionary of "visual primitives" learned from natural images;
  3. Model textures by a descriptive model (Markov Random Field model) - Julesz Ensemble model.
  4. Model the spatial relationship of the structures by a descriptive model - Gestalt Fields.



  1. Image and video compression: As shown in the first example, we achieved around 1:20 compression ratio to the original image.

  2. Leads to a new method of edge detection which incorporates structural information.



  1. Cheng-en Guo, Song-chun Zhu and Yingnian Wu
    "A Mathematical Theory of Primal Sketch and Sketchability" (.pdf 553K)
    Proc. of International Conference on Computer Vision, Nice France, 2003


[One Example] [More Examples] [Background] [Summary] [Applications] [References]