The first step for computing bottom-up saliency is to generate image pyramids for each feature to enable computations on different scales. Three features are considered: Intensity, orientation, and color. For the feature intensity, we convert the input image into gray-scale and generate a Gaussian pyramid with 5 scales to by successively low-pass filtering and subsampling the input image, i.e., scale has half the width and height of scale .
The intensity maps are created by center-surround mechanisms, which compute the intensity differences between image regions and their surroundings. We compute two kinds of maps, the on-center maps for bright regions on dark background, and the off-center maps : Each pixel in these maps is computed by the difference between a center and a surround ( ) or vice versa ( ). Here, is a pixel in one of the scales to , is the average of the surrounding pixels for two different radii. This yields 12 intensity scale maps with onoff-, and .
The maps for each are summed up by inter-scale addition , i.e., all maps are resized to scale 2 and then added up pixel by pixel yielding the intensity feature maps .
To obtain the orientation maps, four oriented Gabor pyramids are created, detecting bar-like features of the orientations . The maps 2 to 4 of each pyramid are summed up by inter-scale addition yielding 4 orientation feature maps .
To compute the color feature maps, the color image is first converted into the uniform CIE LAB color space [2]. It represents colors similar to human perception. The three parameters in the model represent the luminance of the color (L), its position between red and green (A) and its position between yellow and blue (B). From the LAB image, a color image pyramid is generated, from which four color pyramids , , , and are computed for the colors red, green, blue, and yellow. The maps of these pyramids show to which degree a color is represented in an image, i.e., the maps in show the brightest values at red regions and the darkest values at green regions. Luminance is already considered in the intensity maps, so we ignore this channel here. The pixel value in map of pyramid is obtained by the distance between the corresponding pixel and the prototype for red . Since is of the form , this yields:
On these pyramids, the color contrast is computed by on-center-off-surround differences yielding color scale maps with red greenblue yellow-, and . The maps of each color are inter-scale added into 4 color feature maps .