The first step for computing bottom-up saliency is to generate
image pyramids for each feature to enable computations on
different scales. Three features are considered:
Intensity, orientation, and color. For the feature intensity, we
convert the input image into gray-scale and generate a Gaussian
pyramid with 5 scales to
by successively low-pass
filtering and subsampling the input image, i.e., scale
has half the width and height of scale
.
The intensity maps are created by center-surround mechanisms,
which compute the intensity differences between image regions and
their surroundings.
We compute two kinds of maps, the on-center maps
for bright regions on dark background, and the off-center maps
: Each pixel in these maps is computed
by the difference between a center
and a surround
(
) or vice
versa (
).
Here,
is a pixel in one of the scales
to
,
is the average of
the surrounding pixels for two different radii.
This yields 12 intensity scale maps
with
on
off
-
, and
.
The maps for each are summed up by inter-scale addition
, i.e., all maps are resized to scale 2 and then added up
pixel by pixel yielding the intensity feature maps
.
To obtain the orientation maps, four oriented Gabor
pyramids are created, detecting bar-like features of the orientations
.
The maps 2 to 4 of each pyramid are summed up by inter-scale
addition yielding 4 orientation feature maps
.
To compute the color feature maps, the color image is first
converted into the uniform CIE LAB color space [2].
It represents colors similar to human perception. The three
parameters in the model represent the luminance of the color (L),
its position between red and green (A) and its position between
yellow and blue (B). From the LAB image, a color image pyramid
is generated, from which four color
pyramids
,
,
, and
are computed for the
colors red, green, blue, and yellow. The maps of these pyramids
show to which degree a color is represented in an image, i.e.,
the maps in
show the brightest values at red regions and
the darkest values at green regions. Luminance is already
considered in the intensity maps, so we ignore this channel
here. The pixel value
in map
of pyramid
is obtained by the distance between the corresponding pixel
and the prototype for red
. Since
is of the form
, this yields:
On these pyramids, the color contrast is computed by on-center-off-surround
differences yielding color scale maps
with
red
green
blue
yellow
-
, and
.
The maps of each color are inter-scale added into 4 color
feature maps
.