There are many motivations for using features rather than pixels
directly. For mobile robots, a critical motivation is that feature
based systems operate much faster than pixel based systems
[17]. The features used here have the same structure as
the Haar basis functions, i.e. step functions introduced by Alfred
Haar to define wavelets [6]. They are also used in
[8,9,10,17].
Fig. 4 (left) shows the eleven basis features, i.e.
edge, line, diagonal and center surround features. The base resolution
of the object detector is
pixels, thus, the set of
possible features in this area is very large (642592 features, see
[9] for calculation details).
A single feature is effectively computed on input images
using integral images [17], also known as summed area
tables [8,9]. An integral image is
an intermediate representation for the image and contains the sum of
gray scale pixel values of image with height and width ,
i.e.,
|
To detect a feature, a threshold is required. This threshold is automatically determined during a fitting process, such that a minimum number of examples are misclassified. Furthermore, the return values of the feature are determined, such that the error on the examples is minimized. The examples are given in a set of images that are classified as positive or negative samples. The set is also used in the learning phase that is briefly described next.