Three papers today: head segmentation, review paper on ear biometrics, and a cute one on image-based shaving.
"Segmentation of a head into face, ears, neck and hair for knowledge-based analysis-synthesis coding of videophone sequences" by Kampmann: Task is to segment a video stream into face ears, neck, and hair regions. A motivating idea is smart videoconferencing compression, as regions like the face should be transferred at a higher fidelity than other regions.
They assume they start with eyes and mouth center positions as well as chin and cheek contours. In a new image, they find areas with likely skin tone using the eyes and mouth locations, and they use this tone to segment the image between skin and background. Using the chin contour, they break the skin pixels into face, neck, and ears regions.
To find hair, they assume they have a segmentation of the head from the background and they subtract the skin pixels.
"The ear as a biometric" by Hurley et al.: This is an ear biometrics review paper.
The paper makes the point that there is significant variability in ear shape between people, and that the ear: 1) changes little with age, 2) doesn't change with facial expression, 3) vs fingerprints there is not hygenics issue, 4) vs iris or retina scanning, there is no user fear of harm, 5) not affected by makeup or obscured by facial hair, though can be changed with jewelry or obscured with head hair.
Burge and Burger demonstrated the potential for the ear as a biometric and also used computer vision to recognize ears. They segmented the ear using Canny edges and compared two such segmentations by comparing the Voronoi diagrams of the segmentations using a novel distance measure. They identify occlusion by hair as a major obstacle.
A number of authors have used principal components analysis on the raw pixels of registered images as a preprocessing step when recognizing ears.
Appear only to talk about approaches for registered ears.
"Image-based Shaving" by Nguyen et al.: This is a graphics paper discussing automatic removal of beards in images. They express a given image in terms of beard and non-beard PCA components.
Given a set of non-beard images, a naive approach is to find the PCA components accounting for most of the energy, and when a bearded image comes in, express the bearded image in terms of the non-beard principal components. However, as beards tend to be spatially large and significantly different than skin pixels, the reconstruction process tries to reconstruct the beard and produces poor results. As a first pass fix for this problem, they use a robust estimator, where pixel mismatch imposes an asymptotically L1 error, instead of the L2 error of regular PCA. This improves the results, but can also result in overly smooth reconstructions.
As a further refinement, they estimate the beard space directly. To do this, they take a dataset of bearded faces, and for each bearded face, they compute the difference between the bearded face and the automatically shaved faces using the previous method. This set of difference vectors spans the beard space. They can then express a new bearded image with the standard PCA error (L2), using beard and non-beard principal components. They can then reweight the beard coefficients to delete the beard or even make it more pronounced.
As yet a further refinement, they exploit the spatial locality of beards. When determining the beard subspace by considering the difference images, they use an MRF to segment beard pixels (generally pixels with a large difference) from non-beard pixels. They then zero out the entries in the difference vectors that are not beard pixels.
With their technique, they can also perform beard addition, though they must blend the beard and face layers as a postprocessing step to assure an even transition from beard to face.
They use hand-registered data.
They have preliminary results for automatic glasses removal, but they don't discuss them.