This paper assumes foreground and background windows are locally constant, and additionally imposes a smoothness constraint on the alpha mask. With these assumptions, the alpha mask can be recovered exactly by solving an unconstrained quadratic program. In the case of color images, the authors relax their assumptions somewhat, assuming the pixel values in foreground and background windows lay on lines in RGB space (generally, they could lay anywhere in the RGB cube). In this case, their optimization problem still reduces to a QP.
Their method can be interpreted as producing an affinity function between pixels, even before the alpha matte is known. This can be used to help the user of the matting system, by telling the user which pixels are though to be similar, allowing the user to quickly spot and fix incorrect assumptions.
It seems like a fairly simple idea, and the results are impressive. They are able, for example, to matte a woman and her fly-away hair, preserving long thin filaments of hair that are surrounded by background. A matting with this much detail is probably not needed for texture-based descriptors, but would be great for shape-based descriptors. A long, thin, thread is a very distinctive shape.
To get the per-segment probability for a pixel, they use two sources of information. First, they look at the pixel location; since the images are aligned, hair occurs in the same parts of the image. Because hairstyles differ, they express the probability that a pixel will be hair using a mixture model, hand-clustering the training data into one of six hair types: {thin, thick} x {short, medium, long}. The second source of information is the pixel color; they model the probability of a color given a segment type using a Gaussian mixture model. With these two sources of information, they get their per-class evidence. They combine this with a prior that penalizes neighboring pixels having different segment types, assigning a penalty of either zero or one. To recover the approximately optimal solution to this MRF-type problem, they use loopy belief propagation.
The idea seems sound to me, but the methods are old school. They could probably get better initial segment type estimates if they used texture (hair vs skin should be easy to distinguish with texture). There is probably also a clever way to replace the MRF inference with matting as described in the previous paper, though there might be issues: 1) going from foreground-background to multiple segmentation types 2) conceptually dealing with "soft" segmentations (alpha values) 3) mapping the idea of wanting to keep foreground and background constant to the idea of wanting to preserve evidence.
No comments:
Post a Comment