Mattes are used in photography and special effects filmmaking to combine two or more image elements into a single, final image. Usually, mattes are used to combine a foreground image (e.g. actors on a set or a spaceship) with a background image (e.g. a scenic vista or a starfield with planets). In this case, the matte is the background painting. In film and stage, mattes can be physically huge sections of painted canvas, portraying large scenic expanses of landscapes.
In film, the principle of a matte requires masking certain areas of the film emulsion to selectively control which areas are exposed. However, many complex special-effects scenes have included dozens of discrete image elements, requiring very complex use of mattes, and layering mattes on top of one another. For an example of a simple matte, we may wish to depict a group of actors in front of a store, with a massive city and sky visible above the store's roof. We would have two images—the actors on the set, and the image of the city—to combine onto a third. This would require two masks/mattes. One would mask everything above the store's roof, and the other would mask everything below it. By using these masks/mattes when copying these images onto the third, we can combine the images without creating ghostly double-exposures. In film, this is an example of a static matte, where the shape of the mask does not change from frame to frame. Other shots may require mattes that change, to mask the shapes of moving objects, such as human beings or spaceships. These are known as traveling mattes. Traveling mattes enable greater freedom of composition and movement, but they are also more difficult to accomplish.
Compositing techniques known as chroma keying that remove all areas of a certain color from a recording - colloquially known as "bluescreen" or "greenscreen" after the most popular colors used - are probably the best-known and most widely used modern techniques for creating traveling mattes, although rotoscoping and multiple motion control passes have also been used in the past. Computer-generated imagery, either static or animated, is also often rendered with a transparent background and digitally overlaid on top of modern film recordings using the same principle as a matte - a digital image mask.
The in-camera matte shot, also known as the Dawn Process is created by first mounting a piece of glass in front of the camera. Black paint is applied to the glass where the background will be replaced. The actors are then filmed with minimal sets. The director shoots several minutes of extra footage to be used as test strips. The matte painter then develops a test strip (with the blacked out areas in the shot) and projects a frame of the 'Matted' shot onto the easel mounted glass. This test footage clip is used as the reference to paint the background or scenery to be matted in on a new piece of glass. The live action part of the glass is painted black, more of the test footage is then exposed to adjust and confirm color matching and edge line up. Then the critical parts of the matted live action scene (with the desired actions and actors in place) are threaded up for burning the painted elements into the black areas. The flat black paint put on the glass blocks light from the part of the film it covers, preventing double exposure over the latent live action scenes from occurring.
To begin a bipack matte filming, the live action portion is shot. The film is loaded and projected onto a piece of glass that has been painted first black, then white. The matte artist decides where the matte line will be and traces it on the glass, then paints in the background or scenery to be added. Once the painting is finished the matte artist scrapes away the paint on the live action portions of the glass. The original footage and a clean reel are loaded into the bi-pack with the original threaded so it passes the shutter in front of the clean film. The glass is lit from behind, so that when the reels are both run, only the live action is transferred to the clean film. The reel of original footage is then removed and a piece of black cloth is placed behind the glass. The glass is lit from the front and the new reel is rewound and run again. The black cloth prevents the already exposed footage from being exposed a second time; the background scenery has been added to the live action.
The rotoscope was a device used to project film (namely live-action footage) onto a canvas to act as a reference for artists. This was used perhaps most famously in older Disney animated movies, such as Snow White and the Seven Dwarfs which had notably realistic animations. The technique had a few other uses, such as in 2001: A Space Odyssey where artists manually traced and painted alpha mattes for each frame. Rotoscoping was also used to achieve the fluid animations in Prince of Persia, which were impressive for the time. Unfortunately, the technique is very time consuming, and trying to capture semi-transparency with the technique was difficult. A digital variant of rotoscoping exists today, with software helping users avoid some of the tedium; for instance, interpolating mattes between a few frames.
Often, it is desirable to extract two or more mattes from a single image. This process, dubbed "matting" or "pulling a matte," is most commonly used to separate the foreground and background elements of an image, and these images are often individual frames of a video file. Compositing techniques are a relatively simple way of pulling a matte - the foreground from a greensceen scene could be imposed on an arbitrary background scene, for instance. Attempting to matte an image that doesn't use this technique is significantly more difficult. Several algorithms have been designed in an effort to address this challenge.
Ideally, this matting algorithm would separate an input video stream Irgb into three output streams: a full-color, foreground-only stream αFrgb with a pre-multiplied alpha (alpha compositing), a full-color background stream Brgb, and a single-channel stream of partial coverage of the pixels in the foreground stream. This ideal algorithm can take any arbitrary video as input, including video where the foreground and background are dynamic, there are multiple depths in the background, there exist overlaps between background and foreground share the same color and no texture, and other various features that such algorithms traditionally have some difficulty in dealing with. Unfortunately, achieving this algorithm is impossible due to the loss of information that occurs when translating a real-world scene into a two-dimensional video. Smith and Blinn formally proved this in 1996.
Matting also has some other fundamental limitations. The process cannot reconstruct parts of the background that are occluded by the foreground, and any sort of approximation will be limited. Additionally, the foreground and background of an image still have an effect on each other due to shadows being cast and light being reflected between them. When compositing an image or video from mattes of different origin, missing or extra shadows and other details of light can ruin the impact of the new image.
The process of matting itself is a difficult problem to solve. It has been under research since the 1950s, and yet its most popular use - filmmaking - resorts to the classic but constrained compositing method. Specifically, they use a kind of global color model. This technique is based on a global color assumption; for instance, that the entire background is green. (Incidentally, this is why weather forecasters sometimes appear to have invisible ties - the color of the tie is similar to that of the background, leading the algorithm to classify the tie as part of the background stream.) Any color in theory could be used, but the most common are green and blue. Luminance matting (also called black-screen matting) is another variation of the global color model. As opposed to color, it assumes that the background is darker than a user-defined value.
Another approach is using a local color model. This model assumes the background to be a static, previously-known image, so in this case the background stream is given. A simple matte can be pulled by comparing the actual video stream with the known background stream. Lighting and camera angle requirements are very strict unlike in global color models, but there is no restriction for possible colors in the foreground stream.
There also exist machine learning tools that can pull mattes with the assistance of a user. Often, these tools require iteration on the part of the user - an algorithm provides a result based on a training set, and the user adjusts the set until the algorithm provides the desired result. An example of this is using a manually-created coarse matte with a trimap segmentation, so called because it separates the image into three regions: known background, known foreground, and an unknown region. In this case, the algorithm attempts to label the unknown region based on the user's input, and the user can iterate through multiple trimaps for better results. Knockout, a plug-in tool for Adobe Photoshop, is an implementation of this process.
Another digital matting approach was proposed by McGuire et al. It makes use of two imaging sensors along the same optical axis, and uses data from both of them. (There are various ways to achieve this, such as using a beam-splitter or per-pixel polarization filters.) The system simultaneously captures two frames that differ by about half the dynamic range at background pixels but are identical at foreground pixels. Using the differences between the backgrounds of the two images, McGuire et al. are able to extract a high-resolution foreground matte from the scene. This method still retains some of the shortcomings of compositing techniques - namely, the background must be relatively neutral and uniform - but it introduces several benefits, such as precise sub-pixel results, better support for natural illumination, and allowing the foreground to be the color that a compositing technique would identify as part of the background matte. However, this means that intentionally masking something in the foreground by coating it in the same color as the background is impossible.
A third approach to digital matting is using three video streams with different focusing distances and depths of field. As with the previous method, all three image sensors share a common optical axis, though now the algorithm uses information about what part of the image is in focus in which video feed to generate its foreground matte. With this technique, both the foreground and background can have dynamic content, and there are no restrictions on what colors or complexity the background has.
All of these approaches share one notable weakness: they cannot take arbitrary videos as inputs. In video, distinct from film, Chroma key requires the background of the original video to be a single color. The other two techniques require more information in the form of synchronized but slightly different videos.
Mattes and widescreen filming
Another use of mattes in filmmaking is to create a widescreen effect. In this process, the top and bottom of a standard frame are matted out, or masked, with black bars, i.e. the film print has a thick frame line. Then the frame within the full frame is enlarged to fill a screen when projected in a theater.
Thus, in "masked widescreen" an image with an aspect ratio of 1.85:1 is created by using a standard, 1.37:1 frame and matting out the top and bottom. If the image is matted during the filming process it is called a hard matte due to its sharp edge. In contrast, if the full frame is filled during filming and the projectionist is relied upon to matte out the top and bottom in the theater, it is referred to as a soft matte, as the aperture plate is not on the focal plane and causes a soft edge.
In video, a similar effect is often used to present widescreen films on a conventional, 1.33:1 television screen. In this case, the process is called letterboxing. However, in letterboxing, the top and bottom of the actual image are not matted out. The picture is "pushed" farther back on screen and thus made "smaller", so to speak, so that, in a widescreen film, the viewer can see, on the left and right of the picture, what would normally be omitted if the film were shown fullscreen on television, achieving a sort of "widescreen" effect on a square TV screen. In letterboxing, the top of the image is slightly lower than usual, the bottom is higher, and the unused portion of the screen is covered by black bars. For video transfers, transferring a "soft matte" film to a home video format with the full frame exposed, thus removing the mattes at the top and bottom, is referred to as an "open matte transfer." In contrast, transferring a "hard matte" film to a home video format with the theatrical mattes intact is referred to as a "closed matte transfer."
A "garbage matte" is often hand-drawn, sometimes quickly made, used to exclude parts of an image that another process, such as bluescreen, would not remove. The name stems from the fact that the matte removes "garbage" from the procedurally produced image. "Garbage" might include a rig holding a model, or the lighting grid above the top edge of the bluescreen.
Mattes that do the opposite, forcing inclusion of parts of the image that might otherwise have been removed by the keyer, such as too much blue reflecting on a shiny model ("blue spill"), are often called "holdout mattes", and can be created with the same tool.