Charles Petzold: Taper Transforms with Matrix3DProjection: An Analytical Approach

Taper Transforms with Matrix3DProjection: An Analytical Approach

July 31, 2009
Roscoe, N.Y.

The new Matrix3DProjection class in Silverlight 3 allows programmers to apply 3D transforms to 2D elements, but it also makes available non-affine 2D transforms, including a type of transform sometimes known as a "taper transform." In a blog entry last week I showed how to derive non-affine transforms from rectangles whose corners are moved to arbitrary coordinates. That's a rather interactive approach. Today I'd like to look at taper transforms in a more analytical manner.

Let's begin with the basics. A two-dimensional linear transform can be represented by a 2×2 matrix:

M11	M12
M21	M22

I've identified the cells using property names of the Matrix structure defined in the Windows Presentation Foundation and Silverlight. You can use this matrix to transform points (x, y) to points (x', y') with standard matrix multiplication:

M11	M12
M21	M22

The transform formulas are:

With this 2×2 matrix transform, you can scale in the horizontal direction (by setting M11) or the vertical direction (M22), and you can perform rotation and shear by various combinations of the values of the four cells. The default matrix that performs no transformation has a diagonal of 1's:

1	0
0	1

But there's a problem here: Although you can scale, rotate, and shear, you can not perform the type of transform known as translation, which simply shifts an object to another location on the 2D plane.

The mathematician August Ferdinand Möbius (1790–1868) realized that translation could be included in the 2D transform by adding an extra dimension to make what are called homogeneous coordinates. Basically, two dimensional translation is equivalent to three-dimensional shear. (A more extensive discussion appears on pages 300-306 of my book 3D Programming for Windows.) The transform matrix looks like this (again, using property names of the WPF and Silverlight Matrix structure):

M11	M12	0
M21	M22	0
OffsetX	OffsetY	1

Translation factors named OffsetX and OffsetY have been added as a third row. When applying this transform matrix, a two-dimensional point (x, y) is first converted to a three-dimensional point (x, y, 1) on the XY plane where Z equals 1, and than that point is multiplied by the matrix:

M11	M12	0
M21	M22	0
OffsetX	OffsetY	1

The 3D point that results from this calculation is also on the XY plane where Z equals 1, so the Z coordinate can simply be ignored. The transform formulas are:

This is the standard two-dimensional affine transform supported by the Windows Presentation Foundation and Silverlight. The values in the third column of the matrix are constants and cannot be changed. These values are necessary to keep the entire transform restricted to the XY plane. Consequently, the affine transform can never transform a square into anything other a parallelogram.

But suppose you could change those values in the third column. What would happen? What would a non-affine transform look like?

Here's a hypothetical two-dimensional 3×3 matrix capable of non-affine transforms:

M11	M12	M13
M21	M22	M23
OffsetX	OffsetY	M33

Just as with the affine transform matrix, we can represent a two-dimensional point (x, y) as a three-dimensional point (x, y, 1) and multiply the point by that transform:

M11	M12	M13
M21	M22	M23
OffsetX	OffsetY	M33

Now we're in big trouble. We've broken out of two dimensions, or at least the XY plane where Z equals 1, as the transform formulas show:

This is a problem because we're still trying to work in two dimensions, and somehow we have to project that three-dimensional point back on the XY plane where Z equals 1. The standard solution (and perhaps the simplest) is to divide all three coordinates by z':

(x', y', z') → (x'/z', y'/z', z'/z') → (x'/z', y'/z', 1)

Now we're back on the XY plane where Z equals 1, but with a potential problem: Division is involved, so there might well be singularities where z' equals zero, which would result in infinite coordinates. This is what makes the transform "non-affine." By definition, affine transforms do not involve infinity.

Try to get a little intuitive feel for the effect of the third column on the transformed coordinates. By default, M33 is 1; decreasing that value performs an overall positive scaling on the coordinates, and increasing it makes everything uniformly smaller. That's not very interesting, and usually M33 is kept at 1 for convenience. When M33 equals one, a positive value of M13 causes X and Y values to decrease as X gets larger, so that the image tapers in the positive X direction. Similarly a positive value of M23 causes X and Y values to decrease as Y gets larger. Negative values of M13 and M23 cause X and Y values to increase for a reverse tapering.

Silverlight 3 has not introduced a 3×3 matrix where the values of the third column can be set for non-affine transforms. However, Silverlight 3 has lifted the Matrix3D structure from WPF, and made it available for creating Matrix3DProjection objects, which can then be set to the Projection property of 2D Silverlight elements.

As you've seen, 2D graphics requires 3D homogeneous coordinates and a 3×3 matrix to allow translation to be defined along with the linear transforms. Analogously, translation in three-dimensional space is equivalent to skewing in four-dimensional space, so the 3D transform is a 4×4 matrix. These are the property names of the Matrix3D structure:

M11	M12	M13	M14
M21	M22	M23	M24
M31	M32	M33	M34
OffsetX	OffsetY	OffsetZ	M44

A point in 3D space (x, y, z) is represented as a 4D point (x, y, z, 1) for multiplication by the matrix:

M11	M12	M13	M14
M21	M22	M23	M24
M31	M32	M33	M34
OffsetX	OffsetY	OffsetZ	M44

Notice that the fourth dimension is represented by the letter W because we've run out of letters after Z. The transform formulas are:

Points are projected back into 3D space with the following process:

(x', y', z', w') → (x'/w', y'/w', z'/w', w'/w') → (x'/w', y'/w', z'/w', 1)

Non-affine transforms are required in 3D graphics for perspective effects: Objects seem to get smaller as they recede from the viewer's vantage point. This is a three-dimensional taper transform. Additionally, at some point prior to rendering, all the Z coordinates are collapsed for the two-dimensional video display or printer.

In Silverlight 3, we're not really dealing with 3D space. There aren't even Point3D and Vector3D structures to help us manipulate points in the third dimension. We're really only transforming 2D points of UIElement derivatives. Matrix3D is a bit of overkill for this purpose, but it's a familiar entity (at least to WPF programmers). Conceptually, the two-dimensional coordinate point (x, y) is treated as a four-dimensional coordinate point (x, y, 0, 1) for the matrix multiplication:

M11	M12	M13	M14
M21	M22	M23	M24
M31	M32	M33	M34
OffsetX	OffsetY	OffsetZ	M44

Notice that the third row of the matrix has no effect on the result. The transform formulas are consequently somewhat simpler:

Points are projected back into 3D space normally by dividing by w':

(x', y', z', w') → (x'/w', y'/w', z'/w', w'/w') → (x'/w', y'/w', z'/w', 1)

You might assume that the Z coordinate is simply ignored for rendering these points on the two-dimensional surface of the video display, but that is not the case. My experimentation reveals that any point where the Z coordinate is less than zero, or greater than one, is clipped. I will try to explore this problem and solutions in future blog entries.

For two-dimensional taper transforms, you can leave cells in the third column (as well as the third row) at their default values so the matrix transform looks like this:

M11	M12	0	M14
M21	M22	0	M24
0	0	1	0
OffsetX	OffsetY	0	M44

Because Z coordinates aren't involved at all in this calculation, you can visualize the transform as a non-affine 3×3 matrix such as one I showed earlier, but just with slightly different property names for the last column:

M11	M12	M14
M21	M22	M24
OffsetX	OffsetY	M44

The transform formulas are:

The final transformed point is projected back onto the plane from which it came by dividing by w':

(x', y', w') → (x'/w', y'/w', w'/w') → (x'/w', y'/w', 1)

For simplicity (and to help avoid singularities), I will keep M44 at its default value of 1, but this isn't a requirement.

Let's begin with a UIElement derivative, such as an Image element containing a bitmap:

As with RenderTransform, the Projection transform is relative to the upper-left corner of the element to which it's applied. For purposes of calculating the transformed points, the upper-left corner of the element is (0, 0). The element has a width (which I'll refer to as W) and a height (H). The upper-right corner is (W, 0), the lower-left corner is (0, H), and the lower-right corner is (W, H).

Suppose you want to define a taper transform that causes the right side of the image to be half its untransformed height, and you want the top edge of the image to remain horizontal. In other words, when x equals 0, you want y to be unchanged, but when x equals W, you want values of y be halved, which means that w' should be 2, and M14 should be 1/W:

1	0	1/W
0	1	0
0	0	1

If you use this Matrix3D with the Matrix3DProjection class applied to the Projection property of the Image element, you'll get:

It's actually not quite what I envisaged. I wanted the width of the Image to remain the same, but here it's being halved just like the height at the right side. This can be corrected by setting M11:

2	0	1/W
0	1	0
0	0	1

And now the image is correct:

If you want the tapered side to be 1/3 of its normal height, you can set M14 to 2/W, and M11 to 3:

3	0	2/W
0	1	0
0	0	1

And here it is:

If you prefer to taper at the top rather than the bottom (and let's go back to a right side that is half its normal height), the matrix is a little more complex. The value of M14 is still 1/W, and M11 is still 2, but a skewing factor must be introduced to shift the thing down:

2	H/W	1/W
0	1	0
0	0	1

When x is zero, y' equals y, but when x is W, y' equals y + H:

It's also possible to taper towards the center of the right side with a matrix that looks much the same but with a slightly different skew value:

2	H/(2W)	1/W
0	1	0
0	0	1

Here it is:

You can generalize all the formulas for the matrix cells, and then derive similar matrices for tapering on the bottom, top, or left sides in the three variations, or you can use a class I've written called TaperTransform. This class has the following properties:

TaperFraction

double

TaperSide

Left

Top

Right

Bottom

TaperCorner

LeftOrRight

RightOrBottom

Both

TargetWidth

TargetHeight

Matrix3D

Here's a program to experiment with the TaperTransform class:

TaperTransformExperiment.html

and here's the source code.