The Distributive Law (a + b)c = ac + bc

# Aspect Ratio of a Rectangle in Perspective

Here's an interesting* problem I had to solve about a year ago. Suppose we have, for example, a photograph of a rectangular object taken from some angle. We can't necessarily know the size of the rectangle, because its co-ordinates in the image depend on its distance from the camera, the camera's field of view, and the image resolution. But can we work out its proportions, i.e. the aspect ratio, height divided by width?

A picture is worth $10^3$ words:

For convenience, let's put the camera (or the eye) at the origin, and let the photograph be a projection onto the plane $z = \lambda$, where $\lambda > 0$ is an unknown constant which depends on the parameters of the camera. $\renewcommand\vec[1]{\mathbf{#1}}\vec{a},\vec{b},\vec{c},\vec{d}$ are the 2D position vectors of the rectangle's corners in the photograph, and the camera projection is $$\pi\begin{pmatrix}x \\ y \\ z\end{pmatrix} = \begin{pmatrix}\lambda x / z \\ \lambda y / z \\ \lambda \end{pmatrix}$$

Let $\vec{a}',\vec{b}',\vec{c}',\vec{d}'$ be the position vectors of the actual rectangle's corners in 3D space. We are more interested in $\vec{u}$ and $\vec{v}$, since the aspect ratio $\newcommand\abs[1]{\left|#1\right|} \abs{\vec{v}} \, / \, \abs{\vec{u}}$ is what we're trying to find. Since we know the shape is a rectangle, we can write down some facts:

• $\vec{b}' = \vec{a}' + \vec{u}$,
• $\vec{c}' = \vec{a}' + \vec{v}$,
• $\vec{d}' = \vec{a}' + \vec{u} + \vec{v}$ because opposite sides of a rectangle are parallel, and
• $\vec{u} \cdot \vec{v} = 0$ because adjacent sides of a rectangle are perpendicular.

We don't know how far away from the camera the rectangle actually is; however, we don't need to. The aspect ratio will be preserved if we scale $\vec{a}',\vec{b}',\vec{c}',\vec{d}'$ by any factor, so without loss of generality,** let's choose $a'_z = \lambda$. Since $\vec{a} =\pi(\vec{a}')$, it follows immediately that $a'_x = a_x$ and $a'_y = a_y$.

Now let's try to solve for $\vec{u},\vec{v}$. We're assuming that $\vec{a},\vec{b},\vec{c},\vec{d}$ are known, because they can be read straight off from the photograph.***

• $\vec{b} = \pi(\vec{a}' + \vec{u})$, so $b_x = \dfrac{\lambda(a_x + u_x)}{\lambda + u_z}$ and $b_y = \dfrac{\lambda(a_y + u_y)}{\lambda + u_z}$.
• $\vec{c} = \pi(\vec{a}' + \vec{v})$, so $c_x = \dfrac{\lambda(a_x + v_x)}{\lambda + v_z}$ and $c_y = \dfrac{\lambda(a_y + v_y)}{\lambda + v_z}$.
• $\vec{d} = \pi(\vec{a}' + \vec{u} + \vec{v})$, so $d_x = \dfrac{\lambda(a_x + u_x + v_x)}{\lambda + u_z + v_z}$ and $d_y = \dfrac{\lambda(a_y + u_y + v_y)}{\lambda + u_z + v_z}$.

After rearranging, all of this can be written as a single matrix equation: $$\begin{pmatrix}1 &0& -b_x &0&0&0 \\ 0& 1 & -b_y &0&0&0 \\ 0&0&0& 1 &0& -c_x \\ 0&0&0&0& 1 & -c_y \\ 1 &0& -d_x & 1 &0& -d_x \\ 0& 1 & -d_y &0& 1 & -d_y \end{pmatrix} \begin{pmatrix} u_x \\ u_y \\ u_z/\lambda \\ v_x \\ v_y \\ v_z/\lambda \end{pmatrix} = \begin{pmatrix} b_x - a_x \\ b_y - a_y \\ c_x - a_x \\ c_y - a_y \\ d_x - a_x \\ d_y - a_y \end{pmatrix}$$ which can be solved to find $u_x$, $u_y$, $\frac{u_z}{\lambda}$, $v_x$, $v_y$ and $\frac{v_z}{\lambda}$.

So we need to know $\lambda$. Fortunately, we have one more equation: $\vec{u} \cdot \vec{v} = 0$. In particular, $$u_x v_x + u_y v_y + \lambda^2 \left(\frac{u_z}{\lambda}\right) \left(\frac{v_z}{\lambda}\right) = 0~~~\implies~~~\lambda = \sqrt{ \frac{-u_x v_x - u_y v_y}{\left(\frac{u_z}{\lambda}\right) \left(\frac{v_z}{\lambda}\right)} }$$

What a mess! Fortunately, the mathematician's job ends here and the computer takes over. The remainder is trivial: $u_z = \lambda(\frac{u_z}{\lambda})$ and $v_z = \lambda(\frac{v_z}{\lambda})$, hence we can compute $\abs{\vec{v}}\,/\,\abs{\vec{u}}$. The end.

Wait...

It would be remiss not to include some information about when this doesn't work. There are three things that might go wrong: our $6{\times}6$ matrix might be singular, and computing $\lambda$ might result in division by zero, or the square root of a negative number.

It's possible to check by inspection that the matrix is singular when $b_x = c_x = d_x$ or when $b_y = c_y = d_y$, but this isn't exhaustive. Fortunately, computers know how to do algebra, and MATLAB easily gives the determinant: the matrix is singular if and only if $$b_x c_y + c_x d_y + d_x b_y - b_y c_x - c_y d_x - d_y b_x = 0$$

How to interpret this? We expect that the task should be impossible when $\vec{b},\vec{c},\vec{d}$ are collinear, but we aren't sure if this is a necessary condition, so let's check. The determinant $$\det \begin{pmatrix} b_x & c_x & d_x \\ b_y & c_y & d_y \\ 1 & 1 & 1 \end{pmatrix} = b_x c_y + c_x d_y + d_x b_y - b_y c_x - c_y d_x - d_y b_x$$ gives the volume of a parallelepiped spanned by three vectors in the plane $z = 1$, so the determinant is zero if and only if the parallelepiped lies in a plane through the origin. This occurs precisely when the three points are collinear, because the intersection of these two planes is a line, and conversely given a line, there is a plane through the origin containing it.

This only happens when the camera lies in the plane of the rectangle. It would be rather difficult for a photographer to make this mistake by accident.

Murphy's theorem dictates that the formula for $\lambda$ breaks in many more circumstances. The "division by zero" error happens when either $u_z/\lambda$ or $v_z/\lambda$ come out to be zero. Of course $\lambda$ isn't actually $\infty$, rather our formula for $\lambda$ fails because $u_z = 0$ or $v_z = 0$ (or both).

If only one is zero, then we don't have enough information to calculate the other. This occurs when two sides of the rectangle's image are parallel. A quick sanity check indicates that in this case, we really can't compute the aspect ratio from the co-ordinates in the photograph alone:

My hands aren't stable enough to get them exactly equal, but you get the idea.

If both are zero then we probably got lucky and photographed the rectangle dead on, in which case not only do we have $\vec{a}' = \vec{a}$ but also $\vec{b}' = \vec{b}$, $\vec{c}' = \vec{c}$ and $\vec{d}' = \vec{d}$, and the problem is trivial. (We should still check if $\vec{u}\cdot\vec{v} = 0$.) However, $\lambda$ could instead be very large, in which case $u_z / \lambda$ and $v_z / \lambda$ will of course come out near zero, and the sides in our image will be so close to parallel that we can't reliably estimate the aspect ratio due to numerical instability.

The "square root of a negative number" error happens either because of numerical instability in one of the above cases, or because we labelled the corners $\vec{a},\vec{b},\vec{c},\vec{d}$ in the wrong order, or because the shape is not a rectangle at all.

*Perhaps you expected this endnote to contain a caveat. Sorry, the problem is interesting because I say so.

**We could simply choose $\lambda = a'_z$ rather than the other way round, but this is less obviously valid.

***This requires that we know where the point $(0,0,\lambda)$ is in the photograph's 2D co-ordinates. If the photograph hasn't been cropped, and the camera's field of view is symmetric, then this will be the centre of the photograph.

# Matrix Multiplication and Pascal's Triangle

Students who learn about matrix multiplication before linear map composition often think the definition seems arbitrary and unmotivated. (Why not just multiply each component separately, the way addition and scalar multiplication work?) Linear maps are the major reason we use matrices, and matrix multiplication really only makes sense once you figure this out.

But still, any grid of numbers is by definition a matrix, whether or not those numbers came from the co-ordinates of images of basis vectors under a linear map. Whatever context the data in the grid came from, we can do matrix multiplication. The matrix may even be interpreted post hoc as linear map. What's surprising is how often this bonus linear algebraic structure ends up being useful and elegant.

A few examples:

• A graph or network has two natural representations as a grid of numbers; an adjacency matrix $A$ gives the connections between nodes and other nodes, and an incidence matrix $B$ gives the two endpoints of each edge. $(A^n)_{i,j}$ happens to give the number of routes of length $n$ between nodes $i$ and $j$, and $B$ has a post hoc interpretation as the boundary map, which transforms a vector $\mathbf{e}$ of edges into a vector $B\mathbf{e}$ of nodes such that $\mathbf{e}$ is a disjoint union of paths with distinct endpoints $B\mathbf{e}$.
• Markov chain is a random process with a set of states, and each state has a probability distribution for determining the next state. If $P_{i,j}$ is the probability that state $i$ will evolve into state $j$, then it turns out that $(P^n)_{i,j}$ is the probability that state $i$ will evolve into state $j$ after $n$ steps.
• A multi-variable function $f(\mathbf{x})$ has a Hessian matrix $$H_{i,j} = \frac{\partial f}{\partial x_i \, \partial x_j}$$ which appears as a quadratic form $\newcommand\transp{^{\mathsf{T}}} \frac{1}{2}\mathbf{x}\transp H \mathbf{x}$ in $f$'s Taylor series, and its eigenvalues determine the nature of $f$'s stationary points.
• A set of random variables $X_i$ has a covariance matrix $C_{i,j} = \operatorname{Cov}(X_i, X_j)$. Given a linear combination $Y = \sum y_j X_j$, matrix multiplication strikes again: $(C \mathbf{y})_i = \operatorname{Cov}(Y, X_i)$.

How about Pascal's triangle? $$P = \begin{pmatrix} 1 & 0 & 0 & 0 & \cdots \\ 1 & 1 & 0 & 0 & \cdots \\ 1 & 2 & 1 & 0 & \cdots \\ 1 & 3 & 3 & 1 & \cdots \\ \vdots & \vdots & \vdots & \vdots & \ddots \end{pmatrix}$$ Of course it's a matrix. The numbers are determined by the first row, and the formula $P_{i+1,j+1}$ $= P_{i,j} + P_{i,j+1}$. We also know this matrix as $\newcommand\ncr[2]{{}^{#1} \mathrm{C}_{#2}} P_{n,r} = \ncr{n}{r}$.

Alternatively we can write it as a symmetric matrix: $$S = \begin{pmatrix} 1 & 1 & 1 & 1 & \cdots \\ 1 & 2 & 3 & 4 & \cdots \\ 1 & 3 & 6 & 10 & \cdots \\ 1 & 4 & 10 & 20 & \cdots \\ \vdots & \vdots & \vdots & \vdots & \ddots \end{pmatrix}$$ $P$ and $S$ are just grids of numbers; they're defined by a simple combinatorial rule, and there is no linear algebra in sight. Surely matrix multiplication is meaningless here? But in fact, $PP\transp = S$: $$\left(PP\transp\right) {}_{\!i,j} = \sum P_{i,k} \left(P^{\mathsf{T}}\right) {}_{\!k,j} = \underbrace{\sum \ncr{i}{k} \, \ncr{j}{k} = \ncr{i+j}{i}} = S_{i,j}$$ as required.

Let's take a look at that third step more closely - what we're saying is, the number of ways to choose $i$ items from a set of $i+j$ is the same as the number of ways to choose some number $k$ items from a set of $i$ and another $k$ from a set of $j$. There is indeed a natural correspondence between such choices; for example, how many ways are there of choosing $3$ of the symbols A,B,C,X,Y?

 $k = 0$ ABC $k = 1$ AABBCC XYXYXY BCXBCYACXACYABXABY $k = 2$ ABACBC XYXYXY CXYBXYAXY

This yields all $10 = \ncr{3+2}{3}$ ways. The correspondence is simple; choose $k$ of the first set A,B,C not to use, and choose $k$ of the second set X,Y to use.

A consequence of this fact is that if we truncate $S$ so that it is a finite square matrix, then its determinant is $1$. This is not difficult to show: $S = PP\transp$ is a product of a lower-triangular and upper-triangular matrix, each of which has only $1$s on the diagonal.

(This insight is lost on MATLAB:

>> det(pascal(20))ans = 1007.44031415979>> det(sym(pascal(30)))ans = 209

Thanks, IEEE 754.)

Can $P$ be interpreted as a linear map? In fact, it can. $P$ is the matrix of binomial coefficients, so binomial expansions are a good context to consider. If we identify polynomials $f(x) = a_0 + a_1 x + \ldots + a_n x^n$ with row vectors $[f(x)] = (a_0~a_1~\cdots~a_n)$, then $[1]\,P = [1]$, $[x]\,P = [1 + x]$, $[x^2]\,P$ $= [1 + 2x + x^2]$ $= [(1+x)^2]$, and in general $$[x^n]\,P = \sum P_{n,k} \left[x^k\right] = \left[\sum \ncr{n}{k} \, x^k\right] = \left[(1+x)^n\right]$$ Since matrix multiplication is linear, it follows immediately that $[f(x)] \,P = [f(1+x)]$. Therefore the matrix $P$ represents a linear transformation on polynomials given by the substitution $x \mapsto x+1$. Then we see that $P^a$ represents the substitution $x \mapsto x+a$, and so $(P^a)_{n,r}$ is the coefficient of $x^r$ in the binomial expansion $(x+a)^n$.

Somehow, wherever a grid of numbers comes from, linear algebra finds something to say about it.

Home