Here's an interesting* problem I had to solve about a year ago. Suppose we have, for example, a photograph of a rectangular object taken from some angle. We can't necessarily know the size of the rectangle, because its co-ordinates in the image depend on its distance from the camera, the camera's field of view, and the image resolution. But can we work out its proportions, i.e. the aspect ratio, height divided by width?

A picture is worth $10^3$ words:

For convenience, let's put the camera (or the eye) at the origin, and let the photograph be a projection onto the plane $z = \lambda$, where $\lambda > 0$ is an unknown constant which depends on the parameters of the camera. $\renewcommand\vec[1]{\mathbf{#1}}\vec{a},\vec{b},\vec{c},\vec{d}$ are the 2D position vectors of the rectangle's corners in the photograph, and the camera projection is $$\pi\begin{pmatrix}x \\ y \\ z\end{pmatrix} = \begin{pmatrix}\lambda x / z \\ \lambda y / z \\ \lambda \end{pmatrix}$$

Let $\vec{a}',\vec{b}',\vec{c}',\vec{d}'$ be the position vectors of the actual rectangle's corners in 3D space. We are more interested in $\vec{u}$ and $\vec{v}$, since the aspect ratio $\newcommand\abs[1]{\left|#1\right|} \abs{\vec{v}} \, / \, \abs{\vec{u}}$ is what we're trying to find. Since we know the shape is a rectangle, we can write down some facts:

• $\vec{b}' = \vec{a}' + \vec{u}$,
• $\vec{c}' = \vec{a}' + \vec{v}$,
• $\vec{d}' = \vec{a}' + \vec{u} + \vec{v}$ because opposite sides of a rectangle are parallel, and
• $\vec{u} \cdot \vec{v} = 0$ because adjacent sides of a rectangle are perpendicular.

We don't know how far away from the camera the rectangle actually is; however, we don't need to. The aspect ratio will be preserved if we scale $\vec{a}',\vec{b}',\vec{c}',\vec{d}'$ by any factor, so without loss of generality,** let's choose $a'_z = \lambda$. Since $\vec{a} =\pi(\vec{a}')$, it follows immediately that $a'_x = a_x$ and $a'_y = a_y$.

Now let's try to solve for $\vec{u},\vec{v}$. We're assuming that $\vec{a},\vec{b},\vec{c},\vec{d}$ are known, because they can be read straight off from the photograph.***

• $\vec{b} = \pi(\vec{a}' + \vec{u})$, so $b_x = \dfrac{\lambda(a_x + u_x)}{\lambda + u_z}$ and $b_y = \dfrac{\lambda(a_y + u_y)}{\lambda + u_z}$.
• $\vec{c} = \pi(\vec{a}' + \vec{v})$, so $c_x = \dfrac{\lambda(a_x + v_x)}{\lambda + v_z}$ and $c_y = \dfrac{\lambda(a_y + v_y)}{\lambda + v_z}$.
• $\vec{d} = \pi(\vec{a}' + \vec{u} + \vec{v})$, so $d_x = \dfrac{\lambda(a_x + u_x + v_x)}{\lambda + u_z + v_z}$ and $d_y = \dfrac{\lambda(a_y + u_y + v_y)}{\lambda + u_z + v_z}$.

After rearranging, all of this can be written as a single matrix equation: $$\begin{pmatrix}1 &0& -b_x &0&0&0 \\ 0& 1 & -b_y &0&0&0 \\ 0&0&0& 1 &0& -c_x \\ 0&0&0&0& 1 & -c_y \\ 1 &0& -d_x & 1 &0& -d_x \\ 0& 1 & -d_y &0& 1 & -d_y \end{pmatrix} \begin{pmatrix} u_x \\ u_y \\ u_z/\lambda \\ v_x \\ v_y \\ v_z/\lambda \end{pmatrix} = \begin{pmatrix} b_x - a_x \\ b_y - a_y \\ c_x - a_x \\ c_y - a_y \\ d_x - a_x \\ d_y - a_y \end{pmatrix}$$ which can be solved to find $u_x$, $u_y$, $\frac{u_z}{\lambda}$, $v_x$, $v_y$ and $\frac{v_z}{\lambda}$.

So we need to know $\lambda$. Fortunately, we have one more equation: $\vec{u} \cdot \vec{v} = 0$. In particular, $$u_x v_x + u_y v_y + \lambda^2 \left(\frac{u_z}{\lambda}\right) \left(\frac{v_z}{\lambda}\right) = 0~~~\implies~~~\lambda = \sqrt{ \frac{-u_x v_x - u_y v_y}{\left(\frac{u_z}{\lambda}\right) \left(\frac{v_z}{\lambda}\right)} }$$

What a mess! Fortunately, the mathematician's job ends here and the computer takes over. The remainder is trivial: $u_z = \lambda(\frac{u_z}{\lambda})$ and $v_z = \lambda(\frac{v_z}{\lambda})$, hence we can compute $\abs{\vec{v}}\,/\,\abs{\vec{u}}$. The end.

Wait...

It would be remiss not to include some information about when this doesn't work. There are three things that might go wrong: our $6{\times}6$ matrix might be singular, and computing $\lambda$ might result in division by zero, or the square root of a negative number.

It's possible to check by inspection that the matrix is singular when $b_x = c_x = d_x$ or when $b_y = c_y = d_y$, but this isn't exhaustive. Fortunately, computers know how to do algebra, and MATLAB easily gives the determinant: the matrix is singular if and only if $$b_x c_y + c_x d_y + d_x b_y - b_y c_x - c_y d_x - d_y b_x = 0$$

How to interpret this? We expect that the task should be impossible when $\vec{b},\vec{c},\vec{d}$ are collinear, but we aren't sure if this is a necessary condition, so let's check. The determinant $$\det \begin{pmatrix} b_x & c_x & d_x \\ b_y & c_y & d_y \\ 1 & 1 & 1 \end{pmatrix} = b_x c_y + c_x d_y + d_x b_y - b_y c_x - c_y d_x - d_y b_x$$ gives the volume of a parallelepiped spanned by three vectors in the plane $z = 1$, so the determinant is zero if and only if the parallelepiped lies in a plane through the origin. This occurs precisely when the three points are collinear, because the intersection of these two planes is a line, and conversely given a line, there is a plane through the origin containing it.

This only happens when the camera lies in the plane of the rectangle. It would be rather difficult for a photographer to make this mistake by accident.

Murphy's theorem dictates that the formula for $\lambda$ breaks in many more circumstances. The "division by zero" error happens when either $u_z/\lambda$ or $v_z/\lambda$ come out to be zero. Of course $\lambda$ isn't actually $\infty$, rather our formula for $\lambda$ fails because $u_z = 0$ or $v_z = 0$ (or both).

If only one is zero, then we don't have enough information to calculate the other. This occurs when two sides of the rectangle's image are parallel. A quick sanity check indicates that in this case, we really can't compute the aspect ratio from the co-ordinates in the photograph alone:

My hands aren't stable enough to get them exactly equal, but you get the idea.

If both are zero then we probably got lucky and photographed the rectangle dead on, in which case not only do we have $\vec{a}' = \vec{a}$ but also $\vec{b}' = \vec{b}$, $\vec{c}' = \vec{c}$ and $\vec{d}' = \vec{d}$, and the problem is trivial. (We should still check if $\vec{u}\cdot\vec{v} = 0$.) However, $\lambda$ could instead be very large, in which case $u_z / \lambda$ and $v_z / \lambda$ will of course come out near zero, and the sides in our image will be so close to parallel that we can't reliably estimate the aspect ratio due to numerical instability.

The "square root of a negative number" error happens either because of numerical instability in one of the above cases, or because we labelled the corners $\vec{a},\vec{b},\vec{c},\vec{d}$ in the wrong order, or because the shape is not a rectangle at all.

*Perhaps you expected this endnote to contain a caveat. Sorry, the problem is interesting because I say so.

**We could simply choose $\lambda = a'_z$ rather than the other way round, but this is less obviously valid.

***This requires that we know where the point $(0,0,\lambda)$ is in the photograph's 2D co-ordinates. If the photograph hasn't been cropped, and the camera's field of view is symmetric, then this will be the centre of the photograph.