This post is the second in a series of posts about my computational approach to the Zarankiewicz problem.

Let's begin with a quick summary of what the problem is. You have a grid with $m$ rows and $n$ columns, and in each cell of that grid you must write either a $1$ or a $0$. The objective is to fit as many $1$s as you possibly can, but no four $1$s may lie on the corners of a rectangle.*

$$\begin{pmatrix}\bbox[3pt,white]{1}&1&\bbox[3pt,white]{0} \\ 1&0&1 \\ \bbox[3pt,white]{0}&1&\bbox[3pt,white]{1}\end{pmatrix}~~~\begin{pmatrix} \bbox[3pt,#FF8080]{1}&0&\bbox[3pt,#FF8080]{1} \\ 1&1&0 \\ \bbox[3pt,#FF8080]{1}&0& \bbox[3pt,#FF8080]{1} \end{pmatrix}$$Left: a solution. Right: an invalid grid.

If you want to solve a discrete optimisation problem like this, the simplest thing to do is: try everything, throw away the things that aren't valid, measure the ones that are, and keep the best one. This is the "brute force" algorithm, though really it's a class of algorithms, because the definitions of "everything", "valid" and "measure" will vary depending on the problem being solved. Formally, we have a set of candidate solutions $X$, a constraint on $X$, and a profit function $\pi : X \to \mathbb{R}$.

Brute force is notoriously slow for combinatorial problems; $X$ is usually extremely large, even when the numbers in the problem are small. In the Zarankiewicz problem, we have $mn$ cells in the grid, and $2$ choices of what to write in each cell, so there are $2^{mn}$ possible ways of completing the grid. For a $5{\times}5$ grid, this number is about 34 million, for a $7{\times}7$ grid it's about 563 trillion. By the time we get to $31{\times}31$, the number of things to try is 290 digits long; we'll never be able to try that many things.**

 

But if we don't try everything, how can we be sure to find the best one? This is the challenge.

In fact, sometimes we really do have to try everything. We need problems like this in security: consider a combination lock, for example.

undefined

(Image modified from this one.)

To open this lock, you need all three cylinders in their correct orientations, so that the internal gaps align with where the rod needs to move. If the lock is well-designed, the only way to discover the correct combination is to try them all until you find the one which works. This is the brute force algorithm, where $X$ is "all sequences of three digits, from 000 to 999", there is no constraint, and $\pi$ is $1$ or $0$ depending on whether the lock opens or not.*** This takes up to $1000$ tries.

If there is an algorithm faster than brute-force, the combination lock has a design flaw. The image above suggests one, which is indeed a common weakness in cheap locks: if the rod can move through the first cylinder, we know the first digit is correct; if it also moves through the second cylinder, we know the second digit is also correct. This represents a structure in the problem - if we "measure" a combination $\renewcommand{\vec}{\mathbf} \vec{a} = (a_1,a_2,a_3)$ by the number of cylinders $\pi(\vec{a})$ the rod can move through, then whenever $\pi(\vec{a}) = 0$ and $b_1 = a_1$, we also know $\pi(\vec{b}) = 0$, for example. Importantly, we know this without trying $\vec{b}$.

This can indeed be turned into an efficient algorithm: for $0 \le a_1 \le 9$ try each possible $(a_1,0,0)$ until we find one with $\pi(\vec{a}) \ge 1$. Then for $0 \le a_2 \le 9$ try each possible $(a_1,a_2,0)$ until we find one with $\pi(\vec{a}) \ge 2$. Finally for $0 \le a_3 \le 9$ try each possible $(a_1,a_2,a_3)$ until we find one with $\pi(\vec{a}) = 3$. This takes at most $30$ tries: a significant improvement. That's bad for security, but good for computer science.

 

On an abstract level, this works because inspecting one candidate solution gives information about many others. In this example we were able to reject up to $100$ candidates per inspection. This points to a general (but vague) principle of combinatorial optimisation: efficient search algorithms exploit structure within the problem, to reject as many candidates per inspection as possible.

As with the combination lock, this usually requires decomposing the candidate solutions into components - in this case, the individual digits in the sequence. It may be more convenient to represent "unfilled" components explicitly, rather than choosing a default. The result can be interpreted as a tree, in which e.g. $(a_1,a_2,*)$ is "descended from" $(a_1,*,*)$. The presence of a $*$ means the candidate is a partial solution; a non-partial descendant is a completion.

If this is possible, we might be able to find logical relationships between the components and the validity or measurement of the full candidate. These can take various forms:

  1. If $\vec{a}$ is invalid, and $\vec{b}$ is a completion of $\vec{a}$, then $\vec{b}$ is also invalid.
  2. For some function $U$, $\pi(\vec{b}) \le U(\vec{a})$ whenever $\vec{b}$ is a completion of $\vec{a}$.

These allow us to instantly reject all completions of $\vec{a}$, whenever $\vec{a}$ is invalid or $U(\vec{a})$ is too small. There is another kind of structure which many problems have: symmetry.

  1. Candidates $\vec{a}$ can be transformed into other candidates $\sigma\vec{a}$, where $\sigma\vec{a}$ is valid exactly when $\vec{a}$ is, and $\pi(\sigma\vec{a}) = \pi(\vec{a})$.

This allows us to reject $\vec{a}$ whenever $\sigma\vec{a}$ has already been considered.****

 

Returning to the Zarankiewicz problem, one way to decompose the grid is as a sequence of rows. A row itself is a sequence of $1$s and $0$s, of a fixed length, which is convenient for representation on a computer. The rectangle-free constraint can be phrased as "no two rows have more than one $1$ bit in common", which can be tested by bitwise operations.

Most importantly, this decomposition is compatible with the structure of the problem: if a partial solution has a rectangle, then any completion also has a rectangle, as in (1.). Furthermore, by considering the weights (number of $1$s) of the rows in a partial solution, we can give an upper bound on the weight of its completions, as in (2.), based on the convex optimisation bound described in a previous post.

We can also exploit symmetry. The problem has lots of transformations as in (3.): any permutation of rows and columns will do, because swapping rows (or columns) cannot introduce or remove a rectangle, and does not change the weight. As we'll see in later posts, there are many ways that all this symmetry can be exploited, and some are much better than others.

 

*Formally, for $i \ne k$ and $j \ne l$, not all of $A_{i,j}$, $A_{i,l}$, $A_{k,j}$, $A_{k,l}$ can equal $1$. For pedants, only axis-aligned rectangles are forbidden.

**Combinatorial explosion is, for example, why IBM's Deep Blue and Google's AlphaGo can't just be replaced with about 15 lines of code.

***We could instead have defined a combination as "valid" if it opens the lock, but that would be less convenient for the analogy.

****Alternatively, we can reject $\vec{a}$ when $\sigma\vec{a}$ will be considered later, so long as we don't later reject $\sigma\vec{a}$ by circular reasoning. We can make this consistent by comparing $\vec{a}$ and $\sigma\vec{a}$ in some ordering, which need not be the order that candidates are considered in.