This post is 'number 0' in a series of posts about my computational approach to the Zarankiewicz problem.

Abstraction is the distillation of domain-specific problems into their pure mathematical essence. Much like the distillation of liquids, a wide variety of different substances often turn out to have the same essence (such as alcohol).

Here's a problem from the domain of electronics, via the same correspondent who posed the Card & Symbol Game problem:

I was typing along and the keyboard abruptly stopped sending the host anything—right in the middle of a word, I think, even. Further investigation indicated that it was probably a failure of the electronics in the keyboard.

So I took it apart. There's the keyboard proper, with two ribbon-ish cables running to sockets on a PCB. The PCB contains one 'big chip' (a 40-pin DIP part), three small common logic chips, and assorted other parts.

The keyboard has 95 keys; the keys are switches which allow current to flow from the output pins (on the 'big chip') to the input pins, so that key-presses can be detected. Here's (an abstract representation of) how it works: But the 'big chip' only has 40 pins; to handle 95 keys, some pins must be used for more than one key. We will still be able to tell which key is pressed, so long as two key-switches don't connect the same output pin to the same input pin. Although 'J' and 'K' are connected to the same input pin, the 'big chip' can distinguish between them by activating the output pins one at a time. In this example, we can tell that 'K' and 'N' are both pressed: current flows from the output pin and is detected on both input pins, so both switches must have been closed.

Unfortunately, this fails when more keys are pressed. 'J'+'K'+'M' is indistinguishable from 'J'+'K'+'M'+'N': It's easy to check that activating the second output pin instead of the third doesn't help determine whether 'N' is pressed. As our correspondent describes:

Unfortunately, it also turns out [the manufacturer] cheaped out and skipped the diodes in series with the keyswitches. This means that whenever you have three keys down that form three corners of a rectangle in the matrix, it's impossible to tell whether the key at the fourth corner is down.

Although the switches are meant to allow current to flow from the vertical wires to the horizontal wires, the 'J' key also allows current from the 'K' key to flow down to the 'M' key. Since 'M' is also pressed, the second input pin receives current - and 'N' incorrectly appears to be pressed. (Furthermore, all combinations of three of 'J','K','M','N' are indistinguishable in this way.)

Is there a way to prevent this from happening without using diodes?

Our first observation is that although we have a matrix of wires from the output pins to the input pins, not every crossing needs to be connected by a key-switch. If the 'N' key-switch were removed, current on the second input pin could not represent it being pressed. Generalising from this, we see that if no four key-switches form a rectangle in the matrix, then any combination of up to 3 key-presses can be uniquely determined by which input pins receive current from which output pins.

So, we need to design a matrix where some crossings are connected by key-switches, but no four key-switches form a rectangle. Here's an example using eight pins for nine key-switches:$$\begin{pmatrix} 1&1&1&0 \\ 1&0&0&1 \\ 0&1&0&1 \\ 0&0&1&1 \end{pmatrix}$$In this notation, $1$ indicates a key-switch and $0$ indicates no key-switch. Our goal is to connect $95$ key-switches in an $n{\times}m$ matrix with $n+m \le 40$, so it's natural to ask how many $1$s we can fit in a matrix of given dimensions, subject to the constraint that no four $1$s lie on the corners of a rectangle.

The distillation is complete, and now we have alcohol.

Definition. A rectangle-free matrix (or RFM) is a {0,1}-matrix such that no four $1$s lie on the corners of a rectangle; i.e. if $i \ne j$ and $k \ne l$, then $A_{i,k}$, $A_{i,l}$, $A_{j,k}$ and $A_{j,l}$ are not all $1$.

Definition. The weight $w(A)$ of an RFM $A$ is the number of $1$s in $A$.

Definition. An RFM $A$ is maximum-weight if $w(A) \ge w(B)$ for every RFM $B$ of the same dimensions.

Definition. $z(n,m)$ is the weight of a maximum-weight RFM with $n$ rows and $m$ columns.

In this language, our original question is whether $\max_{n+m \le 40} z(n,m) \ge 95$, but of course as mathematicians we are far more interested in the general case. At this point I went to Google looking for references to this problem, but I wasn't able to find anything relevant. (One of the problems with abstraction is there can be many different characterisations* of the same concept, so there are many different search terms to try.)

Our second observation is that "no four $1$s lie on the corners of a rectangle" is equivalent to "each pair of rows have at most one $1$ in common". But hold on a moment - we've seen $\{0,1\}$-matrices with this property before:

"each pair of cards has exactly one symbol in common."

"Exactly one" is certainly "at most one". Of course, CSGs have more constraints, not only that two rows must have a common $1$, but also that each row has the same number of $1$s. Nonetheless, clearly every CSG is an RFM, and we know from before that every FPP is a CSG. In fact this directly gives us a lower bound good enough to solve the original question:

Proposition. $z(20,20) \ge 96$.

Proof. There is an FPP of order $4$; let $P$ be its incidence matrix. $4^2 + 4 + 1 = 21$ and $4+1 = 5$, so $P$ is $21{\times}21$ and each row and each column has a weight of $5$, so $w(P)$ $= 21 \times 5$ $= 105$. Choose a particular $1$ in $P$, and let $A$ be the matrix formed by deleting its row and column from $P$. Then $A$ is $20{\times}20$ and $w(A)$ $= 105 - 5 - (5 - 1)$ $= 96$. $\square$

The question now is: can we do better than FPPs?

There are two ways to go from here. One way is to write an algorithm to search for maximum-weight RFMs; another is to prove some theorems. Of course I did both. Here are some computational results:

$z(n,m)$
$n,m$34567891011121314151617
3 6
4 7 9
5 8 10 12
6 9 12 14 16
7 10 13 15 18 21
8 11 14 17 19 22 24
9 12 15 18 21 24 26 29
10 13 16 20 22 25 28 31 34
11 14 17 21 24 27 30 33 36 39
12 15 18 22 25 28 32 36 39 42 45
13 16 19 23 27 30 33 37 40 44 48 52
14 17 20 24 28 31 35 39 42 45 49 53 56
15 18 21 25 30 33 36 40 44 47 51 55 58 61
16 19 22 26 31 34 38 42 46 50 53 57 60 64 67
17 20 23 27 32 36 39 43 47 51 55 59 63 67 70 74

Results for $n \le 2$ are trivial, and results for larger $n,m$ took too long to compute.** I'll write more about computer search in a future post; for now, we'll note that $\max_{n+m \le N} z(n,m)$ is achieved when $n$ and $m$ are as large and equal as possible, and focus our attention on square grids.*** Let's see what some maximum-weight RFMs actually look like:

 $3{\times}3$ $4{\times}4$ $5{\times}5$ $\begin{pmatrix}1&1& \\ 1& &1 \\ &1&1\end{pmatrix}$ $\begin{pmatrix}1&1&1& \\ 1& & &1 \\ &1& &1 \\ & &1&1\end{pmatrix}$ $\begin{pmatrix}1&1&1&1& \\ 1& & & &1 \\ &1& & &1 \\ & &1& &1 \\ & & &1&1\end{pmatrix}$
 $6{\times}6$ $7{\times}7$ $\begin{pmatrix}1&1&1& & & \\ 1& & &1&1& \\ 1& & & & &1 \\ &1& &1& &1 \\ &1& & &1& \\ & &1& &1&1\end{pmatrix}$ $\begin{pmatrix}1&1&1& & & & \\ 1& & &1&1& & \\ 1& & & & &1&1 \\ &1& &1& &1& \\ &1& & &1& &1 \\ & &1&1& & &1 \\ & &1& &1&1& \end{pmatrix}$

For clarity, the $0$s are left blank.

The $3{\times}3$ solution is, of course, the triangle CSG. If you're well-versed in combinatorics then you'll recognise the $7{\times}7$ solution as the Fano plane; additionally, the solution for $z(13,13) = 52$ is indeed the FPP of order $3$. The intermediate sizes are not of the form $n^2 + n + 1$, so they can't be FPPs.

All signs point to FPPs being optimal; can we prove it?

Theorem. If $v = n^2 + n + 1$, then $z(v,v)$ $\le v(n+1)$ with equality if and only if there is a finite projective plane of order $n$ (or $n \le 1$).

Proof. The result is trivial for $n \le 1$, so suppose $n \ge 2$.

Let $A$ be an $v{\times}v$ RFM, and consider combinations of two columns of $A$. For each combination, at most one row of $A$ can have $1$s in both columns. Let $w_i$ be the weight of row $i$; the $1$s in row $i$ account for $\frac{1}{2}w_i(w_i-1)$ combinations of two columns, and these combinations are distinct for different rows. Furthermore, there are at most $\frac{1}{2}v(v-1)$ combinations in total.

It follows that $\sum \frac{1}{2}w_i(w_i - 1)$ $\le \frac{1}{2}v(v-1)$. This can be rearranged**** to give$$w(A) \le v^2 - \sum (w_i - 1)^2$$This is a convex optimisation problem: while we want to maximise $w(A) = \sum w_i$, we also want to relax the constraint by minimising $\sum (w_i - 1)^2$.

Note that adjusting the $w_i$ does not correspond with actually moving $1$s around in the matrix, and a row weight distribution satisfying the constraint is not necessarily realisable as an RFM. We are only getting an upper bound on $z(v,v)$.

Claim. Without loss of generality, $\left|w_j - w_k\right| \le 1$ for every $j,k$.

Proof of claim. Suppose $w_k \ge w_j + 2$. Then substitute $w_j \mapsto w_j+1$ and $w_k \mapsto w_k-1$; we see that $\sum w_i$ is unchanged, but $\sum (w_i - 1)^2$ decreases by at least $2$, as$$w_j^2 + (w_k-2)^2 = (w_j-1)^2 + (w_k-1)^2 - 2\Delta$$where $\Delta = w_k - w_j - 1$ which by assumption is $\ge 1$. $\square$

It follows that by choosing $w^\star$ with $vw^\star(w^\star-1)$ $\le v(v-1)$ $< v(w^\star+1)w^\star$, then $\sum w_i$ is maximised when some number $0 \le l < v$ of the $w_i = w^\star+1$, and the other $v-l$ of the $w_i = w^\star$.

In the case that $v = n^2 + n + 1$, we can verify that $w^\star = n+1$ satisfies the first inequality exactly, so $l=0$ and every $w_i = n+1$. If $A$ is an RFM with this row weight distribution, then because every combination of two columns is accounted for, the "every two columns share a unique row" property holds and so $A$ is the incidence matrix of an FPP of order $n$. We also have $w(A) = v(n+1)$.

Alternatively, if there is no FPP of order $n$, then there is no $v{\times}v$ RFM with this row weight distribution. Since the $w_i$ differ by at most one, and we cannot have every $w_i \ge n+1$, we must have every $w_i \le n+1$ with some $w_i < n+1$. Therefore $z(v,v) < v(n+1)$ as required. $\square$

This gives yet another characterisation of FPPs - this time as extremal solutions to combinatorial problems. The special case $n \le 1$ could actually have been avoided if we talked about complete CSGs instead of FPPs. Note that the convexity argument applies in other cases too:

Proposition. $z(20,20) \le 97$.

Proof. $w^\star = 4$ and $l = 17$, giving $w(A)$ $\le (20-17) \times 4 + 17 \times (4+1)$ $= 97$. $\square$

I came up with this proof on Saturday. You can imagine my disappointment, then, when on Sunday I finally entered the right search terms into Google to find that the problem has been around since 1951, and the correspondence with FPPs was proved in 1958. Oh well!

*One example in this case is that maximum-weight RFMs are equivalent to minimum hitting sets: if we rephrase the problem as putting at least one $0$ in every rectangle using as few $0$s as possible, then we have a hitting set problem where the sets have size $4$ and there are $\frac{1}{4}n(n-1)m(m-1)$ of them. There are many minimum hitting set algorithms designed for small-set cases like this, but I think RFMs have some specific features which a general-purpose algorithm won't be able to exploit.

**My algorithm appears to be a lot faster than another researcher's even without the use of FPPs and convex optimisation to give lower and upper bounds. In this case my algorithm takes just under $14$ minutes to find $z(14,14)$ and just over $4$ hours to find $z(15,15)$; this may be asymptotically similar to Dybizbanski et al.'s algorithm which is feasible up to $z(16,16)$, though they didn't give running times. Using bounds to eliminate most of the search tree, my algorithm is able to find $z(16,17)$ in $83$ minutes and $z(17,17)$ in $2$ hours on a 1.9GHz processor. Compared to the usual standards of computational combinatorics, I am quite impatient. More in a future post.

***Intuitively, for fixed $n+m$, square grids have more space to fit more $1$s. Actually, this is still a conjecture, and may even be false.

****$\sum w_i(w_i - 1)$ $= \sum (w_i - 1)^2 + \sum(w_i - 1)$ $= \sum(w_i-1)^2 + w(A) - v$. This must be $\le v(v-1)$, giving the result.

Postscript. The computed values given above are consistent with the literature. I became a little worried when Damásdi et al. gave the value of $z(15,15)$ as $60$, citing a book from 1969; one of the authors did publish a correction soon after. My computed solution of weight $61$ is given below.

X X X X X : : : : : : : : : :
X : : : : X X X : : : : : : :
X : : : : : : : X X X : : : :
X : : : : : : : : : : X X X :
: X : : : X : : X : : X : : :
: X : : : : X : : X : : X : :
: X : : : : : X : : X : : : X
: : X : : X : : : : : : X : X
: : X : : : X : : : X X : : :
: : X : : : : X : X : : : X :
: : : X : X : : : : X : : X :
: : : X : : : X X : : : X : :
: : : X : : : : : X : X : : X
: : : : X X : : : X : : : : :
: : : : X : X : X : : : : X X