The Distributive Law (a + b)c = ac + bc

Keyboards and Projective Geometry

This post is 'number 0' in a series of posts about my computational approach to the Zarankiewicz problem.


Abstraction is the distillation of domain-specific problems into their pure mathematical essence. Much like the distillation of liquids, a wide variety of different substances often turn out to have the same essence (such as alcohol).

Here's a problem from the domain of electronics, via the same correspondent who posed the Card & Symbol Game problem:

I was typing along and the keyboard abruptly stopped sending the host anything—right in the middle of a word, I think, even. Further investigation indicated that it was probably a failure of the electronics in the keyboard.

So I took it apart. There's the keyboard proper, with two ribbon-ish cables running to sockets on a PCB. The PCB contains one 'big chip' (a 40-pin DIP part), three small common logic chips, and assorted other parts.

The keyboard has 95 keys; the keys are switches which allow current to flow from the output pins (on the 'big chip') to the input pins, so that key-presses can be detected. Here's (an abstract representation of) how it works:


But the 'big chip' only has 40 pins; to handle 95 keys, some pins must be used for more than one key. We will still be able to tell which key is pressed, so long as two key-switches don't connect the same output pin to the same input pin.


Although 'J' and 'K' are connected to the same input pin, the 'big chip' can distinguish between them by activating the output pins one at a time. In this example, we can tell that 'K' and 'N' are both pressed: current flows from the output pin and is detected on both input pins, so both switches must have been closed.

Unfortunately, this fails when more keys are pressed. 'J'+'K'+'M' is indistinguishable from 'J'+'K'+'M'+'N':


It's easy to check that activating the second output pin instead of the third doesn't help determine whether 'N' is pressed. As our correspondent describes:

Unfortunately, it also turns out [the manufacturer] cheaped out and skipped the diodes in series with the keyswitches. This means that whenever you have three keys down that form three corners of a rectangle in the matrix, it's impossible to tell whether the key at the fourth corner is down.

Although the switches are meant to allow current to flow from the vertical wires to the horizontal wires, the 'J' key also allows current from the 'K' key to flow down to the 'M' key. Since 'M' is also pressed, the second input pin receives current - and 'N' incorrectly appears to be pressed. (Furthermore, all combinations of three of 'J','K','M','N' are indistinguishable in this way.)

Is there a way to prevent this from happening without using diodes?


Our first observation is that although we have a matrix of wires from the output pins to the input pins, not every crossing needs to be connected by a key-switch. If the 'N' key-switch were removed, current on the second input pin could not represent it being pressed. Generalising from this, we see that if no four key-switches form a rectangle in the matrix, then any combination of up to 3 key-presses can be uniquely determined by which input pins receive current from which output pins.

So, we need to design a matrix where some crossings are connected by key-switches, but no four key-switches form a rectangle. Here's an example using eight pins for nine key-switches:$$\begin{pmatrix} 1&1&1&0 \\ 1&0&0&1 \\ 0&1&0&1 \\ 0&0&1&1 \end{pmatrix}$$In this notation, $1$ indicates a key-switch and $0$ indicates no key-switch. Our goal is to connect $95$ key-switches in an $n{\times}m$ matrix with $n+m \le 40$, so it's natural to ask how many $1$s we can fit in a matrix of given dimensions, subject to the constraint that no four $1$s lie on the corners of a rectangle.

The distillation is complete, and now we have alcohol.

Definition. A rectangle-free matrix (or RFM) is a {0,1}-matrix such that no four $1$s lie on the corners of a rectangle; i.e. if $i \ne j$ and $k \ne l$, then $A_{i,k}$, $A_{i,l}$, $A_{j,k}$ and $A_{j,l}$ are not all $1$.

Definition. The weight $w(A)$ of an RFM $A$ is the number of $1$s in $A$.

Definition. An RFM $A$ is maximum-weight if $w(A) \ge w(B)$ for every RFM $B$ of the same dimensions.

Definition. $z(n,m)$ is the weight of a maximum-weight RFM with $n$ rows and $m$ columns.

In this language, our original question is whether $\max_{n+m \le 40} z(n,m) \ge 95$, but of course as mathematicians we are far more interested in the general case. At this point I went to Google looking for references to this problem, but I wasn't able to find anything relevant. (One of the problems with abstraction is there can be many different characterisations* of the same concept, so there are many different search terms to try.)


Our second observation is that "no four $1$s lie on the corners of a rectangle" is equivalent to "each pair of rows have at most one $1$ in common". But hold on a moment - we've seen $\{0,1\}$-matrices with this property before:

"each pair of cards has exactly one symbol in common."

"Exactly one" is certainly "at most one". Of course, CSGs have more constraints, not only that two rows must have a common $1$, but also that each row has the same number of $1$s. Nonetheless, clearly every CSG is an RFM, and we know from before that every FPP is a CSG. In fact this directly gives us a lower bound good enough to solve the original question:

Proposition. $z(20,20) \ge 96$.

Proof. There is an FPP of order $4$; let $P$ be its incidence matrix. $4^2 + 4 + 1 = 21$ and $4+1 = 5$, so $P$ is $21{\times}21$ and each row and each column has a weight of $5$, so $w(P)$ $= 21 \times 5$ $= 105$. Choose a particular $1$ in $P$, and let $A$ be the matrix formed by deleting its row and column from $P$. Then $A$ is $20{\times}20$ and $w(A)$ $= 105 - 5 - (5 - 1)$ $= 96$. $\square$

The question now is: can we do better than FPPs?


There are two ways to go from here. One way is to write an algorithm to search for maximum-weight RFMs; another is to prove some theorems. Of course I did both. Here are some computational results:

3 6                            
4 7 9                          
5 8 10 12                        
6 9 12 14 16                      
7 10 13 15 18 21                    
8 11 14 17 19 22 24                  
9 12 15 18 21 24 26 29                
10 13 16 20 22 25 28 31 34              
11 14 17 21 24 27 30 33 36 39            
12 15 18 22 25 28 32 36 39 42 45          
13 16 19 23 27 30 33 37 40 44 48 52        
14 17 20 24 28 31 35 39 42 45 49 53 56      
15 18 21 25 30 33 36 40 44 47 51 55 58 61    
16 19 22 26 31 34 38 42 46 50 53 57 60 64 67  
17 20 23 27 32 36 39 43 47 51 55 59 63 67 70 74

Results for $n \le 2$ are trivial, and results for larger $n,m$ took too long to compute.** I'll write more about computer search in a future post; for now, we'll note that $\max_{n+m \le N} z(n,m)$ is achieved when $n$ and $m$ are as large and equal as possible, and focus our attention on square grids.*** Let's see what some maximum-weight RFMs actually look like:

$3{\times}3$ $4{\times}4$ $5{\times}5$
$\begin{pmatrix}1&1& \\ 1& &1 \\ &1&1\end{pmatrix}$ $\begin{pmatrix}1&1&1& \\ 1& & &1 \\ &1& &1 \\ & &1&1\end{pmatrix}$ $\begin{pmatrix}1&1&1&1& \\ 1& & & &1 \\  &1& & &1 \\ & &1& &1 \\ & & &1&1\end{pmatrix}$
$6{\times}6$ $7{\times}7$
$\begin{pmatrix}1&1&1& & & \\ 1& & &1&1& \\ 1& & & & &1 \\ &1& &1& &1 \\ &1& & &1& \\ & &1& &1&1\end{pmatrix}$ $\begin{pmatrix}1&1&1& & & & \\ 1& & &1&1& & \\ 1& & & & &1&1 \\ &1& &1& &1& \\ &1& & &1& &1 \\ & &1&1& & &1 \\ & &1& &1&1& \end{pmatrix}$

For clarity, the $0$s are left blank.

The $3{\times}3$ solution is, of course, the triangle CSG. If you're well-versed in combinatorics then you'll recognise the $7{\times}7$ solution as the Fano plane; additionally, the solution for $z(13,13) = 52$ is indeed the FPP of order $3$. The intermediate sizes are not of the form $n^2 + n + 1$, so they can't be FPPs.


All signs point to FPPs being optimal; can we prove it?

Theorem. If $v = n^2 + n + 1$, then $z(v,v)$ $\le v(n+1)$ with equality if and only if there is a finite projective plane of order $n$ (or $n \le 1$).

Proof. The result is trivial for $n \le 1$, so suppose $n \ge 2$.

Let $A$ be an $v{\times}v$ RFM, and consider combinations of two columns of $A$. For each combination, at most one row of $A$ can have $1$s in both columns. Let $w_i$ be the weight of row $i$; the $1$s in row $i$ account for $\frac{1}{2}w_i(w_i-1)$ combinations of two columns, and these combinations are distinct for different rows. Furthermore, there are at most $\frac{1}{2}v(v-1)$ combinations in total.

It follows that $\sum \frac{1}{2}w_i(w_i - 1)$ $\le \frac{1}{2}v(v-1)$. This can be rearranged**** to give$$w(A) \le v^2 - \sum (w_i - 1)^2$$This is a convex optimisation problem: while we want to maximise $w(A) = \sum w_i$, we also want to relax the constraint by minimising $\sum (w_i - 1)^2$.

Note that adjusting the $w_i$ does not correspond with actually moving $1$s around in the matrix, and a row weight distribution satisfying the constraint is not necessarily realisable as an RFM. We are only getting an upper bound on $z(v,v)$.

Claim. Without loss of generality, $\left|w_j - w_k\right| \le 1$ for every $j,k$.

Proof of claim. Suppose $w_k \ge w_j + 2$. Then substitute $w_j \mapsto w_j+1$ and $w_k \mapsto w_k-1$; we see that $\sum w_i$ is unchanged, but $\sum (w_i - 1)^2$ decreases by at least $2$, as$$w_j^2 + (w_k-2)^2 = (w_j-1)^2 + (w_k-1)^2 - 2\Delta$$where $\Delta = w_k - w_j - 1$ which by assumption is $\ge 1$. $\square$

It follows that by choosing $w^\star$ with $vw^\star(w^\star-1)$ $\le v(v-1)$ $< v(w^\star+1)w^\star$, then $\sum w_i$ is maximised when some number $0 \le l < v$ of the $w_i = w^\star+1$, and the other $v-l$ of the $w_i = w^\star$.

In the case that $v = n^2 + n + 1$, we can verify that $w^\star = n+1$ satisfies the first inequality exactly, so $l=0$ and every $w_i = n+1$. If $A$ is an RFM with this row weight distribution, then because every combination of two columns is accounted for, the "every two columns share a unique row" property holds and so $A$ is the incidence matrix of an FPP of order $n$. We also have $w(A) = v(n+1)$.

Alternatively, if there is no FPP of order $n$, then there is no $v{\times}v$ RFM with this row weight distribution. Since the $w_i$ differ by at most one, and we cannot have every $w_i \ge n+1$, we must have every $w_i \le n+1$ with some $w_i < n+1$. Therefore $z(v,v) < v(n+1)$ as required. $\square$

This gives yet another characterisation of FPPs - this time as extremal solutions to combinatorial problems. The special case $n \le 1$ could actually have been avoided if we talked about complete CSGs instead of FPPs. Note that the convexity argument applies in other cases too:

Proposition. $z(20,20) \le 97$.

Proof. $w^\star = 4$ and $l = 17$, giving $w(A)$ $\le (20-17) \times 4 + 17 \times (4+1)$ $= 97$. $\square$


I came up with this proof on Saturday. You can imagine my disappointment, then, when on Sunday I finally entered the right search terms into Google to find that the problem has been around since 1951, and the correspondence with FPPs was proved in 1958. Oh well!


*One example in this case is that maximum-weight RFMs are equivalent to minimum hitting sets: if we rephrase the problem as putting at least one $0$ in every rectangle using as few $0$s as possible, then we have a hitting set problem where the sets have size $4$ and there are $\frac{1}{4}n(n-1)m(m-1)$ of them. There are many minimum hitting set algorithms designed for small-set cases like this, but I think RFMs have some specific features which a general-purpose algorithm won't be able to exploit.

**My algorithm appears to be a lot faster than another researcher's even without the use of FPPs and convex optimisation to give lower and upper bounds. In this case my algorithm takes just under $14$ minutes to find $z(14,14)$ and just over $4$ hours to find $z(15,15)$; this may be asymptotically similar to Dybizbanski et al.'s algorithm which is feasible up to $z(16,16)$, though they didn't give running times. Using bounds to eliminate most of the search tree, my algorithm is able to find $z(16,17)$ in $83$ minutes and $z(17,17)$ in $2$ hours on a 1.9GHz processor. Compared to the usual standards of computational combinatorics, I am quite impatient. More in a future post.

***Intuitively, for fixed $n+m$, square grids have more space to fit more $1$s. Actually, this is still a conjecture, and may even be false.

****$\sum w_i(w_i - 1)$ $= \sum (w_i - 1)^2 + \sum(w_i - 1)$ $= \sum(w_i-1)^2 + w(A) - v$. This must be $\le v(v-1)$, giving the result.


Postscript. The computed values given above are consistent with the literature. I became a little worried when Damásdi et al. gave the value of $z(15,15)$ as $60$, citing a book from 1969; one of the authors did publish a correction soon after. My computed solution of weight $61$ is given below.

X X X X X : : : : : : : : : :
X : : : : X X X : : : : : : :
X : : : : : : : X X X : : : :
X : : : : : : : : : : X X X :
: X : : : X : : X : : X : : :
: X : : : : X : : X : : X : :
: X : : : : : X : : X : : : X
: : X : : X : : : : : : X : X
: : X : : : X : : : X X : : :
: : X : : : : X : X : : : X :
: : : X : X : : : : X : : X :
: : : X : : : X X : : : X : :
: : : X : : : : : X : X : : X
: : : : X X : : : X : : : : :
: : : : X : X : X : : : : X X

Card Games and Latin Squares

In a previous post, we saw that a card game with some particular properties is essentially the same thing as a finite projective plane (FPP). For some $n \ge 2$,

  • Each card has $n+1$ symbols,
  • Each symbol appears on $n+1$ cards,
  • Any two cards share exactly one common symbol, and
  • Any two symbols appear together on exactly one card.

$n$ is the order of the FPP. All manner of elusive combinatorial structures turn out to be equivalent to, constructible from, or otherwise intimately connected with FPPs. This post is about one particular case: complete sets of mutually orthogonal Latin squares.


Backtrack for a moment. A Latin square of order $n$ is an $n{\times}n$ grid containing the numbers $1,2,\ldots,n$ such that each row and each column contains all $n$ numbers. By the pigeonhole principle this is the same as saying no number appears twice in any row or any column. (If a number appears twice in a row, then there is not enough space left to fit the rest of the numbers in that row.) Here are three Latin squares with $n=4$: $$\mathcal{A} = \begin{bmatrix} 1&2&3&4 \\ 2&1&4&3 \\ 3&4&1&2 \\ 4&3&2&1 \end{bmatrix} \hspace{.5cm} \mathcal{B} = \begin{bmatrix} 1&2&3&4 \\ 3&4&1&2 \\ 4&3&2&1 \\ 2&1&4&3 \end{bmatrix} \hspace{.5cm} \mathcal{C} = \begin{bmatrix} 1&2&3&4 \\ 4&3&2&1 \\ 2&1&4&3 \\ 3&4&1&2 \end{bmatrix}$$

Latin squares are not remotely rare. In fact, they are easy to construct from blank grids: if you do one row at a time, you will never reach a dead end. But orthogonal Latin squares are harder to come by. $\mathcal{A}$, $\mathcal{B}$ and $\mathcal{C}$ are not just any $4{\times}4$ Latin squares: together, they have the particularly elusive property that when any two are superimposed, every ordered pair of numbers is present. Here's $\mathcal{A}$ and $\mathcal{B}$ together:$$\begin{bmatrix}1,1&2,2&3,3&4,4 \\ 2,3&1,4&4,1&3,2 \\ 3,4&4,3&1,2&2,1 \\ 4,2&3,1&2,4&1,3\end{bmatrix}$$I.e. $\mathcal{A}$ and $\mathcal{B}$ are orthogonal. The same can be easily verified for $\mathcal{A}$ and $\mathcal{C}$, and for $\mathcal{B}$ and $\mathcal{C}$. Of course, by the pigeonhole principle, "every ordered pair of numbers appears once" is the same as saying "no ordered pair of numbers appears twice".


A natural* question to ask is how many mutually orthogonal Latin squares can be found.

Definition. Latin squares (of the same order) $\mathcal{A}_1,\ldots,\mathcal{A}_t$ are mutually orthogonal if for $i \ne j$, $\mathcal{A}_i$ and $\mathcal{A}_j$ are orthogonal.

Definition. $N(n)$ is the maximum size of a set of mutually orthogonal Latin squares (MOLS) of order $n$.

We have seen already that $N(4) \ge 3$. In fact, $N(4) = 3$.

Proposition. for $n \ge 2$, $N(n) \le n-1$.

Proof. Note that permuting the numbers of a Latin square (e.g. exchanging $1$s for $2$s and vice versa) preserves orthogonality. Therefore without loss of generality, each $\mathcal{A}_k$ has $[1~2~\cdots~n]$ in the first row.

Now look at, say, the second row** in the first column; let $s_k$ be the number there in $\mathcal{A}_k$. Every $s_k \ne 1$, since each $\mathcal{A}_k$ has a $1$ in the first row of the same column, and the number $1$ cannot appear twice in a column of a Latin square.

Furthermore, the $s_k$ are distinct, otherwise some ordered pair $s,s$ appears in this position and in the first row in the $s$th column, contradicting orthogonality.

So there are at most $n-1$ numbers which can appear in this position, and each $\mathcal{A}_k$ has a different number there. $\square$

The following definition is therefore fitting:

Definition. A set of MOLS of order $n \ge 2$ is complete if the size of the set is $n-1$.

So $\mathcal{A},\mathcal{B},\mathcal{C}$ are a complete set for $n=4$.


The main purpose of this post is to give a proof of the following fantastic theorem,*** due to R. C. Bose in 1938:

Theorem. There is a complete set of mutually orthogonal Latin squares of order $n$ if and only if there is a finite projective plane of order $n$.

The proof is constructive. We'll consider complete Card & Symbol Games (CSGs) with $n+1$ symbols per card, which for $n \ge 2$ are the same thing as FPPs, but the notation and intuition lends itself to a nice visual proof which accomplishes both the "if" and the "only if" in one fell swoop.

Before we attempt a general proof, it's worth trying an example. If we use our complete set $\mathcal{A},\mathcal{B},\mathcal{C}$ for $n=4$, there will be $4^2 + 4 + 1$ cards, each with $4+1$ symbols.

$5$ "special" cards are given below:

$I$ $J$
$\Omega~\sigma_1~\sigma_2~\sigma_3~\sigma_4$ $\Omega~\tau_1~\tau_2~\tau_3~\tau_4$
$z_{\mathcal{A}}$ $z_{\mathcal{B}}$ $z_{\mathcal{C}}$
$\Omega~\alpha_1~\alpha_2~\alpha_3~\alpha_4$ $\Omega~\beta_1~\beta_2~\beta_3~\beta_4$ $\Omega~\gamma_1~\gamma_2~\gamma_3~\gamma_4$

The other $4^2$ cards can be arranged in a $4{\times}4$ grid:

$\sigma_1~\tau_1~\alpha_1~\beta_1~\gamma_1$ $\sigma_1~\tau_2~\alpha_2~\beta_2~\gamma_2$ $\sigma_1~\tau_3~\alpha_3~\beta_3~\gamma_3$ $\sigma_1~\tau_4~\alpha_4~\beta_4~\gamma_4$
$\sigma_2~\tau_1~\alpha_2~\beta_3~\gamma_4$ $\sigma_2~\tau_2~\alpha_1~\beta_4~\gamma_3$ $\sigma_2~\tau_3~\alpha_4~\beta_1~\gamma_2$ $\sigma_2~\tau_4~\alpha_3~\beta_2~\gamma_1$
$\sigma_3~\tau_1~\alpha_3~\beta_4~\gamma_2$ $\sigma_3~\tau_2~\alpha_4~\beta_3~\gamma_1$ $\sigma_3~\tau_3~\alpha_1~\beta_2~\gamma_4$ $\sigma_3~\tau_4~\alpha_2~\beta_1~\gamma_3$
$\sigma_4~\tau_1~\alpha_4~\beta_2~\gamma_3$ $\sigma_4~\tau_2~\alpha_3~\beta_1~\gamma_4$ $\sigma_4~\tau_3~\alpha_2~\beta_4~\gamma_1$ $\sigma_4~\tau_4~\alpha_1~\beta_3~\gamma_2$

It is slow but simple to verify that this is indeed a complete CSG. Notice:

  • Each "cell" card must have one common symbol with each "special" card; since we can't have any more than $5$ cards with the symbol $\Omega$, this means each "cell" card must have exactly one $\sigma$, $\tau$, $\alpha$, $\beta$ and $\gamma$ symbol.
  • The subscripts on the $\sigma$ and $\tau$ symbols are used to determine the row and column of the cell.
  • The subscripts on the $\alpha$, $\beta$ and $\gamma$ symbols are given by the numbers in the Latin squares $\mathcal{A}$, $\mathcal{B}$ and $\mathcal{C}$.
  • The "each number appears once in each row and column" rule for Latin squares, and the "each ordered pair appears once" rule for orthogonal Latin squares, both correspond with the "each pair of symbols appears together on one card" rule for complete CSGs.


The example contains almost all of the insight; the rest is formalism, except for the following lemma. Notice that e.g. the two cards $\newcommand{\card}[1]{\boxed{{\strut}~#1~}} \card{\sigma_2~\tau_3~\alpha_4~\beta_1~\gamma_2}$ and $\card{\sigma_4~\tau_2~\alpha_3~\beta_1~\gamma_4}$ have a common symbol $\beta_1$ - this is a requirement for a complete CSG - but do complete sets of MOLS always satisfy this? It means that given two grid positions not in the same row or column, there must be one Latin square from the complete set of MOLS with the same number in both positions.$$\begin{bmatrix} 1&2&3&4 \\ 2&1&\boxed{4}&3 \\ 3&4&1&2 \\ 4&\boxed{3}&2&1 \end{bmatrix} \hspace{.5cm} \begin{bmatrix} 1&2&3&4 \\ 3&4&\boxed{1}&2 \\ 4&3&2&1 \\ 2&\boxed{1}&4&3 \end{bmatrix} \hspace{.5cm} \begin{bmatrix} 1&2&3&4 \\ 4&3&\boxed{2}&1 \\ 2&1&4&3 \\ 3&\boxed{4}&1&2 \end{bmatrix}$$It's not obvious why this has to happen.

Lemma. If $\mathcal{A}_1,\ldots,\mathcal{A}_{n-1}$ is a complete set of MOLS, $i \ne i'$ and $j \ne j'$, then exactly one $\mathcal{A}_k$ has $(\mathcal{A}_k)_{i,j} = (\mathcal{A}_k)_{i',j'}$.

Proof. Without loss of generality, we can permute symbols so that every $(\mathcal{A}_k)_{i,j} = 1$. Let $l_k$ be the row containing the $1$ in column $j'$ of $\mathcal{A}_k$.

Since they are Latin squares, every $l_k \ne i$. Also, the $l_k$ are distinct: every pair of the Latin squares each has a $1$ in the same cell in column $j$, so by orthogonality they cannot also each have a $1$ in the same cell of column $j'$.

Therefore the $n-1$ distinct values of $l_k$ are each of the $n-1$ rows other than $i$, so one $l_k = i'$. $\square$


OK, now the proof of Bose's theorem. To avoid drowning in a sea of unnecessary variables, the subscript $-$ indicates an unknown number between $1$ and $n$.

Proof. Fix a distinguished symbol $\Omega$, and let $I = \card{\Omega~\sigma_1~\cdots~\sigma_n}$, $J = \card{\Omega~\tau_1~\cdots~\tau_n}$ and $n-1$ cards $z_{\mathcal{A}},z_{\mathcal{B}},\ldots,z_{\mathcal{H}}$ be the cards with symbol $\Omega$, with $z_{\mathcal{A}} = \card{\Omega~\alpha_1~\cdots~\alpha_n}$, $z_{\mathcal{B}} = \card{\Omega~\beta_1~\cdots~\beta_n}$, and so on up to $z_{\mathcal{H}} = \card{\Omega~\eta_1~\cdots~\eta_n}$. Denote these as the special cards.

Note that $\sigma_{-},\tau_{-},\alpha_{-},\ldots,\eta_{-}$ must be distinct: each can only appear once alongside $\Omega$, and there are $n+1$ such distinct cards.

Note that we're just choosing names, not imposing restrictions; the construction works either starting with a complete CSG, or starting from a blank slate in an attempt to construct one.

Proof (continued). We have accounted for $n+1$ cards and all $n^2 + n + 1$ symbols, so there are $n^2$ more cards; denote these as the cell cards. Any cell card $c$ must have exactly one symbol in common with each special card, and by assumption $c$ does not have the symbol $\Omega$. It follows that $c = \card{\sigma_{-}~\tau_{-}~\alpha_{-}~\cdots~\eta_{-}}$.

There must be $n$ cell cards with each $\sigma_{-}$, and $n$ cell cards with each $\tau_{-}$. Moreover, each ordered pair $\sigma_{-},\tau_{-}$ must appear on exactly one card. This gives an enumeration of the cell cards as $c_{i,j}$ for $1 \le i,j \le n$:

$\sigma_1~\tau_1~\alpha_{-}~\cdots~\eta_{-}$ $\cdots$ $\sigma_1~\tau_n~\alpha_{-}~\cdots~\eta_{-}$
$\vdots$ $\ddots$ $\vdots$
$\sigma_n~\tau_1~\alpha_{-}~\cdots~\eta_{-}$ $\cdots$ $\sigma_n~\tau_n~\alpha_{-}~\cdots~\eta_{-}$

Now let $c_{i,j}$'s subscripts $\alpha_{-},\ldots,\eta_{-}$ be the numbers $\mathcal{A}_{i,j},\ldots,\mathcal{H}_{i,j}$ respectively: if the CSG is given, use the subscripts to determine the Latin squares, and vice versa.

We now check that from a complete set of MOLS, the resulting incidence structure is a complete CSG:

  • By construction each card has $n+1$ symbols.
  • $\Omega$ appears on $n+1$ cards. Each $\sigma_{-}$ and each $\tau_{-}$ appear on one special card and on $n$ cell cards, making $n+1$. Each $\alpha_{-}$ appears on one special card and $n$ cell cards since each number appears $n$ times in $\mathcal{A}$, making $n+1$; $\beta_{-},\ldots,\eta_{-}$ are similar.
  • By construction each two special cards have exactly $\Omega$ in common, and each special card has one symbol in common with each cell card. Two cell cards with the same $\sigma_{-}$ do not have a common $\tau_{-}$ and vice versa, and either case has no common $\alpha_{-},\ldots,\eta_{-}$ since each number appears only once in each row and column of $\mathcal{A},\ldots,\mathcal{H}$. Two cell cards with the same $\alpha_{-}$ do not have a common $\beta_{-}$, as no ordered pair appears twice when $\mathcal{A}$ and $\mathcal{B}$ are superimposed; other pairs are similar. The tricky case of showing that two cell cards with no common $\sigma_{-}$ or $\tau_{-}$ do have a common symbol is handled by the lemma.
  • Each two of the same $\sigma_{-},\tau_{-},\alpha_{-},\ldots,\eta_{-}$ appear together on one special card. Each $\sigma_{-},\tau_{-}$ appear together on one cell card by construction. Each $\sigma_{-},\alpha_{-}$ and each $\tau_{-},\alpha_{-}$ appear together on one cell card because each number appears once in each row and each column of $\mathcal{A}$; $\beta,\ldots,\eta$ are similar. Each $\alpha_{-},\beta_{-}$ appear together on one cell card because each ordered pair of numbers appears once when $\mathcal{A}$ and $\mathcal{B}$ are superimposed; other pairs are similar.

Next, we check that from a complete CSG, the result is a complete set of MOLS:

  • Each $\sigma_{-}$ and $\alpha_{-}$ appear together on one card, so each number appears in one row of $\mathcal{A}$; columns and $\mathcal{B},\ldots,\mathcal{H}$ are similar. So the squares are indeed Latin.
  • Each $\alpha_{-}$ and $\beta_{-}$ appear together on one card, so each ordered pair appears once when $\mathcal{A}$ and $\mathcal{B}$ are superimposed; other pairs are similar.
  • The set of MOLS is complete since there are $n-1$ special cards $z_{\mathcal{A}},\ldots,z_{\mathcal{H}}$.

The result follows because for $n \ge 2$, a complete CSG is equivalent to a finite projective plane. $\square$

So as with FPPs, we can systematically construct a complete set of MOLS when $n$ is a power of a prime, it's impossible when $n=6$,**** and $n=12$ is an open problem.


*Mathematicians are greedy, so of course we want as many MOLS as possible.

**This proposition is often stated without the condition that $n \ge 2$; but the proof requires there to be a second row. Indeed, when $n=1$ there is a counterexample in the set containing the unique Latin square of order $1$, all pairs of distinct Latin squares from which are orthogonal.

***The theorem is well known to combinatorists, but proofs seem to be thin on the ground - even Wikipedia (as of writing) says "a set of n - 1 MOLS is equivalent to a finite projective plane of order n" with no citation.

****In fact, $N(6) = 1$.

Card Games and Projective Geometry

A correspondent writes:

A few times, over the past few weeks, I've seen a particular friend of mine, and he's had a game with him, a game called Spot It. It takes the form of a bunch of cards, each of which has eight different symbols on it; the game mechanic depends on the fact that each pair of cards has exactly one symbol in common.


It turns out there are 57 distinct symbols. 42 of them appear on eight cards each, 14 appear on seven cards each, and one appears on six. This looks suspiciously like "57 symbols, 57 cards, each card has eight symbols, each symbol appears on eight cards" with two cards missing.


It's trivial to deduce what the 'missing' cards are, and, with them added, it is then a 57-card deck with the same property; that any two cards have exactly one symbol in common.

Then I noticed that 57 equals 57. That is, that the number of cards in the 'complete' deck equals the number of distinct symbols. Then, I drew up a 57x57 matrix, where one axis is card identity and the other is symbol identity. It looks symmetric enough, and, on checking, it turns out that the "one bit in common" property holds in the other dimension too (which means, mapped back into cards, that, for every pair of symbols, there is exactly one card that has both of them).

Then a few questions (paraphrased):

  1. Why $57$; is there anything special about this number?
  2. If we choose a different number of symbols per card, can we always construct a card game like this? How many cards would there be?
  3. Are there always the same number of cards and symbols in a complete game? If a game has fewer cards than symbols, is it always possible to complete it?
  4. Is it possible to add more cards even if none are "missing"?

I added a question of my own.

  1. Is the incidence matrix actually symmetric (for some permutation of rows and columns)? If so, do all such card games have symmetric incidence matrices?

I found question E interesting because it implies that there is a bijection between cards and symbols with nice properties;* in particular it relates these card games to certain regular graphs of diameter $2$. More on this later.


Here's an example of what we're talking about. The triangle game has cards $a,b,c$ and symbols $\alpha,\beta,\gamma$:

$a$ $b$ $c$
$\beta,\gamma$ $\alpha,\gamma$ $\alpha,\beta$

There are $3$ cards and $3$ symbols, each card has $2$ symbols, each symbol appears on $2$ cards, each pair of cards has one symbol in common, and each pair of symbols appears together on one card. There is a pleasing duality between cards and symbols, and we can write down the dual of this game:

$\alpha$ $\beta$ $\gamma$
$b,c$ $a,c$ $a,b$

In the dual game, the card $\alpha$ has symbols $b,c$ because those were the cards with the symbol $\alpha$ in the original game. Notice that in this case, the dual is essentially the same game: there's a bijection $$\varphi = \{~a \mapsto \alpha,~b \mapsto \beta,~c \mapsto \gamma~\}$$ such that a card $x$ has a symbol $\varphi(y)$ if and only if the card $\varphi(x)$ has the symbol $y$ in the dual game. This relates to the fact that the incidence matrix is symmetric:

  $\begin{matrix} \alpha&\beta&\gamma \end{matrix}$
$\begin{matrix} a \\ b \\ c \end{matrix}$ $\begin{pmatrix} 0&1&1 \\ 1&0&1 \\ 1&1&0 \end{pmatrix}$


It's usually a good idea to come up with some notation.

Definition. An incidence structure is $(C,S,\in)$ where $C$ and $S$ are disjoint finite non-empty sets, and $\in$ is a relation with domain $C$ and range $S$.**

Interpret $C$ as the set of cards, $S$ as the set of symbols, and $x \in \sigma$ as "card $x$ has symbol $\sigma$". The bit about domains and ranges means there are no "blank cards" or "phantom symbols" - every card has a symbol, and every symbol appears on some card.

Definition. For $x : C$, $S_x = \{~\sigma : S~|~x \in \sigma~\}$, and for $\sigma : S$, $C_{\sigma} = \{~x : C~|~x \in \sigma~\}$.

$S_x$ is the set of symbols on a card $x$, and $C_\sigma$ is the set of cards with symbol $\sigma$.

Definition. A Card & Symbol Game (or CSG) is an incidence structure such that

$\newcommand\prop[1]{[\textsf{#1}]}\prop{UNIFORM}$ For some $n \ge 1$, every card has $n+1$ symbols (i.e. $\exists n \ge 1 \bullet \forall x : C \bullet \#S_x = n+1$)

$\prop{1-COMMON}$ Any two distinct cards have a unique common symbol (i.e. $\forall x,y : C~|~x \ne y \bullet \#(S_x \cap S_y) = 1$)

Call this $\prop{CSG} \equiv \prop{UNIFORM} \land \prop{1-COMMON}$.

The use of $n+1$ rather than $n$ is because mathematicians like to go back and change their notation in order to make things easier later, once they know what will make things easier.


Some more definitions will be useful for our investigation:

Definition. An incidence structure is square if there are the same number of cards and symbols (i.e. $\#C = \#S$). Call this $\prop{SQUARE}$.

Definition. An incidence structure is regular if the number of symbols on each card equals the number of cards with each symbol, and this number is at least $2$. Call this $\prop{REGULAR}$.

Definition. Incidence structures $\mathcal{G} = (C,S,\in)$ and $\mathcal{G}' = (C',S',\in')$ are isomorphic (or $\mathcal{G} \cong \mathcal{G}'$) if there are bijections $\varphi : C \to C'$ and $\psi : S \to S'$ such that $x \in \sigma$ if and only if $\varphi(x) \mathrel{\in'} \psi(\sigma)$.

Obviously if $\mathcal{G}$ satisfies a property $\prop{P}$ and $\mathcal{G} \cong \mathcal{G}'$, then $\mathcal{G}'$ also satisfies $\prop{P}$.


We'd like to talk about duals:

Definition. The dual of an incidence structure $\mathcal{G} = (C,S,\in)$ is $\mathcal{G}^\star = (S,C,\ni)$, where $\sigma \ni x$ if and only if $x \in \sigma$.

Definition. $\mathcal{G}$ is self-dual if $\mathcal{G} \cong \mathcal{G}^\star$. Call this $\prop{SELF-DUAL}$.

Obviously $\mathcal{G}^{\star\star} = \mathcal{G}$. It will be useful to have a notation for dual properties:

Meta-definition. For a property $\prop{P}$, an incidence structure $\mathcal{G}$ satisfies $\newcommand\propd[1]{[\textsf{#1}^\star]}\propd{P}$ if and only if $\mathcal{G}^\star$ satisfies $\prop{P}$. For example,

$\propd{UNIFORM}$ For some $n \ge 1$, every symbol appears on $n+1$ cards

$\propd{1-COMMON}$ Any two distinct symbols appear together on a unique card

Obviously $[\mathsf{P}^{\star\star}] \equiv \prop{P}$. Due to the symmetry between cards and symbols in their definitions, $\prop{SQUARE}$, $\prop{REGULAR}$ and $\prop{SELF-DUAL}$ are each equivalent to their dual properties.


We'll be interested in finding more logical connections between these properties; let's write $\newcommand\infer{\vDash} \prop{P} \infer \prop{Q}$ to mean that $\prop{Q}$ can be inferred from $\prop{P}$.

Proposition. $\prop{REGULAR} \equiv \prop{SQUARE} \land \prop{UNIFORM} \land \propd{UNIFORM}$.

Proof. Obviously $\prop{REGULAR} \equiv \prop{UNIFORM} \land \propd{UNIFORM}$ with the same $n$. Consider $\in$ as a set of pairs $x \mapsto \sigma$. Then $\#{\in} = \#C \times \#S_x$ for each $x : C$, and also $\#{\in} = \#S \times \#C_\sigma$ for each $\sigma : S$. Since $\#C_\sigma = \#S_x$, we have $\#C = \#S$.

Conversely, suppose $\#C = \#S$, then $\#C \times \#S_x = \#{\in} = \#S \times \#C_\sigma$ for each $x:C$, $\sigma:S$, so $\#C_\sigma = \#S_x$. $\square$

So if $\mathcal{G}$ is square, uniform and dual-uniform, then the values of $n$ are the same. This is not necessarily the case if $\mathcal{G}$ is not square: for example, the incidence structure $\mathcal{B}_3$ defined by the incidence matrix $$\begin{pmatrix} 0&0&0&~&1&1&1 \\ 0&0&1&~&1&1&0 \\ 0&1&0&~&1&0&1 \\ 0&1&1&~&1&0&0 \\ 1&0&0&~&0&1&1 \\ 1&0&1&~&0&1&0 \\ 1&1&0&~&0&0&1 \\ 1&1&1&~&0&0&0 \end{pmatrix}$$ is uniform with each $\#S_x = 3$ and dual-uniform with each $\#C_\sigma = 4$. (Exercise: generalise this construction to $\mathcal{B}_n$ with $2^n$ cards, $2n$ symbols, each card having $n$ symbols and each symbol appearing on $2^{n-1}$ cards.) This isn't a CSG, as it doesn't satisfy $\prop{1-COMMON}$; another construction gives $\mathcal{C}_4$ with incidence matrix $$\begin{pmatrix} 1&1&1&0&0&0 \\ 1&0&0&1&1&0 \\ 0&1&0&1&0&1 \\ 0&0&1&0&1&1 \end{pmatrix}$$ which generalises to $\mathcal{C}_n$ with $n$ cards, $\frac{1}{2}n(n-1)$ symbols, each card having $n-1$ symbols and each symbol appearing on $2$ cards. (The details of this construction are left as another exercise.) This is a CSG satisfying $\propd{UNIFORM}$ but not $\propd{1-COMMON}$.


Some second-order theorems:

Meta-theorem. $(\prop{P} \infer \prop{Q}) \infer (\propd{P} \infer \propd{Q})$.

Proof. Suppose there is an incidence structure $\mathcal{G}$ satisfying $\propd{P} \land \neg \propd{Q}$. Then $\mathcal{G}^\star$ satisfies $\prop{P} \land \neg \prop{Q}$, a contradiction. $\square$

Meta-theorem. $\prop{SELF-DUAL} \land \prop{P} \infer \propd{P}$.

Proof. If $\mathcal{G}$ satisfies $\prop{P}$ then $\mathcal{G}^\star \cong \mathcal{G}$ satisfies $\propd{P}$. $\square$

These will be useful later.


We turn our attention now to projective geometry. The reason for this will be immediately clear:

Definition. A finite projective plane (or FPP) is an incidence structure $(P,L,\in)$ where $P$ is a set of points, $L$ is a set of lines, $p \in \lambda$ means "point $p$ is on line $\lambda$", and

$\prop{1-COMMON}$ Any two distinct points are on a unique common line

$\propd{1-COMMON}$ Any two distinct lines meet at a unique common point

$\prop{NON-DEGENERATE}$ There are four distinct points such that no three are on any line

Call this $\prop{FPP} \equiv \prop{1-COMMON} \land \propd{1-COMMON} \land \prop{NON-DEGENERATE}$.

Definition. $P_\lambda$ is the set of points on a line $\lambda$, and $L_p$ is the set of lines through a point $p$, as before.

The words "point" and "line" are deliberately undefined - we can interpret them in a geometric sense (if we're generous about how straight a line has to be):


This is the Fano plane, the smallest FPP. But we could instead define "point" to mean card, and "line" to mean symbol, and then we can draw the Fano plane as a card game diagram:

$a$ $b$ $c$ $d$ $e$ $f$ $g$
$\alpha,\beta,\gamma$ $\alpha,\delta,\epsilon$ $\alpha,\zeta,\eta$ $\beta,\delta,\eta$ $\beta,\epsilon,\zeta$ $\gamma,\epsilon,\eta$ $\gamma,\delta,\zeta$

It's easy to check that the Fano plane is a CSG, and isomorphic to its dual (which is also an FPP),

$\alpha$ $\beta$ $\gamma$ $\delta$ $\epsilon$ $\zeta$ $\eta$
$a,b,c$ $a,d,e$ $a,f,g$ $b,d,g$ $b,e,f$ $c,e,g$ $c,d,f$

via the isomorphism $\varphi = \{~a \mapsto \alpha,~b \mapsto \beta,~\ldots,~g \mapsto \eta~\}$ and $\psi = \varphi^{-1}$.

In fact, every FPP is a CSG, and its dual is also an FPP. Let's prove this.

Theorem. The dual of an FPP is an FPP (i.e. $\prop{FPP} \infer \propd{FPP}$).

Proof. We only need to check $\propd{NON-DEGENERATE}$.

Choose $p,q,r,s : P$ as in $\prop{NON-DEGENERATE}$. By $\prop{1-COMMON}$, there are lines $\lambda,\mu,\nu,\xi : L$ joining $p$ to $q$, $q$ to $r$, $r$ to $s$ and $s$ to $p$ respectively. Suppose, without loss of generality, that $\lambda,\mu,\nu$ meet at a single point: by $\propd{1-COMMON}$ the only point on both $\lambda$ and $\mu$ is $q$, and so $\nu$ contains all three of $q,r,s$, a contradiction. $\square$

By a meta-theorem, it follows that $(\prop{FPP} \infer \prop{P}) \infer (\prop{FPP} \infer \propd{P})$. I.e., if every FPP satisfies some property $\prop{P}$, then every FPP also satisfies $\propd{P}$.

Lemma. Every line in an FPP contains at least $3$ points.

Proof. Choose $p,q,r,s : P$ as in $\prop{NON-DEGENERATE}$, and without loss of generality let $p \notin \lambda$ and $q \notin \lambda$. By $\prop{1-COMMON}$ there are $\mu,\nu,\xi : L_p$ with $q \in \mu$, $r \in \nu$ and $s \in \xi$, which are distinct by $\prop{NON-DEGENERATE}$.

Then by $\propd{1-COMMON}$ there are $t,u,v : P_\lambda$ with $t \in \mu$, $u \in \nu$ and $v \in \xi$ which are distinct as each pair of $\mu,\nu,\xi$ already meet at $p$. $\square$

By the dual lemma, every point lies on at least $3$ lines.

Lemma. For any two points in an FPP, there is a line containing neither.

Proof. By $\prop{1-COMMON}$ choose $\mu : L$ with $p \in \mu$ and $q \in \mu$. By the lemma, there is a point $r \in P_\mu - p - q$. By $\prop{NON-DEGENERATE}$ there is a point $s : P - r$ with $s \notin \mu$, so choose $\lambda : L$ with $r \in \lambda$ and $s \in \lambda$. By $\propd{1-COMMON}$, $\lambda$ and $\mu$ only meet at $r$, so $p \notin \lambda$ and $q \notin \lambda$. $\square$

By the dual lemma, for any two lines, there is a point on neither.

Theorem. Every FPP is a CSG (i.e. $\prop{FPP} \infer \prop{CSG}$).

Proof. We have $\prop{1-COMMON}$ already, so we only need to check $\prop{UNIFORM}$: that every point is on the same number of lines.

We will show that given any point $p : P$ and line $\lambda : L$ with $p \notin \lambda$, the number of lines through $p$ equals the number of points on $\lambda$. Given $\mu : L_p$, $\mu \ne \lambda$ so by $\propd{1-COMMON}$ there is a unique point $q : P_\lambda$ with $q \in \mu$. Conversely, given $q : P_\lambda$, by $\prop{1-COMMON}$ there is a unique $\mu : L_p$ with $q \in \mu$. This gives a bijection $: L_p \to P_\lambda$.

Then given distinct points $p,q : P$, by the lemma there is a line $\lambda : L$ with $p \notin \lambda$ and $q \notin \lambda$, so $\#L_p = \#P_\lambda = \#L_q$. $\square$

Notice that the proof also gives us $\prop{FPP} \infer \prop{REGULAR}$, as $\#L_p = \#P_\lambda$.


The problem right now is that the dual of a CSG is not necessarily a CSG; $\mathcal{C}_n^\star$ satisfies $\prop{UNIFORM}$ but not $\prop{1-COMMON}$, for example. Worse still, call this one $\mathcal{L}_5$:

$a$ $b$ $c$ $d$ $e$
$\alpha,\Omega$ $\beta,\Omega$ $\gamma,\Omega$ $\delta,\Omega$ $\epsilon,\Omega$

This CSG is uninteresting because the common symbol is always $\Omega$. Nonetheless, it's a CSG by our definition, and the dual game

$\Omega$ $\alpha$ $\beta$ $\gamma$ $\delta$ $\epsilon$
$a,b,c,d,e$ $a$ $b$ $c$ $d$ $e$

satisfies neither $\prop{UNIFORM}$ nor $\prop{1-COMMON}$. Of course we could change our definition of CSGs to include $\propd{UNIFORM}$ and $\propd{1-COMMON}$, so that the dual of a CSG is always a CSG. But this is unsatisfying and may be too strict: it would disallow "incomplete" CSGs, as deleting cards from a CSG preserves $\prop{UNIFORM}$ and $\prop{1-COMMON}$ but neither $\propd{UNIFORM}$ nor $\propd{1-COMMON}$.

A better approach would be to try to isolate what it means for a CSG to have "missing" cards, and then relate this to "completeness". We'll consider a CSG to be "complete" if it satisfies $\propd{UNIFORM}$ and $\propd{1-COMMON}$, i.e. every symbol appears on the same number of cards, and every pair of symbols appears together, as in the completed original game. Conveniently, this is exactly $\propd{CSG}$.

Definition. A CSG is complete if its dual is a CSG. Call this $\prop{COMPLETE} \equiv \prop{CSG} \land \propd{CSG}$.

Definition. An extension of an incidence structure $\mathcal{G} = (C,S,\in)$ is an incidence structure $\mathcal{G}' = (C',S',\in')$ for which $C \subseteq C'$, $S \subseteq S'$, and $\in$ is the restriction of $\in'$ to $C$ and $S$ (i.e. ${\in} \subseteq {\in'}$ as sets of pairs). The extension is non-trivial if $\mathcal{G}' \ne \mathcal{G}$.

An expansion is an extension such that $\in$ is the restriction of $\in'$ to $C$. Write $\mathcal{G} \le \mathcal{G}'$ for an expansion.

A completion is an expansion which is a complete CSG.

Proposition. If $\mathcal{G} \le \mathcal{G}'$ is a non-trivial expansion, then $C \ne C'$.

Proof. Suppose $C = C'$. Then $\in$ is the restriction of $\in'$ to its own domain, so ${\in} = {\in'}$ and hence $S' =$ range of ${\in'} =$ range of ${\in} = S$, and the expansion is trivial; a contradiction. $\square$

Definition. An incidence structure is CSG-expansible if it has a non-trivial expansion which is a CSG. Call this $\prop{CSG-EXPANSIBLE}$.

An incidence structure is completable if it has a completion. Call this $\prop{COMPLETABLE}$.

So extensions and expansions may add more cards and more symbols, but an extension might add new symbols to existing cards, whereas an expansion cannot change any existing cards. An expansion is trivial if and only if it adds no cards. A completion need not be non-trivial, so $\prop{COMPLETE} \infer \prop{COMPLETABLE}$.


Now let's try to answer some of the questions.

Theorem. $\prop{1-COMMON} \land \prop{REGULAR} \infer \#C = n^2 + n + 1$.

Proof. Fix a card $x : C$. Given any other card $y : C - x$, by $\prop{1-COMMON}$ there is a unique $\sigma : S_x$ for which $y$ is in $C_{\sigma} - x$. Therefore $\{x\}$ and the $C_{\sigma} - x$ form a partition of $C$.

By $\prop{REGULAR}$, $\# S_x = n+1$, and each $\# (C_{\sigma} - x) = n$. Therefore $\#C = 1 + n(n+1)$. $\square$

This answers question A: $57 = 7^2 + 7 + 1$.

Theorem. An incidence structure is a complete CSG if and only if it is either an FPP or a triangle.

Proof. Since complete CSGs satisfy $\prop{1-COMMON}$ and $\propd{1-COMMON}$, we only need to consider complete CSGs which don't satisfy $\prop{NON-DEGENERATE}$. These are degenerate finite projective planes, all of which are classified:

  • The empty set. This is not an incidence structure by our definition.
  • A single line, with any number of points. This fails $\prop{UNIFORM}$, as we need $n \ge 1$.
  • As before, but one of the points is on any number of additional lines. This still fails $\prop{UNIFORM}$.
  • A single point, with any number of lines. This fails $\propd{UNIFORM}$, as we need $n \ge 1$.
  • As before, but one of the lines contains any number of additional points. This still fails $\propd{UNIFORM}$.
  • A line $\mu$ with $C_{\mu} = \{p_1,\ldots,p_{n+1}\}$, and an additional point $q$ with $S_q = \{\lambda_1,\ldots,\lambda_{n+1}\}$, where each $C_{\lambda_i} = \{p_i, q\}$. This satisfies $\prop{UNIFORM}$ only when $n=1$, giving the triangle.

Conversely, $\prop{FPP} \infer \prop{CSG}$ and by a meta-theorem $\prop{FPP} \infer \propd{FPP} \infer \propd{CSG}$, so every FPP is a complete CSG, as is the triangle. $\square$

This answers the first part of question B: all FPPs with $n \le 10$ are known, and there is no FPP with $n=6$, so there is no complete CSG with $7$ symbols per card, for example. $n$ is called the order of the FPP.

Corollary. Every complete CSG is regular.

Proof. $\prop{FPP} \infer \prop{REGULAR}$, and the triangle is regular. $\square$

This answers the second part of question B: if there is a complete CSG with $n+1$ symbols per card, then it has $n^2 + n + 1$ cards.

Corollary. Every complete CSG is square.***

Proof. $\prop{REGULAR} \infer \prop{SQUARE}$. $\square$

This answers question C: yes, a complete CSG must have equal numbers of cards and symbols; and no, there are CSGs which are not completable. For example, a CSG with $1$ card and $7$ symbols is not completable, though it has "missing cards" in the sense that it is CSG-expansible; however, it has a CSG-expansion which is not CSG-expansible.**** In contrast, $\mathcal{L}_n$ is not completable for $n \ge 3$, but $\mathcal{L}_n \le \mathcal{G}$ is a non-trivial CSG-expansion exactly when $\mathcal{G} \cong \mathcal{L}_{n'}$ for some $n' > n$; so however many cards are added, with whatever new symbols, more can still be added!

Some things are known about completability: for example, Dow showed in 1982 that, in other words, any CSG with $\#S = n^2 + n + 1$ and $\#C > n^2 - 2\sqrt{n+3} + 6$ is completable, and if $\#C > n^2 - n + 1$ then the completion is unique (up to isomorphism).

Theorem. $\prop{COMPLETE} \infer \neg \prop{CSG-EXPANSIBLE}$.

Proof. Let $\mathcal{G} \le \mathcal{G}'$ be a non-trivial expansion, and fix $x : C$ and $y : C' \setminus C$. Let $\chi : S$ be the common symbol between $x$ and $y$. There is another symbol $\sigma : S_x - \chi$, and since $G$ satisfies $\propd{UNIFORM}$ there is another card $w : C_\sigma - x$.

Suppose $G'$ satisfies $\prop{1-COMMON}$. Let $\tau : S$ be the unique common symbol between $w$ and $y$. Since $\mathcal{G}$ satisfies $\propd{1-COMMON}$ there is a card $z : C_\chi - x$ with $z \in \tau$. Then $y$ and $z$ have two common symbols $\chi$ and $\tau$, a contradiction. Therefore $\mathcal{G}'$ is not a CSG. $\square$

This answers question D: no, if the CSG is complete then we can't add more cards and still have a CSG.


Question E asks if an incidence matrix is equal to its own transpose. This is apparently a stronger condition than $\prop{SELF-DUAL}$, which only requires the incidence matrix to be some permutation of its transpose.

Definition. An incidence structure is self-polar if there is a bijection $\varphi : C \to S$ such that $x \in \varphi(y)$ if and only $y \in \varphi(x)$. Call this $\prop{SELF-POLAR}$.

Proposition. $\prop{SELF-POLAR} \infer \prop{SELF-DUAL}$.

Proof. $\psi = \varphi^{-1}$ will do. $\square$

Proposition. Every self-dual CSG is complete.

Proof. By meta-theorem, $\prop{SELF-DUAL} \land \prop{CSG} \infer \propd{CSG}$. $\square$

FPPs can be systematically constructed using vector spaces over finite fields, yielding FPPs where $n$ can be any prime power. (It's an open problem whether there are any FPPs where $n$ is not a prime power.) FPPs constructed this way are always self-polar; FPPs constructed by other methods are not necessarily self-dual, and apparently it's another open problem as of 2009 whether every self-dual FPP is self-polar.

To answer question E: the particular card game introduced above is a complete CSG with $8$ symbols per card, so it is isomorphic to the unique FPP with $n=7$, which is self-polar. However, there are complete CSGs which are not self-dual, the smallest two of which are isomorphic to a dual pair of FPPs with $n=9$, so they have $91$ cards each with $10$ symbols.

For general incidence structures, $\prop{SELF-POLAR}$ is strictly stronger than $\prop{SELF-DUAL}$. In 2003, Brouwer et al. found the smallest possible self-dual incidence structure which is not self-polar:

$a$ $b$ $c$ $d$ $e$ $f$ $g$
$\alpha,\beta,\gamma,\epsilon$ $\alpha,\beta,\delta,\zeta$ $\alpha,\zeta,\eta$ $\beta,\epsilon,\eta$ $\alpha,\gamma$ $\beta,\delta$ $\gamma,\delta$

There are $7$ others of the same size, up to isomorphism.


Returning to the start of this post, our correspondent also noted that (in other words) there is a one-to-one correspondence between CSGs and constant-weight codes for which each codeword has Hamming weight $n+1$ and each pair has Hamming distance $2n$. We've shown that complete CSGs are almost exactly the same thing as finite projective planes; another closely related area is the study of symmetric block designs, which (for $\lambda = 1$) are exactly the same thing as complete CSGs, although this result is a theorem rather than by definition.

I said we'd come back to regular graphs with diameter $2$. If the incidence matrix of a CSG is symmetric (i.e. the CSG is self-polar), then we can interpret it as the adjacency matrix of a graph in which $x,y:C$ are joined by an edge if and only if $x \in \varphi(y)$. In this context,

  • $\prop{UNIFORM}$ means the graph is $n+1$-regular, i.e. each vertex has $n+1$ edges;
  • $\prop{1-COMMON}$ means each pair of distinct vertices $x,y:C$ are joined by a unique route of length $2$;
  • $\prop{SELF-POLAR}$ means the graph is undirected.

For $x \in \varphi(x)$, the graph has an edge from $x$ to itself, and self-edges may be used to form routes.***** This gives a correspondence between self-polar CSGs and graphs of this form. Here's the graph corresponding to the Fano plane:


However, the graph produced depends on which polarity $\varphi$ is used. Automorphisms of the graph correspond to conjugates of $\varphi$, but in general the polarities may not form a single conjugacy class in the group of automorphisms of the FPP.

Another thing you can do with CSGs is colour the edges of complete graphs such that the monochrome sets are themselves complete graphs. The colour of an edge joining two cards is given by their unique common symbol. Here's the Fano plane again, with the symbols $\alpha,\beta,\ldots,\eta$ in order of hue:


Here each monochrome set is complete on $3$ vertices, because the Fano plane is dual-uniform; and each pair of colours meets at one vertex, because the Fano plane is dual-1-common. So that's nice.


*Combinatorics would greatly benefit from a theorem which says "if two sets appear near each other, have the same cardinality, and this cardinality is an unusually mundane number like $57$, then there is a natural bijection with some nice properties which explains why this has happened."

**Combinatorists may recognise that the usual definition of incidence structure does not require that everything has some incidence. The relation symbol is $\in$ because we can use a set-theoretic model where $S \subseteq \mathbb{P} C$. Astute combinatorists may recognise that the usual definition of a uniform incidence structure is the dual of the definition given here.

***Ryser's theorem gives the converse: every square CSG is complete!

****This follows because Ryser's theorem gives a strict bound $\#C < n^2 + n + 1$, so it cannot be expanded indefinitely.

*****The friendship theorem implies that the only such graph with no self-edges is the triangle. This is a corollary of the fact that every polarity $\varphi$ of an FPP has an absolute point $x:P$ such that $x \in \varphi(x)$.