## Generalizations of Nim

June 18, 2017

Probably everyone knows that the P-positions (previous player has won) in Nim are exactly those $(x_1,\ldots,x_r)$ such that $x_1 \oplus \cdots \oplus x_r = 0$, where $\oplus$ denotes bitwise XOR. For example, $(3,2,1)$ is a P-position, because $3 \oplus 2 \oplus 1 = 11_2 \oplus 10_2 \oplus 01_2 = 00_2 = 0$, whereas one glance at $(9,7,5,1)$ shows it is an N-position (next player wins), because only the pile of size $9$ contains $2^3$ in binary. Any winning move must take at least $2$ counters from this pile: doing so leaves a pile of size $7 = 4+2+1$, which is ideal for balancing the contribution $7 \oplus 5 \oplus 1 = 3$ from the other piles. Therefore we leave $3$ counters, taking $6$, reaching the P-position $(3,7,5,1)$.

Every finite impartial game is equal to a nimber. Nim is unusual in that knowing the P-positions immediately determines all its nimbers: $(x_1,\ldots,x_r) = m \star$ if and only if $(x_1,\ldots,x_r) + m\star = 0$ (disjoint sum of games) if and only if $(x_1,\ldots,x_r,m)$ is a P-position.

Theorem. The nimber of $(x_1,\ldots, x_r)$ is $(x_1 \oplus \cdots \oplus x_r)\star$.

Proof. By the previous paragraph, it suffices to find a winning move when $x_1 \oplus \cdots \oplus x_r = 2^s + M$ with $M < 2^s$. By induction, we may assume that all options of $(x_1,\ldots,x_r)$ have their expected nimbers. By reordering piles we may assume without loss of generality that $x_1 = 2^{s+1}A + 2^s + B$ where $A \in \mathbb{N}_0$ and $B < 2^s$. We take all but $2^{s+1}A + 2^s-1$ counters from the first pile, pause to admire the situation, and then take further counters to leave exactly $2^{s+1}A + (2^{s+1}A \oplus x_2 \oplus \cdots \oplus x_r)$. The parenthetical quantity is

$2^s \oplus B \oplus x_1 \oplus \cdots \oplus x_r = 2^s \oplus B \oplus 2^s \oplus M = B \oplus M < 2^s,$

so this is always possible. $\Box$

A small generalization of this argument finds explicit moves giving options of $(x_1,\ldots,x_r)$ with all nimbers $m' \star$ with $m' < x_1 \oplus \cdots \oplus x_r$.

Example. Before I learned about Combinatorial Game Theory I got thrashed at Nim by someone whose main tactic was to reduce to the zero position $(3,2,1)$. (Of course my opponent could also win any non-balanced position with two piles.) In memory of this occasion, consider the position $(19,13,3,2,1) = 19\star + 13\star = 30 \star$. To reach $(16 + c) \star$ with $0 \le c \le 7$, we observe that $30$ and $16 + c$ both have $16$ in their binary expansion, but only $30$ has $8$ (present in the pile $13$). We mentally cancel out the $16$s, leaving $(3,13,3,2,1)$ with target $s \star$, and play, as a first step, to $(3,7,3,2,1)$. We then calculate that

$3 \oplus 3 \oplus 2 \oplus 1 \oplus c = 3 \oplus c.$

We must leave $3 \oplus c$ counters in the half-eaten pile of size $7$; since $0 \le c \le 7$, this is always possible. For instance, if $c = 5$ then since $3 \oplus 5 = 6$, we leave $6$ counters. Since $3 \oplus 6 \oplus 3 \oplus 2 \oplus 1 = 6$, on restoring the cancelled subpiles, we reach a position of nimber $(19 \oplus 6 \oplus 3 \oplus 2 \oplus 1)\star = 21 \star$, as required.

Strong players of games instinctively maximize their mobility. This principle gives a decent strategy for games ranging from Settlers of Catan (a resource acquisition game in which early mobility is essential) and Othello (in which weak players often minimize their mobility, by playing to maximizing their counters in the early game; typically this creates multiple good moves for the opponent) and even a reasonable heuristic for chess, albeit one that has been criticized by Turing. It can be seen at work in the proof and example above, where the half-eaten pile of size

\begin{aligned} 2^{s+1}A + 2^s-1 &= 2^{s+1}A + 1 + 2 + \cdots + 2^{s-1} \\ &= 2^{s+1}A \oplus 1 \oplus 2 \oplus \cdots \oplus 2^{s-1} \end{aligned}

gives us ample options.

### $p$-Nim

Nim boils down to a fight over the nimber $x_1 \oplus \cdots \oplus x_r$. With the end firmly in mind, let us ask for a generalization in which $\oplus$ is replaced with $\oplus_p$, the analogue of $\oplus$ defined with addition modulo $p$. For instance,

$12 \oplus_3 3 \oplus_3 2 \oplus_3 1 = 110_3 \oplus_3 010_3 \oplus_3 002_3 \oplus_3 001_3 = 120_3 = 15,$

and in the hoped for generalization, $(12,3,2,1)$ will have nimber $15{}\star$ and (this is just a restatement) options with nimbers $0,\star,\ldots, 14{}\star$, but not $15{}\star$. Consider the option $9\star$. This can be reached only by removing $3$ counters from both the piles of sizes $12$ and $3$. So it seems we have to permit moves in multiple piles. The position $(1,1,\ldots,1)$ with $p-1$ singleton piles shows that moves in up to $p-1$ piles may be required. So, as a first try, we make the following definition.

Definition. Naive $p$-Nim is the impartial game in which each move consists of up to $p-1$ moves in Nim.

Thus, writing $d$ for Hamming distance, a position $x \in \mathbb{N}_0^r$ in naive $p$-Nim has $y \in \mathbb{N}_0^r$ as an option if and only if $y_i \le x_i$ for each $i$ and $1 \le d(x,y) \le p-1$.

Suppose, inductively, that the options of $(x_1,\ldots, x_r)$ have the required nimbers. Then $x$ has as options all $m' \star$ with $m' < m$.

Sketch proof. Take $s$ maximal such that $p^s$ appears more often in $m$ than $m'$. Cancelling as in the example, we may assume that $p^s$ is the greatest power of $p$ appearing in the base $p$-forms of $x_1, \ldots, x_r$. Take $m-m'-1$ subpiles $p^s$ from piles containing $p^s$ in $p$-ary, and in a remaining pile of size $\alpha p^s + B$ take all but $(\alpha-1)p^s + p^s-1$ of its counters; then use its subpile of size $(p-1) + (p-1)p + \cdots + (p-1)p^{s-1}$ to reach $m' \star$. $\Box$

However, $(x_1,\ldots, x_r)$ may well have further options. For instance, $(3,2,1)$ has the option $(3,0,0)$ with nimber $3$. This destroys the inductive foundations for the sketch proof above. In fact $(3,2,1) = 6\star$ in naive $3$-Nim.

The obvious fix is to bar all moves taking counters $[c_1,\ldots,c_r]$ in which $c_1 \oplus_p \cdots \oplus_p c_r = 0$. (The square brackets are used to distinguish moves from positions.) But still there are too many options: for instance $(6,3)$ should be a P-position but, according to the current rules, we can move by $[4,2]$ or $[5,1]$ to the P-positions $(2,1)$ and $(1,2)$. A computer search finds the following illustrative examples, listed with their intended nimber and the additional moves that must be made illegal:

• $(3,6), 0, \{[2,4],[1,5]\}$,
• $(3,7), 1, \{[2,7],[1,5]\}$,
• $(4,8), 0, \{[2,7]\}$,
• $(5,6), 2, \{[5,4],[4,5]\}$,
• $(5,7), 0, \{[4,5]\}$,
• $(6,6), 3, \{[5,1],[4,2],[2,4],[1,5]\}$
• $(1,5,7), 1, \{[0,4,5]\}$,
• $(2,3,7), 0, \{[2,1,0],[2,2,2]],[0,2,7],[0,1,5]\}$.

Let $v_p(x)$ denote the highest power of $p$ dividing $x \in \mathbb{N}$, and let $v_p(0) = \infty$. Some inspection of these examples may suggest the following definition.

Definition. $p$-Nim is naive $p$-Nim barring any move taking counters $[c_1,\ldots,c_r]$ such that

$v_p(c_1 \oplus_p \cdots \oplus_p \cdots \oplus_p c_r) > \mathrm{min} \{v_p(c_1), \ldots, v_p(c_r) \}.$

By the definition of $v_p(0)$, this generalizes our first attempted rule. Moreover, it bars any move not changing the purported nimber $x_1 \oplus_p \cdots \oplus_p x_r$ of $(x_1,\ldots,x_r)$. The following result is a special case of Lemma 3.1 in this paper of Irie.

Theorem [Irie] The nimber of the $p$-Nim position $(x_1,\ldots,x_r)$ is $(x_1 \oplus_p \cdots \oplus_p x_r)\star$.

Proof. Let $m = x_1 \oplus_p \cdots \oplus_p x_r$. Let $p^b = \nu_p(m-m')$ and let $p^s$ be the greatest power of $p$ appearing in $m-m'$ in $p$-ary. Cancelling as before, we may assume that $p^s$ is the greatest power of $p$ appearing in the base $p$-forms of $x_1, \ldots, x_r$. Similarly, we may cancel the powers $p^a$ with $a < b$ between $m$ and $m'$. So the only powers of $p$ that appear are between $p^b$ and $p^s$. (This will guarantee that our move satisfies the valuation condition.) Now play as in the sketch proof, replacing $p^s - 1$ with

$p^s - p^b = (p-1)p^b + (p-1)p^{b+1} + \cdots + (p-1)p^{s-1}.$

Since $m \star$ is not an option of $(x_1,\ldots,x_r)$, the mex rule implies that the nimber of $(x_1,\ldots,x_r)$ is $m \star$, as required. $\Box$

In fact Irie’s result is more general: a surprising effect of the valuation condition is that allowing moves in $p$ or more piles does not create options with new nimbers. This leads to the notion of the $p$-saturation of a game. The main focus of Irie’s paper is the $p$-saturation of Welter’s game, which is shown to have a remarkable connection with the representation theory of the symmetric group.

### Back to naive $p$-Nim

It seems that in many cases the nimbers in naive $p$-Nim are given by ordinary (naive?) addition. For example, this is true for all $k$-pile positions whenever $p > k$. When $p =3$, the exceptions for $3$ pile positions (in increasing order) with a small first pile are listed below.

• $(1,x,y)$: $(1,1,1) = 0$,
• $(2,x,y)$: $(2,2,2) = 0, (2,2,3) = \star$,
• $(3,x,y)$: $(2,2,3) = \star, (3,3,3) = 0, (3,3,4) = 2\star$,
• $(4,x,y)$: $(3,3,4) = 2\star, (4,4,4) = 0, (4,4,5) = \star, (4,4,6) = 3\star$, $(4,5,5) = 3\star$.

For instance, the matrix below shows nimbers for $(2,x,y)$ with $0\le x,y \le 4$.

$\left(\begin{matrix} 2 & 3 & 4 & 5 & 6 \\ 3 & 4 & 5 & 6 & 7 \\ 4 & 5 & 0 & 1 & 8 \\ 5 & 6 & 1 & 8 & 9 \\ 6 & 7 & 8 & 9 & 10 \end{matrix} \right)$

Once the pattern $(2,x,y) = (2+x+y)\star$ is established (this happens by $(2,4,4)$), the mex rule implies that it continues.

## Burnside’s method

April 24, 2017

Burnside proved in 1901 that if $p$ is an odd prime then a permutation group containing a regular subgroup isomorphic to $C_{p^2}$ is either imprimitive or 2-transitive. His proof was an early application of character theory to permutation groups. Groups with this property are now called B-groups.

Burnside attempted to generalize his 1901 result in two later papers: in 1911, he claimed a proof that $C_{p^n}$ is a B-group for any prime $p$ and any $n \ge 2$, and in 1921, he claimed a proof that all abelian groups, except for elementary abelian groups, are B-groups. The first claim is correct, but his proof has a serious gap. This error appears to have been unobserved (or, just possibly, observed but ignored, since the result was soon proved in another way using Schur’s theory of $S$-rings) until 1994 when it was noted by Peter Neumann, whose explication may be found in his introduction to Burnside’s collected works. In 1995, Knapp extended Burnside’s argument to give a correct proof. Burnside’s second claim is simply false: for example, $S_4 \wr S_2$ acts primitively on $\{1,2,3,4\}^2$, and has a regular subgroup isomorphic to $C_4 \times C_4$. In one of my current projects, I’ve simplified Knapp’s proof and adapted Burnside’s character-theoretic methods to show, more generally, that any cyclic group of composite order is a B-group.

The purpose of this post is to record some proofs omitted for reasons of space from the draft paper. This companion post has some notes on B-groups that may be of more general interest.

### Sums over roots of unity

Let $\xi$ be a primitive $n$th root of unity. Define

$R(r) = \{ r, r + p^{n-1}, \ldots, r+ (p-1)p^{n-1} \}$

for $0 < r < p^{n-1}$. Define a subset $Z$ of $\{1,\ldots,p^n-1\}$ to be null if there exists $s \in \mathbb{N}_0$ and distinct $r_{ij} \in \{1,\ldots, p^{n-1}-1\}$ for $0 \le i \le p-1$ and $1 \le j \le s$ such that $r_{ij} \equiv i$ mod $p$ for each $i$ and $j$ and

$Z = \bigcup_{i=0}^{p-1} \bigcup_{j=1}^s R(r_{ij}).$

Proposition 6.2 Let $\omega = \zeta^{p^{n-1} c}$ where $c$ is not divisible by $p$. Let $\mathcal{O} \subseteq \{1,\ldots,p^n-1\}$. Then

$\sum_{i \in \mathcal{O}} \zeta^i = \sum_{i \in \mathcal{O}} \omega^i$

if and only if either

1. $\mathcal{O}$ is null; or
2. $\mathcal{O} = \{p^{n-1}, \ldots, (p-1)p^{n-1} \} \; \cup \; \bigcup_{i=1}^{p-1} R(r_i) \; \cup \; Z$ where $Z$ is a null set, the $r_i$ are distinct elements of $\{1,\ldots,p^{n-1}-1\}\backslash Z$ and $r_i \equiv i$ mod $p$ for each $i$.

Proof. Since the minimum polynomial of $\zeta$ is

$1+X^{p^{n-1}} + \cdots + X^{(p-1)p^{n-1}}$

we have $\sum_{i \in R(r)} \zeta^i = 0$. Since $\xi^{p^{n-1}} = \omega$, we have $\sum_{i \in R(r)} \omega^i = p\omega^r$. It follows that $\sum_{i \in Z} \xi^i = \sum_{i \in Z} \omega^i = 0$ if $Z$ is a null set. (For the second equality, note the contributions from the $R(r_{ij})$ for fixed $j$ combine to give $p + p\omega + \cdots + p\omega^{p-1} = 0$.) For (2) we have $\sum_{i=1}^{p-1} \xi^{i p^{n-1}} = \omega + \cdots + \omega^{p-1} = -1$, $\sum_{i=1}^{p-1} \omega^{i p^{n-1}} = (p-1)$ and $\sum_{i=1}^{p-1} p \omega^i = - p$. This proves the ‘if’ direction.

Conversely, by Lemma 2.1 in the paper, $\mathcal{O} \backslash \{p^{n-1},\ldots,(p-1)p^{n-1} \}$ is a union of some of some of the sets $R(r)$. There exists a unique subset $A$ of $\{1,\ldots,p-1\}$ and unique $j_i \in \mathbb{N}$ for $0 \le i \le p-1$ and unique $r_{ij} \in \{1,\ldots, p^{n-1}-1\}$ for $1 \le j \le s$ and $0 \le i \le j_i$ such that

$\mathcal{O} = \{ p^{n-1} i : i \in A \} \cup \bigcup_{i=0}^{p-1}\bigcup_{j=1}^{i_j} R(r_{ij}).$

We have $\sum_{i \in \mathcal{O}} \zeta^i = \sum_{i \in A} \omega^i$ and $\sum_{i \in \mathcal{O}} \omega^i = |A| + \sum_{i=0}^{p-1} pj_i \omega^i$. Therefore

$|A| + \sum_{i=0}^{p-1} (pj_i - [i \in A]) X^i$

has $\omega$ as a root. Since this polynomial has degree at most $p-1$ and the minimal polynomial of $\omega$ is $1+X +\cdots + X^{p-1}$, it follows that the coefficients are constant. Hence

$|A| + pj_0 = pj_i - [i \in A]$

for $1\le i \le p-1$. If $A = \varnothing$ then $j_0 = j_1 = \ldots = j_{p-1}$ and $\mathcal{O}$ is null. Otherwise, taking the previous displayed equation mod $p$ we see that $A = \{1,\ldots,p-1\}$, and, moreover, the $j_i$ are constant for $1 \le i \le p-1$. (This holds even if $p=2$.) Set

$s = j_1 = \ldots = j_{p-1}.$

We have $j_0 = s-1$. Hence, $s \in \mathbb{N}$ and choosing in any way $r_i \in \{1,\ldots,p^{n-1}-1\}$ for $1 \le i \le p-1$ such that $r_i \in \mathcal{O}$, we see that $\mathcal{O}$ is the union of $\{p^{n-1}, \ldots, (p-1)p^{n-1}\}$, the sets $R(r_i)$ for $1 \le i \le p-1$ and a null set. $\Box$.

### Ramanujan matrices.

For $p$ prime and $n \in \mathbb{N}$ we define $R(p^n)$ to be the matrix

$\left( \begin{matrix} \scriptstyle 1 & \scriptstyle1 & \scriptstyle 1 & \scriptstyle\ldots & \scriptstyle 1 & \scriptstyle1 & \scriptstyle1 \\ \scriptstyle-1 & \scriptstyle p-1 & \scriptstyle p-1 & \scriptstyle \ldots & \scriptstyle p-1 & \scriptstyle p-1 & \scriptstyle p-1 \\ \scriptstyle0 & \scriptstyle -p & \scriptstyle p(p-1) & \scriptstyle \ldots & \scriptstyle p(p-1) & \scriptstyle p(p-1) & \scriptstyle p(p-1) \\ \scriptstyle 0 & \scriptstyle 0 & \scriptstyle -p^2 & \scriptstyle \ldots & \scriptstyle p^2(p-1) & \scriptstyle p^2(p-1) & \scriptstyle p^2(p-1) \\ \scriptstyle\vdots & \scriptstyle \vdots & \scriptstyle \vdots & \scriptstyle \ddots & \scriptstyle \vdots & \scriptstyle \vdots & \scriptstyle \vdots \\ \scriptstyle 0 & \scriptstyle 0 & \scriptstyle 0 & \scriptstyle \ldots & \scriptstyle -p^{n-2} & \scriptstyle p^{n-2}(p-1) & \scriptstyle p^{n-2}(p-1) \\ \scriptstyle 0 & \scriptstyle 0 & \scriptstyle 0 & \scriptstyle \ldots & \scriptstyle 0 & \scriptstyle -p^{n-1} & \scriptstyle p^{n-1}(p-1) \end{matrix} \right).$

More generally, if $d$ has prime factorization $p_1^{a_1} \ldots p_s^{a_s}$, we define $R(d) = R(p_1^{a_1}) \otimes \cdots \otimes R(p_s^{a_s})$. The rows and columns of $R(d)$ are labelled by the divisors $D$ of $2^m p^n$, as indicated below for the case $d = 2^3 p$, with $p$ and odd prime.

$R(2^3 p) = \begin{matrix} 1 \\ 2 \\ 2^2 \\ 2^3 \\ p \\ 2p \\ 2^2p \\ 2^3p \end{matrix} \left( \begin{array}{cccc|cccc} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ -1 & 1 & 1 & 1 & -1 & 1 & 1 & 1 \\ 0 & -2 & 2 &2 & 0 & -2 & 2 & 2 \\ 0 & 0 & -2^2 & 2 & 0 & 0 & -2^2 & 2^2\\ \hline -1 & -1 & -1 & -1 & 1 & 1 & 1 & 1 \\ 1 & -1 & -1 & -1 & -1 & 1 & 1 & 1 \\ 0 & 2 & -2 & -2 & 0 & -2 & 2 & 2 \\ 0 & 0 & 2^2 & -2^2 & 0 & 0 & -2^2 & 2^2 \end{array} \right).$

Say that a partition of $D \backslash \{ d\}$ is coprime if the highest common factor of the numbers in each part is $1$. The aim of the game is to find a non-empty set of rows, $X$ say, of $R(d)$ such that the subsets of columns (excluding column $d$) on which the sum of the rows in $X$ are equal form a coprime partition of the divisors.

There is an application to B-groups only when $d$ is even, in which case we may assume that $2 \in X$. Proposition 6.7 in the paper states that in this case, the only way to win this game when $d = 2^n$, $d = 2^n p$ or $d = 2 p^n$, where $p$ is an odd prime, is to put every single row in $X$. This implies that $C_{2^m p^n}$ is a B-group when $n \le 1$ or $m \le 1$. However the result on the game may well hold more generally.

#### Constant sums for $p^n$

The only coprime partition of $D \backslash \{p^n\}$ has a singleton part, so the row sums are all equal. By adding $p^{e-1}$ to each entry in row $p^e$ for $e \in \{1,\ldots, n\}$, we obtain the matrix below.

$\left( \begin{matrix} \scriptstyle 1 & \scriptstyle1 & \scriptstyle 1 & \scriptstyle\ldots & \scriptstyle 1 & \scriptstyle1 & \scriptstyle1 \\ \scriptstyle 0 & \scriptstyle p & \scriptstyle p & \scriptstyle \ldots & \scriptstyle p & \scriptstyle p & \scriptstyle p \\ \scriptstyle p & \scriptstyle 0 & \scriptstyle p^2 & \scriptstyle \ldots & \scriptstyle p^2 & \scriptstyle p^2 & \scriptstyle p^2 \\ \scriptstyle p^2 & \scriptstyle p^2 & \scriptstyle 0 & \scriptstyle \ldots & \scriptstyle p^3 & \scriptstyle p^3 & \scriptstyle p^3 \\ \scriptstyle\vdots & \scriptstyle \vdots & \scriptstyle \vdots & \scriptstyle \ddots & \scriptstyle \vdots & \scriptstyle \vdots & \scriptstyle \vdots \\ \scriptstyle p^{n-2} & \scriptstyle p^{n-2} & \scriptstyle p^{n-2} & \scriptstyle \ldots & \scriptstyle 0 & \scriptstyle p^{n-1} & \scriptstyle p^{n-1} \\ \scriptstyle p^{n-1} & \scriptstyle p^{n-1} & \scriptstyle p^{n-1} & \scriptstyle \ldots & \scriptstyle p^{n-1} & \scriptstyle 0 & \scriptstyle p^n \end{matrix} \right)$

which still has constant sums over the rows in $X$. Let $x_i = 1$ if $p^i \in X$ and let $x_i = 0$ otherwise. Writing numbers in base $p$, the row sums over $X$ are, for each column,

\begin{aligned} 1 \quad &\quad x_nx_{n-1} \ldots x_3x_2x_0 \\ p \quad &\quad x_n x_{n-1} \ldots x_3x_1x_0 \\ p^2\quad &\quad x_n x_{n-1} \ldots x_2x_1x_0 \\ \vdots \; \quad &\quad \qquad \vdots \\ p^{n-2} \quad &\quad x_nx_{n-2} \ldots x_2x_1x_0 \\ p^{n-1} \quad &\quad x_{n-1}x_{n-2} \ldots x_2x_1x_2. \end{aligned}

(Note in each case there are $n$ digits, the most significant corresponding to $p^{n-1}$.) The numbers in the right column are equal, hence all the $x_i$ are equal and $X = D$, as required. Taking $p=2$ this gives an alternative proof of Proposition 6.7(i).

#### Proofs of further claims on the Ramanujan matrices

Let $R(d)^{\circ}$ denote $R(d)$ rotated by a half turn. The following result was stated without proof in the paper; it implies that each $R(d)$ is invertible, with inverse $\frac{1}{d}R(d)^\circ$. (Update: I later found a much better proof, now outlined in the paper, using a lower-upper decomposition of $R(d)$.)

Proposition. We have $R(p^n)^{-1} = \frac{1}{p^n} R(p^n)^\circ$ and $\det R(p^n) = p^{n(n+1)}/2$.

Proof. Since $R(p^n)^\circ_{p^sp^t} = R(p^n)_{p^{n-s}p^{n-t}}$, we have $R(p^n)^\circ_{p^np^t} = 1$ for all $t \in \{0,\ldots, n\}$ and

\begin{aligned} R(p^n)^\circ_{p^sp^t} &= \begin{cases} (p-1)p^{n-s-1} & n-t \ge n-s \\ -p^{n-s-1} & n-t=n-s-1 \\ 0 & n-t \le n-s-2 \end{cases} \\ &= \begin{cases} (p-1)p^{n-s-1} & s \ge t \\ -p^{n-s-1} & s = t-1 \\ 0 & s \le t-2 \end{cases} \end{aligned}

for $s \in \{0,\ldots, n-1\}$ and $t \in \{0,\ldots,n\}$. We use this to show that

$\sum_{c=0}^n R(p^n)_{p^rp^c} R(p^n)^{\circ}_{p^c p^{r'}} = p^n [r=r']$

for $r,r' \in \{0,\ldots, n\}$. When $r=0$ the first term in each product is $1$, so the left-hand side is the column sum of the $n-r'$-th of $R(p^n)$; this is $p^n$ if $r'=0$ and $0$ otherwise, as required. Now suppose that $r \ge 1$. Since $R(p^n)_{p^rp^c}$ vanishes when $c \le r-2$, the left-hand side is

$-p^{r-1}R(p^n)^\circ_{p^{r-1}p^{r'}} + \sum_{c=r}^n p^{r-1}(p-1) R(p^n)_{p^cp^{r'}}.$

Take out a factor $p^{r-1}$ to define $L$. Substitute for $R(p^n)^\circ_{p^cp^{r'}}$, and split off the summand from $R(p^n)^\circ_{p^np^{r'}} = 1$ to get

\begin{aligned} L &= \begin{cases} (p-1)p^{n-r} & r' \le r-1 \\ -p^{n-r} & r' = r \\ 0 & r' \ge r+1 \end{cases} \\ & \qquad + (p-1) \sum_{c=r}^{n-1} (p-1) \begin{cases} (p-1)p^{n-c-1} & c \ge r' \\ -p^{n-c-1} & c = r'-1 \\ 0 & c \le r'-2 \end{cases} + p-1.\end{aligned}

We must now consider three cases. When $r=r'$ we get

\begin{aligned} L &= p^{n-r} + (p-1)\sum_{c=r}^{n-1} (p-1)p^{n-c-1} + p-1 \\ &= p^{n-r} + (p-1)(p^{n-r}-1) + p-1 \\ &= p^{n-r+1} \end{aligned}

as required. When $r \ge r'+1$ we have $c > r'$ in all summands so

\begin{aligned} L &= -(p-1)p^{n-r} + (p-1)\sum_{c=r}^{n-1} (p-1)p^{n-c-1} + p^{r-1}(p-1) \\ &= (p-1) \bigl( -p^{n-r} + (p^{n-r}-1) + 1 \bigr) \\ &= 0. \end{aligned}

When $r \le r'-1$ the first non-zero summand occurs for $c=r'-1$ so we have

\begin{aligned} L &= - (p-1)p^{n-r'} + (p-1) \sum_{c=r'}^{n-1} (p-1)p^{n-c-1} + p^{r-1}(p-1) \\ &= (p-1) \bigl( -p^{n-r'} + (p^{n-r'}-1) + 1 \bigr) \\ &=0. \end{aligned}

Now taking determinants, using that $\det R(p^n)^\circ = \det R(p^n)$ (the matrices are conjugate by the matrix with $1$s on its antidiagonal and $0$s elsewhere), we get $1 / \det R(p^n) = (p^n)^{n+1} \det R(p^n)$. Hence $\det R(p^n) = p^{n(n+1)/2}$, as required.

### Jordan block matrices.

Let $J$ be the $m \times m$ unipotent upper-triangular Jordan block matrix over $\mathbb{F}_p$. We have $(J-I)^m = 0$, hence $(J-I)^{p^r} = 0$ whenever $p^r \ge m$. On the other hand if $k \le m-1$ then $J^k$ has a $1$ in position $(1,k)$. Therefore $J^{p^r} = I$ if and only if $p^r \ge m$. We also need a result on the relative trace matrix $I + J + \cdots + J^{p^{r}-1}$. Note that $\binom{p}{k}$ is divisible by $p$ for $k \in \{1,\ldots, p-1\}$. (For instance, a $p$-cycle acts freely on the set of $k$-subsets of $\{1,\ldots, p\}$.) An easy inductive argument using $\binom{p-1}{k-1} + \binom{p-1}{k} = \binom{p}{k}$ shows that $\binom{p-1}{k} \equiv (-1)^k$ mod $p$ for all $k$. Now Lucas’ Theorem implies that $\binom{p^r-1}{k} \equiv (-1)^k$ mod $p$. Hence

$(J-I)^{p^r-1} = I + J + \cdots + J^{p^r-1}.$

It follows that the right-hand side is $0$ if and only if $p^r > m$.

## Model characters for wreath products with symmetric groups

April 24, 2017

Let $G$ be a finite group. The model character for $G$ is $\sum_{\chi \in \mathrm{Irr}(G)} \chi$. A nice short paper by Inglis, Richardson and Saxl gives a self-contained inductive proof that if $\pi_{2r}$ is the permutation character of $\mathrm{Sym}_{2r}$ acting by conjugacy on its set of fixed-point-free involutions then

$\sum_r (\pi_{2r} \times \mathrm{sgn}_{n-2r}) \bigl\uparrow^{\mathrm{Sym}_n}$

is the model character for $\mathrm{Sym}_n$. Assuming Pieri’s rule, that if $\alpha$ is a partition of $r$ then $\chi^\alpha \times \mathrm{sgn}_{\mathrm{Sym}_t} \uparrow^{\mathrm{Sym}_{r+t}} = \sum \chi^\lambda$, where the sum is over all partitions $\lambda$ obtained from $\alpha$ by adding $t$ boxes, no two in the same row, this follows from the well-known fact (proved inductively in the paper) that $\pi_{2r} = \sum_{\gamma \in \mathrm{Par}(r)} \chi^{2\gamma}$.

Note that $\pi_{2r} \times \mathrm{sgn}_{n-2r}\bigl\uparrow^{\mathrm{Sym}_n}$ is the induction of a linear character from the centralizer of the involution $(1,2)\ldots (2r-1,2r) \in \mathrm{Sym}_n$. (When $r=0$ we count the identity as an involution as an honorary involution.) Up to conjugacy, each involution is used exactly once to define the model character.

In an interesting paper, Baddeley generalizes the Inglis–Richardson–Saxl result to a larger class of groups. He makes the following definition.

Definition. An involution model for a finite group $G$ is a collection $\{ (\tau_1, \rho_1), \ldots, (\tau_c, \rho_c) \}$ such that $\{\tau_1,\ldots, \tau_c\}$ is a set of conjugacy-class representatives for the involutions of $G$ and $\rho_i : C_G(u_i) \rightarrow \mathbb{C}$ is a linear character for each $i \in \{1,\ldots, c\}$, chosen so that

$\sum_{\chi \in \mathrm{Irr}(G)} \chi = \sum_{i=1}^c \rho_i \bigl\uparrow_{\mathrm{Cent}_G(\tau_i)}^G$

For example, if an abelian group $A$ has an involution model then, since each centralizer is $A$ itself, comparing degrees shows that $c = |A|$, and so $A$ is an elementary abelian 2-group. Conversely, any such group clearly has an involution model. By the Frobenius–Schur count of involutions, a necessary condition for a group to have an involution model is that all its irreducible representations are defined over the reals.

Baddeley’s main theorem is as follows.

Theorem. [Baddeley] If a finite group $H$ has an involution model then so does $H \wr \mathrm{Sym}_n$.

The aim of this post is to sketch my version of Baddeley’s proof of his theorem in the special case when $H = C_2$. Some familiarity with the theory of conjugacy classes and representations of wreath products in Chapter 4 of James & Kerber, Representation theory of the symmetric group is assumed. The characters $\rho_i$ defined below differ from Baddeley’s by a factor of $\mathrm{Inf}_{\mathrm{Sym}_n}^{H \wr \mathrm{Sym}_n} \mathrm{sgn}_{\mathrm{Sym}_n}$; this is done to make $\phi_r$ (defined below) a permutation character, in analogy with the Inglis–Richardson–Saxl character $\pi_r$.

#### Aside: the Hyperoctahedral group

The group of all $n \times n$ matrices with entries $\pm 1$ that become permutation matrices when all $-1$ entries are changed to $1$ is isomorphic to $H \wr S_n$. Thus $C_2 \wr \mathrm{Sym}_n$ is the hyperoctahedral group of symmetries of the $n$-hypercube. It is a nice exercise to identify $\mathrm{Sym}_4$, the rotational symmetry group of the cube, as an explicit index $2$ subgroup of $H \wr \mathrm{Sym}_3$.

### Preliminaries

From now on let $H = \langle h \rangle \cong C_2$. The group $\mathrm{Sym}_n$ acts on $H^{\times n}$ by

$(b_1,\ldots,b_n)^\sigma = (b_{1\sigma^{-1}},\ldots,b_{n\sigma^{-1}}).$

This is a place permutation: the element $b_i$, in position $i$ on the left-hand side, occupies position $b_{i\sigma}$ on the right-hand side. Let $G = H^{\times n} \rtimes \mathrm{Sym}_n \cong H \wr \mathrm{Sym}_n.$

We write elements of $G$ as $(b_1,\ldots,b_n;t)$ where each $b_i \in \{1,h\}$.

#### Imprimitive action of $G$

For each $i \in \{1,\ldots, n\}$ introduce a formal symbol $\overline{i}$. (This could be thought of as $-i$, but I find that bar makes for a more convenient notation.) Let $\Omega = \{1, \ldots, n, \overline{1}, \ldots, \overline{n} \}$. Given $\sigma \in \mathrm{Sym}_{\{1,\ldots,n\}}$, we define $\overline{\sigma} \in \mathrm{Sym}_{\{\overline{1},\ldots,\overline{n}\}}$ by $i \overline{\sigma} = i$ for all $i \in \{1,\ldots, n\}$ and $\overline{i} \overline{\sigma} = \overline{i\sigma}$ for all $\overline{i} \in \{\overline{1},\ldots,\overline{n} \}$. Then $G$ is isomorphic to the subgroup $G_n \le \mathrm{Sym}_\Omega$ defined by $G_n = B_n \rtimes T_n$ where

$B_n = \langle (1, \overline{1}), \ldots, (n, \overline{n}) \rangle$

and

$T_n = \langle \sigma \overline{\sigma} : \sigma \in \mathrm{Sym}_n \rangle.$

So $G_n$ acts imprimitively on $\Omega$ with blocks $\{1, \overline{1}\}$, $\ldots$, $\{n,\overline{n}\}$.

#### Irreducible representations of $G$

Let $\epsilon : \{1,h\} \rightarrow \{\pm 1\}$ be the faithful character of $H \cong C_2$ and let $\widetilde{\epsilon}^{\times s}$ denote the linear character of $H \wr \mathrm{Sym}_s$ on which each of the $s$ factors of $H$ in the base group factor acts as $\epsilon$. Given a bipartition $(\lambda|\mu) \in \mathrm{BPar}(n)$ with $\lambda \in \mathrm{Par}(t)$ and $\mu \in \mathrm{Par}(n-t)$, we define

$\chi^{(\lambda|\mu)} = \bigl( \mathrm{Inf}_{\mathrm{Sym}_t}^{H \wr \mathrm{Sym}_t} \chi^\lambda \times \widetilde{\epsilon}^{\times (n-t)} \mathrm{Inf}_{\mathrm{Sym}_{n-t}}^{H \wr \mathrm{Sym}_{n-t}} \chi^\mu \bigr) \bigl\uparrow^{H \wr \mathrm{Sym}_n}.$

Basic Clifford theory shows that the characters $\chi^{(\lambda|\mu)}$ for $(\lambda|\mu) \in \mathrm{BPar}(n)$ form a complete irredundant set of irreducible characters of $G$. For example, the $n$-dimensional representation of $G$ as the symmetry group of the $n$-hypercube has character labelled by the bipartition $\bigl((n-1)|(1)\bigr)$.

#### Conjugacy classes of involutions in $G$

Since $G_n/B_n \cong T_n \cong \mathrm{Sym}_n$, any involution in $G$ is of the form $(b_1,\ldots,b_n ; \sigma)$ where $\sigma \in \mathrm{Sym}_n$ is an involution. Moreover, as the calculation $(h,1;(1,2))^2 = (h,h)$ suggests, the place permutation action of $\sigma$ on $(b_1,\ldots,b_n)$ permutes amongst themselves the indices $i$ such that $b_i = h$. By applying a suitable place permutation we may assume that $b_1,\ldots b_{n-s} = 1$ and $b_{n-s+1} = \ldots = b_n = h$, for some $s \in \{0,\ldots, n\}$. Now using that $(h,h ; (1,2))$ is conjugate, by $(h,1)$, to $(1,1;(1,2))$, we see that a set of conjugacy class representatives for the involutions in $G$ is

$\{ (\stackrel{n-s}{\overbrace{1, \ldots, 1}} , \stackrel{s}{\overbrace{h, \ldots, h}} ; (1,2) \ldots (2r-1,2r) \}$

for $r$ and $s$ such that $2r+s \le n$. The generalized cycle-type invariant defined in James–Kerber can be used to show no two of these representatives are conjugate.

#### Centralizers of involutions in $G$

As an element of $G_n$, the involution defined above is

$\tau_s^{(r)} = (n-s+1,\overline{n-s+1}) \ldots (n,\overline{n}) \theta_r\overline{\theta_r}$

where, by definition, $\theta_r = (1,2)\ldots (2r-1,2r) \in T_{2r} \le \mathrm{Sym}_r$. If $(b_1,\ldots,b_n;\sigma)$ commutes with $\tau_s^{(r)}$ then, passing to the quotient, $\sigma$ commutes with $\theta_r$. Therefore the non-singleton orbits

$\{1,2\}, \{\overline{1},\overline{2}\}, \ldots, \{2r-1,2r\}, \{\overline{2r-1},\overline{2r}\}$

of $\tau_s^{(r)}$ are permuted by $(b_1,\ldots,b_n;\sigma)$, as are the remaining non-singleton orbits $\{n-s+1, \overline{n-s+1}\}, \ldots, \{n,\overline{n} \}$.

Therefore

$\mathrm{Cent}_{G_n}(\tau_s^{(r)}) = \mathrm{Cent}_{G_{2r}}(\theta_r\overline{\theta_r}) \times G_{n-(2r+s)} \times G_s$

where $G_{n-(2r+s)}$ acts on $\{2r+1, \ldots, n-s, \overline{2r+1}, \ldots, \overline{n-s}\}$ and $G_s$ acts on $\{n-s+1,\ldots,n,\overline{n-s+1},\ldots,\overline{n}\}$. Clearly $(b_1,b_2,\ldots, b_{2r-1},b_{2r}) \in G_r$ commutes with

$(1,2)\ldots (2r-1,2r) (\overline{1},\overline{2})\ldots(\overline{2r-1},\overline{2r})$

if and only if $b_1 = b_2, \ldots, b_{2r-1} = b_{2r}$. Therefore the first factor is permutation isomorphic to

$D_r \rtimes \mathrm{Cent}_{T_{2r}}(\theta_r\overline{\theta_r})$

where

$D_r = \{(b_1,b_1,\ldots,b_r,b_r) : b_1, \ldots, b_r \in H \}$.

Set $E_r = \mathrm{Cent}_{T_{2r}}(\theta_r\overline{\theta_r})$. Note that $E_r$ is permutation isomorphic to $C_2 \wr \mathrm{Sym}_r$, acting with one orbit on $\{1,\ldots,r\}$ and another on $\{\overline{1},\ldots,\overline{r}\}$. (One has to get used to the two different ways in which the group $C_2 \wr \mathrm{Sym}_r$ arises; in this post I’ve used $H$ when the $C_2$ comes from the base group.)

For example, if $n=7$ then

$\tau_2^{(2)} = (1,2)(\overline{1},\overline{2})(3,4)(\overline{3},\overline{4})(6,\overline{6})(7,\overline{7}) \in G_7$

and the centralizer of $\tau_2^{(2)}$ is generated by $(1, \overline{1})(2, \overline{2})$, $(3,\overline{3})(4,\overline{4})$, $(5, \overline{5})$, $(6,\overline{6})$, $(7,\overline{7})$ in the base group $B_7$ and $(1,2)(\overline{1},\overline{2})$, $(1,3)(2,4)(\overline{1},\overline{3})(\overline{2},\overline{4})$ and $(6,7)(\overline{6},\overline{7})$ in the top group $T_7$. The first two top group generators generate $E_2$.

#### Definition of the linear representations and reduction

The second and third factors of the centralizer are both complete wreath product, so, by analogy with the Inglis–Richardson–Saxl paper, it is natural guess to define $\rho_s^{(r)}: \mathrm{Cent}_{G_n}(\tau_s^{(r)}) \rightarrow \mathbb{C}$ so that $\rho_s^{(r)}$ restricts to:

• The trivial character on $D_r \rtimes \mathrm{Cent}_{T_{2r}}(\theta_r\overline{\theta_r}) = D_r \rtimes E_r$;
• $\mathrm{Inf}_{\mathrm{Sym}_{n-(2r+s)}}^{H \wr \mathrm{Sym}_{n-(2r+s)}} \mathrm{sgn}_{\mathrm{Sym}_{n-(2r+s)}}$ on $G_{n-(2r+s)}$;
• $\widetilde{\epsilon}^{\times s} \mathrm{Inf}_{\mathrm{Sym}_s}^{H \wr \mathrm{Sym}_s} \mathrm{sgn}_{\mathrm{Sym}_s}$ on $G_s$.

That is (omitting the details of the inflations for brevity),

$\rho_s^{(r)} = 1_{D_r \rtimes E_r} \times \mathrm{Inf} \;\mathrm{sgn}_{\mathrm{Sym}_{n-(2r+s)}} \times \widetilde{\epsilon}^{\times s} \mathrm{Inf}\; \mathrm{sgn}_{\mathrm{Sym}_s}.$

Define

$\phi_r = 1_{D_r \rtimes E_r}\bigl\uparrow^{G_{2r}}.$

Since $\widetilde{\epsilon}^{\times 2r}$ restricts to the trivial character of $D_r \rtimes E_r$, we have $\phi_r = \widetilde{\epsilon}^{\times r}\phi_r$. The definition of $\rho_s^{(r)}$ above is therefore symmetric with respect to $1_H$ and $\epsilon$. Moreover, if

$\phi_r = \sum_{(\alpha|\beta) \in \mathrm{BPar}(2r)} m_{\alpha\beta} \chi^{(\alpha|\beta)}$

then, by Pieri’s rule for the hyperoctahedral group (this follows from Pieri’s rule for the symmetric group in the same way as the hyperoctahedral branching rule follows from the branching rule for the symmetric group — for the latter see Lemma 4.2 in this paper),

$\rho_s^{(r)} \bigl\uparrow^{G_n} = \sum_{(\alpha|\beta) \in \mathrm{BPar}(2r)} \sum_{(\lambda|\mu)} m_{\alpha\beta} \chi^{(\lambda|\mu)}$

where the second sum is over all bipartitions $(\lambda|\mu)$ of $n$ such that $\lambda$ is obtained by adding $n-(2r+s)$ boxes, no two in the same row, to $\alpha$ and $\mu$ is obtained by adding $s$ boxes, again no two in the same row, to $\beta$. Therefore Baddeley’s theorem holds if and only if $\phi_r$ is multiplicity-free, with precisely the right constituents for the Pieri inductions as $s$ varies to give us every character of $H \wr \mathrm{Sym}_n$ exactly once. This is the content of the following proposition.

Proposition. $\phi_r = \sum_{(\gamma|\delta) \in \mathrm{BPar}(r)} \chi^{(2\gamma|2\delta)}$

### Proof of the proposition

To avoid some messy notation I offer a ‘proof by example’. I believe it shows all the essential ideas of the general case.

Proof by example. Take $r=3$. We have

$D_3 = \langle (1,\overline{1})(2, \overline{2}), (3,\overline{3})(4,\overline{4}), (5, \overline{5})(6, \overline{6}) \rangle \le B$

and so $\phi_3$ is induced from the trivial character of

$\mathrm{Cent}_{G_3}(\tau_3^{(0)}) = D_3 \rtimes E_3$

(Recall that $\tau_3^{(0)} = \theta_3 \overline{\theta_3}$ where $\theta_3 = (1,2)(3,4)(5,6)$ and $E_3 = \mathrm{Cent}_{T_6}(\theta_3\overline{\theta_3})$.) To apply Clifford theory, it would be much more convenient if we induced from a subgroup of $G_6 = B_6 \rtimes T_6$ containing the full base group $B_6$. We arrange this by first inducing up to $B_6 \rtimes E_3$. (For the action of $E_3$, it is best to think of $B_6$ as $(H \times H)^{\times 3}$.) The calculation

$1_{D_1}\bigl\uparrow^{G_2} = \widetilde{1_H}^{\times 2} + \widetilde{\epsilon}^{\times 2}$

shows that, on restriction to $B_6$, the induced character

$1_{D_3 \rtimes E_3} \bigl\uparrow^{B_6 \rtimes E_3}$

is the sum of all products $\psi_1 \times \psi_2 \times \psi_3$ where each $\psi_i$ is one of the irreducible characters $1_{H \wr C_2} = \widetilde{1_H}^{\times 2}$ or $\eta_{H \wr C_2} = \widetilde{\epsilon}^{\times 2}$ on the right-hand side above. The centralizer $\mathrm{Cent}_{T_6}(\theta_3 \overline{\theta_r})$ acts transitively on the 3 factors: glueing together the products in the same orbits into induced characters we get that $1_{D_3 \rtimes E_3} \uparrow^{B_6 \rtimes E_3}$ has the following irreducible constituents:

• $\widetilde{1_{H \wr C_2}}^{\times 3} 1_{S_3}$
• $\bigl( \widetilde{1_{H \wr C_2}}^{\times 2} 1_{S_2} \times \eta_{H \wr C_2} \bigr) \bigl\uparrow_{(B_4 \rtimes E_2) \times (B_2 \rtimes E_1)}^{B_6 \rtimes E_3}$
• $1_{H \wr C_2} \times \widetilde{\eta_{H \wr C_2}}^{\times 2} 1_{S_2} \bigl\uparrow_{(B_2 \rtimes E_1) \times (B_4 \rtimes E_2)}^{B_6 \rtimes E_3}$
• $\widetilde{\eta_{H \wr C_2}}^{\times 3} 1_{S_3}$.

Note that the ’tilde-construction’ enters in two ways: once to combine characters of each two $H$-factors in the same orbit of $\tau_r$, and then again to combine the characters obtained in this way. As a small check, observe that the sum of degrees is $1 + 3 + 3 + 1 = 8$, which is the index of $D_3 \rtimes E_3$ in $B_6 \rtimes E_3$.

Reflecting the isomorphisms

$(H \wr C_2) \wr S_3 \cong H \wr (C_2 \wr S_3) \cong H \wr E_3 \cong B_6 \rtimes E_3$

we rewrite these characters as follows:

• $\widetilde{1_H}^{\times 6} 1_{E_3}$
• $\widetilde{1_H}^{\times 4} 1_{E_2} \times \widetilde{\epsilon}^{\times 2} 1_{E_1} \uparrow^{B_6 \rtimes E_3}$
• $\widetilde{1_H}^{\times 2} 1_{E_1} \times \widetilde{\epsilon}^{\times 4} 1_{E_2} \uparrow^{B_6 \rtimes E_3}$
• $\widetilde{\epsilon}^{\times 6} 1_{E_3}$.

It is now routine to induce ‘in the top group’ up to

$G_6 = C_2 \wr S_6 = B_6 \rtimes T_6$

using the decomposition of $\pi_r$ into characters labelled by even partitions. For the second summand we use transitivity of induction, starting at the subgroup $(B_4 \rtimes E_2) \times (B_2 \rtimes E_1)$ and going via $(B_4 \rtimes T_4) \times (B_2 \rtimes T_2)$. The third summand is dealt with similarly. Thus $\phi_3$ is the sum of the $\chi^{(\lambda|\mu)}$ for the following bipartitions $(\lambda|\mu)$:

• $\bigl((6)|\varnothing\bigr), \bigl((4,2)|\varnothing\bigr), \bigl((2,2,2)|\varnothing\bigr)$
• $\bigl((4)|(2)\bigr), \bigl((2,2)|(2)\bigr)$
• $\bigl((2)|(4)\bigr), \bigl((2)|(2,2)\bigr)$
• $\bigl(\varnothing|(6)\bigr), \bigl(\varnothing|(4,2)\bigr), \bigl(\varnothing|(2,2,2)\bigr)$.

as required. $\Box$

## Some Stirling Number identities by differentiation

March 20, 2017

Let $G(z)$ be the exponential generating function enumerating a family of combinatorial objects. For example, $-\log(1-z) = \sum_{n=1}^\infty z^n/n$ is the e.g.f. enumerating cycles (there are $(n-1)!$ cycles on $\{1,\ldots,n\}$) and $\mathrm{e}^z -1 = \sum_{n=1}^\infty z^n/n!$ is the e.g.f enumerating non-empty sets. Then $\exp G(z)$ is the e.g.f. enumerating set partitions where each part carries a $G$-structure. For example,

$\exp (\mathrm{e}^z - 1) = \sum_{n=0}^\infty B_n \frac{z^n}{n!}$

where the Bell Number $B_n$ is the number of set partitions of $\{1,\ldots, n\}$. We can keep track of the number of parts with a further variable. For example

$\exp \bigl( -x \log (1-z) \bigr) = \sum_{n=0}^\infty \sum_{m=0}^\infty \genfrac{[}{]}{0pt}{}{n}{m} x^m \frac{z^n}{n!}$

where the (unsigned) Stirling Number of the First Kind $\genfrac{[}{]}{0pt}{}{n}{m}$ is the number of permutations of $\{1, \ldots, n\}$ having exactly $m$ disjoint cycles. Similarly

$\exp \bigl( x(\mathrm{e}^z - 1) \bigr) = \sum_{n=0}^\infty \sum_{m=0}^\infty \genfrac{\{}{\}}{0pt}{}{n}{m} x^m \frac{z^n}{n!}$

where the Stirling Number of the Second Kind $\genfrac{\{}{\}}{0pt}{}{n}{m}$ is the number of set partitions of $\{1, \ldots, n\}$ into exactly $m$ parts.

All this is explained beautifully in Chapter 3 of Wilf’s book generating functionology, in a way that leads readily into the high-brow modern take on these ideas using combinatorial species. For my planned combinatorics textbook I expect to deal with products of exponential generating functions in an ad-hoc way, and probably not do much more, since it’s impossible to compete with Wilf’s exposition.

#### Two part structures where one part is boring

A special case of the multiplication rule is that $\exp(z) G(z)$ is the e.g.f enumerating two-part set compositions $(A,B)$ where $A$ carries no extra structure and $B$ carries a $G$-structure. For example, if $d_n$ is the number of derangements of $\{1,\ldots, n\}$ then, since any permutation is uniquely determined by its set of fixed points and the derangement it induces on the remaining points, we have $\exp(z) G(z) = \sum_{n=0}^\infty n! z^n/n! = 1/(1-z)$, giving

$G(z) = \exp(-z)/(1-z).$

For another example, observe that the e.g.f enumerating the $r^n$ functions $f : \{1,\ldots, n\} \rightarrow \{1,\ldots, r\}$ is

$\sum_{r=0}^\infty r^n \frac{w^r}{r!} \frac{z^n}{n!} = \sum_{r=0}^\infty \frac{w^r}{r!} \sum_{n=0}^\infty \frac{(rz)^n}{n!} = \sum_{r=0}^\infty \frac{w^r}{r!} \mathrm{e}^{rz} = \exp (w \mathrm{e}^z).$

Each such function is uniquely determined by a pair $(A,B)$ where $A = \{1,\ldots,r\} \backslash \mathrm{im} f$ carries no extra structure, and $B = \mathrm{im} f$ carries a set composition of $\{1,\ldots, n\}$ into $|B|$ parts. Therefore the e.g.f. for set compositions is

$\exp(-w) \exp (w\mathrm{e}^z) = \exp(w (\mathrm{e}^z-1)).$

(Note this series is doubly exponential: the number of set compositions of $\{1,\ldots,n\}$ into exactly $m$ parts is the coefficient of $w^m/m! z^n/n!$.) Since there are $m!$ set compositions associated to each set partition into $m$ parts, we get

$\exp(w (\mathrm{e}^z-1) = \sum_{s=0}^\infty \sum_{n=0}^\infty \genfrac{\{}{\}}{0pt}{}{n}{m} w^m \frac{z^n}{n!}.$

as claimed earlier.

#### Binomial inversion

Multiplication by an exponential series is closely related to binomial inversion. Let $G(z) = \sum_{n=0}^\infty a_n z^n/n!$. Then

$\exp(z) G(z) = \sum_{n=0}^\infty b_n z^n/n!$

if and only if $b_n = \sum_{m=0}^n \binom{n}{m} a_m$.

This gives a very quick and elementary proof of the derangements formula: enumerating permutations by their number of fixed points we have $n! = \sum_{m=0}^n \binom{n}{m} d_m$, so

$\sum_{n=0}^\infty d_n z^n/n! = \exp(-z) \sum_{n=0}^\infty n! z^n/n! = \exp(-z)/(1-z);$

now take the coefficient of $z^n$ to get

$d_n = n!\bigl( 1-\frac{1}{1!} + \frac{1}{2!} - \cdots + \frac{(-1)^n}{n!} \bigr).$

#### Differentiation

Let $G(z) = \sum_{n=1}^\infty a_n z^n/n!$ be the exponential generating function for $G$-structures. We have seen that the weighted exponential generating function for $G$-structured set partitions is

$\sum_{m=0}^\infty \sum_{n=0}^\infty a^{(m)}_n x^m \frac{z^n}{n!} = \exp x G(z)$.

Differentiating $k$ times with respect to $x$, dividing by $k!$, and then setting $x=1$ we get

$\sum_{n=0}^\infty \sum_{m=0}^n a^{(m)}_n \binom{m}{k} \frac{z^n}{n!} = G(z)^k/k! \exp G(z).$

Now $G(z)^k/k!$ is the exponential generating function for $G$-structured set partitions into exactly $k$ parts, so, taking coefficients of $z^n/n!$, we get

\begin{aligned} \sum_{m=0}^n a^{(m)}_n \binom{m}{k} &= \Bigl[ \frac{z^n}{n!}\Bigr] \sum_{r=0}^\infty a^{(k)}_r \frac{z^r}{r!} \exp G(z) \\ &= \sum_{r=0}^n a^{(k)}_r \frac{n!}{r!} [z^{n-r}] \exp G(z) \\ &= \sum_{r=0}^n \binom{n}{r} a^{(k)}_r \bigl[\frac{z^{n-r}}{(n-r)!}\Bigr] \exp G(z) \\ &= \sum_{r=0}^n \binom{n}{r} a^{(k)}_r b_{n-r} \end{aligned}

where $b_n$ is the number of $G$-structured set partitions of $\{1,\ldots,n\}.$

This identity gives uniform proofs of two Stirling Number identities.

Taking $G(z) = -\log (1-z)$ to enumerate Stirling Numbers of the First Kind we have $\exp G(z) = 1/(1-z)$ and $b_n = n!$ so

$\sum_{m=0}^n \genfrac{[}{]}{0pt}{}{n}{m} \binom{m}{k} = \sum_{r=0}^n \binom{n}{r} \genfrac{[}{]}{0pt}{}{r}{k} (n-r)! = \genfrac{[}{]}{0pt}{}{n+1}{k+1}$

where the final equality holds because the middle sum counts triples $(X, \sigma, \tau)$ where $X$ is an $r$-subset of $\{1,\ldots, n\}$, $\sigma$ is a permutation of $X$ having exactly $k$ disjoint cycles and $\tau$ is a cycle on $\{1,\ldots,n,n+1\}\backslash X$; such triples are in obvious bijection with permutations of $\{1,\ldots,n ,n+1\}$ having exactly $k+1$ disjoint cycles.

Taking $G(z) = \mathrm{e}^z-1$ to enumerate Stirling Numbers of the Second Kind we have $\exp G(z) = \sum_{n=0}^\infty B_n \frac{z^n}{n!}$ and $b_n = B_n$, so

$\sum_{m=0}^n \genfrac{\{}{\}}{0pt}{}{n}{m} \binom{m}{k} = \sum_{r=0}^n \binom{n}{r} \genfrac{\{}{\}}{0pt}{}{r}{k} B_{n-r}.$

This, and the analogous identity for the Stirling Numbers of the First Kind, may be compared with

$\sum_{m=0}^n \binom{n}{m} \genfrac{\{}{\}}{0pt}{}{m}{k} = \genfrac{\{}{\}}{0pt}{}{n+1}{k+1}$

which follows from counting set partitions according to the size of the part containing $n+1$.

#### Bijective proofs

The left-hand side of the general identity above counts $G$-structured set partitions of $\{1,\ldots,n\}$ having exactly $k$ distinguished parts, and maybe some further undistinguished parts. (Typically differentiation transforms generating functions in this way). The right-hand side counts the same objects, by enumerating triples consisting of an $r$-subset $X$ of $\{1,\ldots,n\}$, a $G$-structured set partition of $X$ into $k$ distinguished parts, and an (undistinguished) $G$-structured set partition of $\{1,\ldots,n\}\backslash X$. This gives fully bijective proofs of both identities. So maybe they are not so deep: that said, even the special case $k=1$ of the first, namely

$\sum_{m=0}^n \genfrac{[}{]}{0pt}{}{n}{m} m = \genfrac{[}{]}{0pt}{}{n+1}{2}$

seemed non-obvious to me.

## Regular abelian subgroups of permutation groups

March 15, 2017

A B-group is a group $K$ such that if $G$ is a permutation group containing $K$ as a regular subgroup then $G$ is either imprimitive or $2$-transitive. (Regular subgroups are always transitive in this post.) The term ‘B-group’ was introduced by Wielandt, in honour of Burnside, who showed in 1901 that if $p$ is an odd prime then $C_{p^2}$ is a B-group. This is a companion to his important 1906 theorem that a transitive permutation group of prime degree is either solvable or $2$-transitive.

One might imagine that, post-classification, it would be clear which groups are B-groups, but this is far from the case. The purpose of this post is to collect some ancillary results with a bearing on this question.

### Quasiprimitivity

When Burnside’s argument succeeds in proving that $G$ is imprimitive, it does so by showing that the kernel $N$ of an irreducible non-trivial constituent of the permutation character of $G$ is intransitive. The orbits of $N$ are then a non-trivial block system for $G$. Thus $G$ is not even quasiprimitive. That one always gets this stronger result is explained by Corollary 3.4 in this paper by Li. The details are spelled out below.

Claim. If $G$ is a permutation group containing a regular abelian subgroup $K$ then $G$ is primitive if and only if $G$ is quasiprimitive.

Proof. Suppose $B$ is part of a non-trivial block system $\mathcal{B}$ for $G$. Let $L$ be the kernel of the induced action of $K$ on $\mathcal{B}$. The key observation is that $K/L$ acts regularly on $\mathcal{B}$: it acts semiregularly, since $K/L$ is abelian, and since $K$ is transitive, $K/L$ is transitive on $\mathcal{B}$. Similarly, since $K$ is transitive, $L$ is transitive on the block $B$ (given $\alpha, \beta \in B$, there exists $k \in K$ such that $\alpha^k = \beta$; then $B^k = B$ and because $k$ acts regularly on $\mathcal{B}$, we have $k \in L$). So we can factor $K$ as a top part $K/L$, acting regularly on $\mathcal{B}$, and a bottom part $L$ acting regularly on each block. In particular, the kernel of $G$ acting on $\mathcal{B}$ contains $L$ and $L$ has $\mathcal{B}$ as its orbits. Therefore $\mathcal{B}$ is the set of orbits of a normal subgroup of $G$, and $G$ is not quasiprimitive. $\Box$

Dropping the condition that $K$ be abelian, one finds many imprimitive but quasiprimitive groups: they can be obtained from factorizations $G = HK$ of a finite simple group $G$ where $H \cap K = 1$ and $H$ is not maximal. One nice example is $A_7 = F_{21}S_5$ where $F_{21}$ is the transitive Frobenius group of order $21$ inside $A_7$ and $S_5$ is generated by $(1,2,3,4,5)$ and $(1,2)(6,7)$. Here $F_{21}$ is not maximal because it is a subgroup of $\mathrm{GL}_3(\mathbb{F}_2)$ in its action on the $7$ non-zero vectors of $\mathbb{F}_2^3$. For an easier example, take $A_5 = \langle (1,2,3,4,5) \rangle A_4$.

### Factorizable groups

Say that a finite group $K$ is factorizable if there exists $t > 1$ and groups $K_1, \ldots, K_t$ such that $K \cong K_1 \times \cdots \times K_t$ where $|K_1| = \ldots = |K_t| \ge 3$. If $K$ is factorizable with $|K_i| = m$ for each $i$, then, since each $K_i$ acts as a regular subgroup of $S_m$, $K$ is a regular subgroup of $S_m \wr S_t$ in its primitive action on $\{1,\ldots,m\}^t$. Therefore no factorizable group is a B-group. This example appears as Theorem 25.7 in Wielandt Permutation Groups.

### Affine groups

An interesting source of examples is affine groups of the form $G = V \rtimes H$ where $V = \mathbb{F}_p^n$ for some prime $p$ and $H \le \mathrm{GL}(V)$. Here $V$ acts by translation on itself and $H$ is the point stabiliser of $0$. Any non-trivial block is invariant under the action of $V$ and so is a linear subspace of $V$. Therefore $G$ is primitive if and only if $H$ acts irreducibly on $V$. The action of $G$ is 2-transitive if and only if the point stabiliser $H$ is transitive. Therefore if there exists a linear group $H$ acting irreducibly but not transitively on $\mathbb{F}_p^n$, then $C_p^n$ is not a B-group.

#### Symmetric group representations

Consider the symmetric group $S_n$ acting on $\langle v_1, \ldots, v_n \rangle$ by permutation matrices. Let $V = \langle v_i - v_j : 1 \le i 2$ and $n \ge 2$ then the orbit of $v_1-v_2$ does not contain $v_2-v_1$, and hence $C_p^n$ is not a B-group in these cases. If $p = 2$ and $n \ge 4$ then the orbit of $v_1+v_2$ does not contain $v_1+v_2+v_3+v_4$. Therefore $C_2^m$ is not a B-group when $m$ is even and $m \ge 4$.

#### Exotic regular abelian subgroups of affine groups

An interesting feature of these examples, noted by Li in Remark 1.1 following the main theorem, is that $V \rtimes H$ may have regular abelian subgroups other than the obvious translation subgroup $V$. Let $t_v : V \rightarrow V$ denote translation by $v \in V$.
Take $n=2r+1$, fix $s \le r$, and consider the subgroup

$K = \langle (2,3)t_{v_1+v_2}, \ldots, (2s,2s+1)t_{v_1+v_{2s}}, t_{v_{2s+2}}, \ldots, t_{v_{2r+1}} \rangle$.

By the multiplication rule

$h t_v h' t_{v'} = h h' t_{vh' + v'}$

we see that $(2t,2t+1)t_{v_1+v_{2t}}$ has square $t_{v_{2t}+v_{2t+1}}$ and that the generators of $K$ commute. Therefore $K \cong C_4^s \times C_2^{2(r-s)}$, and no group of this form, with $r \ge 2$ is a B-group. (Since these groups are factorizable, this also follows from Wielandt’s result above.)

Li’s paper includes this example: the assumption that $n$ is odd, needed to ensure that $V$ is irreducible, is omitted. In fact it is an open problem to decide when $C_2^n$ is a B-group.

Perhaps surprisingly, this example does not generalize to odd primes. To see this, we introduce some ideas from an interesting paper by Caranti, Della Volta and Sala. Observe that if $K$ is a regular abelian subgroup of $G$ then for each $v \in V$, there exists a unique $g_v \in K$ such that $0g_v = v$. There exists a unique $h_v \in H$ such that $g_v = h_v t_v$. By the multiplication rule above we have

$h_u t_u h_v t_v = h_u h_v t_{uh_v + v} = h_vh_u t_{vh_u+u} = h_vt_vh_ut_u$

for all $u, v \in V$. Therefore $\{h_v : v \in V\}$ is an abelian subgroup of $H$ and $uh_v + v = vh_u + u$ for all $u,v \in V$. Replacing $v$ with $v+w$, we have

\begin{aligned} uh_{v+w} + (v+w) &= (v+w)h_u + u \\ &= vh_u + wh_u + u \\ &= uh_v + v + uh_w + w - u \end{aligned}

and so, cancelling $v+w$, we get the striking linearity property

$h_{v+w} = h_v + h_w - \mathrm{id}$.

Equivalently, $h_{v+w}-\mathrm{id} = (h_v-\mathrm{id}) + (h_w-\mathrm{id})$. Since $(h_v - \mathrm{id})(h_w - \mathrm{id}) = (h_vh_w - \mathrm{id}) - (h_v - \mathrm{id}) - (h_w - \mathrm{id})$, it follows that the linear maps $\{h_v - \mathrm{id} : v \in V \}$ form a subalgebra of $\mathrm{End}(V)$. (This is essentially Fact 3 in the linked paper.)

#### Abelian regular subgroups of odd degree affine symmetric groups

Suppose that $K$ is a regular abelian subgroup of $V \rtimes S_n$ and that there exists $v \in V$ such that $h_v \not= \mathrm{id}$. The matrix $M$ representing $h_v$ in the basis $v_2-v_1, \ldots, v_n-v_1$ of $V$ is a permutation matrix if $1 h_v = 1$. If $1h_v = a > 1$ and $bh_v = 1$ then from

$(v_i - v_1)h_v = (v_{ih}-1) - (v_a-v_1)$

we see that $M$ has $-1$ in each entry in the column for $v_a-v_1$, and the only other non-zero entries are a unique $1$ in the row for $v_i-v_1$, for each $i \not= b$. For example, $(1,2,3,4)$ is represented by

$\left( \begin{matrix} -1 & 1 & 0 \\ -1 & 0 & 1 \\ -1 & 0 & 0 \end{matrix} \right).$

By the linearity property above, $h_{2v} = 2h_v - \mathrm{id}$. But if $p > 2$ then $2M - I$ is not of either of these forms. Therefore $V$ is the unique regular abelian subgroup of $V \rtimes S_n$.

Suppose that $p = 2$. Suppose that $h_v$ has a cycle of length at least $4$. By relabelling, we may assume this cycle is $(1,2,\ldots,a)$ where $a$ is a 2-power. The matrix representing $h_v$ has $-1$ entries in column $2$, and the matrix representing $h_v^{-1}$ has $-1$ entries in column $a$. Therefore the matrix representing $h_v + h_v^{-1} + \mathrm{id}$ has $-1$ entries in both columns $2$ and $a$, and so is not of the permitted form. It follows that, when $p=2$ and $n=2r+1$ is odd, the only abelian regular subgroups of $V \rtimes S_n$ are isomorphic to $C_4^s \times C_2^{2r-2s}$. (In fact it seems they are precisely the subgroups constructed above.)

#### Abelian regular subgroups of even degree affine symmetric groups

Now suppose that $n$ is even. Let $U = V / \langle v_1 + \cdots + v_n \rangle$ and consider $U \rtimes S_n$. When $n=4$ the action of $S_4$ is not faithful: after factoring out the kernel $\langle (1,2)(3,4), (1,3)(2,4)\rangle$ we get $\mathbb{F}_2^2 \rtimes S_3 \cong S_4$, which has abelian regular subgroups $C_2 \times C_2$ and $C_4$. When $n=6$, there is an abelian regular subgroup isomorphic to $C_8 \times C_2$, generated by $t_{v_1+v_3}(3,4,5,6)$ and $t_{v_1+v_2}$. The obstruction seen above does not apply: for example, in the basis $v_1+v_3, v_1+v_4,v_1+v_5,v_1+v_6$, we have

$(3,4,5,6) + (3,6,5,4) + \mathrm{id} \mapsto \left( \begin{matrix} 1 & 1 & 0 & 1 \\ 1 & 1 & 1 & 0 \\ 0 & 1 & 1 & 1 \\ 1 & 0 & 1 & 1 \end{matrix} \right)$

which is the matrix representing $(1,2)(3,6,5,4)$, thanks to the relations present in the quotient module. However, in this case the action of $S_6$ on $\mathbb{F}_2^4$ is transitive (any element of $U$ is congruent to some $v_i + v_j$) and in fact $C_8 \times C_2$ is a B-group. (A related, remarkable fact, is that the action of $A_6$ on $U$ extends to a $2$-transitive action of $A_7$, giving an example of a $3$-transitive affine group, $U \rtimes A_7$.) A computer calculation shows that, up to conjugacy, there are also three abelian regular subgroups of $U \rtimes S_n$ isomorphic to $C_4 \times C_4$ and two isomorphic to $C_4 \times C_2 \times C_2$. If $n \ge 8$ then a similar argument to the odd degree case shows that any abelian regular subgroup has exponent at most $4$, so we get no new examples.

#### Elementary abelian subgroups other than $V$

The following lemma has a simple ad-hoc proof.

Lemma. If $n < p$ then the only regular abelian subgroups of $V \rtimes \mathrm{GL}(V)$ are elementary abelian.

Proof. Suppose that $K$ is an elementary abelian subgroup of $V \rtimes \mathrm{GL}(V)$. Let $h_v t_v \in K$. Since $h_v - 1$ is a nilpotent $n \times n$-matrix and $n \textless\; n$ we have $(h_v - 1)^p = 0$ The $p$-th power of $h_v t_v \in V$ is therefore $t_w$ where $w = v + vh_v + \cdots + vh_v^{p-1}$. Since each Jordan block in $h_v$ has size at most $p-1$, we have $w=0$. Therefore $K$ has exponent $p$. $\Box$

Note however that there exist elementary abelian subgroups other than the obvious $V$ whenever $n \ge 2$. Possibly they can be classified by the correspondence with nil-algebras in the Caranti, Della Volta, Sala paper.

#### Algebra groups

Given $n \in \mathbb{N}$ and a field $F$, define an algebra group to be a subgroup of $\mathrm{GL}_n(F)$ of the form $\{I + X : X \in A \}$ where $A$ is a nilalgebra of $n \times n$ matrices. (We may assume all elements of $A$ are strictly upper triangular.) The subalgebra property above shows that any abelian regular subgroup of an affine group of dimension $n$ over $\mathbb{F}_2$ is an extension

$1 \rightarrow C_2^r \hookrightarrow K \twoheadrightarrow \{I + X : X \in A \} \rightarrow 1$

of an algebra group by an elementary abelian 2-group. If $F$ has characteristic $2$ then, since

$(I+X)(I+Y) = (I+XY) + (I+X) + (I+Y)$

by the dual of the calculation above, a subgroup $G$ of $\mathrm{GL}_n(F)$ is an algebra group if and only if $\{ I + g : g \in G \}$ is additively closed.

An exhaustive search shows that if $n \le 4$ then every 2-subgroup of $\mathrm{GL}_n(\mathbb{F}_2)$ is an algebra group; note it suffices to consider subgroups up to conjugacy. But when $n = 5$ there are, up to conjugacy, 110 algebra subgroups of $\mathrm{GL}_5(\mathbb{F}_2)$, and 66 non-algebra subgroups, of which 34 are not isomorphic to an algebra subgroup. Of these:

• 1 is abelian, namely $C_8$;
• 4 are nilpotent of class 2, namely $\langle x, c : x^8 = c^2 = 1, x^c = x^5 \rangle$, $\langle x, y, c : x^4 = y^2 = c^4 = 1, x^c = xy, y^c = y \rangle$, $\langle x, y, c : x^2 = y^2 = c^8 = 1, x^c = xy, y^c = y \rangle$, $\langle x,y,c : x^4 = y^2 = c^4 = 1, x^c = xy, y^c = y\rangle$;
• 17 are nilpotent of class 3;
• 12 are nilpotent of class 4.

The appearance of $C_8$ is explained by the following lemma, which implies that there is an algebra group isomorphic to $C_{2^m}$ if and only if $m \le 2$.

Lemma. Let $g \in \mathrm{GL}_n(\mathbb{F}_2)$ have order $2^m$. Then $\langle g \rangle$ is an algebra group if and only if all Jordan blocks in $g$ have dimension at most $3$.

Proof. Let $1 + X$ be a Jordan block of $g$ of dimension $d$ and order $2^m$. The subalgebra of $d\times d$-matrices generated by $X$ has dimension $d-1$, and so has size $2^{d-1}$. Therefore if $\langle g \rangle$ is an algebra group then $2^{d-1} \le 2^m$. Hence $d-1 \le m$. On the other hand, since $X^{d-1} \not=0$, if $2^r < d$ then $(1+X)^{2^r} \not=0$, and so the order of $1+X$ is the minimal $s$ such that $2^s \ge d$. Hence $2^{m-1} < d$. Combining these inequalities we get

$2^{m-1} < d \le m+1$

and so $m \le 2$, and, since $d-1 \le m$, we have $d \le 3$. The converse is easily checked. $\Box$.

## A result of Newell on plethysms of symmetric functions

January 7, 2017

It’s often said that part of the unique character of mathematics is that it builds on itself. Old results may be forgotten, but are only very rarely found to be incorrect. But changes in the ‘expected general background’ make many old papers impenetrable. For example, a long standing conjecture of Foulkes was introduced in a paper with the title ‘On the concomitants of the quintic and sextic up to degree four in the coefficients of the ground form‘. How many algebraists working today would immediately know even roughly what it is about? Of them, I’m sure still fewer would expect to read his paper with any pleasure. Certainly I cannot.

The purpose of this post is to do a small, but detailed, translation exercise on a paper of Newell from 1951, with the title ‘A theorem on the plethysm of $S$-functions‘. This is a relatively late publication from the stable of British algebraists working on invariant theory, so one might hope it would be fairly accessible.

### Overview

Fix $m, n \in \mathbb{N}$ and let $\lambda$ be a partition of $mn$. Newell’s paper is cited in an important paper of Weintraub for the pair of results stated verbatim below:

\begin{aligned} ([m+1] \odot [1^n],[\lambda_1+1,\ldots,\lambda_n+1]) &= ([m] \odot [n], [\lambda]) \\ ([m+1] \odot [n],[\lambda_1+1,\ldots,\lambda_n+1]) &= ([m] \odot [1^n], [\lambda]) \end{aligned}.

Weintraub defines all his notation (which was standard for the time) clearly. For example, $[\lambda]$ denotes the irreducible representation of the symmetric group labelled by the partition $\lambda$, and $\odot$ is the analogue of the plethysm product $\circ$ for the symmetric group, with the variables swapped. Writing $s_\lambda$ for the Schur function labelled by the partition $\lambda$, an equivalent statement in the language of symmetric functions is:

\begin{aligned} \langle s_{(1^n)} \circ s_{(m+1)}, s_{\lambda + (1^n)} \rangle &= \langle s_{(n)} \circ s_{(m)}, s_\lambda \rangle \\ \langle s_{(n)} \circ s_{(m+1)}, s_{\lambda + (1^n)} \rangle &= \langle s_{(1^n)} \circ s_{(m)}, s_\lambda \rangle \end{aligned}.

Even a careful visual inspection of Newell’s paper reveals nothing that looks remotely like Weintraub’s statement (or my restatement). But in fact, the upper displayed equations are a special case of Newell’s Theorem 1, stated verbatim below:

#### Theorem 1.

If $\{m\} \otimes \{n\} = \sum \{\nu\}$ where $m$ and $n$ are integers then for any integer $k \le n$

$\sum g_{(1^k)\xi\nu} \{\nu\} = [(m-1)\otimes (1^k)][\{m\} \otimes \{n-k\}]$.

To be fair to Newell, $g$ is defined earlier: one reads ‘… where $g_{rst}$ is defined from the multiplication of $S$-functions by means of $\{r\}\{s\} = g_{rst}\{t\}$‘. So alles klar?

### Translation up to Theorem 1

Much of Newell’s notation was standard at its time: $\{\nu\}$ is the Schur function $s_\nu$ and $\otimes$ is the plethystic product corresonding to Weintraub’s $\odot$. (So again, the order is reversed compared to $\circ$.) The hypothesis $\{m\} \otimes \{n\} = \sum \{\nu\}$ may still seem a little mysterious to modern eyes, since a general plethysm is certainly not multiplicity-free: nowadays we might write ‘$s_{(n)} \circ s_{(m)} = \sum c_\nu s_{\nu}$ where $c_\nu \in \mathbb{N}_0$ for each partition $\nu$ of $mn$‘. Only the $g$ remains to be understood: Newell defines $g_{rst}$ for natural numbers $r, s, t$, but then uses general partitions as the coefficients. Even in the special case, and with the benefit of knowing what is meant, his definition seems highly unclear to me.

A clue to the correct interpretation is given by the start of the sentence defining $g$ quoted above: ‘It is known [(2) 349] that if $\{\lambda\} \otimes \{n\} = \sum \{\nu\}$ then $\sum g_{1\xi\nu} \{\xi\} = \{\lambda\} \otimes \{n-1\}[g_{1\mu \lambda} \{\mu\}]$ …’. This must be parsed bearing in mind the summation convention that the repeated letters $\xi$ and $\mu$ are summed over. So, in modern language, it says: ‘if $s_n \circ s_\lambda = \sum c_\nu s_\nu$ then

$\sum_\nu c_\nu \sum_\xi g_{1\xi\nu} s_\xi = (s_{(n-1)} \circ s_\lambda)\sum_\mu g_{1\mu\lambda} s_\mu.'$

(I have included the parentheses around $s_{(n-1)} \circ s_\lambda$ as a small editorial mercy.) After staring at this for a while, I decided it must express the following symmetric function identity:

$(s_{(n)} \circ s_\lambda)\!\downarrow = (s_{(n-1)} \circ s_\lambda)(s_\lambda\!\downarrow)$,

where $s_\lambda\!\downarrow$ is the sum of all Schur functions $s_\mu$ labelled by partitions $\mu$ obtained from $\lambda$ by removing a box. (This identity is proved below, using the symmetric group.) Reciprocally, we have

$\langle s_\lambda\!\downarrow, s_\mu\rangle = \langle s_\lambda, s_{(1)}s_\mu \rangle.$

Thus $\downarrow$ is the adjoint to multiplication by $s_{(1)}$, and Newell’s $g_{1\mu \nu}$ are the corresponding coefficients. Generalizing freely, we can guess the correct definition of $g_{\theta\xi \nu}$.

Definition. Given partitions $\theta, \mu, \nu$ we define

$g_{\theta \mu\nu} = \langle s_\nu, s_\theta s_\mu \rangle.$

If we believe this is correct, then after one more piece of guesswork where we amend $g_{(1^k)\xi\nu}\{\nu\}$ in the statement of Theorem 1 to $g_{(1^k)\xi\nu}\{\xi\}$ (making it consistent with the above, and also with the analogous Theorem 1A), the conclusion of Theorem 1 becomes

$\sum_\nu c_\nu \sum_\xi \langle s_\xi s_{(1^k)}, s_\nu \rangle s_\xi = (s_{(1^k)}\circ s_{(m-1)})(s_{(n-k)} \circ s_{(m)}).$

The left-hand side is

$\sum_\xi \langle s_\xi s_{(1^k)}, \sum_\nu c_\nu s_\nu \rangle s_\xi = \sum_\xi \langle s_\xi s_{(1^k)}, s_{(n)} \circ s_{(m)} \rangle s_\xi.$

Therefore Newell’s Theorem 1 can be stated as follows:

$\sum_{\xi} \langle s_{\xi} s_{(1^k)}, s_{(n)} \circ s_{(m)} \rangle s_\xi = (s_{(1^k)}\circ s_{(m-1)})(s_{(n-k)} \circ s_{(m)}).$

If $s_\nu$ appears in $s_{(n)} \circ s_{(m)}$ then $\nu$ has at most $n$ parts. On the other hand, by Pieri’s rule, $s_{\xi}s_{(1^k)}$ is the sum of all partitions $\nu$ obtained from $\xi$ by adding $k$ boxes, no two in the same column. Therefore, in the special case when $k=n$, we have

$\langle s_\xi s_{(1^n)}, s_{(n)} \circ s_{(m)} \rangle = \langle s_{\xi + (1^n)}, s_{(n)} \circ s_{(m)} \rangle.$

Since $s_{(0)} \circ s_{(m)} = s_{(0)} = 1$, the unit symmetric function, we obtain

$\langle s_{\xi + (1^n)}, s_{(n)} \circ s_{(m)} \rangle = \langle s_\xi, s_{(1^n)} \circ s_{(m-1)}\rangle.$

This is equivalent to the first displayed equation in my restatement of Weintraub’s version of Newell’s result. The second displayed equation can be translated similarly.

### Proof of Theorem 1

We prove Newell’s theorem in the symmetric group, replacing $m$ with $m+1$ for consistency with Weintraub’s statement. Let $\Omega$ be the collection of all set partitions of $\{1,\ldots,(m+1)n\}$ into $n$ sets each of size $m+1$, and let $M$ denote the corresponding permutation module for $\mathbb{C}S_{(m+1)n}$ with basis $\Omega$. In Weintraub’s notation, the permutation character of $M$ is $[m+1] \odot [n]$.

Consider the restricted module $M\!\downarrow_{S_k \times S_{(m+1)n-k}}$, where $S_k$ permutes $\{1,\ldots, k\}$ and $S_{(m+1)n-k}$ permutes $\{k+1,\ldots, (m+1)n\}$. The maximal summand of $M$ on which $S_k$ acts as the sign representation is spanned by the symmetrized set partitions $v_\mathcal{P}$ for $\mathcal{P} \in \Omega$, where

$v_\mathcal{P} = \mathcal{P} \sum_{\sigma \in S_k} \sigma \mathrm{sgn}(\sigma).$

Observe that $v_\mathcal{P} = 0$ unless $\mathcal{P}$ has at most one entry from $\{1,\ldots, k\}$ in each of its $n$ constituent $(m+1)$-sets. If $X$ and $Y$ are two such subsets, with entries $x_1, \ldots, x_{m}$ and $y_1, \ldots, y_{m}$ in $\{k+1,\ldots,(m+1)n\}$ respectively, then

$v_\mathcal{P} (x_1,y_1)\ldots (x_{m},y_{m}) = -v_\mathcal{P}.$

It follows that

\begin{aligned}\quad [m+1] \odot [n] & \!\downarrow_{S_k \times S_{(m+1)n-k}} \\ & \hskip0.1in = \mathrm{sgn}_{S_k} \times ([m] \odot [(1^k)])([m+1] \odot [n-k]) + \pi \end{aligned}

where $\pi$ is a sum of irreducible characters of $S_k \times S_{(m+1)n-k}$ not of the form $\mathrm{sgn}_{S_k} \times \chi^\tau$. By Frobenius reciprocity,

\begin{aligned} \langle [m+1] \odot [n], & \mathrm{sgn}_{S_k} \times \chi^\xi \uparrow^{S_{(m+1)n}} \rangle \\ &\hskip0.2in = \langle ([m] \odot [(1^k)])([m+1] \odot [n-k]), \chi^\xi \rangle \end{aligned}

for each partition $\xi$ of $(m+1)n -k$. This is equivalent to

$\langle s_{(n)} \circ s_{(m+1)}, s_{(1^k)}s_\xi \rangle = \langle (s_{(1^k)} \circ s_m)(s_{(n-k)} \circ s_{(m+1)}, s_\xi \rangle$

which is obtained from my restatement of Newell’s theorem by taking the inner product with $s_\xi$. The companion result follows by a sign twist. $\Box$

No wonder old results are frequently reproved: its invariably less work than reading the original.

## Henry IV: Donmar at King’s Cross

December 2, 2016

This is a well established production so hardly needs a review, least of all from me, but here goes anyway. Harriet Walter’s performance as the titular King was astonishing: graceful, subtle and slightly knowing. Her command of Shakespeare’s language was superb, and orders of magnitude beyond the rest of the cast. Often this didn’t matter: the ruthless editing of the two parts into a single play meant many more complicated speeches were cut, and robust physical theatricality carried many other parts.

Jade Anouka’s Hospur made every scene she was in sparkle (even her dead body had a commanding presence!). Clare Dunne’s performance as Prince Hal seemed to get stronger throughout the evening. Shakespeare’s comparisons to his earlier play Richard II survived the editing and gave some needed depth to the death-bed scene between father and son in the final act. Sophie Stanton’s Falstaff left me a bit cold. The tavern scene where she and Prince Hal imitate Henry IV (and his likely reaction to Hal’s disportations) was filled with the camaraderie and bouyant wit I expected from Falstaff, but at other times, particularly in the scenes with the lesser actors, she seemed less full of her role.

There was an attempt, not entirely successful I think, to give Prince Hal’s final rejection of Falstaff extra weight by use of the prison setting. This framing also helps to excuse the many clunky props, such as a toy dog, or the assorted lollipops the actors sucked on intermittently, or the fake swords that were unrecognisable as weapons of any sort. Perhaps the point was to emphasise the juvenile nature of much of the early goings on? But to pick a tough comparison, in Lilies, the prison setting works brilliantly to emphasise the key themes: here it sometimes seemed unhelpful. Maybe it works better in the companion plays, Julius Ceasar and The Tempest? That said, there was effective use of the theatre space (the audience sits on all sides—the atmosphere was frequently intense), and some nice symbolism, for example, when Hotspur is slain on top of the prison yard’s basketball court’s centre circle, a map of England and Wales spray-painted around him.

The all female cast helped to bring out the gender themes, for example the generally contemptuous treatment of Mistress Quickly, and the marriage between Lady Percy (played by Sheila Atim, showing great promise) and Hotspur. The programme has an interesting article by Charlotte Higgins on the 2:1 problem: of the 10 theatres that received the most money from the Arts Council, women account for about 1/3 of the board members, artistic directors and actors. Higgins makes the interesting point that the colossal gender balance in Shakespeare (of 981 roles, 826 are men, 155 are women) shapes audience expectations of casts, and has knock-on effects for careers throughout British theatre.

Overall: an enjoyable evening, but certainly not one for the purists.