## Henry IV: Donmar at King’s Cross

December 2, 2016

This is a well established production so hardly needs a review, least of all from me, but here goes anyway. Harriet Walter’s performance as the titular King was astonishing: graceful, subtle and slightly knowing. Her command of Shakespeare’s language was superb, and orders of magnitude beyond the rest of the cast. Often this didn’t matter: the ruthless editing of the two parts into a single play meant many more complicated speeches were cut, and robust physical theatricality carried many other parts.

Jade Anouka’s Hospur made every scene she was in sparkle (even her dead body had a commanding presence!). Clare Dunne’s performance as Prince Hal seemed to get stronger throughout the evening. Shakespeare’s comparisons to his earlier play Richard II survived the editing and gave some needed depth to the death-bed scene between father and son in the final act. Sophie Stanton’s Falstaff left me a bit cold. The tavern scene where she and Prince Hal imitate Henry IV (and his likely reaction to Hal’s disportations) was filled with the camaraderie and bouyant wit I expected from Falstaff, but at other times, particularly in the scenes with the lesser actors, she seemed less full of her role.

There was an attempt, not entirely successful I think, to give Prince Hal’s final rejection of Falstaff extra weight by use of the prison setting. This framing also helps to excuse the many clunky props, such as a toy dog, or the assorted lollipops the actors sucked on intermittently, or the fake swords that were unrecognisable as weapons of any sort. Perhaps the point was to emphasise the juvenile nature of much of the early goings on? But to pick a tough comparison, in Lilies, the prison setting works brilliantly to emphasise the key themes: here it sometimes seemed unhelpful. Maybe it works better in the companion plays, Julius Ceasar and The Tempest? That said, there was effective use of the theatre space (the audience sits on all sides—the atmosphere was frequently intense), and some nice symbolism, for example, when Hotspur is slain on top of the prison yard’s basketball court’s centre circle, a map of England and Wales spray-painted around him.

The all female cast helped to bring out the gender themes, for example the generally contemptuous treatment of Mistress Quickly, and the marriage between Lady Percy (played by Sheila Atim, showing great promise) and Hotspur. The programme has an interesting article by Charlotte Higgins on the 2:1 problem: of the 10 theatres that received the most money from the Arts Council, women account for about 1/3 of the board members, artistic directors and actors. Higgins makes the interesting point that the colossal gender balance in Shakespeare (of 981 roles, 826 are men, 155 are women) shapes audience expectations of casts, and has knock-on effects for careers throughout British theatre.

Overall: an enjoyable evening, but certainly not one for the purists.

## Sylow subgroups of symmetric groups

November 27, 2016

The purpose of this post is to collect some proof of known results on Sylow subgroups of symmetric groups that are scattered across the literature, and in one case, wrongly stated in the standard reference. Throughout permutations act on the right. Let $S_{p^r}$ denote the symmetric group on $\{0,1,\ldots,p^r-1\}$.

### Rooted trees

Let $T_r$ be the rooted $p$-ary tree with levels numbered from $0$ at the root to $r$ at the leaves. Let $T_r(\ell)$ denote the set of $p^\ell$ vertices at level $\ell$. We label the vertices in $T_r(\ell)$ by $\{0,1,\ldots, p^\ell-1\}$, so that $\beta \in T_r(\ell+1)$ is a child of $\alpha \in T_r(\ell)$ if and only if $\beta \equiv \alpha$ mod $p^\ell$. Thus the children of $\alpha \in T_r(\ell)$ are

$\alpha, \alpha + p^\ell, \ldots, \alpha + (p-1)p^\ell \in T_r(\ell+1)$

and $\gamma$ is a descendant of $\alpha$ if and only if $\gamma \equiv \alpha$ mod $p^\ell$. For example, the children of the root vertex $\varnothing$ are labelled $0,1,\ldots, p-1$ and correspond to congruence classes modulo $p$, the children of $1 \in T_r(1)$ are labelled (in base $p$) by $01, 11, \ldots, (p-1)1$, and correspond to the congruence classes modulo $p^2$ of numbers congruent to $1$ modulo $p$, and so on. All this may be seen from the tree $T_3$ for $p=3$ shown below.

Let $\mathrm{Aut}(T_r)$ be the group of automorphisms of $T_r$. Any automorphism is determined by its action on the $p^r$ leaves of $T_r$, so, when it is useful to do so, we regard $\mathrm{Aut}(T_r)$ as a subgroup of the symmetric group $S_{p^r}$.

Let $0 \le \ell < r$. For each vertex $\alpha \in T_r(\ell)$, let $g(\alpha) \in \mathrm{Aut}(T_r)$ fix $\alpha$ and all vertices of $T_r$ that are not a descendant of $\alpha$, and permute the descendants of $\alpha$ (of any generation) by adding $p^{\ell}$ modulo $p^{\ell+1}$. More formally, the vertex $\alpha + ip^\ell + jp^{\ell+1}$ where $0\le i < p$ and $j \in \mathbb{N}_0$ is sent to $\alpha + (i+c \ \mathrm{mod}\ p)p^\ell +jp^{\ell+1}$. For example, if $r=3$ then $g(01) = (1,10,19)$ and

$g(1) = (1,4,7)(10,13,16)(19,22,25).$

(We rely on the length of the $p$-ary form of the vertex to indicate the intended level.) Let $B(\ell)$ be the subgroup of $\mathrm{Aut}(T_r)$ generated by all the $g(\alpha)$ for $\alpha \in T_r(\ell)$. Each subgroup $B(\ell)$ is normalized by the $B(j)$ for $j < \ell$, so $P_r = B(r-1) B(r-2) \ldots B(0)$ is a $p$-subgroup of $S_{p^r}$. We have $|P_r| = p^e$ where

$e = 1 + p + \cdots + p^{r-1}.$

It is well known that the highest power of $p$ dividing $n!$ is $\sum_{i \ge 1} \lfloor n/p^i \rfloor$, so $P_r$ is a Sylow $p$-subgroup of $S_{p^r}$.

We now make the connection with the other standard construction of $P_r$. Let $\phi : P_r \rightarrow P_{r-1}$ be the group homomorphism defined by restricting $g \in \mathrm{Aut}(T_r)$ to $T_r(r-1)$. Thus $\phi(g)$ describes the action of $g$ on the $p^{r-1}$ vertices at level $r-1$, or, equivalently, on the $p^{r-1}$ congruence classes of numbers in $\{0,1,\ldots,p^r-1\}$ modulo $p^{r-1}$.

Lemma 1.$P_r$ is permutation isomorphic to the iterated wreath product $C_p \wr (C_p \wr \cdots \wr C_p)$ with $r$ factors, in its intransitive action.

Proof. By induction $\phi(P_r) \cong B(r-2)\ldots B(0)$ is permutation isomorphic to the imprimitive wreath product with $r-1$ factors. Therefore $B(r-1) \cong C_p^{p^{r-1}}$ is the base group in a semidirect product $B(r-1) \rtimes B(r-2) \ldots B(0)$ permutation isomorphic to $C_p \wr (C_p \wr \cdots \wr C_p)$ in its imprimitive action. $\Box$

### Centre and normalizer

The centre of $P_r$ and its normalizer in $S_{p^r}$ can be described by slightly more involved inductive arguments. The following elements of $P_r$ will be useful: for $0 \le \ell < r$, let $z_r(\ell) = \prod_{\alpha \in T_r(\ell)} g(\alpha)$. For example, if $p=3$ and $r=3$ then

\begin{aligned} z_3(0) &= (0,1,2)(3,4,5)(6,7,8)(9,10,11) \ldots (24,25,26), \\ z_3(1) &= (0,3,6)(1,4,7)(2,5,8)(9,12,15)\ldots (20,23,26), \\ z_3(2) &= (0,9,18)(1,10,19)(2,11,20)(3,12,21) \ldots (8,17,26). \end{aligned}

Observe that $\phi(z_r(\ell)) = z_{r-1}(\ell)$ for $0 \le \ell < r-1$, and that $\phi(z_r(r-1)) = \mathrm{id}$.

Lemma 2. $Z(P_r) = \langle z_r(r-1) \rangle$.

Proof. If $r=1$ then $z_1(0) = (0,1,\ldots,p)$ generates $P_1$. Let $r \ge 2$ and let $z \in Z(P_r)$. By induction $\phi(z) \in \langle \phi(z_r(r-2))\rangle$. Hence $z \in \phi^{-1} \langle \phi(z_r(r-2)) \rangle = B_{r-1} \langle z_{r}(r-2) \rangle$. Let $z = b z_r(r-2)^i$ where $0 \le i \le p-1$ and $b \in B_{r-1}$. Since $B(r-1)$ is abelian, we see that $z_r(r-2)^i$ commutes with all permutations in $B(r-1)$. If $i\not=0$ then there exists $\alpha \in T_r(r-1)$ such that $\alpha z \not= \alpha$ and so

$\alpha g(\alpha) z_r(r-2)^i = (\alpha +p^n) z_r(r-2)^i = \alpha + p^{r-1} + ip^{r-2}$

whereas

$\alpha z_r(r-2)^i g(\alpha) = (\alpha +ip^{r-2}) g_x = \alpha + ip^{r-2}$,

a contradiction. Therefore $i=0$ and $z = \prod_{\beta \in T_r(r-1)} g(\beta)^{i_\beta}$ where $0 \le i_\beta < p$ for each $\beta$. A similar argument considering the actions of the generators $g(\alpha)$ for $\alpha \in T_r(r-2)$ now shows that $i_\beta$ is independent of $\beta$, and so $g \in \langle z_r(r-1)\rangle$. $\Box$

The normalizer $N_{S_{p^r}}(P_r)$ turns out to be a split extension of $P_r$. Let $c$ be a primitive root modulo $p$. Let $\alpha \in T_r(\ell)$. Let $h(\alpha)$ fix $\alpha$ and every vertex that is not a descendant of $\alpha$ and permute the descendants of $\alpha$ (of any generation) by sending $\alpha + ip^\ell + jp^{\ell+1}$ where $0 \le i < \ell$ and $0 \le j < p^{r-\ell-1}$ to $\alpha + (ci \ \mathrm{mod}\ p)p^\ell + jp^{\ell+1}$. Thus $h(\alpha)$ fixes the child $0\alpha$ of $\alpha$, and permutes the remaining $p-1$ children $1\alpha \ldots (p-1)\alpha$ by a $(p-1)$-cycle, acting compatibly on their descendants in turn. Let $k_r(\ell) = \prod_{\alpha \in T_r(\ell)}h(\alpha)$. For example, if $p=3$ and $r=3$ then $h(01) = (12,21)$, $h(1) = (4,7)(13,16)(22,25)$ and

\begin{aligned}\scriptstyle k_3(0) = (1, 2)(4, 5)(7, 8)(10, 11)(13, 14)(16, 17)(19, 20)(22, 23)(25, 26), \\ \scriptstyle k_3(1) = (3, 6)(4, 7)(5, 8)(12, 15)(13, 16)(14, 17)(21, 24)(22, 25)(23, 26), \\ \scriptstyle k_3(2) = (9, 18)(10, 19)(11, 20)(12, 21)(13, 22)(14, 23)(15, 24)(16, 25)(17, 26). \end{aligned}

The permutation $k_3(0)$ is shown below.

Proposition 3. $N_{S_{p^r}}(P_r) = P_r \rtimes \langle k_r(\ell) : 0 \le \ell < r-1 \rangle$.

Proof. Let $g \in N_{S_{p^r}}(P_r)$. By a similar argument to Lemma 1, we have $g \in B(r-2)\ldots B(0) \langle k_3(0), \ldots, k_3(r-2) \rangle h$ where $h \in \ker \phi$. By composing $h$ with a suitable power of $z_r(r-1)$, we may assume that $0h = 0$. Let $h^\star$ denote the permutation induced by $h$ on $\{p^{\ell-1},\ldots,(p-1)p^{\ell-1}\}$. Since $Z(P_r) = \langle z_r(r-1) \rangle$ is a characteristic subgroup of $P_r$, it is normalized by $g$. Therefore $h^\star$ normalizes the $p$-cycle $(0,p^{\ell-1},\ldots,(p-1)p^{\ell-1})$ forming part of the disjoint cycle form of $z_r(r-1)$. Therefore $h^\star$ acts on $\{p^{\ell-1},\ldots,(p-1)p^{\ell-1}\}$ as some power of $k_r(r-1)$. Since $h$ normalizes $Z(P_r)$, it follows that $h$ acts compatibly on all the other $p$-cycles in the disjoint cycle form of $z_r(r-1)$, and so $h \in \langle k_r(r-1) \rangle$, as required. $\Box$

In particular, a Sylow $2$-subgroup of $S_{2^r}$ is self-normalizing. The following corollary is also immediate from Proposition 3.

Corollary 4. The number of Sylow $p$-subgroups of $S_{p^r}$ is $(p^r)!/p^{e}(p-1)^r$ where $e = 1 + p + \cdots + p^{r-1}$. $\Box$

### Derived group

Using the transitive action of $P_r$ on the vertices at each level of $T_r$, it is not hard to prove the following lemma.

Lemma 5. The derived group $P_r'$ of $P_r$ contains the permutations $g_\alpha g_\beta^{-1}$ for all $\alpha, \beta$ vertices at the same level of $T_r$. $\Box$

Corollary 6. The permutations $g(\alpha) g(\beta)^{-1}$ for $\alpha, \beta$ vertices at the same level of $T_r$ generate $P_r'$ and $P_r/P_r' \cong C_p^r$.

Proof. Let $\gamma_\ell \in T_r(\ell)$ be the vertex labelled $0$ at level $l$ of $T_r$. Let $H$ be the subgroup generated by the permutations $g(\alpha) g(\beta)^{-1}$. By Lemma 5, $\mathrm{Aut}(T_r)$ is generated by the $g(\gamma_\ell)$ and $H$. Moreover, the $g(\gamma_\ell)$ commute modulo $H$. Therefore $P_r/H$ is abelian, and so $H = P_r'$, with the permutations $g(\gamma_\ell)$ generating the quotient group. $\Box$

### Weisner’s counting argument

In On the Sylow subgroups of the symmetric and alternating groups,
Amer. J. Math. 47 (1925), 121–124, Louis Weisner gives a counting argument that, specialized to $P_r$, claims to show that $P_r$ has exactly $(p^r)!/p^e (p-1)^e$ Sylow $p$-subgroups, where, as above $e = 1 + p + \cdots + p^{r-1}$. The exponent of $p$ is correct, but by Corollary 4, the exponent of $p-1$ is too large. Weisner’s error appears to be on page 123, where he implicitly assumes that Sylow subgroups of $S_{p^r}$ with the same base group are equal. This is false whenever $p \not= 2$ and $r \ge 2$: for example

\begin{aligned} \langle (036), (147), (258) \rangle \rtimes \langle (012)(345)(678) \rangle, \\ \langle (063), (147), (258) \rangle \rtimes \langle (012)(378)(645) \rangle \end{aligned}

are distinct.

An alternative reference for the correct result is La structure des p-groupes de Sylow des groupes symétriques finis, Léo Kaloujnine, Ann. Sci. École Norm. Sup. (3) 65 (1948), 239–276: see page 262. In the footnote, Kaloujnine notes that, in general, the action of $N_{S_{p^r}}(P_r)$ on $P_r$ does not induce the full automorphism group. Indeed, when $p=2$ and $r=2$, the base group $\langle (02), (13)\rangle$ of the wreath product

$C_2 \wr C_2 = \langle (02), (13), (01)(23) \rangle$

is not even a characteristic subgroup of $C_2 \wr C_2$. For more on this see On the structure of standard wreath products of groups, Peter M. Neumann, Math. Z. 84 1964, 343–373.

The mistake in Weisner’s paper is pointed out on page 73 of an M.Sc. thesis by Sandra Covello.

### Final remarks

#### Lower central series

The lower central series of $P_r$ was found by Weir in The Sylow subgroups of the symmetric groups, Proc. Amer. Math. Soc. 6 (1955) 534–541. Another description is given in the paper of Kaloujnine. There has been much further related work: see for example Lie algebra associated with the group of finitary automorphisms of p-adic tree, N. V. Bondarenko, C. K. Gupta V. I. Sushchansky, J. Algebra 324 (2010), 2198–2218 and the references to earlier papers therein. Using Lemma 6, one gets the description in Proposition 8 below. (This is almost certainly equivalent to a result in Kaloujnine’s paper stated using his formalism of tableaux polynomials, but the close connection with cyclic codes seems worth noting.)

Definition 7. Let $P_r^m$ be the $m$th term in the lower central series of $P_r$, so $P_r^1 = P_r'$ and $P_r^m = [P_r, P_r^{m-1}]$ for $m \ge 2$. Let $B(\ell,m) = P_r^m \cap B(\ell)$ for $0 \le s < r$.

Let $V = \mathbb{F}_p^{p^\ell}$. Index coordinates of vectors in $V$ by the set $\{0,1,\ldots,p^{\ell}-1\}$ labelling the vertices in $T_r(\ell)$. Given $v \in V$, we may define a corresponding element of $B(\ell)$ by $\prod_{\alpha=0}^{p^\ell-1} g(\alpha)^{v_\alpha}$. This sets up a bijective correspondence between subgroups of $B(\ell)$ and subspaces of $\mathbb{F}_p^{p^\ell}$. In turn, a subspace of $\mathbb{F}_2^{p^\ell}$ invariant under the shift map sending $(v_0,\ldots,v_{p^\ell-2},v_{p^\ell-1})$ to $(v_{p^\ell-1},v_0, \ldots, v_{p^\ell-2})$ corresponds to an ideal in $\mathbb{F}_p[x]/(x^{p^\ell}-1)$. (This is a basic observation in the theory of cyclic error-correcting codes.)

Proposition 8. We have

$P_r^m = B(r-1,m)B(r-2,m) \ldots B(0,m)$

where $B(\ell, m)$ corresponds to the ideal in $\mathbb{F}_p[x]/(x^{p^\ell}-1)$ generated by $(x-1)^m$.

Proof. Under the correspondence above, multiplication by $x$ corresponds to conjugation by $(0,1,\ldots,p^{\ell}-1) \in P_\ell$. Each $B(\ell,m)$ is invariant under conjugation by $P_r$, so we see that the subspaces of $V$ corresponding to the $B(\ell,m)$ are all $x$-invariant. Hence they are ideals in $\mathbb{F}_p[x]/(x^{p^\ell}-1)$. It is easily seen by induction that $(x-1)^m$ is in the ideal corresponding to $B(\ell,m)$. Since the lower central series is strictly decreasing, and $(x-1)^m$ generates an ideal of codimension $m$, the proposition follows. $\Box$

Corollary 9. Let $M$ be the natural permutation module for $P_r$, defined over a field of characteristic $p$. Then the socle series for $M$ and the radical series for $M$ coincide; the $m$th term (from the top) in either is the image of $(t-1)^m$ where $t = (0,1,\ldots, p^r)$. $\Box$

Thus the radical and socle series of $M$ considered just as an $\mathbb{F}_p\langle t \rangle$-module coincide with the radical and socle series of $M$ as an $\mathbb{F}_pP_r$-module. This is not really unexpected: the trivial module is the unique simple $\mathbb{F}\langle t \rangle$-module, and $M$ is uniserial even when considered as an $\mathbb{F}_p\langle t \rangle$-module. Still, it seems non-obvious without this injection of theory that $\mathrm{im}(t-1)^m = \mathrm{ker}(t-1)^{p^r-m}$ is $P_r$-invariant.

It is a nice exercise to find the decomposition of the analogously defined permutation module over $\mathbb{C}$: it has exactly $p-1$ summands of dimension $p^\ell$ for each $\ell \in \{1,\ldots, r-1\}$ and $p$ $1$-dimensional summands, affording the characters of $P_r / B(r-1)\ldots B(1) \cong C_p$.
As a final application of the correspondence after Definition 7, we prove the following proposition.

Proposition 10. There are $(p-1)^r$ conjugacy classes of $p^r$-cycles in $P_r$. All $p^r$-cycles in $N_{S_{p^r}}(P_r)$ are conjugate.

Proof. Let $g = g_0g_1 \ldots g_{r-1} \in B(0)B(1) \ldots B(r-1)$. Let $c_i$ be the sum of the components of the vector corresponding to $g_i$ under the correspondence between $B(\ell)$ and subspaces of $\mathbb{F}_p^\ell$ set up after Definition 7. Standard results on conjugacy classes in wreath products (see for example Chapter 4 of The representation theory of the symmetric group, Encyclopedia of Mathematics and its Applications, vol. 16, Addison-Wesley 1981) show that $g$ is a $p^r$-cycle if and only if $c_\ell \not= 0$ for each $\ell$. Moreover, the conjugacy class of $g$ depends only on $(c_0,c_1,\ldots,c_{r-1}) \in (\mathbb{F}_p^\times)^{p}$. By Proposition 3, the normalizer has a single orbit on this set. $\Box$

#### Automorphism groups of cyclic codes

The previous section implies that the automorphism group of a $p$-ary cyclic code of length $p^r$ contains a Sylow $p$-subgroup of $S_{p^r}$. In fact we get more: provided we label positions in the sensible way, so that the cyclic shift is $(0,1,\ldots,p^r-1)$, the automorphism group contains our distinguished Sylow subgroup $S_{p^r}$. The results on $N_{S_{p^r}}(P_r)$ above can be used to give a more powerful version of a result of Brand on equivalences of codes.

Proposition 11. Cyclic codes $C$ and $C'$ of length $p^r$ are equivalent if and only if they are equivalent by an element of $\langle k_r(\ell) : 0 \le \ell < r-1 \rangle$.

Proof. Let $Cg= C'$. We have seen that $P_r \le \mathrm{Aut}(C)$ and $P_r \le \mathrm{Aut}(C')$. Hence $C'g^{-1}P_rg = C'$ so $P_r^g \le \mathrm{Aut}(C')$. By Sylow’s Theorem, there exists $h \in \mathrm{Aut}(C')$ such that $P_r^{gh} = P_r$. Let $gh = k \in N_{S_{p^r}}(P_r)$. Clearly we have $C' = C'h = Cgh = Ck$. Now apply Proposition 3 to $k$. $\Box$

For comparison, Brand’s result (see Lemma 3.1 in Polynomial isomorphisms of combinatorial objects, Graphs and Combinatorics 7 7–14) says that, for codes of arbitrary length, if $P$ is a Sylow $p$-subgroup of $C$, then the equivalence may be assumed to conjugate $(0,1,\ldots,p^r-1)$ to another permutation in $P$.

#### Application to Sylow’s Theorem

Let $G$ be a finite group of order $n$. If $G$ acts transitively on a set of size coprime to $p$ then any point stabiliser contains a Sylow $p$-subgroup of $G$. The existence of Sylow subgroups of $S_n$ gives a very convenient setting to exploit this observation. By letting $G$ act regularly on itself, we may regard $G$ as a subgroup of $S_n$. Fix a Sylow $p$-subgroup $P$ of $S_n$. The coset space $S_n/P$ has size coprime to $p$. Let $Pg$ be in an orbit of $G$ of size coprime to $p$ and let $H \le G$ be the stabiliser of $Pg$. Now $PgH = Pg$ implies that $gHg^{-1}$ is a subgroup of $P$, so $gHg^{-1}$ is a $p$-subgroup of $G$ of order divisible by the highest power of $p$ dividing $|G|$.

#### Intransitivity of derived group

By Corollary 6, any permutation in $P_r'$ permutes amongst themselves the odd and even numbers in $\{0,1,\ldots,p^r-1\}$. Hence $P_r'$ is not transitive. More generally, let $Q \le S_n$ be a transitive $p$-group. By embedding $Q$ in a Sylow $p$-subgroup $P$ of $S_n$, we see that $Q' \le P'$ is not transitive. Of course this can be proved in many other ways: for example, if $Q'$ is transitive then $Q = RQ'$ where $R$ is a point stabiliser. But $Q' \le \Phi(Q)$, where $\Phi(G)$ is the Frattini subgroup of non-generators’ of $Q$, so we have $Q = R$, a contradiction.

#### Automorphisms of permutation groups

Let $G \le S_n$ be a transitive permutation group. It is natural to ask when every automorphism of $G$ is induced by the action of $N_{S_n}(G)$. For example, this is the case whenever $G$ acts regularly.

#### Magma code

MAGMA code constructing all the permutations and groups defined in this post is available here.

## Algebraic numbers and character tables

November 12, 2016

Let $\zeta$ be a primitive $p$th root of unity in $\mathbb{C}$. Let $a$ be a primitive root modulo $p$, so the field automorphism $\sigma \in \mathrm{Gal}(\mathbb{Q}(\zeta):\mathbb{Q})$ defined by $\zeta^\sigma = \zeta^a$ generates the Galois group. Fix $t$ dividing $p-1$, let $b = a^t$, and let $\alpha = \zeta + \zeta^b + \cdots + \alpha^{b^{s-1}}$ where $st = p-1$. Thus $\mathbb{Q}(\alpha)$ is the fixed field of $\tau = \sigma^{t}$, the unique element of the Galois group of order $s$. We have $[\mathbb{Q}(\zeta) : \mathbb{Q}(\alpha)] = t$ and $[\mathbb{Q}(\alpha) : \mathbb{Q}] = s$.

A few days ago, while reading Burnside’s proof (in Section 252 of his wonderful book, Theory of groups of finite order) that a permutation group of prime degree $p$ is either 2-transitive or a Frobenius group contained in an affine general linear group $\mathrm{AGL}_2(\mathbb{F}_p)$, I needed the identity

$\alpha \alpha^{-\sigma^j} + \alpha^{\sigma} \alpha^{-\sigma^{j+1}} + \cdots + \alpha^{\sigma^{t-1}}\alpha^{-\sigma^{j+(t-1)}} = -s$

where $0 < j < t$. One rather sneaky proof is to look at the intended conclusion of Burnside's argument, namely that $G$ is Frobenius of order $ps$, and work back. I've put in details of the construction of the character table below, but taking this as read, the proof is essentially one line.

#### Character tables of affine Frobenius groups

Let $G = \langle g, h | g^p = h^s = 1, g^h = g^b \rangle$. Setting $N = \langle g \rangle$, we have $G = N \rtimes \langle h \rangle$. In the action of $G$ on the cosets of $\langle h\rangle$, the non-identity elements of $N$ act as $p$-cycles, and all elements of $G \backslash N$ have a unique fixed point. Therefore $G$ is a Frobenius group. The conjugacy class of $g$ is $\{g, g^h, \ldots, g^{h^{s-1}}\} = \{g,g^b, \ldots, g^{b^{s-1}} \}$, and the non-identity conjugacy classes have representatives

$g, g^a, \ldots, g^{a^{t-1}}, h, h^2, \ldots, h^{s-1}$

and are of sizes $s, s, \ldots, s, p, p, \ldots, p$ respectively.

There are $s$ linear characters of $G$ inflated from $G / \langle g \rangle \cong C_{s-1}$. Let $\phi : N \rightarrow \mathbb{C}$ be defined by $\phi(g) = \zeta$. Let $\theta = \phi\uparrow^G$. Then

\begin{aligned} \theta(g) &= \phi(g) + \phi(g^h) + \cdots + \phi(g^{h^{s-1}}) \\&= \phi(g) + \phi(g^b) + \cdots + \phi(g^{b^{s-1}}) \\ &= \zeta + \zeta^b + \cdots + \zeta^{b^{s-1}} \\ &= \alpha, \end{aligned}

and more generally, $\theta(g^{a^j}) = \alpha^{\sigma^j}$ for all $j$. It follows that $\theta, \theta^\sigma, \ldots, \theta^{\sigma^{s-1}}$ are $s$ distinct characters of $G$. By Frobenius reciprocity

\begin{aligned} \langle \theta, \theta \rangle &= \langle \theta\downarrow_N, \phi \rangle \\ &= \langle \phi, \phi + \phi^{\tau} + \cdots + \phi^{\tau^{s-1}} \rangle \\ &= 1. \end{aligned}

The sum of the squares of the degrees of the characters constructed so far is $1^2 \times s + s^2 \times t = s + s^2 t = s (1+st) = sp$, so we have found all irreducible characters.

#### Conclusion

Let $0 < j < t$. From

$|G| \langle \theta, \theta^{\sigma^j} \rangle = s^2 + s(\alpha \alpha^{-\sigma^j} + \alpha^{\sigma} \alpha^{-\sigma^{j+1}} + \cdots + \alpha^{\sigma^{t-1}} \alpha^{-\sigma^{j+(t-1)}})$

we get

$\alpha \alpha^{-\sigma^j} + \alpha^{\sigma} \alpha^{-\sigma^{j+1}} + \cdots + \alpha^{\sigma^{t-1}}\alpha^{-\sigma^{j+(t-1)}} = \frac{0-s^2}{s} = -s$

for $j\not= 0$. If $j=0$, the left-hand side is $|G|$ and we get instead $p-s$.

#### Two related well known tricks

In Wielandt’s well-known proof of Sylow’s Theorem, one needs to know that if $m$ is coprime to $p$ then $\binom{p^a m}{p^a} \not\equiv 0$ mod $p$. This follows immediately from Lucas’ Theorem, or one can follow Wielandt and use a short direct argument, but it is in the spirit of his proof to fix a set $\Omega$ of size $p^a m$ and let $g \in \mathrm{Sym}(\Omega)$ have $m$ disjoint $p^a$ cycles. Then $\langle g \rangle$ clearly has $m$ fixed points, in its action on the set of subsets of $\Omega$ of size $p^a$ and all other orbits have size divisible by $p$. So $\binom{p^a m}{p^a} \equiv m$ modulo $p$.

Another nice argument, applying linear algebra to algebraic number theory, shows that if $\alpha$ and $\beta$ are algebraic numbers then $\alpha \beta$ is also algebraic. Indeed, $\alpha \beta$ is an eigenvalue of $X_\alpha \otimes X_\beta$ where $X_\alpha$ and $X_\beta$ are irreducible integer matrices having $\alpha$ and $\beta$ as eigenvalues. (For example, take the companion matrices of the minimal polynomials.) This trick also shows that $\alpha \beta^{-1}$ is an eigenvalue of an integer matrix, $Y$ say, and so $1 + \alpha\beta^{-1}$ is an eigenvalue of $I + Y$. Therefore $1 + \alpha\beta^{-1}$ is algebraic, and multiplying through by $\beta$ (using the result just proved), we get that $\alpha + \beta$ is algebraic.

## Putnam 2016

November 4, 2016

Today I picked up the October issue of the American Mathematical Monthly. As traditional, the first article has the questions in this year’s William Lowell Putnam Competition. As displacement activity, I did Paper A, allowing myself two of the usual three hours, but, as is my normal habit, full use of computer algebra. For the entertainment of others (and possibly as a warning to myself), here is a record of the thought processes of an averagely procrastinating mathematician.

A1. Let $A$ and $B$ be points on the same branch of the hyperbola $xy=1$. Suppose that $P$ is a point lying between $A$ and $B$ on this hyperbola, such that the area of the triangle $APB$ is as large as possible. Show that the region bounded by the hyperbola and the chord $AP$ has the same area as the region bounded by the hyperbola and chord $PB$.

Thoughts / solution. This looks like a routine optimization problem. Put $A$ at $(a,1/a)$, $B$ at $(b,1/b)$ and $P$ at $(p,1/p)$, with $a > 0$. The first region is formed from the quadrilateral with vertices $(a,0),(p,0),(p,1/p),(a,1/p)$, missing out the area below the hyperbola from $a$ to $p$. So its area is

$\displaystyle \frac{(p-a)(1/a+1/p)}{2} - \log \frac{p}{a}.$

Similarly the second region has area

$\displaystyle \frac{(b-p)(1/p+1/b)}{2} - \log \frac{b}{p}.$

I can see that $p$ is eliminated when we add the logs. This looks hopeful. I add the areas and promptly make an algebraic slip that says the sum is independent of $p$. This can’t be right. Turning to Mathematica I get the correct sum, and by differentiating, see that it is minimized when $p = \sqrt{ab}$. This maximizes the area of the triangle, and sure enough, the expressions above for the areas under the chords turn out to be equal when $p = \sqrt{ab}$.

(In retrospect, this had to be the right $p$, because by the moral linear independence of arbitrary transcendentals, there’s no way the expressions will be equal without the logs agreeing.)

A2. Let $a_0 = 1, a_1 = 2$, and $a_n = 4a_{n-1} - a_{n-2}$ for $n \ge 2$. Find an odd prime factor of $a_{2015}$.

Thoughts / solution. I immediately load the problem into Mathematica and compute the factorizations of $a_1, \ldots, a_{100}$. It looks like $181$ is always a factor of $a_{5m}$ when $m$ is odd. Since $2015 = 5 \times 13 \times 31$, this would do it. After some dithering with $2 \times 2$ matrices versus direct manipulation of the recurrence, I use Mathematica with

b[0] := b0; b[1] := b1; b[n_] := 4b[n-1] - b[n-2]

to find $b_{10} = -40545 b_0 + 151316 b_1$. It follows that

$b_{n+10} = -40545b_{n} + 151316b_{n+1}$

for all $n$. Since $151316 = 181 \times 836$, the claim follows by induction. Thinking that this wouldn’t have been much fun to do without computer algebra, I press on.

A3. Compute

$\displaystyle \log_2 \Bigl( \prod_{a=1}^{2015} \prod_{b=1}^{2015} (1 + \mathrm{e}^{2\pi i ab/2015}) \Bigr)$.

Here $i$ is the imaginary unit (that is, $i^2 = -1$).

Thoughts / solution. My immediate thought is ‘who is going to know the definition of the complex exponential function, but get confused by $i$‘? Next I put the product into Mathematica, replacing $2015$ with a general $N$. After some confusing errors, I realize that $N$ (even used in pattern matching as N_) is a bad choice, because it clashes with the built in numerical evaluation function. This hitch resolved, the small values suggest that the product is $0$ when $N$ (or rather, a more Mathematica-friendly letter) is even, and a power of $2$ when $N$ is odd. A trivial change in the code lets me compute

$\displaystyle Q(a) = \prod_{b=1}^{N} (1 + \mathrm{e}^{2\pi i ab/N}).$

Of course its still a power of $2$, but the exponent is still not obvious. Fairly quickly I realise I’ve encountered the product

$(1+\zeta)(1+\zeta^2) \ldots (1+\zeta^{M}),$

where $\zeta$ is a primitive $M$th root of unity, before: the standard (and I feel slightly tired) trick, is to note that, up to a sign, it’s obtained by evaluating $X^M-1=(X-1)(X-\zeta)\ldots (X-\zeta^{M-1})$ at $X=-1$. So $(1+\zeta)(1+\zeta^2) \ldots (1+\zeta^{M}) = 2$ and, after some stupid slips and general lack of progress, I realise that

$Q(a) = 2^{\gcd(N,a)}$.

This reduces the problem to evaluating $\sum_{a=1}^N \gcd(N,a)$. Turning to a small example, I see how this works for $N = 15$: we have $\phi(15) = 8$ coprime numbers with gcd $1$, then $3,6,9,12$ each contribute $3$, and $5,10$ each contribute $5$. I go off on a tangent trying to get a general formula with the Principle of Inclusion and Exclusion. Nothing seems very nice, and, noting that the sum is over all $a$, not those $a$ satisfying some ‘sievable’ property, it starts to feel like the wrong approach. Going back to the example, I realise that there are exactly $\phi(N/d)$ numbers with gcd $d$. So we need

$P(N) = \prod_{c \mid n} 2^{\phi(c) N/c}.$

It looks like $P$ might be multiplicative. Yes it is. I evaluate $P(p^a)$, then realise I’m wasting time, since I should remember from A2 that $2015 = 5 \times 13 \times 31$ is square-free. Clearly

$P(p) = 2^p \times 2^{p-1} = 2^{2p-1}.$.

So $P(2015) = 2^{9 \times 25 \times 61}.$

A4. For each real number $x$, let

$\displaystyle f(x) = \sum_{n \in S_x} \frac{1}{2^n}$

where $S_x$ is the set of positive integers $n$ for which $\lfloor nx \rfloor$ is even. What is the largest real number $L$ such that $f(x) \ge L$ for all $x \in [0,1)$? (As usual $\lfloor z \rfloor$ denotes the greatest integer less than or equal to $z$.)

Thoughts / solution. I’m getting a bit tired now, but this looks like an interesting problem. Once again, I load it into Mathematica. This takes a bit more effort, but after a few minutes I’ve got the code and graph shown below.

It looks like the minimum is attained at $2/3$. I evaluate N[f[1000, 2/3 - 0.0001], 20] and get $0.57142857142857142857$. I should recognise this immediately as $4/7$, but instead put it into Google.

I’m now quite keen to solve this problem. I look at some small cases by hand, seeing that if $x \in [0,1/2)$ we get an immediate contribution of $2^{-2}$, so $x \ge 1/2$ is indicated. Then $x \in [1/3,2/3)$ makes sure that we don’t get $2^{-3}$, so it looks like $x \le 2/3$. But then we have $x \in [1/2,3/4)$, so $4 \in S_x$ and we do get $2^{-4}$. Maybe this is unavoidable, … I have my first ‘aha’ moment of the competition, when I realise that this greedy approach is going to solve the problem: since $1/2^n$ is equal to $1/2^{n+1} + 1/2^{n+2} + \cdots$, nothing is more important at the $n$th step than avoiding $n \in S_x$, if we can. I celebrate by taking 10 minutes off to play some blitz chess.

Back at the problem, I see that I have to rule out an interval of the form $[2r/n, (2r+1)/n)$ containing $2/3$, such that the lower bound from the greedy strategy after $n-1$ steps is less than $2r/n$. Then we would dodge’ left of this interval, and lose control of $x$. I try to make this work for at least 10 minutes, not getting anywhere much. Maybe only prime values of $n$ are relevant? I get Mathematica to show me the relevant intervals with the code below.

This shows that $x \ge 5/8$ is a lower bound from the greedy strategy not coming from prime $n$. But more importantly, I finally see the pattern: starting at $n=3m-1$, we have

$\frac{2}{3} \in [(2m-1)/(3m-1), 2m/(3m-1));$

since $2m-1$ is odd, the greedy strategy tells us that $x \ge (2m-1)/(3m-1)$. For the next $m$ we have

$\frac{2}{3} \in [(2m-1)/3m, (2m+1)/3m,]$

so $x \ge (2m-1)/3m$, but this is already implied. Next

$\frac{2}{3} \in [2m/(3m+1), (2m+1)/(3m+2)),$

and since $2m/(3m+1) \le (2m-1)/(3m-1)$, there is no way to avoid $3m+1 \in S_x$. So taking $x$ arbitrarily close to $2/3$ we can get $f(x)$ arbitrarily close to

$\frac{1}{2^1} + \frac{1}{2^4} + \frac{1}{2^7} + \cdots = \frac{1}{2} \times \frac{1}{1-1/8} = \frac{4}{7}.$

A5. Let $q$ be an odd positive integer, and let $N_q$ denote the number of integers $a$ such that $0 < a < q/4$ and $\mathrm{gcd}(a,q) = 1$. Show that $N_q$ is odd if and only if $q$ is of the form $p^k$ with $k$ a positive integer and $p$ a prime congruent to $5$ or $7$ modulo $8$.

Thoughts. My immediate thought: this is too much, and I’m wasting time doing this anyway. Maybe it is something to do with arithmetic sums (which I know I don’t know well). I decide to skip it.

A6. Let $n$ be a positive integer. Suppose that $A, B$, and $M$ are $n \times n$ matrices with real entries such that $AM = MB$ and such that $A$ and $B$ have the same characteristic polynomial. Prove that $\mathrm{det}(A-MX) = \mathrm{det}(B-XM)$ for every $n \times n$ matrix $X$ with real entries.

Thoughts / sketched solution. Surely as a professional user of linear algebra, this can’t be hard for me? Reluctantly I commit to solving it. Immediately I see that if $M$ is invertible then we’re done, since then $M^{-1}AM = B$ and so

$\mathrm{det}(B-XM) = \mathrm{det}(M^{-1}A-X) \mathrm{det}(M)$

and similarly

$\mathrm{det}(A-MX) = \mathrm{det}(M)\mathrm{det}(M^{-1}A - X).$

Thinking about the other extreme case, when $M$ is $0$, the $AM = MB$ condition tells us nothing, but now all we need is $\mathrm{det}(A) = \mathrm{det}(B)$, which follows from the assumption on the characteristic polynomials.

By Fitting’s Lemma we can write $M$ as the direct sum of a nilpotent matrix and an invertible matrix. I mess around with block decompositions for a few minutes: it looks very fiddly. Throwing some more technology at the problem, I decide we can assume $M$ is diagonalizable (by density passing to the complex numbers), and in fact diagonal (by conjugation). Then the block decompositions look more tractable: putting

$M = \left( \begin{matrix} D & 0 \\ 0 & 0 \end{matrix} \right), A = \left( \begin{matrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{matrix} \right), B = \left( \begin{matrix} B_{11} & B_{12} \\ B_{21} & B_{22} \end{matrix} \right)$

we get

$AM = \left( \begin{matrix} A_{11} D & 0 \\ A_{21} D & 0 \end{matrix} \right) = \left( \begin{matrix} D B_{11} & DB_{12} \\ 0 & 0 \end{matrix} \right) = MB$

so $A_{21} = 0$, $B_{12} = 0$ and $A_{11}D = DB_{11}$ with $D$ invertible. So we need

$\left( \begin{matrix} A_{11} - DX_{11} & A_{12} - DX_{12} \\ 0 & A_{22} \end{matrix} \right), \left( \begin{matrix} B_{11} - X_{11}D & 0 \\ B_{21} - X_{21} D & B_{22} \end{matrix} \right)$

to have the same determinant. The determinants factorize: the contributions from the top left are equal by the invertible case. The characteristic polynomials also factorize, so, as in the zero case, $\mathrm{det} A_{22} = \mathrm{det} B_{22}$.

Final thoughts. All this took about two hours, with some breaks for chess / tea-making. I don’t feel inclined to go back to A5, so call it a day. I enjoyed doing A4, but otherwise it felt like a bit of a slog. Maybe I am too quick to reach for my trusty computer algebra system? Or maybe my experience also shows the increasing artificiality of time-limited, closed-book, no-internet, no-collaboration, no-technological-assistance tests such as the Putnam?

## Semisimple modular reductions

October 30, 2016

Let $G$ be a finite group and let $\rho : G \rightarrow \mathrm{GL}_d(\mathbb{Q})$ be a representation of $G$ by $d \times d$ matrices with rational entries. We regard these matrices as acting on row vectors in an $d$-dimensional vector space $V$ with standard basis $e_1, \ldots, e_n$. The $\mathbb{Z}$-submodule of $V$ spanned by $\bigl\{e_i \rho(g) : i \in \{1,\ldots,d\}, g \in G\bigr\}$ is free of rank $d$, and so has a $\mathbb{Z}$-basis, say $u_1, \ldots, u_d$. In this new basis, the matrices representing elements of $G$ have integer entries. We may therefore assume that $\rho$ has image in $\mathrm{GL}_d(\mathbb{Z})$.

It is now easy to obtain a representation of $G$ in prime characteristic $p$: just think of the entries of each $\rho(g)$ as lying in $\mathbb{F}_p$ rather than $\mathbb{Z}$. A representation obtained in this way is called a $p$-modular reduction of $V$. The indefinite article is correct: $p$-modular reductions defined with respect to different $\mathbb{Z}$-bases need not be isomorphic. However, as in the Jordan–Hölder Theorem, they always have the same multiset of composition factors. (For a proof, see Theorem 32 in Serre, Linear representations of finite groups.)

I think of representations of finite groups as rather ‘rigid’ objects, so find it slightly surprising how much freedom there often is to obtain different $p$-modular reductions. Here is an example from the symmetric group. Switching to the language of modules, let $n \in \mathbb{N}$, let $p$ be a prime dividing $n$, and let $M = \langle e_1, \ldots, e_n \rangle_\mathbb{Q}$ be the natural permutation module for $\mathbb{Q} S_n$. The subspace

$U = \langle e_i - e_j : 1 \le i < j \le n \rangle_\mathbb{Q}$

of $M$ is a $\mathbb{Q}S_n$-submodule. One obvious $\mathbb{Z}$-basis for $U$ is $e_1-e_n, \ldots, e_{n-1}-e_n$. Provided $n \ge 4$, in this basis, the matrices representing the generators $(1,2)$ and $(1,2,\ldots, n)$ of $S_n$ are

$\left( \begin{matrix} 0 & 1 & 0 & \ldots & 0 \\ 1 & 0 & 0 & \ldots & 0 \\ 0 & 0 & 1 & \ldots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \ldots & 1 \end{matrix}\right) \quad, \left( \begin{matrix} -1 & 1 & 0 & \ldots & 0 & 0 \\ -1 & 0 & 1 & \ldots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ -1 & 0 & 0 & \ldots & 0 & 1 \\ -1 & 0 & 0 & \ldots & 0 & 0 \end{matrix} \right)$

respectively. Consider the $\mathbb{Q}$-basis for $U$ in which $e_{n-1} - e_n$ is replaced with

$v = e_1 + \cdots + e_{n-1} - (n-1)e_n = (e_1-e_n) + \cdots + (e_{n-1} - e_n).$

Since the coefficient of $e_{n-1} - e_n$ in $v$ is $1$, the change of basis matrix is triangular with $1$s on its diagonal. Therefore $e_1-e_n, \ldots, e_{n-2}-e_n, v$ is a $\mathbb{Z}$-basis for $U$. In this basis the matrices representing the chosen generators are conjugated to

$\left( \begin{matrix} 0 & 1 & 0 & \ldots & 0 \\ 1 & 0 & 0 & \ldots & 0 \\ 0 & 0 & 1 & \ldots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \ldots & 1 \end{matrix}\right) ,\quad \left( \begin{matrix} -1 & 1 & 0 & \ldots & 0 & 0 \\ -1 & 0 & 1 & \ldots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ -2 & -1 & -1 & \ldots & -1 & 1 \\ -n & 0 & 0 & \ldots & 0 & 1 \end{matrix} \right)$

respectively.

Let $W$ be the $\mathbb{F}_pS_n$-module obtained by $p$-modular reduction. Since $p$ divides $n$, we see from the final row of the matrices immediately above that $\langle v \rangle_{\mathbb{F}_p}$ is a $1$-dimensional trivial $\mathbb{F}_pS_n$-submodule of $W$. The quotient $W / \langle v \rangle$ is a simple $\mathbb{F}_pS_n$-module, $D$ say. By the result on composition factors mentioned earlier, any $p$-modular reduction of $U$ has composition factors $\mathbb{F}_p$ and $D$.

Suppose we scale the final basis vector $v$ by $\gamma \in \mathbb{Q}$. If the coefficient of $e_{i} - e_n$ in $v \rho(g)$ is $\alpha$, then the coefficient of $e_i - e_n$ in $\gamma v \rho(g)$ is $\gamma \alpha$. Similarly, if the coefficient of $v$ in $(e_i - e_n) \rho(g)$ is $\beta$ then the coefficient of $\gamma v$ in $(e_i - e_n) \rho(g)$ is $\gamma^{-1} \beta$. The effect on matrices is to scale the final column by $\gamma$, and the final row by $\gamma^{-1}$, leaving the entry in the bottom right unchanged. The new matrices are therefore

$\left( \begin{matrix} 0 & 1 & 0 & \ldots & 0 \\ 1 & 0 & 0 & \ldots & 0 \\ 0 & 0 & 1 & \ldots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \ldots & 1 \end{matrix}\right) ,\quad \left( \begin{matrix} -1 & 1 & 0 & \ldots & 0 & 0 \\ -1 & 0 & 1 & \ldots & 0 & 0 \\ -1 & 0 & 0 & \ldots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ -2 & -1 & -1 & \ldots & -1 & \gamma \\ -n/\gamma & 0 & 0 & \ldots & 0 & 1 \end{matrix} \right)$

respectively. Let $p^a$ be the highest power of $p$ dividing $n$. Taking $\gamma = p^a$ we see that $U$ has a $p$-modular reduction in which $D$ appears as the unique bottom composition factor, and $\mathbb{F}_p$ as the unique top composition factor. Moreover, if $a \ge 2$ then by taking $\gamma = p$ we get a semisimple $p$-modular reduction of $U$.

This trick for moving composition factors up or down in the radical series of a module generalizes. In fact, over a suitable ring, one may move composition factors around almost at will. One striking application is Lemma 18.2 in Feit, Finite linear groups. The special case for rational representations is proved below.

Theorem. Let $G$ be a finite group and let $U$ be a $\mathbb{Q}G$-module. There is a field $K$ containing $\mathbb{Q}$ such that the $KG$-module $U \otimes_\mathbb{Q} K$ has a semisimple $p$-modular reduction.

Proof. Suppose that $U$ has a $p$-modular reduction having exactly $e$ composition factors. There is an $\mathbb{Z}$-basis of $U$ in which the representing matrices have the block form

$\left( \begin{matrix} X_{11} & pX_{12} & \ldots & pX_{1e} \\ X_{21} & X_{22} & \ldots & pX_{2e} \\ \vdots & \vdots & \ddots & \vdots \\ X_{e1} & X_{e2} & \ldots & X_{ee} \end{matrix} \right).$

Let $K = \mathbb{Q}[p^{1/e}]$. Let $t = p^{1/e}$. Scaling the basis vectors corresponding to the row containing $X_{kk}$ by $t^{k-1}$ for each $k$ gives a $\mathbb{Z}[t]$-basis of $U$ in which the representing matrices are

$\left( \begin{matrix} X_{11} & t^{e-1}X_{12} & \ldots & tX_{ee} \\ tX_{21} & X_{22} & \ldots & t^2X_{2e} \\ \vdots & \vdots & \ddots & \vdots \\ t^{e-1}X_{e1} & t^{e-2}X_{e2} & \ldots & X_{ee} \end{matrix} \right).$

The corresponding $p$-modular reduction, defined by quotienting $\mathbb{Z}[t]$ by its maximal ideal $\langle t \rangle$, is semisimple. $\Box$

The following theorem gives another generalization of the example above. Given a commutative ring $R$ and a partition $\lambda$, let $S_R^\lambda$ denote the corresponding Specht module for $RS_n$.

Theorem. Let $\lambda$ be a partition. There is a $p$-modular reduction of $S^\lambda_\mathbb{Q}$ isomorphic to $(S^\lambda_{\mathbb{F}_p})^\star$.

Proof. Let $n = |\lambda|$. Let $t$ be a tableau of shape $\lambda$ with distinct entries from $\{1,\ldots, n\}$. Let $a_\lambda = \sum_{\sigma} \sigma$, respectively $b_\lambda = \sum_\sigma \mathrm{sgn}(\sigma)$, where the sums are over all permutations $\sigma$ that permute amongst themselves the entries of each row, respectively column, of $t$. There are isomorphisms $S^\lambda_\mathbb{Z} \cong a_\lambda b_\lambda \mathbb{Z}S_n$ and $(S^\lambda_\mathbb{Z})^\star \cong b_\lambda a_\lambda \mathbb{Z}S_n$. Moreover, $a_\lambda b_\lambda$ is a pseudo-idempotent, i.e. $(a_\lambda b_\lambda)^2 = \delta a_\lambda b_\lambda$ for some non-zero $\delta \in \mathbb{Z}$. (For proofs of these facts see Lemma 5 and Exercise 20 in Chapter 7 of Fulton, Young tableaux.) Hence $b_\lambda a_\lambda \sigma \mapsto a_\lambda b_\lambda a_\lambda \sigma$ defines an injective homomorphism $(S^\lambda_\mathbb{Z})^\star \rightarrow S^\lambda_\mathbb{Z}$. Therefore $S^\lambda_\mathbb{Z}$ has a $\mathbb{Z}$-basis defining a reduction modulo $p$ of $S^\lambda_\mathbb{Q}$ isomorphic to $(S^\lambda_{\mathbb{F}_p})^\star$. $\Box$

An alternative description of the inclusion homomorphism in this proof is given by Fayers in Proposition 4.7 of On the structure of Specht modules. The results on Schaper layers in his paper can be used to give many further examples of semisimple $p$-modular reductions of Specht modules that do not require a field extension.

## The fixed point combinator and memoization in Haskell

October 7, 2016

The Fibonacci numbers $F_n$ for $n \in \mathbb{N}_0$ are defined by
$F_n = n$ if $n \le 1$ and $F_n = F_{n-1} + F_{n-2}$ otherwise.

fib n | n <= 1    = n
| otherwise = fib (n-1) + fib (n-2)


The translation is almost verbatim. Unfortunately, as a show-case for functional programming, the definition of fib is pretty terrible. The problem can be seen from the evaluation tree for fib 5 below.

The tree has $F_6 = 8$ leaves. More generally, one can show by induction that the tree for fib n has $F_{n+1}$ leaves. Clearly computing $F_n$ by adding up $F_{n+1}$ numbers, of which $F_n$ are $1$ and $F_{n-1}$ are $0$, is not time efficient.

A more minor objection is that the domain of fib is not $\mathbb{N}_0$. In fact, thanks to Haskell’s polymorphism, fib is defined on values of any numeric type. For example fib 1.5 evaluates to fib 0.5 + fib (-0.5), which, since both arguments are $\le 1$, evaluates to $0.5-0.5 = 0$. It seems remarkable that, without any thought (in fact, with the deliberate absence of thought), we have extended the domain of the Fibonacci function to all numbers understood by Haskell. Anyway, this objection is easily addressed:

fib' :: Integer -> Integer
fib' n | n <  0    = undefined
| n <= 1    = n
| otherwise = fib' (n-1) + fib' (n-2)


Here undefined is the error value, also known as ‘bottom’, that silently inhabits every type.

#### Memoization

To address the serious problem, we need to tell Haskell to remember, or memoize the values of fib, rather than recompute them each time. The following idiom works well and is amenable to generalization.

fibM = \n -> values !! n
where values = [fibAux m | m <- [0..]]
fibAux n | n <= 1    = n
| otherwise = fibM (n-2) + fibM (n-1)


To understand why this works, one has to have some understanding of Haskell’s evaluation model. Haskell is lazy. Roughly put: no value is evaluated unless its necessary for the computation to proceed. Thus if fibM is evaluated at 30 (say), the apparently infinite list [fibAux m | m <- [0..]] is only evaluated as far as its 31st member. Moreover, since values appears at the top level of the ‘where’ in the definition of fibM, it is shared between all invocations of fibM. (To use the correct terminology, it is a constant applicative form’.) Sharing a list means sharing all its members, so each list member, and so each value of fibM is computed at most once.

This point is maybe made more clearly by considering a simpler definition, that fails to memoize correctly.

fibM' n | n <= 1    = n
| otherwise = values !! (n-1) + values !! (n-2)
where values = [fibM' m | m <- [0..]]


The problem is that the list values could potentially depend on $n$ (considering replacing [0..] with [0..n]). and so cannot be shared between all invocations of fibM'. The difference with fibM is that the closure for fibM includes values; in contrast, each evaluation of fibM' n creates a new closure, with a new values list. (So fibM' uses quadratic storage, as well as exponential time—what a triumph!)

Writing fibM needs some care. We have to introduce an auxillary part-memoized’ function, and in turn we have to be careful in fibAux to use memoized values, rather than recursive calls to fibAux. Maybe we could make all this a one-time effort, by writing a function memoize :: (a -> a) -> (a -> a), such that memoize fib evaluates to a memoized version of fib?

A bit of thought suggests this is too ambitious. For example, what if the type a does not support testing for equality? Then we cannot even know if we are evaluating the memoized function on an argument already seen. But surely, we can write a memoizer for functions with domain and range the natural numbers less than $2^{64}$, with type signature memoizeInt :: (Int -> Int) -> (Int -> Int)?

I don’t understand Haskell or functional programming well enough to be completely confident here, but my belief is that no such memoizer can be defined. Consider the following failed attempt.

memoizeInt :: (Int -> Int) -> (Int -> Int)
memoizeInt f = \m -> values !! m
where values = [f m | m <- [0..]]


This is a useful memoizer for a function such as fac n = product [1..n] that may require a lot of work for each evaluation, but do not call themselves recursively. For instance, in ghci 7.8.4 on my laptop, evaluating fac (10^7) takes 4.5s; after defining facM = memoizeInt fac, the first evaluation of facM (10^7) takes 11s, and subsequent evaluations of facM (10^7) are instantaneous. (The result is in every case 0, because of the limited range of the Int type.)

The problem in fibM'' = memoizeInt fib is that when Haskell evaluates fibM'' n, the recursive evaluations fib (n-2) and fib (n-1) call the original fib function, not the memoized function fibM''. We would like to somehow ‘reach into’ the definition of fib so that the recursive calls are redirected, but the immutability of Haskell values makes this impossible.

#### Recursive functions as fixed points

The difficulty seems to be with functions that call themselves. Ignoring memoization, how do compilers cope such functions? We digress to consider a strategy sheds some light on the memoization problem. Consider the function fibBuilder below.

fibBuilder :: (Int -> Int) -> (Int -> Int)
fibBuilder f = g
where g n | n <= 1    = f n
| otherwise = f (n-2) + f (n-1)


The input of fibBuilder is a function. The output is another function which we shall see is a ‘better approximation’ to a correct Fibonacci function. Consider for example

[id m | m <- [0,1,2,3]
[g m | m <- [0,1,2,3]] where g = fibBuilder id
[h m | m <- [0,1,2,3]] where h = (fibBuilder (fibBuilder id))


The output for the identity function id is clearly [0,1,2,3]. Chasing through the evaluations of g we get g 2 $\rightarrow$ id 0 + id 1  $\rightarrow$ 0 + 1 $\rightarrow$ 1 and g 3  $\rightarrow$ id 1 + id 2 $\rightarrow$ 1 + 2 $\rightarrow$ 3. Hence h 3 $\rightarrow$ g 2 + g 1 where g = fibBuilder id evaluates to  1 + 1, and so to 2. Thus g agrees with fib on 0, 1, 2 and h agrees with fib on 0, 1, 2, 3.

An easy inductive argument shows that a function defined, like g and h, but with $r$ calls to fibBuilder agrees with fib on $\{0,1,\ldots, r+1\}$. We want to somehow ‘pass to the limit’, and get the function fibInf defined by calling fibBuilder infinitely many times. So fibInf should be the same as fibBuilder fibInf. By analogy with the infinite list ones = 1 : ones, we boldly define

fibInf = fibBuilder fibInf


The result is disappointing: no evaluation of fibInf ever terminates. For example, fibInf 3 $\rightarrow$ (fibBuilder fibInf) 3 $\rightarrow$ fibInf 2 + fibInf 1 $\rightarrow$ (fibBuilder fibInf) 2 + fibInf 1 $\rightarrow$ fibInf 1 + fibInf 0 + fibInf 1. Now the evaluation gets stuck, since each summand evaluates to itself. But there is a simple solution: extend the builder function so that it knows the initial conditions for the Fibonacci numbers, as well as the pattern of recursion. (This amounts to deleting one f in the original definition.)

fibBuilder' :: (Int -> Int) -> (Int -> Int)
fibBuilder' f = g
where g n | n <= 1    = n
| otherwise = f (n-2) + f (n-1)

fibInf' = fibBuilder' fibInf'


This works. While non-obvious to define, neither of the functions above calls itself recursively. So, at least in this case, we’ve seen that a compiler that can cope with functions that return other functions can cope with a truly recursive function.

An alternative definition, emphasising that fibInf and fibBuilder fibInf are the same is

fibInf' = fix fixBuilder' where fix k = f (fix k)

Thus the defining property of fibInf' also serves as a working Haskell definition. For more theory on the fixed point combinator see the Wikipedia page. If there was a prize for ‘function that seems the most useless, and is in fact the most useful’, fix would be a strong contender.

#### Memoizing builder functions

In the same spirit of ‘defining property as working definition’, we might try to define a memoized version of fib by

fibMInf = fibBuilder' (memoizeInt fibMInf)


Remarkably enough, this works. For example, chasing through the evaluation of fibMInf 3 we get fibMInf 3 $\rightarrow$ fibBuilder' (memoizeInt fibMInf) 3 $\rightarrow$ f 1 + f 2 where f = memoizeInt fibMInf. The closure for f has the list values, defined by values = [f m | m <- [0..]]. So, as if by magic, we now call a memoized function in the recursive evaluations.

So although we haven’t been able to define a good memoizer with type (Int -> Int) -> (Int -> Int), we can use the naive attempt above to write a good memoizer for builder functions.

memoizeBuilder :: ((Int -> Int) -> (Int -> Int)) -> (Int -> Int)
memoizeBuilder builder = fix (builder . memoizeInt)

fibMInf' = memoizeBuilder fibBuilder'


## Differences of permutation characters of symmetric groups

July 13, 2016

Here are a few small observations on permutation characters motivated by discussions at the Herstmonceux conference Algebraic combinatorics and group actions. Let $\chi^\lambda$ denote the irreducible character of the symmetric group $S_n$ labelled by the partition $\lambda$ of $n$. Let $\pi^\lambda$ denote the permutation character of $S_n$ acting on the cosets of the Young subgroup $S_\lambda$.

• Since $\pi^\lambda = \chi^\lambda + \phi$ where $\phi$ is an integral linear combination of characters $\chi^\mu$ for partitions $\mu$ dominating $\lambda$, any irreducible character of $S_n$ is an integral linear combination of the $\pi^\lambda$.

• This can be made more explicit as follows. Let $\delta = (n-1,\ldots,1,0)$. For $\sigma \in S_n$, let $\lambda \cdot \sigma = (\lambda + \delta)\sigma - \delta$, where $\sigma$ acts on $\lambda + \delta$ by place permutation. The Jacobi–Trudi formula states that

$\chi^\lambda = \sum_{\sigma \in S_n} \pi^{\lambda \cdot \sigma} \mathrm{sgn}(\sigma).$

• It is a special property of symmetric groups that any character is a difference of permutation characters. For example, the cyclic group $C_n$ has $\mathrm{d}(n)$ (the number of divisors of $n$) distinct permutation characters, but $n$ irreducible characters, so unless $n \le 2$, the additive group generated by the permutation characters is a proper subgroup of the integral character ring. And clearly no irreducible character of a finite group taking a non-rational value can be an integral linear combination of permutation characters.

• Since a transitive permutation character contains the trivial character exactly once, the trivial character is not a difference of two transitive permutation characters. A further example is given below.

Claim. $\chi^{(n-1,1)} + \chi^{(1^n)}$ is not a difference of two transitive permutation characters of $S_n$.

Proof. Let $\pi_K$ denote the permutation character of $S_n$ acting on the subgroup $K$ of $S_n$. Note that the sign character, $\chi^{(1^n)}$, appears in $\pi_K$ if and only if $K \le A_n$.

Let $s(K)$ be the number of orbits of $K$ on $\{1,\ldots, n\}$. Since

$\chi^{(n-1,1)} = \pi^{(n-1,1)} - 1_{S_n},$

it follows from Frobenius reciprocity and Mackey’s Formula that $\langle \pi_K, \chi^{(n-1,1)} \rangle = s(K)-1$. Consider $\chi^{(2,1^{n-2})}$. Multiplying the displayed equation above by $\mathrm{sgn}$, we get

$\chi^{(2,1^{n-2})} = \mathrm{sgn}_{S_{n-1}}\uparrow^{S_n} - \mathrm{sgn}_{S_n}.$

By Frobenius reciprocity

$\langle \pi_K, \mathrm{sgn}_{S_{n-1}}\uparrow^{S_n} \rangle = \langle \pi_K \uparrow^{S_n} \downarrow_{S_{n-1}}, \mathrm{sgn}_{S_{n-1}} \rangle.$

By Mackey’s Formula, the right-hand side is

$\sum \langle \pi_K \uparrow_{K^g \cap S_{n-1}}^{S_{n-1}}, \mathrm{sgn}_{S_{n-1}} \rangle,$

where the sum is over distinct double cosets $K g S_{n-1}$. It follows that $\langle \pi_K, \chi^{(2,1^{n-2})} \rangle \le s(K)-1$, with equality when $K \le A_n$.

Suppose that $H$ and $G$ are subgroups of $S_n$ such that $\pi_H - \pi_G$ contains $\chi^{(n-1,1)}$ and $\chi^{(1^n)}$. From $\chi^{(1^n)}$ we see that $H \le A_n$ and $G \not\le A_n$. From $\chi^{(n-1,1)}$, we see that $s(G) = s(H) -1$. By the previous paragraph,

$\langle \pi_H, \chi^{(2,1^{n-2})} \rangle = s(H)-1,$

whereas

$\langle \pi_G, \chi^{(n-2,1^{n-2})} \rangle \le s(G)-1 < s(H)-1.$

Hence $\chi^{(2,1^{n-2})}$ also appears in $\pi_H -\pi_G$. Therefore

$\chi^{(n-1,1)} + \chi^{(1^n)}$

is not a difference of two transitive permutation characters. $\Box$

• Since $\pi^{(n-r,r)} - \pi^{(n-r+1,r-1)} = \chi^{(n-r,r)}$ for $1 \le r \le n/2$ (this is a special case of the Jacobi—Trudi formula) each $\chi^{(n-r,r)}$ is a difference of two transitive permutation characters. Moreover, $\chi^{(1^n)} = \pi_{A_n} - 1_{S_n}$. Exhausting over all subgroups of $S_n$ by computer algebra supports the conjecture that, provided $n \ge 9$, these are the only irreducible characters of $S_n$ that are the difference of two transitive permutation characters.
• The behaviour of $S_6$ is interesting: all irreducible characters except for $\chi^{(3,2,1)}$ are the difference of two transitive permutation characters. This is explained in part by the outer automorphism that exists in this case.