## Difference attacks on toy block ciphers

May 22, 2019

As a toy model for a difference attack in my Cipher Systems course, I used the block cipher with plaintext and ciphertexts $\mathbb{F}_2^8$, key space $\mathbb{F}_2^8 \times \mathbb{F}_2^8$ and encryption functions defined by

$e_{(k,k')}(x) = P(x+k) + k',$

where $P : \mathbb{F}_2^8 \rightarrow \mathbb{F}_2^8$ is the pseudo-inverse function from AES. Thus $P$ is defined by identifying $\mathbb{F}_2^8$ with the finite field $\mathbb{F}_{2^8}$ and then inverting non-zero elements. The zero word is the unique fixed point of $P$.

Suppose you know a single plaintext/ciphertext pair $(x,z)$. For each guessed first key $k_\star$ there is a unique $k'_\star$ such that $e_{(k_\star,k'_\star)}(x) = z$, namely

$k'_\star = P(x+k_\star) + z.$

Thus a single known plaintext/ciphertext pair does not reveal either component of the key. To motivate the attack in this post, let us consider what two pairs $(x,z)$ and $(x+\Delta, z+\Gamma)$ reveal. To save space, write $x_\Delta$ for $x + \Delta$. Suppose that $(k_\star, k'_\star)$ is a possible key:

\begin{aligned} x &\mapsto x + k_\star\hskip7.5pt \mapsto P(x+k_\star) \hskip7pt \mapsto P(x+k_\star) + k'_\star \hskip8pt = z \\ x_\Delta &\mapsto x_\Delta + k_\star \mapsto P(x_\Delta + k_\star) \mapsto P(x_\Delta+k_\star) +k'_\star = z + \Gamma.\end{aligned}

A little thought shows that another possible key is $(k_\star + \Delta, k'_\star + \Gamma)$:

\begin{aligned} x &\mapsto x_\Delta + k_\star \mapsto P(x_\Delta+k_\star) \mapsto P(x_\Delta+k_\star) + k'_\star + \Gamma = z \\ x_\Delta &\mapsto x + k_\star\hskip7.5pt \mapsto P(x + k_\star)\hskip7pt \mapsto P(x+k_\star) +k'_\star +\Gamma\hskip8pt = z + \Gamma.\end{aligned}

Observe that the differences of top and bottom are in both cases

$\Delta \mapsto \Delta \mapsto \Gamma \mapsto \Gamma.$

This illustrates the critical point that differences are invariant under key addition, but typically changed by non-linear functions such as $P$. By the lemma below, for any non-zero input difference $\Delta \in \mathbb{F}_2^8$, the output difference

$P(x+k_\star) + P(x + k_\star + \Delta)$

can be $127$ of the $256$ elements of $\mathbb{F}_{2}^8$. This is usually taken as a sign of the cryptographic strength of $P$. But here it means that one can (apart from in one case) determine the pair $\{x+k_\star,x+k_\star+\Delta\}$ from $\Delta$ and $\Gamma$. For the toy cipher, this pair is $\{x+k, x+\Delta+k\}$, and so we determine $\{k, k+\Delta\}$, leaving exactly two possible keys.

#### Differences through pseudo-inverse.

Lemma. Let $\delta \in \mathbb{F}_{2^8} \backslash \{0\}$. If $\gamma \in \mathbb{F}_{2^8} \backslash \{0,1/\delta\}$ then the equation $p(\alpha) + p(\alpha+\delta) = \gamma$ has exactly zero or two solutions.

Proof. If $\alpha \in \{0,\delta\}$ then

$p(\alpha)+p(\alpha+\delta) = p(0) + p(\delta) = 0 + 1/\delta = 1/\delta.$

We may therefore assume that $\alpha\not\in \{0,\delta\}$ and so $p(\alpha) = \alpha^{-1}$ and $p(\alpha+\delta) = (\alpha+\delta)^{-1}$. Now $\alpha^{-1} + (\alpha+\delta)^{-1} = \gamma$ if and only if $\delta = \gamma \alpha (\alpha + \delta)$. This is a quadratic equation in $\alpha$ not having a repeated root. $\Box$.

(In the exceptional case $\gamma = 1/\delta$, there are four solutions, namely $0$, $1$, and the two roots of $(\alpha/\delta)^2 + (\alpha/\delta) + 1$ lying in $\delta \mathbb{F}_{2^2}^\times \subseteq \mathbb{F}_{2^8}$. In fact, since $(\alpha/\delta)^2 + (\alpha/\delta) + 1/\gamma\delta$ has $2$ roots if and only if $1/\gamma\delta \in \{\beta^2 + \beta : \beta \in \mathbb{F}_2^8\}$, which is a co-dimension $1$ subspace of $\mathbb{F}_{2^8}$, there are exactly $127$ output differences $\gamma$ for each non-zero input difference $\delta$, each except $1/\delta$ arising from exactly two inputs $\alpha$.)

In my course I spent an entire lecture on the difference attack, simplifying the lemma by only considering $\Delta = 0000\,0001$, corresponding to $\delta = 1 \in \mathbb{F}_{2^8}$, and omitting the parenthetical paragraph entirely. Still I’m fairly sure no-one except the keenest M.Sc. students got the idea. Probably I will scrap it next year, and instead give an example from a different toy block cipher with a more easily defined $S$-box. Or maybe, since there is no chance of universal comprehension, I should just leave it to a problem sheet, with a hint that the lemma above follows from Hilbert’s Theorem 90. (Joke.)

#### Variations on pseudo-inverse

• The exceptional case $\gamma=1/\delta$ is annoying. It arises because $X^2+X+1$ has a root in $\mathbb{F}_{2^8}$. Over any $\mathbb{F}_{2^c}$ with $c$ odd, there are $2^{c-1}$ output differences for any input difference $\gamma$. For instance, over $\mathbb{F}_{2^2}$ we only see the exceptional case (and even more embarrassingly, the pseudo-inverse function $p$ is linear), while over $\mathbb{F}_{2^3}$, defined by $\beta^3 + \beta = 1$, the output differences for input difference $1$ are $\{1,1+\beta, 1+\beta^2, 1+\beta+\beta^2\}$; these are the (no-longer-so) exceptional $\gamma = 1$, and the reciprocals of the non-zero elements in the subspace

$\{\alpha^2 + \alpha : \alpha \in \mathbb{F}_{2^3} \} = \langle \beta, \beta^2\rangle.$

• It is mathematically tempting to replace the tricky finite field $\mathbb{F}_{2^8}$ with $\mathbb{F}_p$ for $p$ an odd prime. Taking the input difference $\delta$, now applied (and extracted) using arithmetic in $\mathbb{F}_p$, we have

$p(x+1) - p(x) = 1/(x+1) - 1/x = 1/x(x+1)$

for $x\not\in \{-1,0\}$. Quadratic equations are now more easily understood: we can just complete the square to see that exactly $(p-1)/2$ of the elements $\mathbb{F}_p^\times$ are output differences; again the output difference $1$ has double the probability of the others.

• I hoped that $\mathbb{Z}/2^n\mathbb{Z}$ might give a nice example, using the function $q$ such that $q(2x) = 1/(2x+1) -1$ and $q(2x+1) = 1/(2x+1)$. This function is weak respect to the input difference of $1$, since, by definition, $q(2x+1)-q(2x) = 1$ for any $x$. Perhaps surprisingly, this makes $1$ less suitable for the attack. The situation for even input differences is somewhat messy. Clearly this is a non-starter for my course.
• Another try is to use $r(\alpha) = \alpha^3$ in $\mathbb{F}_{2^c}$. We have

$(\alpha+1)^3 + \alpha^3 = \alpha^2 + \alpha + 1,$

so as seen above, there is a affine codimension $1$ subspace of output differences. Unfortunately cubing is only invertible if $c$ is odd; this is the case when the subspace is properly affine and rules out $\mathbb{F}_{2^8}$.

• ## The sum of all hook characters

April 2, 2019

Given a partition $\lambda$ of $n$, let $\chi^\lambda$ denote the irreducible character of the symmetric group $S_n$ canonically labelled by the partition $\lambda$. For example, $\chi^{(n)} = 1$ is the trivial character, $\chi^{(1^n)}$ is the sign character, and $\chi^{(n-1,1)}$ is the irreducible $n-1$-degree constituent of the natural permutation character $\pi$. Thus $\pi(g) = |\mathrm{Fix}(g)|$ and $\chi^{(n-1,1)}(g) = |\mathrm{Fix}(g)| - 1$ for $g \in S_n$. Let $\Gamma = \sum_{a=0}^{n-1} \chi^{(n-a,1^a)}$ be the sum of all the characters labelled by hook partitions. Let $\mathcal{O}_n$ be the set of permutations in $S_n$ having only cycles of odd length.

A nice short paper by Jay Taylor uses characters of the symmetric group labelled by skew partitions to prove that

$\Gamma(g) = \begin{cases} 2^{\mathrm{cyc}(g) - 1} & g \in \mathcal{O}_n \\ 0 & g \not\in \mathcal{O}_n \end{cases}$

where $\mathrm{cyc}(g)$ is the number of cycles in $g$.

As Taylor notes, this result is a special case of a more general theorem of Regev, proved by him using Lie superalgebras. Taylor’s paper gives a short proof using skew characters of the symmetric group. The purpose of this post is to give an alternative proof, using only a basic result on exterior powers of the natural permutation character and to suggest some generalizations not contained in Regev’s original results.

#### Background: a generating function for exterior powers.

Let $\bigwedge^k$ denote the $k$th exterior power of a character or representation. Let $V$ be a complex representation of a finite group $G$ with character $\chi$ and let $g \in G$ have eigenvalues $\lambda_1, \ldots, \lambda_d$ in its action on $V$. Thus $\chi(g) = \lambda_1 + \cdots + \lambda_d$. Let $v_1,\ldots, v_n$ be a basis for $V$ in which $g$ acts diagonally. Then

$\bigwedge^r V = \langle v_{i_1} \wedge\cdots \wedge v_{i_r} : i_1 < \ldots < i_r \rangle_\mathbb{C}$

and we see that

$(\bigwedge^r \chi) (g) = \sum_{i_1 < \ldots < i_r} \lambda_{i_1} \ldots \lambda_{i_r}.$

This is the $r$-th elementary symmetric polynomial evaluated at the eigenvalues $\lambda_1, \ldots, \lambda_d$. Hence

$(1)\quad \sum_{r=0}^d (\bigwedge^r \chi)(g) x^r = (1+\lambda_1 x) \cdots (1+\lambda_d x).$

This generating function is the counterpart for exterior powers of the Molien series, defined using symmetric powers, that is fundamental to invariant theory.

#### Background: exterior powers of the natural module.

Let $V = \langle e_1, \ldots, e_n\rangle_\mathbb{C}$ be the natural representation of $S_n$ over the complex numbers. Now $\bigwedge^n V = \langle e_1 \wedge \cdots \wedge e_n \rangle$ is the sign representation of $S_n$, and correspondingly, $\bigwedge^n \pi = \chi^{(1^n)}$. More generally, $\bigwedge^r V = \langle e_{i_1} \wedge \cdots \wedge e_{i_r} : i_1 < \ldots < i_r\rangle_\mathbb{C}$ is the monomial representation of $S_n$ induced from $\mathrm{sgn}_{S_r} \times \mathbb{C}_{S_{n-r}}$. Its character is known to be

$(2)\quad\bigwedge^r \pi = \chi^{(n-r,1^r)} + \chi^{(n-r+1,1^{r-1})}$

for $1 \le r < n$. This decomposition is an immediate corollary of either the Young or Pieri rules. It may also be proved using Specht modules and an explicit isomorphism $\bigwedge^r S^{(n-1,1)} \cong S^{(n-r,1^r)}$: see for instance Section 5 of this paper with Giannelli and Lim.

#### Proof.

By (2) we have

$\chi^{(n-r,1^r)} = \bigwedge^r \pi - \bigwedge^{r-1} \pi + \cdots + (-1)^{r-1} \pi + (-1)^r 1.$

Hence the coefficient of $\bigwedge^k \pi$ in $\Gamma$ is $1$ if $n-k$ is odd, and $0$ if $n-k$ is even. For example,

\begin{aligned} \Gamma_4 &= \chi^{(4)} + \chi^{(3,1)} + \chi^{(2,1,1)} + \chi^{(1,1,1,1)} \\ &= 1+ (\pi - 1) + (\bigwedge^2 \pi - \pi + 1) + (\bigwedge^3 \pi - \bigwedge^2 \pi + \pi - 1) \\ &= \pi + \bigwedge^3 \pi \end{aligned}

Thus $\Gamma = \sum_{k} \bigwedge^k \pi$, where the sum is over all $k$ such that $n-k$ is odd.

This sum is related to (1). Let $g \in S_n$ have precisely $a_k$ cycles of length $k$, for each $k \in \{1,\ldots, n\}$. The eigenvalues of the $n \times n$ permutation matrix representing $g$ can be found by considering the cycles separately: each $k$-cycle contributes eigenvalues $1, \zeta_k, \ldots, \zeta_{k}^{k-1}$, where $\zeta_k$ is a primitive $k$th root of unity. Since $(1+x)(1+\zeta x) \cdots (1+\zeta^{k-1}x ) = 1 + (-1)^{k-1} x^k$, the generating function in (1) gives

$\sum_{k=0}^n (\bigwedge^k \pi)(g)x^k = (1+x)^{a_1}(1-x^2)^{a_2} \ldots (1+(-1)^{n-1} x^n)^{a_n}.$

Denote this polynomial by $F_g(x)$. We now have

$\Gamma(g) = \begin{cases} \frac{F_g(1) + F_g(-1)}{2} & n \equiv 1 \bmod 2 \\ \frac{F_g(1) - F_g(-1)}{2} & n \equiv 0 \bmod 2. \end{cases}$

Since

$F_g(1) = \begin{cases} 2^{\mathrm{cyc}(g)} & g \in \mathcal{O}_n \\ 0 & g \not\in \mathcal{O}_n \end{cases}$

and $F_g(-1) = 0$ for all $g$, the result follows.

#### Possible generalizations.

Let $\Delta = \sum_{a} \chi^{(n-a,1^a)}$ where the sum is over all $a$ not congruent to $1$ modulo $3$. Then by (2), $\Delta = \sum_{b \ge 0} \bigwedge^{3b} \pi$, and the proof above suggests that $\Delta(g)$ may also have a fairly simple form. One might also look for analogous identities replacing $\bigwedge^k \pi$ with the permutation character of $S_n$ acting on the $k$-subsets of $\{1,\ldots, n\}$.

## Duality, polynomial representations and plethysms

February 27, 2019

The purpose of this post is to collect some standard results on duality and symmetric and exterior powers for representations of general and special linear groups. As a warm-up, here is a small true/false quiz with some, I think, rather unintuitive answers. Let $K$ be the algebraic closure of $\mathbb{F}_2$ and let $E$ be the natural representation of $\mathrm{GL}_2(K)$.

1. $\mathrm{Sym}^2 E$ is an irreducible representation of $\mathrm{GL}_2(K)$.
2. The restriction of $E$ to $\mathrm{GL}_2(\mathbb{F}_4)$ is self-dual.
3. $E$ is self-dual.
4. The restriction of $E$ to $\mathrm{SL}_2(\mathbb{F}_4)$ is self-dual.
5. The restriction of $\mathrm{Sym}^2 E$ to $\mathrm{SL}_2(\mathbb{F}_4)$ is self-dual.
6. The restriction of $\mathrm{Sym}^2 E$ to $\mathrm{GL}_2(\mathbb{F}_2)$ is self-dual, reducible and semisimple.

Answers, all of which should be clear by the end of this post: 1. False, 2. False, 3. False (but true if you interpreted duality as contragredient duality for algebraic groups), 4. True, 5. False, 6. True.

For deeper results and more of the general theory from coalgebras and algebraic groups, see Green, Polynomial representations of $\mathrm{GL}_n$, Springer Lecture Notes in Mathematics 830. These notes of mine might also be of interest.

### Background

Let $K$ be an infinite field. Fix $d \in \mathbb{N}$. Let $x_{ij}$ for $1 \le i,j \le d$ be $d^2$ commuting indeterminates and let $K[x_{ij}]$ be the polynomial ring they generate. Given $f \in K[x_{ij}]$ and a $d \times d$ matrix $X$, let $f(X)$ denote $f$ evaluated at the $d^2$ matrix coefficients $X_{ij}$ for $1 \le i,j \le d$. A representation $\rho : \mathrm{GL}_d(K) \rightarrow \mathrm{GL}_D(K)$ of the general linear group is said to be polynomial of degree $r$ is there exist polynomials $f_{IJ} \in K[x_{ij}]$ for $1 \le I, J \le D$ such that $\rho(X)_{IJ} = f_{IJ}(X)$ for all matrices $X$.

The natural left $\mathrm{GL}_d(K)$-module $E$ is a polynomial representation of $\mathrm{GL}_d(K)$ of dimension $d$ and degree $1$, in which $f_{IJ} = x_{ij}$ for $1 \le I \le J \le d$. The symmetric power $\mathrm{Sym}^r E$ (defined as a quotient of $E^{\otimes r}$) is a polynomial representation of degree $r$. For example if $d = 2$ then $\mathrm{Sym}^2 E = \langle e_1^2, e_2^2, e_1e_2 \rangle$ is the polynomial representation of degree $2$ in which

$\left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) \mapsto \left( \begin{matrix} \alpha^2 & \beta^2 & \alpha \beta \\ \gamma^2 & \delta^2 & \gamma \delta \\ 2\alpha \gamma & 2\beta \delta & \alpha\delta + \beta\gamma \\ \end{matrix} \right).$

As a check on notational conventions, observe that the first column of the matrix on the right-hand side records the image

$(\alpha e_1 + \gamma e_2)^2 = \alpha^2 e_1^2 + \gamma^2 e_2^2 + 2\alpha\gamma e_1e_2$

of $e_1^2$. Hence $f_{11} = x_{11}^2$, $f_{21} = x_{21}^2$ and $f_{31} = 2x_{11}x_{21}$. Note that since $K$ is infinite, the degree of the matrix coefficients on the right-hand side is unambiguously two. This fails over finite fields: in fact, by polynomial interpolation, any function $f : \mathrm{GL}_d(\mathbb{F}_q) \rightarrow \mathbb{F}_q$ is a polynomial, of degree at most $(q-1)d^2$. Another difference between the infinite algebraic group $\mathrm{GL}_d(K)$ and the finite group $\mathrm{GL}_d(\mathbb{F}_q)$ arises when we define duality.

### Duality and symmetric powers

Mimicking the definition from finite groups, we would define the dual of a representation $\rho : \mathrm{GL}_d(K) \rightarrow \mathrm{GL}(V)$ to be the vector space dual $V^\star$ with the action

$\rho^\star(X) \theta = [v \mapsto \theta(\rho(X^{-1}) v)].$

Let $v_1, \ldots, v_D$ be a basis for $V$, with dual basis $v_1^\star, \ldots, v_D^\star$. The calculation

$(\rho^\star(X) v_j^\star)v_i = v_j^\star(\rho(X^{-1})v_i) = \rho(X^{-1})_{ji}$

shows that $\rho^\star(X) v_j^\star = \sum_i \rho(X^{-1})_{ji} v_i^\star$, and so (carefully remembering the convention for matrices acting on column vectors), we have $\rho^\star(X) = \rho(X^{-1})^t$. The inverse and transpose combine to make this a well-defined representation, but except when $V$ is a direct sum of trivial representations, each of degree $0$, it is not polynomial.

Example 1 In the basis $e_1^2$, $e_2^2$, $e_1e_2$ of $\mathrm{Sym}^2 E$, we have

\begin{aligned} \rho_{(\mathrm{Sym}^2 E)^\star}\left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) &= \left( \rho_{\mathrm{Sym}^2 E} \left( \frac{1}{\Delta} \left( \begin{matrix} \delta & -\beta \\ -\gamma & \alpha \end{matrix} \right) \right) \right)^t \\ &= \frac{1}{\Delta^2} \left( \begin{matrix} \delta^2 & \beta^2 & - \beta\delta\\ \gamma^2 & \alpha^2 & -\alpha\gamma \\ -2\gamma \delta & -2\alpha\beta & \alpha \delta + \beta \gamma \end{matrix} \right)^t \\ &= \frac{1}{\Delta^2} \left( \begin{matrix} \delta^2 & \gamma^2 & -2\gamma\delta \\ \beta^2 & \alpha^2 & -2\alpha\beta \\ -\beta\delta & -\alpha\gamma & \alpha \delta + \beta \gamma \\ \end{matrix} \right) \end{aligned}

where $\Delta = \alpha \delta - \beta\gamma$ denotes the determinant. The top-left coefficient is $\delta^2 / \Delta^2$, and this is not a polynomial in $\alpha, \beta, \gamma, \delta$. We shall see later that when this representation is restricted to the special linear group $\mathrm{SL}_2(K)$, duality is much better behaved.

Because conventional duality takes us out of the category of polynomial representations, a different definition of duality is used. We define the contragredient dual of a polynomial representation $V$ of $\mathrm{GL}_d(K)$ to be the vector space dual $V^\star$ with the action

$\rho^\circ(X) \theta = [v \mapsto \theta(\rho(X^t)v)]$

for $X \in \mathrm{GL}_d(K)$ and $\theta \in V^\star$. A very similar calculation to the above shows that $\rho^\circ(X) = \rho(X^t)^t$. Note that typically, the two transposes do not cancel. For instance,

\begin{aligned} \rho_{(\mathrm{Sym}^2 E)^\circ}\left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) &= \left( \rho_{\mathrm{Sym}^2 E} \left( \begin{matrix} \alpha & \gamma \\ \beta & \delta \end{matrix} \right) \right)^t \\ &= \left( \begin{matrix} \alpha^2 & \gamma^2 & \alpha\gamma \\ \beta^2 & \delta^2 & \beta\delta \\ 2\alpha\beta & 2\gamma\delta & \alpha \delta + \beta \gamma \end{matrix} \right)^t \\ &= \left( \begin{matrix} \alpha^2 & \beta^2 & 2\alpha\beta \\ \gamma^2 & \delta^2 & 2\gamma\delta \\ \alpha\gamma & \beta\delta & \alpha \delta + \beta \gamma \end{matrix} \right)\end{aligned}

is the matrix for $\mathrm{Sym}^2 E$ but with $\alpha\beta$ and $\gamma\delta$ doubled and $2\alpha\gamma$ and $2\beta\delta$ halved.

To make this construction more explicit, we shall prove that $(\mathrm{Sym}^2 E)^\circ \cong \mathrm{Sym}_2E$, where

$\mathrm{Sym}_2E = \langle e_1 \otimes e_1, e_2 \otimes e_2, e_1 \otimes e_2 + e_2 \otimes e_1 \rangle$

is the symmetric square, but now constructed as a submodule of $E \otimes E$. (The formal definition is below.) Indeed, with respect to the indicated basis of $\mathrm{Sym}_2E$, the action is

$\rho_{\mathrm{Sym}_2E} \left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) = \left( \begin{matrix} \alpha^2 & \beta^2 & 2\alpha\beta \\ \gamma^2 & \delta^2 & \beta\delta \\ \alpha\gamma & \beta \delta & \alpha \delta + \beta \gamma \end{matrix} \right)$

exactly as for $\rho_{(\mathrm{Sym}^2 E)^\circ}$. For instance, the first column is now given by the image of $e_1 \otimes e_1$, namely

\begin{aligned} (\alpha e_1 + \gamma e_2) \otimes (\alpha e_1 + \gamma e_2) = & \alpha^2 (e_1 \otimes e_1) + \gamma^2 (e_2 \otimes e_2) \\ &\qquad + \alpha\gamma (e_1 \otimes e_2 + e_2 \otimes e_1). \end{aligned}

If $\mathrm{char} K \not= 2$ then $\mathrm{Sym}^2 E \cong \mathrm{Sym}_2 E$, via the isomorphism defined by

\begin{aligned} e_1^2 &\mapsto e_1 \otimes e_1 \\ e_2^2 &\mapsto e_2 \otimes e_2, \\ e_1e_2 &\mapsto \frac{1}{2}(e_1 \otimes e_2 + e_2 \otimes e_1). \end{aligned}

(Correspondingly, changing the basis of $\mathrm{Sym}^2 E$ to $e_1^2,e_2^2, 2e_1e_2$ double the third column and halves the third row of a matrix $\rho_{\mathrm{Sym}^2 E}(X)$, giving the matrix $\rho_{\mathrm{Sym}_2 E}(X)$.) But if $\mathrm{char} K = 2$ then the representations are non-isomorphic. Indeed, from the matrices above, it’s clear that $\mathrm{Sym}^2 E$ has $\langle e_1^2, e_2^2 \rangle$ as a subrepresentation, and not hard to see that in fact this is its unique non-trivial proper subrepresentation. The quotient is the one-dimensional determinant representation. (Some calculation is required: the warning example for $\mathrm{GL}_2(\mathbb{F}_2)$ in Example 11 below shows that uniqueness fails if $K$ is replaced with $\mathbb{F}_2$.) Similarly, $\mathrm{Sym}_2 E$ has $\langle e_1 \otimes e_2 + e_2 \otimes e_1 \rangle$ as its unique non-trivial proper subrepresentation. Thus, as expected from duality, the head of $\mathrm{Sym}^2 E$ is isomorphic to the socle of $\mathrm{Sym}_2 E$, both being the determinant representation.

More generally, $\mathrm{Sym}^r E$ has $L^{(r)}(E)$ as a subrepresentation, where $L^{(r)}(E) = \langle v^r : v \in E \rangle$; since $\mathrm{GL}_d(K)$ acts transitively on the set $\bigl\{v^r : v \in E\backslash \{0\} \bigr\}$, $L^{(r)}(E)$ is irreducible. Whenever $r \ge \mathrm{char} K$ it is a proper subrepresentation of $\mathrm{Sym}^r E$. For example, if $\mathrm{char} K = 2$ and $\dim E = 2$ then $L^{(5)}(E)$ is spanned by $v_1^4 v_2$ and $v_1 v_2^4$. (This indicates the general proof.)

Returning to contragredient duality, it is a good exercise (performed in Proposition 5 below) to show that, generalizing our running example, $(\mathrm{Sym}^r E)^\circ \cong \mathrm{Sym}_r E$. For this it is important to be clear on the definition of $\mathrm{Sym}_r E$. We define it using the right action of the symmetric group $S_r$ by place permutation on $E^{\otimes r}$. This is specified on basis elements by

$(e_{i_1} \otimes \cdots \otimes e_{i_r}) \cdot \sigma = e_{i_{1\sigma^{-1}}} \otimes \cdot \otimes e_{i_{r\sigma^{-1}}}.$

(The inverse is correct for a right action of the symmetric group where permutations are composed by multiplying from left to right: it ensures that the basis element $e_{i_j}$ in position $j$ on the left-hand side appears in position $j\sigma$ on the right-hand side.)

Definition 2. Let $I(d,r) = \{1,\ldots, d\}^r$. We call the elements of $I(d,r)$ multi-indices. Given $\mathbf{i} \in I(d,r)$ we define

$e_\mathbf{i} = e_{i_1} \otimes \cdots \otimes e_{i_r} \in E^{\otimes r}$

and

$v_\mathbf{i} = \sum_{\sigma} e_\mathbf{i} \cdot \sigma$

where the sum is over a set of coset representatives for the stabiliser of $e_\mathbf{i}$ in the place permutation action. Then

$\mathrm{Sym}_r (V) = \langle v_\mathbf{i} : i \in I(d,r), i_1 \le \ldots \le i_r \rangle$. For instance if $r=3$ and $d = 2$ then

$v_{(1,2,2)} = e_1 \otimes e_2 \otimes e_2 + e_2 \otimes e_1 \otimes e_2 + e_2 \otimes e_2 \otimes e_1.$

Suitable coset representatives for the stabiliser $\langle (2,3)\rangle$ of $e_1 \otimes e_2 \otimes e_2$ in $S_3$ are $\mathrm{id}$, $(1,2)$ and $(1,3)$.

Example 3. As an example of the multi-index notation, note that if $X \in \mathrm{GL}_d(K)$ and $\mathbf{j} \in I(d,r)$ then

\begin{aligned} \rho_{E^{\otimes r}}(X) e_\mathbf{j} &= \sum_{i_1, \ldots, i_r \in \{1,\ldots, d\}} X_{i_1j_1} \ldots X_{i_rj_r} e_{i_1} \otimes \cdots \otimes e_{i_r} \\ &=\sum_{\mathbf{i} \in I(d,r)} X_{i_1j_1} \ldots X_{i_rj_r} e_\mathbf{i}. \end{aligned}

Summing over coset representatives in Definition 2 ensures that each summand in $v_\mathbf{i}$ is a different rearrangement of $e_\mathbf{i}$ and so, no matter what the characteristic, $\mathrm{Sym}_r V$ is the subspace of $E^{\otimes r}$ of fixed points for the place permutation action of $S_r$. By this observation and the following lemma, whose proof is left as an exercise, $\mathrm{Sym}_r V$ is a subrepresentation of $V^{\otimes r}$.

Lemma 4. The right action of $S_r$ on $E^{\otimes r}$ commutes with the left action of $\mathrm{GL}_d(K)$. $\Box$

Proposition 5. $(\mathrm{Sym}^r E)^\circ \cong \mathrm{Sym}_r E$.

Proof. Fix $X \in \mathrm{GL}_d(K)$ and $\mathbf{j} \in I(d,r)$. By the remark about rearrangements above, $v_\mathbf{j} = \sum_{\mathbf{j}'} e_\mathbf{j'}$, where, as a standing convention, the sum is over all distinct rearrangements $\mathbf{j}'$ of $\mathbf{j}$. By Example 3 we have

$\rho_{E^{\otimes r}}(X) e_\mathbf{j'} = \sum_{i \in I(d,r)} X_{i_1j'_1} \ldots X_{i_rj'_r} e_\mathbf{i}.$

Therefore

$\rho_{\mathrm{Sym}_r}(X) v_\mathbf{j} = \sum_{\mathbf{i}\in I(d,r)} \sum_{\mathbf{j}'} X_{i_1j'_1} \ldots X_{i_rj'_r} e_\mathbf{i}.$

The right-hand side is known to be in $\mathrm{Sym}_r E$. Looking at the coefficient of $e_\mathbf{i}$ where $i_1 \le \ldots \le i_r$ we see that the coefficient of $v_\mathbf{i}$ in the right hand side is

$\sum_{\mathbf{j}'} X_{i_1j'_1} \ldots X_{i_rj'_r}X_{i_rj'_r}$.

We now turn to $(\mathrm{Sym}^r E)^\circ$. Given $\mathbf{i} \in I(d,r)$ define

$w_{\mathbf{i}} = e_{i_1} \ldots e_{i_r} \in \mathrm{Sym}^r E.$

Observe that the $w_\mathbf{i}$ for $i_1 \le \ldots \le i_r$ are a basis of $\mathrm{Sym}^r E$. Let $w_\mathbf{i}^\star$ denote the corresponding element of the dual basis of $(\mathrm{Sym}^r V)^\circ$. We have

$\rho_{(\mathrm{Sym}^r E)^\circ}(X) w_\mathbf{j}^\star = [w_\mathbf{i} \mapsto w_\mathbf{j}^\star \bigl( \rho_{\mathrm{Sym}^r V}(X^t) w_\mathbf{i}].$

Since $\rho_{\mathrm{Sym}^r E}(X^t) w_\mathbf{i} = \sum_\mathbf{k} X^t_{k_1i_1} \ldots X^t_{k_ri_r} w_\mathbf{k}$, the coefficient of $w_\mathbf{i}^\star$ in the image of $w_\mathbf{j}^\star$ is

$\sum_{\mathbf{j}'} X^t_{j'_1i_1} \ldots X^t_{j'_ri_r}.$

Noting the transpose in $X^t$, this agrees with the coefficient of $v_\mathbf{i}$ in the image of $v_\mathbf{j}$. Therefore the map $v_\mathbf{k} \mapsto w_\mathbf{k}^\star$ defines a isomorphism of representations $\mathrm{Sym}_r E \cong (\mathrm{Sym}^r E)^\circ$. $\Box$

The proposition is an interesting example of a `self-generalizing’ result. To be entirely honest, some human intervention is required, but it is essentially a matter of book-keeping.

Corollary 6.
Let $V$ be a representation of $\mathrm{GL}_d(K)$. Then

$(\mathrm{Sym}^r V)^\circ \cong \mathrm{Sym}_r V^\circ.$

Proof. Let $D = \mathrm{dim} V$. By Proposition 5, taking $E = V$, there is a $D \times D$ matrix $P$ such that

$P^{-1} \bigl(\rho_{\mathrm{Sym}^r V}(Y^t)\bigr)^t P = \rho_{\mathrm{Sym}_r(V)}(Y)$

for all matrices $Y \in \mathrm{GL}_D(K)$. Hence, setting $Y = \rho_V(X^t)^t$, we have

$P^{-1} \bigl(\rho_{\mathrm{Sym}^r V}(\rho_V(X^t))\bigr)^t P = \rho_{\mathrm{Sym}_r(V)}\bigl(\rho_V(X^t)^t\bigr)$

for all matrices $X \in \mathrm{GL}_d(K)$. In the representation $(\mathrm{Sym}^r V)^\circ$ the matrix representing $X \in \mathrm{GL}_d(K)$ is, by definition, $\bigl(\rho_{\mathrm{Sym}^r V}(\rho_V(X^t))\bigr)^t$, while in the representation $\mathrm{Sym}_r V^\circ$, the matrix representing $X \in \mathrm{GL}_d(K)$ is, again by definition, $\rho_{\mathrm{Sym}_r V}\bigl(\rho(X^t)^t\bigr)$. By the previous displayed equation, these representing matrices are conjugate by the fixed matrix $P$. $\Box$.

### Exterior powers

Perhaps surprisingly, copying the construction of $\mathrm{Sym}_r E$ for exterior powers does not give a new representation. Let $d \ge r$. We first find the matrices for $\bigwedge^r E$.

Lemma 7. Let $X \in \mathrm{GL}_d(K)$. Let $Y = \rho_{\wedge^r E}(X)$ be the matrix representing $X$ in its action on $\bigwedge^r E$, with respect to the basis of all $e_{i_1} \wedge \cdots \wedge e_{i_r}$ where $i_1 < \ldots < i_r$. Then

$Y_{\mathbf{i} \mathbf{j}} = \sum_{\sigma \in S_r} X_{i_{1\sigma} j_1} \ldots X_{i_{r\sigma} j_r}.$

Proof. We have

\begin{aligned} \rho_{\wedge^r E}(X)& (e_{j_1} \wedge \cdots \wedge e_{j_r}) \\ & =\sum_{\mathbf{i} \in I(d,r)} X_{i_1j_1} \cdots X_{i_rj_r} (e_{i_1} \wedge \cdots \wedge e_{i_r}) \\ & =\sum_{i_1 < \ldots < i_r} \sum_{\sigma \in S_r} \mathrm{sgn}(\sigma) X_{i_{1\sigma}j_1} \ldots X_{i_{r\sigma}j_r} (e_{i_1} \wedge \cdots \wedge e_{i_r}). \end{aligned}

as required. $\Box$

Given $\mathbf{i} \in I(d,r)$, let

$u_\mathbf{i} = \sum_{\sigma \in S_r} \mathrm{sgn}(\sigma) e_\mathbf{i} \cdot \sigma.$

Observe that $u_\mathbf{i} = 0$ if the multi-index $\mathbf{i}$ has two equal entries. (This holds even if $\mathrm{char} K = 2$.) Let $\bigwedge_r E$ be the subspace of $E^{\otimes r}$ with basis all $u_\mathbf{i}$ for $\mathbf{i} \in I(d,r)$ such that $i_1 < \ldots < i_r$. It follows from Lemma 4 that $\bigwedge_r V$ is a subrepresentation of $V^{\otimes r}$. Let $X \in \mathrm{GL}_d(K)$. By Example 3,

$\rho_{\wedge_r E}(X) u_\mathbf{j} = \sum_{\mathbf{i} \in I(d,r)} X_{i_1j_1} \ldots X_{i_rj_r} \sum_{\sigma \in S_r} \mathrm{sgn}(\sigma) e_\mathbf{i} \cdot \sigma.$

The right-hand side is known to be in $\bigwedge_r V$. The coefficient of $e_\mathbf{i}$ where $i_1 < \ldots < i_r$ in the right-hand side is

$\sum_{\sigma \in S_r} \mathrm{sgn}(\sigma) X_{i_{1\sigma} j_1} \ldots X_{i_{r\sigma}j_r}.$

Comparing with Lemma 7, we see that the matrices are the same. Therefore $\bigwedge^r E \cong \bigwedge_r E$.

Similarly one can show that $\bigwedge^r E \cong (\bigwedge^r E)^\circ$. For a more conceptual proof of this, one could argue that since $\bigwedge^r E$ is generated by the highest weight vector $e_1 \wedge \cdots \wedge e_r$, of weight $(1,\ldots,1,0,\ldots,0)$, it is irreducible; over any field the irreducible polynomial representations of $\mathrm{GL}_d(K)$ are self-dual.

### Special linear groups

We saw above that the contragredient duality used for representations of the infinite algebraic group $\mathrm{GL}_d(K)$ does not, in general, agree with duality as defined for representations of finite groups. There is an interesting exception to this which occurs when $d = 2$ and we restrict to the action of $\mathrm{SL}_2(K)$. Write $\cong_{\mathrm{SL}_2(K)}$ for an isomorphism that holds after this restriction. Here it is in our running example.

Example 8. Let $X = \left( \begin{matrix} \alpha & \beta \\ \gamma &\delta \end{matrix}\right)$. We saw in Example 1 that

$\rho_{(\mathrm{Sym}^2 E)^\star} (X) = \frac{1}{\Delta^2} \left( \begin{matrix} \delta^2 & \gamma^2 & -2\gamma\delta \\ \beta^2 & \alpha^2 & -2\alpha\beta \\ -\beta\delta & -\alpha\gamma & \alpha \delta + \beta \gamma \end{matrix} \right).$

If $X \in \mathrm{SL}_2(K)$ then $\Delta = 1$. Changing the basis of $(\mathrm{Sym^2}E)^\star$ by swapping the first and second basis vectors and negating the third we get an isomorphic representation $\rho'$ in which

$\rho'(X) = \left( \begin{matrix} \alpha & \beta^2 & 2\alpha\beta \\ \gamma^2 & \delta^2 & 2\gamma\delta\\ \alpha\gamma & \beta\delta & \alpha \delta + \beta \gamma \end{matrix} \right).$

This agrees with the matrix representing $X$ in its action on $(\mathrm{Sym}^2 E)^\circ \cong (\mathrm{Sym}_2 E)$. Therefore $(\mathrm{Sym}^2 E)^\star \cong_{\mathrm{SL}_2(K)} (\mathrm{Sym}^2 E)^\circ$.

More generally we have the following proposition.

Proposition 9. Let $V$ be a representation of $\mathrm{GL}_2(K)$. Then $V^\star \cong_{\mathrm{SL}_2(K)} V^\circ$.

Proof. Let $J = \left( \begin{matrix} 0 & 1 \\ -1 & 0 \end{matrix} \right)$. By the following calculation

\begin{aligned} J \left(\begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) J^{-1} &= \left( \begin{matrix} 0 & 1 \\ -1 & 0 \end{matrix}\right) \left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix}\right)^{-1} \left( \begin{matrix} 0 & -1 \\ 1 & 0 \end{matrix} \right) \\ &= \left( \begin{matrix} 0 & 1 \\ -1 & 0 \end{matrix}\right) \left( \begin{matrix} \delta & -\beta \\ -\gamma & \alpha \end{matrix} \right) \left( \begin{matrix} 0 & -1 \\ 1 & 0 \end{matrix} \right) \\ &= \left( \begin{matrix} -\gamma & \alpha \\ \delta & \beta \end{matrix} \right) \left( \begin{matrix} 0 & -1 \\ 1 & 0 \end{matrix} \right) \\ &= \left( \begin{matrix} \alpha & \gamma \\ \beta & \delta \end{matrix}\right), \end{aligned}

for any matrix $X$, the matrices $X^{-1}$ and $X^t$ are conjugate by the fixed matrix $J$. Hence so are $\rho_V(X^{-1})$ and $\rho_{V}(X^t)$, and since transposition preserves conjugacy, so are $\rho_V(X^{-1})^t$ and $\rho_V(X^t)^t$. The proposition follows. $\Box$.

### Plethysms

If $K$ has characteristic zero then it is known that

$\mathrm{Sym}^2 \mathrm{Sym}^n E \cong_{\mathrm{SL}_2(K)} \bigwedge^2 \mathrm{Sym}^{n+1} E.$

Example 11. We shall prove that when $n = 1$, this isomorphism holds for arbitrary $K$, provided that one side is replaced with its contragredient dual. By Lemma 7, given a matrix $X \in \mathrm{GL}_3(K)$ acting on a basis $z_1, z_2, z_3$, the matrix $\bigwedge^2 X$ representing $X$ in its action on $\langle z_1 \wedge z_2, z_1 \wedge z_3, z_2 \wedge z_3 \rangle$ with respect to the indicated basis is $Y$ where

$Y_{(i_1,i_2)(j_1,j_2)} = X_{i_1j_1}X_{i_2j_2}-X_{i_2j_1}X_{i_1j_2}.$

Applying this result when $X$ acts on $\mathrm{Sym}^2 E$ with respect to the basis $e_1^2, e_2^2, e_1e_2$, and setting $\Delta = (\alpha \delta - \beta \gamma)$ and $\Gamma = \alpha\delta + \beta\gamma$, we find that

\begin{aligned} \rho_{\wedge^2 \mathrm{Sym}^2 E}&\left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix}\right) \\ &= \left( \begin{matrix} \alpha^2 \delta^2 - \beta^2\gamma^2 & \alpha^2 \gamma \delta - \gamma^2 \alpha\beta & \beta^2\gamma\delta - \delta^2 \alpha \beta \\ 2\alpha^2 \beta\delta - 2\alpha\gamma \beta^2 & \alpha^2 \Gamma - 2\alpha^2\gamma \beta & \beta^2 \Gamma - 2\beta^2\delta \alpha \\ 2\gamma^2 \beta\delta - 2\alpha\gamma \delta^2 & \gamma^2 \Gamma - 2\alpha\gamma^2\delta & \delta^2 \Gamma - 2\beta\delta^2 \gamma \end{matrix} \right) \\ &= \left( \begin{matrix} \Delta (\alpha \delta + \beta \gamma) & \Delta \alpha\gamma & -\Delta \beta\delta \\ 2 \Delta \alpha \beta & \Delta \alpha^2 & -\Delta \beta^2 \\ -2\Delta \gamma\delta & -\Delta\gamma^2 & \Delta \delta^2 \end{matrix} \right). \end{aligned}

Now use that $\Delta = 1$ for matrices in $\mathrm{SL}_2(K)$ and change basis by negating the third basis element to get the conjugate matrix on the left below

$\left(\begin{matrix} \alpha \delta + \beta \gamma & \alpha\gamma & \beta\delta \\ 2 \alpha \beta & \alpha^2 & \beta^2 \\ 2 \gamma\delta & \gamma^2 & \delta^2 \end{matrix} \right) \sim \left(\begin{matrix} \alpha^2 & \beta^2 & 2 \alpha \beta \\ \gamma^2 & \delta^2 & 2 \gamma\delta \\ \alpha\gamma & \beta\delta & \alpha \delta + \beta \gamma \end{matrix} \right).$

The matrix on the right is its conjugate by a permutation of the basis. We saw above in the example of contragredient duality that

$\rho_{(\mathrm{Sym}^2 E)^\circ}\left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) = \left( \begin{matrix} \alpha^2 & \beta^2 & 2\alpha\beta \\ \gamma^2 & \delta^2 & 2\gamma\delta \\ \alpha\gamma & \beta\delta & \alpha \delta + \beta \gamma \end{matrix} \right).$

The matrices agree and so $(\mathrm{Sym}^2 E)^\circ \cong \bigwedge^2 \mathrm{Sym}^2 E$. This isomorphism can be recast in various ways, for example, $\mathrm{Sym}_2 E \cong \bigwedge^2 \mathrm{Sym}^2 E$ or $\mathrm{Sym}^2 E \cong \bigwedge^2 \mathrm{Sym}_2 E$.

### Finite groups

Let $\mathrm{char} K = p$. By Proposition 9 any isomorphism of $\mathrm{SL}_2(K)$-representations restricts to an isomorphism of $\mathrm{SL}_2(\mathbb{F}_{p^m})$-representations, provided we systematically replace contragredient duality $\circ$ with conventional duality $\star$. For instance, Example 10 implies that

$(\mathrm{Sym}^2 E)^\star \cong_{\mathrm{SL}_2(\mathbb{F}_{p^m})} \bigwedge^2 \mathrm{Sym}^2 E.$

Here is a warning example showing that some features of the infinite case are not preserved by restriction.

Example 11. Let $E = \mathbb{F}_2^2$. We have

$\mathrm{GL}_2(\mathbb{F}_2) = \mathrm{SL}_2(\mathbb{F}_2) = \Bigl\langle \left(\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right), \left( \begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right) \Bigr\rangle$

and, in the usual basis $e_1^2, e_2^2, e_1e_2$ of $\mathrm{Sym}^2 E$,

\begin{aligned}\rho_{\mathrm{Sym^2} E} \left( \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right) &= \left( \begin{matrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{matrix} \right) \\ \rho_{\mathrm{Sym^2} E} \left( \begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right) &= \left( \begin{matrix} 1 & 1 & 1 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{matrix} \right) .\end{aligned}

Observe that the sums of the rows of both $3 \times 3$-matrices are constantly $1$. Therefore in the alternative basis $e_1^2, e_2^2, e_1^2+e_2^2 + e_1e_2$ we have

\begin{aligned}\rho'_{\mathrm{Sym^2} E} \left( \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right) &= \left( \begin{matrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{matrix} \right) \\ \rho'_{\mathrm{Sym^2} E} \left( \begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right) &= \left( \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{matrix} \right) .\end{aligned}

Hence $\mathrm{Sym}^2 E \cong \mathbb{F}_2 \oplus E$ is reducible and semisimple, as claimed in the answer to the final quiz question.

To end here is a problem combining ordinary and contragredient duality, to which I do not immediately have a complete solution.

Problem 12. Let $G$ be a finite subgroup of $\mathrm{GL}_d(K)$ and let $\rho : G \rightarrow \mathrm{GL}(V)$ be a representation of $G$. Let $V^\circ$ be the representation defined by $\rho^\circ(X) = \rho(X^t)^t$. What is the relationship between $V$, $V^\circ$ and $V^\star$?

For instance, if $K$ is a subfield of the real numbers then there is an orthogonal form on $K^d$ invariant under $G$. (This is part of the standard proof of Maschke’s Theorem.) Hence in this case we may assume that $G$ is a subgroup of the orthogonal group, and so $X^t = X^{-1}$ for all $X \in G$. It follows that

$\rho^\circ(X) = \rho(X^t)^t = \rho(X^{-1})^t = \rho^\star(X)$

for all $X \in G$, and so $V \cong V^\circ \cong V^\star$. The example $\mathrm{Sym}^2 E$, where $E$ is the natural representation of $\mathrm{SL}_2(\mathbb{F}_{2^m})$ and $m \ge 2$, has $V \not\cong V^\circ \cong V^\star$. It is also possible to have $V \cong V^\circ \not\cong V^\star$. For example, this is clearly the case if $G$ is a finite subgroup of $\mathrm{GL}_1(\mathbb{C})$ not contained in $\{ \pm 1\}$ and $V$ is the natural $1$-dimensional representation of $G$.

## Meet-in-the-middle attacks on block ciphers

November 25, 2018

A mathematical model for a block cipher is a collection $\mathcal{E}$ of invertible functions $e_k : \mathbb{F}_2^n \rightarrow \mathbb{F}_2^n$ indexed by keys $k \in \mathbb{F}_2^\ell$. We say that the block cipher $\mathcal{E}$ has block size $n$ and key length $\ell$. For example, DES has $n=64$ and $\ell = 56$, while its successor AES has $n=128$ and $\ell = 128$ (or $192$ or $256$ for the super-paranoid).

Let $\mathcal{E}$ and $\mathcal{E}'$ be block ciphers of block size $n$ and key lengths $\ell$ and $\ell'$, respectively. An easy way to boost the key length to $\ell + \ell'$, while keeping essentially the same cryptologics, is to compose the encryption functions from the two ciphers. Thus for $(k,k') \in \mathbb{F}_2^\ell \times \mathbb{F}_2^{\ell'}$ we define

$e_{(k,k')}(x) = e'_{k'}\bigl(e_k(x)\bigr).$

Perhaps surprisingly, when $\ell, \ell' \le n$, this cryptoscheme offers only $\mathrm{max}(\ell,\ell')+1$ bits of security, rather than the expected $\ell + \ell'$. The purpose of this post is to give a fairly precise costing of the meet-in-the-middle attack behind this.

As a notational convention, $\star$ denotes quantities unknown to the cryptanalyst and $\dagger$ quantities chosen or observed by the cryptanalyst. We cost by the number of encryption/decryption operations, or cryptops, justifying this assumption in the course of the argument.

Let $(k_\star,k'_\star)$ be the (unknown) key. Let $x_\dagger$ be a chosen plaintext and let $z_\dagger = e_{(k',k)}(x_\dagger)$ be its observed encryption. (We write $e_{(k'_\star,k_\star)}(x_\dagger)$ rather than $e'_{k'_\star}\bigl(e_{k_\star}(x_\dagger)\bigr)$ to emphasise that in the chosen plaintext model, we can perform two-stage encryption using the unknown key pair $(k_\star,k'_\star)$, but not each stage separately.) We denote by $d'_{k'}$ the inverse of the encryption function $e'_{k'}$. Let

\begin{aligned} E &= \{ \bigl(k, e_k(x_\dagger)\bigr) : k \in \mathbb{F}_2^\ell \} \\ D & = \{ \bigl(k', d'_{k'}(z_\dagger) \bigr) : k' \in \mathbb{F}_2^\ell \} \end{aligned}

Thus $|E| = 2^\ell$, $|D| = 2^{\ell'}$ and $2^\ell + 2^{\ell'}$ cryptops are required to get this far.

Using any reasonable algorithm, sorting $E$ by the second entry in each pair takes at most $2^\ell \ell s$ primitive operations for some smallish $s$. Similarly $D$ is sorted using $2^{\ell'} \ell' s$ primitive operations. Working through the sorted lists we pair up the $(k,y) \in E$ with the $(k',y) \in D$, for each $y \in \mathbb{F}_2^n$. In one possible implementation this makes a list of pointers to the relevant members of $D$ and $E$ for each $y \in \mathbb{F}_2^n$, so the cost in primitive operations is

$(|E|+|D|)b = (2^\ell + 2^{\ell'}) b$

for some small $b$. Since each cryptop for $\mathcal{E}$ must produce $\ell$ bits, and similarly for $\mathcal{E}'$, it seems reasonable to assume that the cost of $2^\ell + 2^{\ell'}$ cryptops is significantly more than the cost of this sorting and pairing up.

We now have the sets

\begin{aligned} \mathcal{K}_y &= \{ k \in \mathbb{F}_2^\ell : e_{k}(x_\dagger) = y \} \\ \mathcal{K}'_y &= \{ k' \in \mathbb{F}_2^\ell : d'_{k'}(z_\dagger) = y \} \end{aligned}

for each $y \in \mathbb{F}_2^n$. Observe that $\mathcal{K}_y \times \mathcal{K}'_y$ is those key pairs (for the composed cryptosystem) that meet-in-the-middle at $y$. Fix $p$ distinct plaintexts

$X_\dagger^{(1)}, \ldots, X_\dagger^{(p)} \in \mathbb{F}_2^n \backslash \{x_\dagger\}.$

For each $y \in \mathbb{F}_2^n$ we encrypt $X_\dagger^{(1)}, \ldots, X_\dagger^{(p)}$ using the guessed key pair $(k,k') \in \mathcal{K}_y \times \mathcal{K}'_{y}$, and check if

$e'_{k'}\bigl(e_{k}(X_\dagger^{(i)})\bigr) = (e_{k'_\star} \circ e_{k_\star})(X^{(i)}_\dagger)$

for each $i$. If all $p$ encryptions agree, we assume that $(k_\star,k'_\star) = (k,k')$. (But for simplicity, let’s say we carry on checking all the other keys.) The number of pairs to be checked in the final step is

$M = \sum_{y \in \mathbb{F}_2^n} |\mathcal{K}_y| |\mathcal{K}'_y|$

and each pair requires two encryptions, making the cost of the final step $2Mp$ cryptops. The total cost in cryptops is therefore $2^{\ell+\ell'} + 2Mp$.

We now estimate $2^{\ell+\ell'} + 2Mp$ for the random cipher, and use this as an approximation to DES and to AES.

### Random cipher

Suppose that each $e_k : \mathbb{F}_2^n \rightarrow \mathbb{F}_2^n$ and $e'_{k'} : \mathbb{F}_2^n \rightarrow \mathbb{F}_2^n$ is chosen uniformly and independently at random from the $(2^n)!$ permutations of $\mathbb{F}_2^n$. We know that $e_{k_\star}(x_\dagger) = y_\dagger$ and for each other key $k$, the probability is $\frac{1}{2^n}$ that $e_k(x_\dagger) = y_\dagger$. Dually, we know that $d'_{k'_\star}(z_\dagger) = y_\dagger$, and for each other key $k'$, the probability is $\frac{1}{2^n}$ that $d'_{k'}(z_\dagger) = y_\dagger$. Therefore $|\mathcal{K}_{y_\dagger}|$ and $|\mathcal{K'}_{y_\dagger}|$ are independently distributed as the shifted binomial distribution

$1 + \mathrm{Bin}(2^\ell-1, \frac{1}{2^n}).$

Similarly, if $y \not= y_\dagger$ then $|\mathcal{K}|_y$ and $|\mathcal{K}|_y$ are independently distributed as $\mathrm{Bin}(2^\ell-1, \frac{1}{2^n})$. (It is worth noting that $\mathcal{K}_{y}$ and $\mathcal{K}_{y'}$ are not independent: for example we have $\cup_{y \in \mathbb{F}_2^n} \mathcal{K}_y = \mathbb{F}_2^\ell$ where the union is disjoint, therefore $\sum_{y \in \mathbb{F}_2^n} |\mathcal{K}_y| = 2^\ell$.) It follows by linearity of expectation that

\begin{aligned} \mathbb{E}[M] &= \mathbb{E}\bigl[\sum_{y \in \mathbb{F}_2^\ell} |\mathcal{K}_y| |\mathcal{K}'_y|\bigr] \\ &= \bigl( 1 + \frac{2^\ell-1}{2^n} \bigr)\bigl( 1 + \frac{2^{\ell'}-1}{2^n} \bigr) + (2^n-1) \bigl( \frac{2^\ell-1}{2^n} \bigr)\bigl( \frac{2^{\ell'}-1}{2^n} \bigr) \\ &= \frac{2^{\ell+\ell'}}{2^n} + 1 - \frac{1}{2^{n}} \end{aligned}

This is very well approximated by $\frac{2^{\ell + \ell'}}{2^n} + 1$. Hence the total cost is estimated as

$2^{\ell} + 2^{\ell'} + 2p \bigl( \frac{2^{\ell + \ell'}}{2^n} + 1\bigr).$

It remains to choose a suitable value for $p$. By our randomness assumption, $e'_{k'_\star} \circ e_{k_\star}$ and $e'_{k'} \circ e_k$ are each permutations of $\mathbb{F}_2^n$ chosen uniformly at random subject to the constraint $f(x_\dagger) = z_\dagger$. Thus for each chosen plaintext $X_\dagger \not= x_\dagger$, the ciphertexts $(e'_{k'_\star} \circ e_{k_\star})(X_\dagger)$ and $(e'_{k'} \circ e_k)(X_\dagger)$ are uniformly distributed on $\mathbb{F}_2^n \backslash \{z_\dagger\}$. The probability they agree is therefore $1/(2^n-1)$. More generally, the probability that $e'_{k'}\bigl(e_{k}(X_\dagger^{(i)})\bigr) = (e'_{k'_\star} \circ e_{k_\star})(X^{(i)}_\dagger)$ for all of the $p$ chosen $X_\dagger^{(i)}$ is

$\displaystyle \frac{1}{2^n-1} \times \ldots \times \frac{1}{2^n-p} \approx \frac{1}{2^{np}}.$

We have to check $M$ key pairs, so the expected number of ‘false’ keys is very nearly $\mathbb{E}\bigl[\frac{M}{2^{np}}\bigr]$; by the good approximation for $M$ above, this is very nearly

$\displaystyle \frac{2^{\ell+\ell'}}{2^{(1+p)n}} + \frac{1}{2^{np}}.$

Suppose for simplicity that $\ell = \ell'$. Setting $n = \nu \ell$, our estimate for the total work is then

$2^{\ell}+2^{\ell'} + 2p\bigl( 2^{(2-\nu)\ell} + 1 \bigr)$

cryptops, where $p$ should be chosen so that $\nu(1+p)$ is large compared to $2$. In particular, if $\nu$ is very large then, as one would expect, then one can take $p=0$: no checking cryptops are needed.

#### DES as a random cipher

For DES, $\ell = 56$, $n = 64$, so $\nu = \frac{8}{7}$, $(2-\nu)\ell = \frac{6}{7} \times 56 = 48$ and we need $\frac{8}{7}(1+p)$ large compared to $2$. Taking $p = 2$, the total work is $2^{57} + 2^{50}$. Clearly for any reasonable choice of $p$, the cost of the attack is dominated by the first phase, and is about $2^{57}$ cryptops. Even in 2009, using a special purpose device, this took at most $2$ days.

#### AES as a random cipher

For AES with the shortest key length, $\ell = n = 128$, so $\nu = 1$ and we need $2p$ large compared to $2$. Again taking $p=2$, the total work is $2^{129} + 2^{130}$, roughly balanced between the first and second phases. For AES with the longest key length, $\ell = 256$ and $n=128$, so $\nu = \frac{1}{2}$ and we need $(1+p)/2$ large compared to $2$. Taking $p=5$, the total work is $2^{257} + 10 \times 2^{378}$. The second phase dominates. Of course neither attack is practical.

## An algebraic view of secret sharing

October 17, 2018

As last year I’m lecturing our 3rd year / 4th year / M.Sc. course on cipher systems. As part of the M.Sc. course I cover the Shamir Secret Sharing Scheme. Here is Shamir’s idea generalized to arbitrary commutative rings, with an application to secret sharing in a hierarchical organization. The idea is very obvious, so no originality is expected or claimed.

For a practical example, imagine that I have a critically important 20GB file. Using a secret-sharing scheme with threshold $3$, I issue shares of the file (each most probably still of size 20GB) to Amazon, Degoo (who promise a ‘top secret cloud drive’) Dropbox and Microsoft. If any single provider goes out of business, or maliciously corrupts the share, the file can still be reconstructed from the remaining three shares. Depending on how much I trust them (and the computational resources available on the various platforms), the reconstruction could be done in the cloud, or on my own machine. In the latter case, a provider can learn the secret only if three of them collude by pooling their shares: this would presumably be a serious breach of all their terms of service.

### Mathematical formulation

Fix $n \in \mathbb{N}$ and let $1 \le t \le n$. Fix a commutative ring $R$ and distinct ideals $I$, $J_1, \ldots, J_n$ of $R$. The secret is an element of $R/I$. We suppose that $R$ has a subset $S$ such that

1. if $A \subseteq \{1, \ldots, n\}$ is a $t$-set then the composition $S \hookrightarrow R \rightarrow \prod_{i \in A} R/J_i$ is injective;
2. if $A \subseteq \{1, \ldots, n\}$ is a $(t-1)$-set then the composition $S \hookrightarrow R \rightarrow R/I \times \prod_{i \in A} R/J_i$ is surjective.

In particular (2) implies that $S \hookrightarrow R \twoheadrightarrow R/I$ is surjective.

To share a secret $s + I \in R/I$, pick $r \in S$ such that $r \equiv s$ mod $I$; the share for algebraist $i$ is then $r + J_i \in R/J_i$. By hypothesis, for each $A \subseteq \{1,\ldots, n\}$ such that $|A| = t$, the unique element of $R$ congruent to $r$ modulo $J_i$ for each $i \in A$ is $r$ itself. Therefore any $t$ (or more) algebraists can pool their shares and determine $r$, and hence $s = r+I$.

### Shamir’s Scheme

Fix a prime $p$ and distinct $c_1, \ldots, c_n \in \mathbb{F}_p$. Take $R = \mathbb{F}_p[X]$, $I = \langle X \rangle$, and $J_i = \langle X - c_i \rangle$ and $S = \{ r \in R : \mathrm{deg} f < t \}$. (Thus the secret is $r(0)$, and the share for algebraist $i$ is $r(c_i)$.) Property (1) holds because a polynomial of degree $\le t-1$ is uniquely determined by its values at the $t$ distinct points $c_i$ for $i \in A$. Moreover, since the map

$\displaystyle S \rightarrow R/I \times \prod_{i \in A} R/J_i \cong \mathbb{F}_p \times \mathbb{F}_p^{t-1}$

is a linear isomorphism whenever $|A| = t-1$, we have (2) in the strongest possible form.

### Integer version

For another special case, fix a prime $p$, a parameter $c \ge 2$ (of which more later) and consecutive primes $q_1 \le \ldots \le q_n$ with $cp \le q_1$. Let $Q = q_1 \ldots q_t$. Take $R = \mathbb{Z}$, $I = p\mathbb{Z}$, $J_i = q_i \mathbb{Z}$ and

$S = \{0,1,\ldots, Q-1\}.$

Since each $r \in S$ is uniquely determined by its residues modulo any $t$ of the $q_i$, we have (1). Suppose the $t-1$ algebraists numbered $a_1, \ldots, a_{t-1}$ pool their shares. They can learn $r$ modulo $Q' = q_{a_1} \ldots q_{a_{t-1}}$. Now $S$ contains $\lfloor Q/Q' \rfloor$ ‘candidate secrets’ of the form $r+ b Q'$ where $b \in \mathbb{Z}$. Provided $c$ is sufficiently large,

$\displaystyle \frac{Q}{Q'} = \frac{q_1 \ldots q_t}{q_{a_1} \ldots q_{a_{t-1}}} \ge cp\frac{q_2 \ldots q_t}{q_{n-t+2} \ldots q_n} \ge p$.

Therefore the ‘candidate secrets’ cover all residue classes modulo $p$, and so (2) holds. Moreover, the sizes of the fibres over each $(r^\star + I, r + J_{a_1}, \ldots, r+J_{a_{t-1}})$ vary by at most $1$ with $r^\star$, and have mean size $Q / p q_{a_1} \ldots q_{a_{t-1}}$. Provided $c$ is large, all the primes $q_i$ have about the same size, and the mean fibre size is approximately $c$.

#### Example

Provided $c$ is large, this variation in the fibre leaks very little information. For small $c$ is it a problem. For example, take $n=4$, $t=3$, $c = 2$, $p = 11$ and $q_1 = 23$, $q_2 = 29$, $q_3 = 31$, $q_4 = 37$. Thus $Q = q_1q_2q_3 = 20677$. For the secret $s$ we choose $7$, lifting it to

$r = 11161 \in \{0,\ldots, Q-1\}.$

The shares are then $6, 25, 1, 24$. If algebraists $3$ and $4$ cooperate they can learn $r$ modulo $31 \times 27$. This leaves

$\lfloor \frac{Q}{31 \times 37} \rfloor = 18$

possibilities for $r$. The mean fibre size is $18 / 11$ and since $18 = 2 \times 7 + 1 \times 4$, there are $7$ fibres of size $2$ and $4$ of size $1$ (namely those for $1 + 11\mathbb{Z}$, $4 + 11\mathbb{Z}$, $7 + 11\mathbb{Z}$ and $10 + 11\mathbb{Z}$). Choosing one of these at random gives them a $1/4$ chance of guessing the secret.

Choosing a small fibre is a good heuristic, but not infallible: it is possible that the secret corresponds to a large fibre—for example if algebraists $1$ and $2$ pool their shares then the fibres for $2 + 11 \mathbb{Z}$ and $9 + 11\mathbb{Z}$ have size $2$, and the rest have size $3$.

### Question

What other rings are amenable to the generalized Shamir scheme?

### Secret sharing in a hierarchy

One possible answer to this question is ‘multivariable polynomial rings’. Here is an idea along these lines.

Let $R = \mathbb{F}_p[X,Y]$. Fix $m, n \in \mathbb{N}$ and thresholds $t$, $u$ with $t \le m$ and $u \le n$. Let $S$ be the set of polynomials $f \in R$ with $\mathrm{deg}_X f < t$ and $\mathrm{deg}_Y f < u$. Fix distinct non-zero $c_1, \ldots, c_m \in \mathbb{F}_p$ and distinct $d_1, \ldots, d_n \in \mathbb{F}_p$. The aim is to share a secret $s \in \mathbb{F}_p$ between teams $T_1, \ldots, T_n$ where team $T_j$ consists of people $P_{1j}, \ldots, p_{mj}$. For this, choose $f \in S$ such that $f(0,0) = s$. For each $j \in \{1,\ldots, n\}$, let $f_j(X) = f(X, d_j)$. For each $j \in \{1,\ldots, n\}$ and each $i \in \{1,\ldots, m\}$, let $s_{ij} = f_j(c_i) = f(c_i,d_j)$. Issue $s_{ij}$ as the share to person $P_{ij}$.

The shares $s_{ij}$ are the shares in a normal Shamir Secret Sharing Scheme for $f_j(0)$. By the linear isomorphism in the first example above, when $t$ people in the team cooperate to learn $f_j(0)$, they in fact learn $f_j$ itself. Suppose that $u$ teams learn $f_{b_1}, \ldots, f_{b_u}$. Then since $\mathrm{deg}_Y f < u$, they can cooperate to learn $f$, and hence $f(0,0)$. Given a $(u-1)$-subset $B$ of $\{1,\ldots, n\}$, when the teams numbered in $B$ pool their shares they learn nothing useful, since for each $h^\star \in \mathbb{F}[X]$, there is a unique polynomial $f^\star$ with $\mathrm{deg}_Y < u$ such that $f(X, 0) = h^\star(X)$ and $f(X, d_j) = f_j(X)$ for each $j \in B$. Explicitly,

$\displaystyle f^\star = h^\star(X) \prod_{j \in B} \frac{Y-d_k}{-d_k}+ \sum_{j \in B}Y \prod_{k \in B \atop k \not=j}\frac{Y-d_k}{d_j-d_k} f_j(X).$

This polynomial is consistent with the secret being $h^\star (0)$; since $h^\star(0)$ takes each value in $\mathbb{F}_p$ equally often as $h^\star(X)$ varies over the polynomials in $\mathbb{F}_p[X]$ of degree at most $t-1$ no useful information is learned.

It should be admitted that one can arrive at essentially the same situation, more simply, by sharing the secret between $n$ teams, and then sharing each 'team-share' between the $m$ people in each team, each time using the normal Shamir Scheme.

## Permutation modules and the simple group of order 168

June 26, 2018

In an earlier post I decomposed the permutation module for $\mathrm{GL}_3(\mathbb{F}_2)$ acting on the $7$ non-zero lines in $\mathbb{F}_2^3$, with coefficients in a field of characteristic $2$. Some related ideas give the shortest proof I know of the exceptional isomorphism between $\mathrm{PSL}_2(\mathbb{F}_7)$ and $\mathrm{GL}_3(\mathbb{F}_2)$. The motivation section below explains why what we do ‘must work’, but is not logically necessary.

### Motivation

Let $h \in \mathrm{GL}_3(\mathbb{F}_2)$ have order $7$ and let

$H = \mathrm{N}_{\mathrm{GL}_3(\mathbb{F}_2)}(\langle h\rangle) \cong C_7 \rtimes C_3.$

The simple modules for $\mathrm{GL}_3(\mathbb{F}_2)$ in characteristic $2$ are the trivial module, the natural module $E$, its dual $E^\star$, and the reduction modulo $2$ of a module affording the $8$-dimensional character induced from either of the two non-trivial linear characters of $H$. (A short calculation with Mackey’s Formula and Frobenius Reciprocity shows the module is irreducible in characteristic zero; then since $168/8$ is odd, it reduces modulo $2$ to a simple projective. Alternatively one can work over $\mathbb{F}_4$ and perform the analogue of the characteristic zero construction directly.) The permutation module for $\mathrm{GL}_3(\mathbb{F}_2)$ acting on the cosets of $H$, defined over $\mathbb{F}_2$, is projective. Since it has the trivial module in its socle and top, the only possible Loewy structures are

$\begin{matrix} \mathbb{F}_2 \\ E \oplus E^\star \\ \mathbb{F}_2 \end{matrix}, \quad \begin{matrix} \mathbb{F}_2 \\ E \\ E^\star \\ \mathbb{F}_2 \end{matrix}, \quad \begin{matrix} \mathbb{F}_2 \\ E^\star \\ E \\ \mathbb{F}_2. \end{matrix}.$

In every case, the permutation module contains a $4$-dimensional $\mathbb{F}_2\mathrm{GL}_3(\mathbb{F}_2)$-submodule $U$ having either $E$ or $E^\star$ as a top composition factor. Below we construct $U$ as a module for $\mathrm{PSL}_2(\mathbb{F}_7)$ and hence obtain the exceptional isomorphism in the form $\mathrm{PSL}_2(\mathbb{F}_7) \cong \mathrm{GL}(E)$.

In fact the final two Loewy structures can be ruled out because they are asymmetric with respect to $E$ and $E^\star$, and so paired under the outer automorphism of $\mathrm{GL}_3(\mathbb{F}_2)$. Thus if one exists then so does the other, and there would be two different projective covers of the trivial module (impossible).

### Construction

Let $G = \mathrm{PSL}_2(\mathbb{F}_7)$ regarded as a permutation group acting on the $8$ non-zero lines in $\mathbb{F}_7^2$. For $j \in \{0,1,\ldots, 6\}$, let $j$ denote the line through $(j,1)$, and let $\infty$ denote the line through $(1,0)$. Let $M = \langle e_j : j \in \{0,1,\ldots, 6\} \cup \{\infty\} \rangle_{\mathbb{F}_2}$ be the corresponding $\mathbb{F}_2G$-permutation module.

Let $h = (0, 1, \dots, 6) \in G$, corresponding to the Möbius map $x \mapsto x + 1$ and to the matrix $\left( \begin{matrix} 1 & 1 \\ 0 & 1 \end{matrix} \right)$ in $\mathrm{SL}_2(\mathbb{F}_7)$. The simple modules for $\mathbb{F}_2\langle h \rangle$ correspond to the factors of

$x^7 + 1 = (x+1)(x^3+x+1)(x^3+x^2+1).$

Since $\{1,2,4\} \le \mathbb{F}_7^\times$ is permuted by squaring, an idempotent killing the simple modules with minimal polynomial $x^3 + x + 1$ is $h^4 + h^2 + h = (h^3+h^2+1)h$. Therefore the module $U$ we require has codimension $1$ in the $8-3 = 5$-dimensional module

$M(h + h^2 + h^4) = \langle e_\infty \rangle_{\mathbb{F}_2} \oplus \left\langle \begin{matrix} e_0 + e_1 + e_3 \\ e_1 + e_2 + e_4 \\ e_2 + e_3 + e_5 \\ e_3 + e_4 + e_6 \end{matrix} \right\rangle_{\mathbb{F}_2}$

Here $(e_3+e_4+e_6)h = e_4+e_5 + e_0$ is the sum of the first three basis vectors, so in its right action, $h$ is represented by

$\left( \begin{matrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 \end{matrix}\right)$.

By symmetry $e_\infty$ must appear in some vector in the $\mathbb{F}_2G$-module $U$ we require. It therefore seems a plausible guess that $U$ we require is generated by $u_0$ where

$u_0 = e_\infty + e_0 + e_1 + e_3$

and so $U$ has basis $u_0, u_1, u_2, u_3$ where $u_i = u_0 h^i$ for each $i$.

Let $t = (1,2,4)(3,5,6) \in \mathrm{N}_G(\langle h \rangle)$ and let

$s = (0,\infty)(1,3)(2,5)(4,6) \in G,$

chosen so that $h^t = h^2$ and $t^s = t^2$. The corresponding Möbius maps are $x \mapsto 2x$ and $x \mapsto 3/x$ and one choice of corresponding matrices in $\mathrm{SL}_2(\mathbb{F}_7)$ is $\left( \begin{matrix} 4 & 0 \\ 0 & 2 \end{matrix} \right)$ and $\left(\begin{matrix} 0 & 2 \\ 3 & 0 \end{matrix}\right)$. Extending the calculation for $h$ above shows that in their (right) action on $U$,

$h \mapsto \left( \begin{matrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 \end{matrix}\right), t \mapsto \left( \begin{matrix} 1 & 1 & 0 & 1 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 1 & 1 & 1 \end{matrix}\right), s \mapsto \left( \begin{matrix} 1 & 0 & 0 & 0 \\ 1 & 1 & 1 & 1 \\ 1 & 0 & 1 & 0 \\ 1 & 0 & 0 & 1 \end{matrix}\right).$

The unique trivial submodule of $U$ is spanned by

$u_0 + u_2 + u_3 = e_0 + e_1 + \cdots + e_6 + e_\infty.$

Quotienting out, we obtain the $3$-dimensional representation of $G$ where

$h \mapsto \left( \begin{matrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 1 \end{matrix} \right), \; t \mapsto \left( \begin{matrix} 0 & 1 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 1 \end{matrix}\right), \; s \mapsto \left( \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 1 & 0 & 1 \end{matrix} \right)$

Since $\mathrm{PSL}_2(\mathbb{F}_7)$ has cyclic Sylow $3$-subgroups and cyclic Sylow $7$-subgroups, and a unique conjugacy class of elements of order $2$ (which contains $s$), one can see just from the matrices above that the representation is faithful. Comparing orders, we get $\mathrm{PSL}_2(\mathbb{F}_7) \cong \mathrm{GL}_3(\mathbb{F}_2)$.

## A weighted Chebyshev’s order inequality

June 20, 2018

Let $w_i \in \mathbb{R}^{\ge 0}$ be non-negative weights and let $a_1, \ldots, a_n \in \mathbb{R}^{\ge 0}$, $b_1, \ldots, b_n \in \mathbb{R}^{\ge 0}$ be non-negative numbers such that

\begin{aligned} &a_1 \ge a_2 \ge \ldots \ge a_n \\ & b_1 \le b_2 \le \ldots \le b_n \end{aligned}.

Then

\begin{aligned} \bigl( \sum_{i} w_i \bigr) \bigl(\sum_{j}w_ja_jb_j \bigr) &= \frac{1}{2} \sum_{ij} w_iw_j (a_jb_j + a_ib_i) \\ &\le \frac{1}{2} \sum_{ij} w_iw_j (a_jb_i + a_ib_j) \\ &= \bigl( \sum_i w_ia_i\bigr) \bigl( \sum_j w_jb_j \bigr)\end{aligned}

where the middle step follows from the inequality $(a_jb_i+a_ib_j) - (a_jb_j + a_ib_i) = (a_j-a_i)(b_i-b_j) \ge 0$. In particular, if $\sum_i w_i = 1$ then we obtain

$\sum_k w_i a_i b_i \le \bigl( \sum_i w_i a_i \bigr)\bigl(\sum_i w_i b_i\bigr).$

This is a weighted version of Chebyshev’s order inequality. The weighted Chebychev’s order inequality appears (with a different proof) in an interesting article in MAA Mathematics Magazine, where it is used to deduce Jensen’s inequality.