Why is Adobe Acrobat so innumerate?

July 12, 2019

The image below shows Adobe Acrobat’s conversion to Microsoft Powerpoint of a typical slide produced by LaTeX and Beamer. The original is shown below.

For the £70 annual fee Adobe charge for turning on this misfeature, I think one might expect something better. The execrable kerning is just one final kick in the teeth. Fortunately for my credit card, I was able to brandish the Consumer Contracts Regulations in an online chat and have been promised a full refund. I’ve seen similar glitches in journal papers: the culprit now seems clear. As an alternative on the Mac, I suggest using pdf-to-keynote and then exporting from Keynote. Or better still, have nothing to do with either Adobe or Microsoft — if only that were feasible. In fairness, I should note that the output from Adobes’s conversion is editable in Microsoft Powerpoint; unfortunately the only reasonable edit one could make it to delete the lot and start over.


Difference attacks on toy block ciphers

May 22, 2019

After setting an examination for my Cipher Systems course that turned me from hero to zero in the eyes of the students (you can guess why), I’m somewhat disillusioned with the subject. One unavoidable issue is that it’s hard to isolate the interesting mathematical ideas from the forest of notation and definitions needed to define a modern block cipher such as AES. With a few honorable exceptions, it seems most textbooks don’t even try.

As block ciphers go, AES is a relatively elegant design: its cryptographic strength comes ultimately from the pseudo-inverse function P : \mathbb{F}_2^8 \rightarrow \mathbb{F}_2^8 defined by identifying \mathbb{F}_2^8 with the finite field \mathbb{F}_{2^8} and then inverting non-zero elements. (The zero word is the unique fixed point of P.) In a typical round the input in \mathbb{F}_2^{128} is split into 16 bytes and the non-linear function P is applied to each byte; the bytes are then combined and mixed up by two \mathbb{F}_2-affine permutations, and these are followed by a final key addition in \mathbb{F}_2.

Stream ciphers seem to be even messier: I asked
without great success on crypto.stackexchange.com for a simply defined but not-trivially-breakable stream cipher I could use in my course.

As a toy model for a difference attack in my course, I used the block cipher with plaintext and ciphertexts \mathbb{F}_2^8, key space \mathbb{F}_2^8 \times \mathbb{F}_2^8 and encryption functions defined by

e_{(k,k')}(x) = P(x+k) + k'.

Suppose you know a single plaintext/ciphertext pair (x,z). For each guessed first key k_\star there is a unique k'_\star such that e_{(k_\star,k'_\star)}(x) = z, namely

k'_\star = P(x+k_\star) + z.

Thus a single known plaintext/ciphertext pair does not reveal either component of the key. To motivate the attack in this post, let us consider what two pairs (x,z) and (x+\Delta, z+\Gamma) reveal. To save space, write x_\Delta for x + \Delta. Suppose that (k_\star, k'_\star) is a possible key:

\begin{aligned} x &\mapsto x + k_\star\hskip7.5pt  \mapsto P(x+k_\star) \hskip7pt \mapsto P(x+k_\star) + k'_\star \hskip8pt = z \\ x_\Delta &\mapsto x_\Delta + k_\star \mapsto P(x_\Delta + k_\star) \mapsto P(x_\Delta+k_\star) +k'_\star = z + \Gamma.\end{aligned}

A little thought shows that another possible key is (k_\star + \Delta, k'_\star + \Gamma):

\begin{aligned} x &\mapsto x_\Delta + k_\star \mapsto P(x_\Delta+k_\star) \mapsto P(x_\Delta+k_\star) + k'_\star + \Gamma = z \\ x_\Delta &\mapsto x + k_\star\hskip7.5pt  \mapsto P(x + k_\star)\hskip7pt \mapsto P(x+k_\star) +k'_\star +\Gamma\hskip8pt  = z + \Gamma.\end{aligned}

Observe that the differences of top and bottom are in both cases

\Delta \mapsto \Delta \mapsto \Gamma \mapsto \Gamma.

This illustrates the critical point that differences are invariant under key addition, but typically changed by non-linear functions such as P. By the lemma below, for any non-zero input difference \Delta \in \mathbb{F}_2^8, the output difference

P(x+k_\star) + P(x + k_\star + \Delta)

can be 127 of the 256 elements of \mathbb{F}_{2}^8. This is usually taken as a sign of the cryptographic strength of P. But here it means that one can (apart from in one case) determine the pair \{x+k_\star,x+k_\star+\Delta\} from \Delta and \Gamma. For the toy cipher, this pair is \{x+k, x+\Delta+k\}, and so we determine \{k, k+\Delta\}, leaving exactly two possible keys.

Differences through pseudo-inverse.

Lemma. Let \delta \in \mathbb{F}_{2^8} \backslash \{0\}. If \gamma \in \mathbb{F}_{2^8} \backslash \{0,1/\delta\} then the equation p(\alpha) + p(\alpha+\delta) = \gamma has exactly zero or two solutions.

Proof. If \alpha \in \{0,\delta\} then

p(\alpha)+p(\alpha+\delta) = p(0) + p(\delta) = 0 + 1/\delta = 1/\delta.

We may therefore assume that \alpha\not\in \{0,\delta\} and so p(\alpha) = \alpha^{-1} and p(\alpha+\delta) = (\alpha+\delta)^{-1}. Now \alpha^{-1} + (\alpha+\delta)^{-1} = \gamma if and only if \delta = \gamma \alpha (\alpha + \delta). This is a quadratic equation in \alpha not having a repeated root. \Box.

(In the exceptional case \gamma = 1/\delta, there are four solutions, namely 0, 1, and the two roots of (\alpha/\delta)^2 + (\alpha/\delta) + 1 lying in \delta \mathbb{F}_{2^2}^\times \subseteq \mathbb{F}_{2^8}. In fact, since (\alpha/\delta)^2 + (\alpha/\delta) + 1/\gamma\delta has 2 roots if and only if 1/\gamma\delta \in \{\beta^2 + \beta : \beta \in \mathbb{F}_2^8\}, which is a co-dimension 1 subspace of \mathbb{F}_{2^8}, there are exactly 127 output differences \gamma for each non-zero input difference \delta, each except 1/\delta arising from exactly two inputs \alpha.)

In my course I spent an entire lecture on the difference attack, simplifying the lemma by only considering \Delta = 0000\,0001, corresponding to \delta = 1 \in \mathbb{F}_{2^8}, and omitting the parenthetical paragraph entirely. Still I’m fairly sure no-one except the keenest M.Sc. students got the idea. Probably I will scrap it next year, and instead give an example from a different toy block cipher with a more easily defined S-box. Or maybe, since there is no chance of universal comprehension, I should just leave it to a problem sheet, with a hint that the lemma above follows from Hilbert’s Theorem 90. (Joke.)

Variations on pseudo-inverse

  • The exceptional case \gamma=1/\delta is annoying. It arises because X^2+X+1 has a root in \mathbb{F}_{2^8}. Over any \mathbb{F}_{2^c} with c odd, there are 2^{c-1} output differences for any input difference \gamma. For instance, over \mathbb{F}_{2^2} we only see the exceptional case (and even more embarrassingly, the pseudo-inverse function p is linear), while over \mathbb{F}_{2^3}, defined by \beta^3 + \beta = 1, the output differences for input difference 1 are \{1,1+\beta, 1+\beta^2, 1+\beta+\beta^2\}; these are the (no-longer-so) exceptional \gamma = 1, and the reciprocals of the non-zero elements in the subspace

    \{\alpha^2 + \alpha : \alpha \in \mathbb{F}_{2^3} \} = \langle \beta, \beta^2\rangle.

  • It is mathematically tempting to replace the tricky finite field \mathbb{F}_{2^8} with \mathbb{F}_p for p an odd prime. Taking the input difference \delta, now applied (and extracted) using arithmetic in \mathbb{F}_p, we have

    p(x+1) - p(x) = 1/(x+1) - 1/x = 1/x(x+1)

    for x\not\in \{-1,0\}. Quadratic equations are now more easily understood: we can just complete the square to see that exactly (p-1)/2 of the elements \mathbb{F}_p^\times are output differences; again the output difference 1 has double the probability of the others.

  • I hoped that \mathbb{Z}/2^n\mathbb{Z} might give a nice example, using the function q such that q(2x) = 1/(2x+1) -1 and q(2x+1) = 1/(2x+1). This function is weak respect to the input difference of 1, since, by definition, q(2x+1)-q(2x) = 1 for any x. Perhaps surprisingly, this makes 1 less suitable for the attack. The situation for even input differences is somewhat messy. Clearly this is a non-starter for my course.
  • It’s tempting to try s(\alpha) = \alpha^2. But because of the identity

    s(\alpha + \delta) = (\alpha + \delta)^2 = \alpha^2 + \delta^2 = s(\alpha) + \delta^2,

    an input difference of \delta transitions deterministically (i.e. the input \alpha is irrelevant) to an output difference of \delta^2. One could replace the one-time-pad key addition in my toy block cipher with s and run essentially the same attack, but as a direct replacement for pseudo-inverse, squaring has completely the wrong properties.

  • Given this, one might try r(\alpha) = \alpha^3, working in \mathbb{F}_{2^c}. We have

    (\alpha+1)^3 + \alpha^3 = \alpha^2 + \alpha + 1,

    so by the argument already seen, there is a affine codimension 1 subspace of output differences. Unfortunately cubing is only invertible if c is odd; this is the case when the subspace is properly affine and rules out \mathbb{F}_{2^8}.

  • The sum of all hook characters

    April 2, 2019

    Given a partition \lambda of n, let \chi^\lambda denote the irreducible character of the symmetric group S_n canonically labelled by the partition \lambda. For example, \chi^{(n)} = 1 is the trivial character, \chi^{(1^n)} is the sign character, and \chi^{(n-1,1)} is the irreducible n-1-degree constituent of the natural permutation character \pi. Thus \pi(g) = |\mathrm{Fix}(g)| and \chi^{(n-1,1)}(g) = |\mathrm{Fix}(g)| - 1 for g \in S_n. Let \Gamma = \sum_{a=0}^{n-1} \chi^{(n-a,1^a)} be the sum of all the characters labelled by hook partitions. Let \mathcal{O}_n be the set of permutations in S_n having only cycles of odd length.

    A nice short paper by Jay Taylor uses characters of the symmetric group labelled by skew partitions to prove that

    \Gamma(g) = \begin{cases} 2^{\mathrm{cyc}(g) - 1} & g \in \mathcal{O}_n \\ 0 & g \not\in \mathcal{O}_n \end{cases}

    where \mathrm{cyc}(g) is the number of cycles in g.

    As Taylor notes, this result is a special case of a more general theorem of Regev, proved by him using Lie superalgebras. Taylor’s paper gives a short proof using skew characters of the symmetric group. The purpose of this post is to give an alternative proof, using only a basic result on exterior powers of the natural permutation character and to suggest some generalizations not contained in Regev’s original results.

    Background: a generating function for exterior powers.

    Let \bigwedge^k denote the kth exterior power of a character or representation. Let V be a complex representation of a finite group G with character \chi and let g \in G have eigenvalues \lambda_1, \ldots, \lambda_d in its action on V. Thus \chi(g) = \lambda_1 + \cdots + \lambda_d. Let v_1,\ldots, v_n be a basis for V in which g acts diagonally. Then

    \bigwedge^r V = \langle v_{i_1} \wedge\cdots \wedge v_{i_r} : i_1 < \ldots < i_r \rangle_\mathbb{C}

    and we see that

    (\bigwedge^r \chi) (g) = \sum_{i_1 < \ldots < i_r} \lambda_{i_1} \ldots \lambda_{i_r}.

    This is the r-th elementary symmetric polynomial evaluated at the eigenvalues \lambda_1, \ldots, \lambda_d. Hence

    (1)\quad \sum_{r=0}^d (\bigwedge^r \chi)(g) x^r = (1+\lambda_1 x) \cdots (1+\lambda_d x).

    This generating function is the counterpart for exterior powers of the Molien series, defined using symmetric powers, that is fundamental to invariant theory.

    Background: exterior powers of the natural module.

    Let V = \langle e_1, \ldots, e_n\rangle_\mathbb{C} be the natural representation of S_n over the complex numbers. Now \bigwedge^n V = \langle e_1 \wedge \cdots \wedge e_n \rangle is the sign representation of S_n, and correspondingly, \bigwedge^n \pi = \chi^{(1^n)}. More generally, \bigwedge^r V = \langle e_{i_1} \wedge \cdots \wedge e_{i_r} : i_1 < \ldots < i_r\rangle_\mathbb{C} is the monomial representation of S_n induced from \mathrm{sgn}_{S_r} \times \mathbb{C}_{S_{n-r}}. Its character is known to be

    (2)\quad\bigwedge^r \pi = \chi^{(n-r,1^r)} + \chi^{(n-r+1,1^{r-1})}

    for 1 \le r < n. This decomposition is an immediate corollary of either the Young or Pieri rules. It may also be proved using Specht modules and an explicit isomorphism \bigwedge^r S^{(n-1,1)} \cong S^{(n-r,1^r)}: see for instance Section 5 of this paper with Giannelli and Lim.


    By (2) we have

    \chi^{(n-r,1^r)} = \bigwedge^r \pi - \bigwedge^{r-1} \pi + \cdots + (-1)^{r-1} \pi + (-1)^r 1.

    Hence the coefficient of \bigwedge^k \pi in \Gamma is 1 if n-k is odd, and 0 if n-k is even. For example,

    \begin{aligned} \Gamma_4 &= \chi^{(4)} + \chi^{(3,1)} + \chi^{(2,1,1)} + \chi^{(1,1,1,1)} \\ &= 1+ (\pi - 1) + (\bigwedge^2 \pi - \pi + 1)  + (\bigwedge^3 \pi - \bigwedge^2 \pi + \pi - 1) \\ &= \pi + \bigwedge^3 \pi \end{aligned}

    Thus \Gamma = \sum_{k} \bigwedge^k \pi, where the sum is over all k such that n-k is odd.

    This sum is related to (1). Let g \in S_n have precisely a_k cycles of length k, for each k \in \{1,\ldots, n\}. The eigenvalues of the n \times n permutation matrix representing g can be found by considering the cycles separately: each k-cycle contributes eigenvalues 1, \zeta_k, \ldots, \zeta_{k}^{k-1}, where \zeta_k is a primitive kth root of unity. Since (1+x)(1+\zeta x) \cdots (1+\zeta^{k-1}x ) = 1 + (-1)^{k-1} x^k, the generating function in (1) gives

    \sum_{k=0}^n (\bigwedge^k \pi)(g)x^k = (1+x)^{a_1}(1-x^2)^{a_2} \ldots (1+(-1)^{n-1} x^n)^{a_n}.

    Denote this polynomial by F_g(x). We now have

    \Gamma(g) = \begin{cases} \frac{F_g(1) + F_g(-1)}{2} & n \equiv 1 \bmod 2  \\ \frac{F_g(1) - F_g(-1)}{2} & n \equiv 0 \bmod 2. \end{cases}


    F_g(1) = \begin{cases} 2^{\mathrm{cyc}(g)} & g \in \mathcal{O}_n \\ 0 & g \not\in \mathcal{O}_n \end{cases}

    and F_g(-1) = 0 for all g, the result follows.

    Possible generalizations.

    Let \Delta = \sum_{a} \chi^{(n-a,1^a)} where the sum is over all a not congruent to 1 modulo 3. Then by (2), \Delta = \sum_{b \ge 0} \bigwedge^{3b} \pi, and the proof above suggests that \Delta(g) may also have a fairly simple form. One might also look for analogous identities replacing \bigwedge^k \pi with the permutation character of S_n acting on the k-subsets of \{1,\ldots, n\}.

    Duality, polynomial representations and plethysms

    February 27, 2019

    The purpose of this post is to collect some standard results on duality and symmetric and exterior powers for representations of general and special linear groups. As a warm-up, here is a small true/false quiz with some, I think, rather unintuitive answers. Let K be the algebraic closure of \mathbb{F}_2 and let E be the natural representation of \mathrm{GL}_2(K).

    1. \mathrm{Sym}^2 E is an irreducible representation of \mathrm{GL}_2(K).
    2. The restriction of E to \mathrm{GL}_2(\mathbb{F}_4) is self-dual.
    3. E is self-dual.
    4. The restriction of E to \mathrm{SL}_2(\mathbb{F}_4) is self-dual.
    5. The restriction of \mathrm{Sym}^2 E to \mathrm{SL}_2(\mathbb{F}_4) is self-dual.
    6. The restriction of \mathrm{Sym}^2 E to \mathrm{GL}_2(\mathbb{F}_2) is self-dual, reducible and semisimple.

    Answers, all of which should be clear by the end of this post: 1. False, 2. False, 3. False (but true if you interpreted duality as contragredient duality for algebraic groups), 4. True, 5. False, 6. True.

    For deeper results and more of the general theory from coalgebras and algebraic groups, see Green, Polynomial representations of \mathrm{GL}_n, Springer Lecture Notes in Mathematics 830. These notes of mine might also be of interest.


    Let K be an infinite field. Fix d \in \mathbb{N}. Let x_{ij} for 1 \le i,j \le d be d^2 commuting indeterminates and let K[x_{ij}] be the polynomial ring they generate. Given f \in K[x_{ij}] and a d \times d matrix X, let f(X) denote f evaluated at the d^2 matrix coefficients X_{ij} for 1 \le i,j \le d. A representation \rho : \mathrm{GL}_d(K) \rightarrow \mathrm{GL}_D(K) of the general linear group is said to be polynomial of degree r is there exist polynomials f_{IJ} \in K[x_{ij}] for 1 \le I, J \le D such that \rho(X)_{IJ} = f_{IJ}(X) for all matrices X.

    The natural left \mathrm{GL}_d(K)-module E is a polynomial representation of \mathrm{GL}_d(K) of dimension d and degree 1, in which f_{IJ} = x_{ij} for 1 \le I \le J \le d. The symmetric power \mathrm{Sym}^r E (defined as a quotient of E^{\otimes r}) is a polynomial representation of degree r. For example if d = 2 then \mathrm{Sym}^2 E = \langle e_1^2, e_2^2, e_1e_2 \rangle is the polynomial representation of degree 2 in which

    \left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) \mapsto \left( \begin{matrix} \alpha^2 & \beta^2 & \alpha \beta \\ \gamma^2 & \delta^2 & \gamma \delta \\ 2\alpha \gamma & 2\beta \delta &  \alpha\delta + \beta\gamma  \\ \end{matrix} \right).

    As a check on notational conventions, observe that the first column of the matrix on the right-hand side records the image

    (\alpha e_1 + \gamma e_2)^2 = \alpha^2 e_1^2 + \gamma^2 e_2^2 + 2\alpha\gamma e_1e_2

    of e_1^2. Hence f_{11} = x_{11}^2, f_{21} = x_{21}^2 and f_{31} = 2x_{11}x_{21}. Note that since K is infinite, the degree of the matrix coefficients on the right-hand side is unambiguously two. This fails over finite fields: in fact, by polynomial interpolation, any function f : \mathrm{GL}_d(\mathbb{F}_q) \rightarrow \mathbb{F}_q is a polynomial, of degree at most (q-1)d^2. Another difference between the infinite algebraic group \mathrm{GL}_d(K) and the finite group \mathrm{GL}_d(\mathbb{F}_q) arises when we define duality.

    Duality and symmetric powers

    Mimicking the definition from finite groups, we would define the dual of a representation \rho : \mathrm{GL}_d(K) \rightarrow \mathrm{GL}(V) to be the vector space dual V^\star with the action

    \rho^\star(X) \theta = [v \mapsto \theta(\rho(X^{-1}) v)].

    Let v_1, \ldots, v_D be a basis for V, with dual basis v_1^\star, \ldots, v_D^\star. The calculation

    (\rho^\star(X) v_j^\star)v_i = v_j^\star(\rho(X^{-1})v_i) = \rho(X^{-1})_{ji}

    shows that \rho^\star(X) v_j^\star = \sum_i \rho(X^{-1})_{ji} v_i^\star, and so (carefully remembering the convention for matrices acting on column vectors), we have \rho^\star(X) = \rho(X^{-1})^t. The inverse and transpose combine to make this a well-defined representation, but except when V is a direct sum of trivial representations, each of degree 0, it is not polynomial.

    Example 1 In the basis e_1^2, e_2^2, e_1e_2 of \mathrm{Sym}^2 E, we have

    \begin{aligned} \rho_{(\mathrm{Sym}^2 E)^\star}\left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) &= \left( \rho_{\mathrm{Sym}^2 E} \left( \frac{1}{\Delta} \left( \begin{matrix}  \delta & -\beta \\ -\gamma & \alpha \end{matrix} \right) \right) \right)^t \\ &= \frac{1}{\Delta^2} \left( \begin{matrix} \delta^2  & \beta^2 & - \beta\delta\\ \gamma^2 & \alpha^2 & -\alpha\gamma \\ -2\gamma \delta &  -2\alpha\beta & \alpha \delta + \beta \gamma  \end{matrix} \right)^t \\ &= \frac{1}{\Delta^2} \left( \begin{matrix} \delta^2 & \gamma^2 & -2\gamma\delta \\ \beta^2 & \alpha^2 & -2\alpha\beta \\ -\beta\delta & -\alpha\gamma & \alpha \delta + \beta \gamma \\  \end{matrix} \right) \end{aligned}

    where \Delta = \alpha \delta - \beta\gamma denotes the determinant. The top-left coefficient is \delta^2 / \Delta^2, and this is not a polynomial in \alpha, \beta, \gamma, \delta. We shall see later that when this representation is restricted to the special linear group \mathrm{SL}_2(K), duality is much better behaved.

    Because conventional duality takes us out of the category of polynomial representations, a different definition of duality is used. We define the contragredient dual of a polynomial representation V of \mathrm{GL}_d(K) to be the vector space dual V^\star with the action

    \rho^\circ(X) \theta = [v \mapsto \theta(\rho(X^t)v)]

    for X \in \mathrm{GL}_d(K) and \theta \in V^\star. A very similar calculation to the above shows that \rho^\circ(X) = \rho(X^t)^t. Note that typically, the two transposes do not cancel. For instance,

    \begin{aligned} \rho_{(\mathrm{Sym}^2 E)^\circ}\left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) &= \left( \rho_{\mathrm{Sym}^2 E} \left( \begin{matrix}  \alpha & \gamma \\ \beta & \delta \end{matrix} \right) \right)^t \\ &= \left( \begin{matrix} \alpha^2 & \gamma^2 & \alpha\gamma \\ \beta^2 & \delta^2 & \beta\delta \\  2\alpha\beta &  2\gamma\delta & \alpha \delta + \beta \gamma \end{matrix} \right)^t \\ &=  \left( \begin{matrix} \alpha^2 & \beta^2 & 2\alpha\beta \\  \gamma^2 & \delta^2 & 2\gamma\delta \\ \alpha\gamma & \beta\delta &  \alpha \delta + \beta \gamma \end{matrix} \right)\end{aligned}

    is the matrix for \mathrm{Sym}^2 E but with \alpha\beta and \gamma\delta doubled and 2\alpha\gamma and 2\beta\delta halved.

    To make this construction more explicit, we shall prove that (\mathrm{Sym}^2 E)^\circ \cong \mathrm{Sym}_2E, where

    \mathrm{Sym}_2E = \langle e_1 \otimes e_1, e_2 \otimes e_2, e_1 \otimes e_2 + e_2 \otimes e_1 \rangle

    is the symmetric square, but now constructed as a submodule of E \otimes E. (The formal definition is below.) Indeed, with respect to the indicated basis of \mathrm{Sym}_2E, the action is

    \rho_{\mathrm{Sym}_2E} \left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) = \left( \begin{matrix} \alpha^2 & \beta^2 & 2\alpha\beta  \\ \gamma^2 &  \delta^2 & \beta\delta \\  \alpha\gamma & \beta \delta & \alpha \delta + \beta \gamma \end{matrix} \right)

    exactly as for \rho_{(\mathrm{Sym}^2 E)^\circ}. For instance, the first column is now given by the image of e_1 \otimes e_1, namely

    \begin{aligned} (\alpha e_1 + \gamma e_2) \otimes (\alpha e_1 + \gamma e_2) = & \alpha^2 (e_1 \otimes e_1) + \gamma^2 (e_2 \otimes e_2) \\ &\qquad + \alpha\gamma (e_1 \otimes e_2 + e_2 \otimes e_1). \end{aligned}

    If \mathrm{char} K \not= 2 then \mathrm{Sym}^2 E \cong \mathrm{Sym}_2 E, via the isomorphism defined by

    \begin{aligned} e_1^2 &\mapsto e_1 \otimes e_1 \\  e_2^2 &\mapsto e_2 \otimes e_2, \\  e_1e_2 &\mapsto \frac{1}{2}(e_1 \otimes e_2 + e_2 \otimes e_1).   \end{aligned}

    (Correspondingly, changing the basis of \mathrm{Sym}^2 E to e_1^2,e_2^2, 2e_1e_2 doubles the third column and halves the third row of a matrix \rho_{\mathrm{Sym}^2 E}(X), giving the matrix \rho_{\mathrm{Sym}_2 E}(X).) But if \mathrm{char} K = 2 then the representations are non-isomorphic. Indeed, from the matrices above, it’s clear that \mathrm{Sym}^2 E has \langle e_1^2, e_2^2 \rangle as a subrepresentation, and not hard to see that in fact this is its unique non-trivial proper subrepresentation. The quotient is the one-dimensional determinant representation. (Some calculation is required: the warning example for \mathrm{GL}_2(\mathbb{F}_2) in Example 11 below shows that uniqueness fails if K is replaced with \mathbb{F}_2.) Similarly, \mathrm{Sym}_2 E has \langle e_1 \otimes e_2 + e_2 \otimes e_1 \rangle as its unique non-trivial proper subrepresentation. Thus, as expected from duality, the head of \mathrm{Sym}^2 E is isomorphic to the socle of \mathrm{Sym}_2 E, both being the determinant representation.

    More generally, \mathrm{Sym}^r E has L^{(r)}(E) as a subrepresentation, where L^{(r)}(E) = \langle v^r : v \in E \rangle; since \mathrm{GL}_d(K) acts transitively on the set \bigl\{v^r : v \in E\backslash \{0\} \bigr\}, L^{(r)}(E) is irreducible. Whenever r \ge \mathrm{char} K it is a proper subrepresentation of \mathrm{Sym}^r E. For example, if \mathrm{char} K = 2 and \dim E = 2 then L^{(5)}(E) is spanned by v_1^4 v_2 and v_1 v_2^4. (This indicates the general proof.)

    Returning to contragredient duality, it is a good exercise (performed in Proposition 5 below) to show that, generalizing our running example, (\mathrm{Sym}^r E)^\circ \cong \mathrm{Sym}_r E. For this it is important to be clear on the definition of \mathrm{Sym}_r E. We define it using the right action of the symmetric group S_r by place permutation on E^{\otimes r}. This is specified on basis elements by

    (e_{i_1} \otimes \cdots \otimes e_{i_r}) \cdot \sigma = e_{i_{1\sigma^{-1}}} \otimes \cdot \otimes e_{i_{r\sigma^{-1}}}.

    (The inverse is correct for a right action of the symmetric group where permutations are composed by multiplying from left to right: it ensures that the basis element e_{i_j} in position j on the left-hand side appears in position j\sigma on the right-hand side.)

    Definition 2. Let I(d,r) = \{1,\ldots, d\}^r. We call the elements of I(d,r) multi-indices. Given \mathbf{i} \in I(d,r) we define

    e_\mathbf{i} = e_{i_1} \otimes \cdots \otimes e_{i_r} \in E^{\otimes r}


    v_\mathbf{i} = \sum_{\sigma} e_\mathbf{i} \cdot \sigma

    where the sum is over a set of coset representatives for the stabiliser of e_\mathbf{i} in the place permutation action. Then

    \mathrm{Sym}_r (V) = \langle v_\mathbf{i} : i \in I(d,r), i_1 \le \ldots \le i_r \rangle. For instance if r=3 and d = 2 then

    v_{(1,2,2)} = e_1 \otimes e_2 \otimes e_2 + e_2 \otimes e_1 \otimes e_2 + e_2 \otimes e_2 \otimes e_1.

    Suitable coset representatives for the stabiliser \langle (2,3)\rangle of e_1 \otimes e_2 \otimes e_2 in S_3 are \mathrm{id}, (1,2) and (1,3).

    Example 3. As an example of the multi-index notation, note that if X \in \mathrm{GL}_d(K) and \mathbf{j} \in I(d,r) then

    \begin{aligned} \rho_{E^{\otimes r}}(X) e_\mathbf{j} &= \sum_{i_1, \ldots, i_r \in \{1,\ldots, d\}} X_{i_1j_1} \ldots X_{i_rj_r} e_{i_1} \otimes \cdots \otimes e_{i_r} \\ &=\sum_{\mathbf{i} \in I(d,r)} X_{i_1j_1} \ldots X_{i_rj_r} e_\mathbf{i}. \end{aligned}

    Summing over coset representatives in Definition 2 ensures that each summand in v_\mathbf{i} is a different rearrangement of e_\mathbf{i} and so, no matter what the characteristic, \mathrm{Sym}_r V is the subspace of E^{\otimes r} of fixed points for the place permutation action of S_r. By this observation and the following lemma, whose proof is left as an exercise, \mathrm{Sym}_r V is a subrepresentation of V^{\otimes r}.

    Lemma 4. The right action of S_r on E^{\otimes r} commutes with the left action of \mathrm{GL}_d(K). \Box

    Proposition 5. (\mathrm{Sym}^r E)^\circ \cong \mathrm{Sym}_r E.

    Proof. Fix X \in \mathrm{GL}_d(K) and \mathbf{j} \in I(d,r). By the remark about rearrangements above, v_\mathbf{j} = \sum_{\mathbf{j}'} e_\mathbf{j'}, where, as a standing convention, the sum is over all distinct rearrangements \mathbf{j}' of \mathbf{j}. By Example 3 we have

    \rho_{E^{\otimes r}}(X) e_\mathbf{j'} = \sum_{i \in I(d,r)} X_{i_1j'_1} \ldots X_{i_rj'_r} e_\mathbf{i}.


    \rho_{\mathrm{Sym}_r}(X) v_\mathbf{j} = \sum_{\mathbf{i}\in I(d,r)} \sum_{\mathbf{j}'} X_{i_1j'_1} \ldots X_{i_rj'_r} e_\mathbf{i}.

    The right-hand side is known to be in \mathrm{Sym}_r E. Looking at the coefficient of e_\mathbf{i} where i_1 \le \ldots \le i_r we see that the coefficient of v_\mathbf{i} in the right hand side is

    \sum_{\mathbf{j}'} X_{i_1j'_1} \ldots X_{i_rj'_r}X_{i_rj'_r}.

    We now turn to (\mathrm{Sym}^r E)^\circ. Given \mathbf{i} \in I(d,r) define

    w_{\mathbf{i}} = e_{i_1} \ldots e_{i_r} \in \mathrm{Sym}^r E.

    Observe that the w_\mathbf{i} for i_1 \le \ldots \le i_r are a basis of \mathrm{Sym}^r E. Let w_\mathbf{i}^\star denote the corresponding element of the dual basis of (\mathrm{Sym}^r V)^\circ. We have

    \rho_{(\mathrm{Sym}^r E)^\circ}(X) w_\mathbf{j}^\star = [w_\mathbf{i} \mapsto w_\mathbf{j}^\star \bigl( \rho_{\mathrm{Sym}^r V}(X^t) w_\mathbf{i}].

    Since \rho_{\mathrm{Sym}^r E}(X^t) w_\mathbf{i} = \sum_\mathbf{k} X^t_{k_1i_1} \ldots X^t_{k_ri_r} w_\mathbf{k}, the coefficient of w_\mathbf{i}^\star in the image of w_\mathbf{j}^\star is

    \sum_{\mathbf{j}'} X^t_{j'_1i_1} \ldots X^t_{j'_ri_r}.

    Noting the transpose in X^t, this agrees with the coefficient of v_\mathbf{i} in the image of v_\mathbf{j}. Therefore the map v_\mathbf{k} \mapsto w_\mathbf{k}^\star defines a isomorphism of representations \mathrm{Sym}_r E \cong (\mathrm{Sym}^r E)^\circ. \Box

    The proposition is an interesting example of a `self-generalizing’ result. To be entirely honest, some human intervention is required, but it is essentially a matter of book-keeping.

    Corollary 6.
    Let V be a representation of \mathrm{GL}_d(K). Then

    (\mathrm{Sym}^r V)^\circ \cong \mathrm{Sym}_r V^\circ.

    Proof. Let D = \mathrm{dim} V. By Proposition 5, taking E = V, there is a D \times D matrix P such that

    P^{-1} \bigl(\rho_{\mathrm{Sym}^r V}(Y^t)\bigr)^t P = \rho_{\mathrm{Sym}_r(V)}(Y)

    for all matrices Y \in \mathrm{GL}_D(K). Hence, setting Y = \rho_V(X^t)^t, we have

    P^{-1} \bigl(\rho_{\mathrm{Sym}^r V}(\rho_V(X^t))\bigr)^t P = \rho_{\mathrm{Sym}_r(V)}\bigl(\rho_V(X^t)^t\bigr)

    for all matrices X \in \mathrm{GL}_d(K). In the representation (\mathrm{Sym}^r V)^\circ the matrix representing X \in \mathrm{GL}_d(K) is, by definition, \bigl(\rho_{\mathrm{Sym}^r V}(\rho_V(X^t))\bigr)^t, while in the representation \mathrm{Sym}_r V^\circ, the matrix representing X \in \mathrm{GL}_d(K) is, again by definition, \rho_{\mathrm{Sym}_r V}\bigl(\rho(X^t)^t\bigr). By the previous displayed equation, these representing matrices are conjugate by the fixed matrix P. \Box.

    Exterior powers

    Perhaps surprisingly, copying the construction of \mathrm{Sym}_r E for exterior powers does not give a new representation. Let d \ge r. We first find the matrices for \bigwedge^r E.

    Lemma 7. Let X \in \mathrm{GL}_d(K). Let Y = \rho_{\wedge^r E}(X) be the matrix representing X in its action on \bigwedge^r E, with respect to the basis of all e_{i_1} \wedge \cdots \wedge e_{i_r} where i_1 < \ldots < i_r. Then

    Y_{\mathbf{i} \mathbf{j}} = \sum_{\sigma \in S_r} X_{i_{1\sigma} j_1} \ldots X_{i_{r\sigma} j_r}.

    Proof. We have

    \begin{aligned} \rho_{\wedge^r E}(X)& (e_{j_1} \wedge \cdots \wedge e_{j_r}) \\ & =\sum_{\mathbf{i} \in I(d,r)} X_{i_1j_1} \cdots X_{i_rj_r} (e_{i_1} \wedge \cdots \wedge e_{i_r}) \\ & =\sum_{i_1 < \ldots < i_r} \sum_{\sigma \in S_r} \mathrm{sgn}(\sigma) X_{i_{1\sigma}j_1} \ldots X_{i_{r\sigma}j_r} (e_{i_1} \wedge \cdots \wedge e_{i_r}). \end{aligned}

    as required. \Box

    Given \mathbf{i} \in I(d,r), let

    u_\mathbf{i} = \sum_{\sigma \in S_r} \mathrm{sgn}(\sigma) e_\mathbf{i} \cdot \sigma.

    Observe that u_\mathbf{i} = 0 if the multi-index \mathbf{i} has two equal entries. (This holds even if \mathrm{char} K = 2.) Let \bigwedge_r E be the subspace of E^{\otimes r} with basis all u_\mathbf{i} for \mathbf{i} \in I(d,r) such that i_1 < \ldots < i_r. It follows from Lemma 4 that \bigwedge_r V is a subrepresentation of V^{\otimes r}. Let X \in \mathrm{GL}_d(K). By Example 3,

    \rho_{\wedge_r E}(X) u_\mathbf{j} = \sum_{\mathbf{i} \in I(d,r)} X_{i_1j_1} \ldots X_{i_rj_r} \sum_{\sigma \in S_r} \mathrm{sgn}(\sigma) e_\mathbf{i} \cdot \sigma.

    The right-hand side is known to be in \bigwedge_r V. The coefficient of e_\mathbf{i} where i_1 < \ldots < i_r in the right-hand side is

    \sum_{\sigma \in S_r} \mathrm{sgn}(\sigma) X_{i_{1\sigma} j_1} \ldots X_{i_{r\sigma}j_r}.

    Comparing with Lemma 7, we see that the matrices are the same. Therefore \bigwedge^r E \cong \bigwedge_r E.

    Similarly one can show that \bigwedge^r E \cong (\bigwedge^r E)^\circ. For a more conceptual proof of this, one could argue that since \bigwedge^r E is generated by the highest weight vector e_1 \wedge \cdots \wedge e_r, of weight (1,\ldots,1,0,\ldots,0), it is irreducible; over any field the irreducible polynomial representations of \mathrm{GL}_d(K) are self-dual.

    Special linear groups

    We saw above that the contragredient duality used for representations of the infinite algebraic group \mathrm{GL}_d(K) does not, in general, agree with duality as defined for representations of finite groups. There is an interesting exception to this which occurs when d = 2 and we restrict to the action of \mathrm{SL}_2(K). Write \cong_{\mathrm{SL}_2(K)} for an isomorphism that holds after this restriction. Here it is in our running example.

    Example 8. Let X = \left( \begin{matrix} \alpha & \beta \\ \gamma &\delta \end{matrix}\right). We saw in Example 1 that

    \rho_{(\mathrm{Sym}^2 E)^\star} (X) = \frac{1}{\Delta^2} \left( \begin{matrix} \delta^2 & \gamma^2 & -2\gamma\delta \\ \beta^2 & \alpha^2 & -2\alpha\beta \\ -\beta\delta & -\alpha\gamma & \alpha \delta + \beta \gamma   \end{matrix} \right).

    If X \in \mathrm{SL}_2(K) then \Delta = 1. Changing the basis of (\mathrm{Sym^2}E)^\star by swapping the first and second basis vectors and negating the third we get an isomorphic representation \rho' in which

    \rho'(X) = \left( \begin{matrix} \alpha & \beta^2 & 2\alpha\beta \\ \gamma^2 & \delta^2 &  2\gamma\delta\\ \alpha\gamma & \beta\delta & \alpha \delta + \beta \gamma   \end{matrix} \right).

    This agrees with the matrix representing X in its action on (\mathrm{Sym}^2 E)^\circ \cong (\mathrm{Sym}_2 E). Therefore (\mathrm{Sym}^2 E)^\star \cong_{\mathrm{SL}_2(K)} (\mathrm{Sym}^2 E)^\circ.

    More generally we have the following proposition.

    Proposition 9. Let V be a representation of \mathrm{GL}_2(K). Then V^\star \cong_{\mathrm{SL}_2(K)} V^\circ.

    Proof. Let J = \left( \begin{matrix} 0 & 1 \\ -1 & 0 \end{matrix} \right). By the following calculation

    \begin{aligned} J \left(\begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) J^{-1} &= \left( \begin{matrix} 0 & 1 \\ -1 & 0 \end{matrix}\right)  \left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix}\right)^{-1} \left( \begin{matrix} 0 & -1 \\ 1 & 0 \end{matrix} \right) \\ &= \left( \begin{matrix} 0 & 1 \\ -1 & 0 \end{matrix}\right) \left( \begin{matrix} \delta & -\beta \\ -\gamma & \alpha \end{matrix} \right) \left( \begin{matrix} 0 & -1 \\ 1 & 0 \end{matrix} \right) \\ &= \left( \begin{matrix} -\gamma  & \alpha \\  \delta & \beta \end{matrix} \right) \left( \begin{matrix} 0 & -1 \\ 1 & 0 \end{matrix} \right) \\ &= \left( \begin{matrix} \alpha & \gamma \\ \beta & \delta \end{matrix}\right), \end{aligned}

    for any matrix X, the matrices X^{-1} and X^t are conjugate by the fixed matrix J. Hence so are \rho_V(X^{-1}) and \rho_{V}(X^t), and since transposition preserves conjugacy, so are \rho_V(X^{-1})^t and \rho_V(X^t)^t. The proposition follows. \Box.


    If K has characteristic zero then it is known that

    \mathrm{Sym}^2 \mathrm{Sym}^n E \cong_{\mathrm{SL}_2(K)} \bigwedge^2 \mathrm{Sym}^{n+1} E.

    Example 11. We shall prove that when n = 1, this isomorphism holds for arbitrary K, provided that one side is replaced with its contragredient dual. By Lemma 7, given a matrix X \in \mathrm{GL}_3(K) acting on a basis z_1, z_2, z_3, the matrix \bigwedge^2 X representing X in its action on \langle z_1 \wedge z_2, z_1 \wedge z_3, z_2 \wedge z_3 \rangle with respect to the indicated basis is Y where

    Y_{(i_1,i_2)(j_1,j_2)} = X_{i_1j_1}X_{i_2j_2}-X_{i_2j_1}X_{i_1j_2}.

    Applying this result when X acts on \mathrm{Sym}^2 E with respect to the basis e_1^2, e_2^2, e_1e_2, and setting \Delta = (\alpha \delta - \beta \gamma) and \Gamma = \alpha\delta  + \beta\gamma, we find that

    \begin{aligned} \rho_{\wedge^2 \mathrm{Sym}^2 E}&\left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix}\right) \\  &= \left( \begin{matrix} \alpha^2 \delta^2 - \beta^2\gamma^2 & \alpha^2 \gamma \delta - \gamma^2 \alpha\beta & \beta^2\gamma\delta - \delta^2 \alpha \beta \\ 2\alpha^2 \beta\delta - 2\alpha\gamma \beta^2 & \alpha^2 \Gamma - 2\alpha^2\gamma \beta & \beta^2 \Gamma - 2\beta^2\delta \alpha \\ 2\gamma^2 \beta\delta - 2\alpha\gamma \delta^2 & \gamma^2 \Gamma  - 2\alpha\gamma^2\delta & \delta^2 \Gamma - 2\beta\delta^2 \gamma \end{matrix} \right) \\ &= \left( \begin{matrix} \Delta (\alpha \delta + \beta \gamma) & \Delta \alpha\gamma & -\Delta \beta\delta \\ 2 \Delta \alpha \beta & \Delta \alpha^2 & -\Delta \beta^2 \\ -2\Delta \gamma\delta & -\Delta\gamma^2 &  \Delta \delta^2 \end{matrix} \right). \end{aligned}

    Now use that \Delta = 1 for matrices in \mathrm{SL}_2(K) and change basis by negating the third basis element to get the conjugate matrix on the left below

    \left(\begin{matrix} \alpha \delta + \beta \gamma & \alpha\gamma & \beta\delta \\ 2  \alpha \beta &  \alpha^2 & \beta^2 \\ 2 \gamma\delta & \gamma^2 &  \delta^2 \end{matrix} \right) \sim \left(\begin{matrix} \alpha^2 & \beta^2 & 2  \alpha \beta  \\  \gamma^2 &  \delta^2 & 2 \gamma\delta \\ \alpha\gamma & \beta\delta & \alpha \delta + \beta \gamma \end{matrix} \right).

    The matrix on the right is its conjugate by a permutation of the basis. We saw above in the example of contragredient duality that

    \rho_{(\mathrm{Sym}^2 E)^\circ}\left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) = \left( \begin{matrix} \alpha^2 & \beta^2 & 2\alpha\beta \\  \gamma^2 & \delta^2 & 2\gamma\delta \\ \alpha\gamma & \beta\delta &  \alpha \delta + \beta \gamma \end{matrix} \right).

    The matrices agree and so (\mathrm{Sym}^2 E)^\circ \cong \bigwedge^2 \mathrm{Sym}^2 E. This isomorphism can be recast in various ways, for example, \mathrm{Sym}_2 E \cong \bigwedge^2 \mathrm{Sym}^2 E or \mathrm{Sym}^2 E \cong \bigwedge^2 \mathrm{Sym}_2 E.

    Finite groups

    Let \mathrm{char} K = p. By Proposition 9 any isomorphism of \mathrm{SL}_2(K)-representations restricts to an isomorphism of \mathrm{SL}_2(\mathbb{F}_{p^m})-representations, provided we systematically replace contragredient duality \circ with conventional duality \star. For instance, Example 10 implies that

    (\mathrm{Sym}^2 E)^\star \cong_{\mathrm{SL}_2(\mathbb{F}_{p^m})} \bigwedge^2 \mathrm{Sym}^2 E.

    Here is a warning example showing that some features of the infinite case are not preserved by restriction.

    Example 11. Let E = \mathbb{F}_2^2. We have

    \mathrm{GL}_2(\mathbb{F}_2) = \mathrm{SL}_2(\mathbb{F}_2) = \Bigl\langle \left(\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right), \left( \begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right) \Bigr\rangle

    and, in the usual basis e_1^2, e_2^2, e_1e_2 of \mathrm{Sym}^2 E,

    \begin{aligned}\rho_{\mathrm{Sym^2} E} \left( \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right) &= \left( \begin{matrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 &  1 \end{matrix} \right) \\ \rho_{\mathrm{Sym^2} E} \left( \begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right) &= \left( \begin{matrix} 1 & 1 & 1 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{matrix} \right) .\end{aligned}

    Observe that the sums of the rows of both 3 \times 3-matrices are constantly 1. Therefore in the alternative basis e_1^2, e_2^2, e_1^2+e_2^2 + e_1e_2 we have

    \begin{aligned}\rho'_{\mathrm{Sym^2} E} \left( \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right) &= \left( \begin{matrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 &  1 \end{matrix} \right) \\ \rho'_{\mathrm{Sym^2} E} \left( \begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix} \right) &= \left( \begin{matrix} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{matrix} \right) .\end{aligned}

    Hence \mathrm{Sym}^2 E \cong \mathbb{F}_2 \oplus E is reducible and semisimple, as claimed in the answer to the final quiz question.

    To end here is a problem combining ordinary and contragredient duality, to which I do not immediately have a complete solution.

    Problem 12. Let G be a finite subgroup of \mathrm{GL}_d(K) and let \rho : G \rightarrow \mathrm{GL}(V) be a representation of G. Let V^\circ be the representation defined by \rho^\circ(X) = \rho(X^t)^t. What is the relationship between V, V^\circ and V^\star?

    For instance, if K is a subfield of the real numbers then there is an orthogonal form on K^d invariant under G. (This is part of the standard proof of Maschke’s Theorem.) Hence in this case we may assume that G is a subgroup of the orthogonal group, and so X^t = X^{-1} for all X \in G. It follows that

    \rho^\circ(X) = \rho(X^t)^t = \rho(X^{-1})^t = \rho^\star(X)

    for all X \in G, and so V \cong V^\circ \cong V^\star. The example \mathrm{Sym}^2 E, where E is the natural representation of \mathrm{SL}_2(\mathbb{F}_{2^m}) and m \ge 2, has V \not\cong V^\circ \cong V^\star. It is also possible to have V \cong V^\circ \not\cong V^\star. For example, this is clearly the case if G is a finite subgroup of \mathrm{GL}_1(\mathbb{C}) not contained in \{ \pm 1\} and V is the natural 1-dimensional representation of G.

    Meet-in-the-middle attacks on block ciphers

    November 25, 2018

    A mathematical model for a block cipher is a collection \mathcal{E} of invertible functions e_k : \mathbb{F}_2^n \rightarrow \mathbb{F}_2^n indexed by keys k \in \mathbb{F}_2^\ell. We say that the block cipher \mathcal{E} has block size n and key length \ell. For example, DES has n=64 and \ell = 56, while its successor AES has n=128 and \ell = 128 (or 192 or 256 for the super-paranoid).

    Let \mathcal{E} and \mathcal{E}' be block ciphers of block size n and key lengths \ell and \ell', respectively. An easy way to boost the key length to \ell + \ell', while keeping essentially the same cryptologics, is to compose the encryption functions from the two ciphers. Thus for (k,k') \in \mathbb{F}_2^\ell \times \mathbb{F}_2^{\ell'} we define

    e_{(k,k')}(x) = e'_{k'}\bigl(e_k(x)\bigr).

    Perhaps surprisingly, when \ell, \ell' \le n, this cryptoscheme offers only \mathrm{max}(\ell,\ell')+1 bits of security, rather than the expected \ell + \ell'. The purpose of this post is to give a fairly precise costing of the meet-in-the-middle attack behind this.

    As a notational convention, \star denotes quantities unknown to the cryptanalyst and \dagger quantities chosen or observed by the cryptanalyst. We cost by the number of encryption/decryption operations, or cryptops, justifying this assumption in the course of the argument.

    Let (k_\star,k'_\star) be the (unknown) key. Let x_\dagger be a chosen plaintext and let z_\dagger = e_{(k',k)}(x_\dagger) be its observed encryption. (We write e_{(k'_\star,k_\star)}(x_\dagger) rather than e'_{k'_\star}\bigl(e_{k_\star}(x_\dagger)\bigr) to emphasise that in the chosen plaintext model, we can perform two-stage encryption using the unknown key pair (k_\star,k'_\star), but not each stage separately.) We denote by d'_{k'} the inverse of the encryption function e'_{k'}. Let

    \begin{aligned} E &= \{ \bigl(k, e_k(x_\dagger)\bigr) : k \in \mathbb{F}_2^\ell \} \\ D & = \{ \bigl(k', d'_{k'}(z_\dagger) \bigr) : k' \in \mathbb{F}_2^\ell \} \end{aligned}

    Thus |E| = 2^\ell, |D| = 2^{\ell'} and 2^\ell + 2^{\ell'} cryptops are required to get this far.

    Using any reasonable algorithm, sorting E by the second entry in each pair takes at most 2^\ell \ell s primitive operations for some smallish s. Similarly D is sorted using 2^{\ell'} \ell' s primitive operations. Working through the sorted lists we pair up the (k,y) \in E with the (k',y) \in D, for each y \in \mathbb{F}_2^n. In one possible implementation this makes a list of pointers to the relevant members of D and E for each y \in \mathbb{F}_2^n, so the cost in primitive operations is

    (|E|+|D|)b = (2^\ell + 2^{\ell'}) b

    for some small b. Since each cryptop for \mathcal{E} must produce n bits, and similarly for \mathcal{E}', and typically n \ge \ell, it seems reasonable to assume that the cost of 2^\ell + 2^{\ell'} cryptops is significantly more than the cost of this sorting and pairing up.

    We now have the sets

    \begin{aligned} \mathcal{K}_y &= \{ k \in \mathbb{F}_2^\ell : e_{k}(x_\dagger) = y \} \\ \mathcal{K}'_y &= \{ k' \in \mathbb{F}_2^\ell : d'_{k'}(z_\dagger) = y \} \end{aligned}

    for each y \in \mathbb{F}_2^n. Observe that \mathcal{K}_y \times \mathcal{K}'_y is those key pairs (for the composed cryptosystem) that meet-in-the-middle at y. Fix p distinct plaintexts

    X_\dagger^{(1)}, \ldots, X_\dagger^{(p)} \in \mathbb{F}_2^n \backslash \{x_\dagger\}.

    For each y \in \mathbb{F}_2^n we encrypt X_\dagger^{(1)}, \ldots, X_\dagger^{(p)} using the guessed key pair (k,k') \in \mathcal{K}_y \times \mathcal{K}'_{y}, and check if

    e'_{k'}\bigl(e_{k}(X_\dagger^{(i)})\bigr) = (e_{k'_\star} \circ e_{k_\star})(X^{(i)}_\dagger)

    for each i. If all p encryptions agree, we assume that (k_\star,k'_\star) =  (k,k'). (But for simplicity, let’s say we carry on checking all the other keys.) The number of pairs to be checked in the final step is

    M = \sum_{y \in \mathbb{F}_2^n} |\mathcal{K}_y| |\mathcal{K}'_y|

    and each pair requires two encryptions, making the cost of the final step 2Mp cryptops. The total cost in cryptops is therefore 2^{\ell+\ell'} + 2Mp.

    We now estimate 2^{\ell+\ell'} + 2Mp for the random cipher, and use this as an approximation to DES and to AES.

    Random cipher

    Suppose that each e_k : \mathbb{F}_2^n \rightarrow \mathbb{F}_2^n and e'_{k'} : \mathbb{F}_2^n \rightarrow \mathbb{F}_2^n is chosen uniformly and independently at random from the (2^n)! permutations of \mathbb{F}_2^n. We know that e_{k_\star}(x_\dagger) = y_\dagger and for each other key k, the probability is \frac{1}{2^n} that e_k(x_\dagger) = y_\dagger. Dually, we know that d'_{k'_\star}(z_\dagger) = y_\dagger, and for each other key k', the probability is \frac{1}{2^n} that d'_{k'}(z_\dagger) = y_\dagger. Therefore |\mathcal{K}_{y_\dagger}| and |\mathcal{K'}_{y_\dagger}| are independently distributed as the shifted binomial distribution

    1 + \mathrm{Bin}(2^\ell-1, \frac{1}{2^n}).

    Similarly, if y \not= y_\dagger then |\mathcal{K}|_y and |\mathcal{K}|_y are independently distributed as \mathrm{Bin}(2^\ell-1, \frac{1}{2^n}). (It is worth noting that \mathcal{K}_{y} and \mathcal{K}_{y'} are not independent: for example we have \cup_{y \in \mathbb{F}_2^n} \mathcal{K}_y = \mathbb{F}_2^\ell where the union is disjoint, therefore \sum_{y \in \mathbb{F}_2^n} |\mathcal{K}_y|  = 2^\ell.) It follows by linearity of expectation that

    \begin{aligned} \mathbb{E}[M] &= \mathbb{E}\bigl[\sum_{y \in \mathbb{F}_2^\ell} |\mathcal{K}_y| |\mathcal{K}'_y|\bigr] \\ &= \bigl( 1 + \frac{2^\ell-1}{2^n} \bigr)\bigl( 1 + \frac{2^{\ell'}-1}{2^n} \bigr) + (2^n-1) \bigl( \frac{2^\ell-1}{2^n} \bigr)\bigl( \frac{2^{\ell'}-1}{2^n} \bigr)  \\ &= \frac{2^{\ell+\ell'}}{2^n} + 1 -  \frac{1}{2^{n}} \end{aligned}

    This is very well approximated by \frac{2^{\ell + \ell'}}{2^n} + 1. Hence the total cost is estimated as

    2^{\ell} + 2^{\ell'} + 2p \bigl( \frac{2^{\ell + \ell'}}{2^n} + 1\bigr).

    It remains to choose a suitable value for p. By our randomness assumption, e'_{k'_\star} \circ e_{k_\star} and e'_{k'} \circ e_k are each permutations of \mathbb{F}_2^n chosen uniformly at random subject to the constraint f(x_\dagger) = z_\dagger. Thus for each chosen plaintext X_\dagger \not= x_\dagger, the ciphertexts (e'_{k'_\star} \circ e_{k_\star})(X_\dagger) and (e'_{k'} \circ e_k)(X_\dagger) are uniformly distributed on \mathbb{F}_2^n \backslash \{z_\dagger\}. The probability they agree is therefore 1/(2^n-1). More generally, the probability that e'_{k'}\bigl(e_{k}(X_\dagger^{(i)})\bigr) = (e'_{k'_\star} \circ e_{k_\star})(X^{(i)}_\dagger) for all of the p chosen X_\dagger^{(i)} is

    \displaystyle \frac{1}{2^n-1} \times \ldots \times \frac{1}{2^n-p} \approx \frac{1}{2^{np}}.

    We have to check M key pairs, so the expected number of ‘false’ keys is very nearly \mathbb{E}\bigl[\frac{M}{2^{np}}\bigr]; by the good approximation for M above, this is very nearly

    \displaystyle \frac{2^{\ell+\ell'}}{2^{(1+p)n}} + \frac{1}{2^{np}}.

    Suppose for simplicity that \ell = \ell'. Setting n = \nu \ell, our estimate for the total work is then

    2^{\ell}+2^{\ell'} + 2p\bigl(  2^{(2-\nu)\ell} + 1 \bigr)

    cryptops, where p should be chosen so that \nu(1+p) is large compared to 2. In particular, if \nu is very large then, as one would expect, then one can take p=0: no checking cryptops are needed.

    DES as a random cipher

    For DES, \ell = 56, n = 64, so \nu = \frac{8}{7}, (2-\nu)\ell = \frac{6}{7} \times 56 = 48 and we need \frac{8}{7}(1+p) large compared to 2. Taking p = 2, the total work is 2^{57} + 2^{50}. Clearly for any reasonable choice of p, the cost of the attack is dominated by the first phase, and is about 2^{57} cryptops. Even in 2009, using a special purpose device, this took at most 2 days.

    AES as a random cipher

    For AES with the shortest key length, \ell = n = 128, so \nu = 1 and we need 2p large compared to 2. Again taking p=2, the total work is 2^{129} + 2^{130}, roughly balanced between the first and second phases. For AES with the longest key length, \ell = 256 and n=128, so \nu = \frac{1}{2} and we need (1+p)/2 large compared to 2. Taking p=5, the total work is 2^{257} + 10 \times 2^{378}. The second phase dominates. Of course neither attack is practical.

    A cipher chosen to have many collisions in the middle

    Let \mathcal{K} = \mathcal{K}' =  \mathbb{F}_{2^\ell} \times \mathbb{F}_2^m and define e_{(a,b)} : \mathbb{F}_{2^\ell} \rightarrow \mathbb{F}_{2^\ell} by e_{(a,b)}(x) = ax. (This is multiplication in the finite field: we will ignore that the encryption functions e_{(0,b)} are not injective.) Define e'_{(a',b')} : \mathbb{F}_2^{\ell} \rightarrow \mathbb{F}_{2^\ell} by e'_{(a',b')}(x) = x+a'. The set \mathcal{K}_y \times \mathcal{K}'_{y} of keys that meet-in-the-middle at y \in \mathbb{F}_2^\ell is

    \bigl\{\bigl((y/x^\dagger,u), (y+z^\dagger, v) \bigr) : u, v \in \mathbb{F}_2^m \bigr\}.

    Again we ignore the vanishing small chance that x^\dagger = 0. Hence

    |M| = \sum_{y \in \mathbb{F}_2^\ell} |\mathcal{K}_y||\mathcal{K}'_y| = \sum_{y \in \mathbb{F}_2^\ell} 2^{2m} = 2^{\ell+2m}.

    The estimate for the cost is

    2^{2\ell} + 2Mp = 2^{2\ell} + 2^{\ell+2m+1}p.

    Using the heuristic for p from the random cipher, we have \nu = \ell/n = 1, so we need 1+p large compared to 2. Taking p=2 the total cost is 2^{2\ell} + 2^{\ell+2m+2}. In this case the chance is 1/2^{2m} that the true key is found, but since the second components of each half of the encryption key are irrelevant, this will not worry the attacker. One reason for not using e_{(a,b)}(x) = x+a is that the composed encryption functions for keys \bigl((a,b),(a',b')) are then the same whenever a+a' = k, giving a further collapse. For the composed cipher as defined, since the affine group on \mathbb{F}_{2^\ell} is sharply 2-transitive, we could in fact have taken p=1.

    An algebraic view of secret sharing

    October 17, 2018

    As last year I’m lecturing our 3rd year / 4th year / M.Sc. course on cipher systems. As part of the M.Sc. course I cover the Shamir Secret Sharing Scheme. Here is Shamir’s idea generalized to arbitrary commutative rings, with an application to secret sharing in a hierarchical organization. The idea is very obvious, so no originality is expected or claimed.

    For a practical example, imagine that I have a critically important 20GB file. Using a secret-sharing scheme with threshold 3, I issue shares of the file (each most probably still of size 20GB) to Amazon, Degoo (who promise a ‘top secret cloud drive’) Dropbox and Microsoft. If any single provider goes out of business, or maliciously corrupts the share, the file can still be reconstructed from the remaining three shares. Depending on how much I trust them (and the computational resources available on the various platforms), the reconstruction could be done in the cloud, or on my own machine. In the latter case, a provider can learn the secret only if three of them collude by pooling their shares: this would presumably be a serious breach of all their terms of service.

    Mathematical formulation

    Fix n \in \mathbb{N} and let 1 \le t \le n. Fix a commutative ring R and distinct ideals I, J_1, \ldots, J_n of R. The secret is an element of R/I. We suppose that R has a subset S such that

    1. if A \subseteq \{1, \ldots, n\} is a t-set then the composition S \hookrightarrow R \rightarrow \prod_{i \in A} R/J_i is injective;
    2. if A \subseteq \{1, \ldots, n\} is a (t-1)-set then the composition S \hookrightarrow R \rightarrow R/I \times \prod_{i \in A} R/J_i is surjective.

    In particular (2) implies that S \hookrightarrow R \twoheadrightarrow R/I is surjective.

    To share a secret s + I \in R/I, pick r \in S such that r \equiv s mod I; the share for algebraist i is then r + J_i \in R/J_i. By hypothesis, for each A \subseteq \{1,\ldots, n\} such that |A| = t, the unique element of R congruent to r modulo J_i for each i \in A is r itself. Therefore any t (or more) algebraists can pool their shares and determine r, and hence s = r+I.

    Shamir’s Scheme

    Fix a prime p and distinct c_1, \ldots, c_n \in \mathbb{F}_p. Take R = \mathbb{F}_p[X], I = \langle X \rangle, and J_i = \langle X - c_i \rangle and S = \{ r \in R : \mathrm{deg} f < t \}. (Thus the secret is r(0), and the share for algebraist i is r(c_i).) Property (1) holds because a polynomial of degree \le t-1 is uniquely determined by its values at the t distinct points c_i for i \in A. Moreover, since the map

    \displaystyle S \rightarrow R/I \times \prod_{i \in A} R/J_i \cong \mathbb{F}_p \times \mathbb{F}_p^{t-1}

    is a linear isomorphism whenever |A| = t-1, we have (2) in the strongest possible form.

    Integer version

    For another special case, fix a prime p, a parameter c \ge 2 (of which more later) and consecutive primes q_1 \le \ldots \le q_n with cp \le q_1. Let Q = q_1 \ldots q_t. Take R = \mathbb{Z}, I = p\mathbb{Z}, J_i = q_i \mathbb{Z} and

    S = \{0,1,\ldots, Q-1\}.

    Since each r \in S is uniquely determined by its residues modulo any t of the q_i, we have (1). Suppose the t-1 algebraists numbered a_1, \ldots, a_{t-1} pool their shares. They can learn r modulo Q' = q_{a_1} \ldots q_{a_{t-1}}. Now S contains \lfloor Q/Q' \rfloor ‘candidate secrets’ of the form r+ b Q' where b \in \mathbb{Z}. Provided c is sufficiently large,

    \displaystyle \frac{Q}{Q'} = \frac{q_1 \ldots q_t}{q_{a_1} \ldots q_{a_{t-1}}} \ge cp\frac{q_2 \ldots q_t}{q_{n-t+2} \ldots q_n} \ge p.

    Therefore the ‘candidate secrets’ cover all residue classes modulo p, and so (2) holds. Moreover, the sizes of the fibres over each (r^\star + I, r + J_{a_1}, \ldots, r+J_{a_{t-1}}) vary by at most 1 with r^\star, and have mean size Q / p q_{a_1} \ldots q_{a_{t-1}}. Provided c is large, all the primes q_i have about the same size, and the mean fibre size is approximately c.


    Provided c is large, this variation in the fibre leaks very little information. For small c is it a problem. For example, take n=4, t=3, c = 2, p = 11 and q_1 = 23, q_2 = 29, q_3 = 31, q_4 = 37. Thus Q = q_1q_2q_3 = 20677. For the secret s we choose 7, lifting it to

    r = 11161 \in \{0,\ldots, Q-1\}.

    The shares are then 6, 25, 1, 24. If algebraists 3 and 4 cooperate they can learn r modulo 31 \times 27. This leaves

    \lfloor \frac{Q}{31 \times 37} \rfloor  = 18

    possibilities for r. The mean fibre size is 18 / 11 and since 18 = 2 \times 7 + 1 \times 4, there are 7 fibres of size 2 and 4 of size 1 (namely those for 1 + 11\mathbb{Z}, 4 + 11\mathbb{Z}, 7 + 11\mathbb{Z} and 10 + 11\mathbb{Z}). Choosing one of these at random gives them a 1/4 chance of guessing the secret.

    Choosing a small fibre is a good heuristic, but not infallible: it is possible that the secret corresponds to a large fibre—for example if algebraists 1 and 2 pool their shares then the fibres for 2 + 11 \mathbb{Z} and 9 + 11\mathbb{Z} have size 2, and the rest have size 3.


    What other rings are amenable to the generalized Shamir scheme?

    Secret sharing in a hierarchy

    One possible answer to this question is ‘multivariable polynomial rings’. Here is an idea along these lines.

    Let R = \mathbb{F}_p[X,Y]. Fix m, n \in \mathbb{N} and thresholds t, u with t \le m and u \le n. Let S be the set of polynomials f \in R with \mathrm{deg}_X f < t and \mathrm{deg}_Y f < u. Fix distinct non-zero c_1, \ldots, c_m \in \mathbb{F}_p and distinct d_1, \ldots, d_n \in \mathbb{F}_p. The aim is to share a secret s \in \mathbb{F}_p between teams T_1, \ldots, T_n where team T_j consists of people P_{1j}, \ldots, p_{mj}. For this, choose f \in S such that f(0,0) = s. For each j \in \{1,\ldots, n\}, let f_j(X) = f(X, d_j). For each j \in \{1,\ldots, n\} and each i \in \{1,\ldots, m\}, let s_{ij} = f_j(c_i) = f(c_i,d_j). Issue s_{ij} as the share to person P_{ij}.

    The shares s_{ij} are the shares in a normal Shamir Secret Sharing Scheme for f_j(0). By the linear isomorphism in the first example above, when t people in the team cooperate to learn f_j(0), they in fact learn f_j itself. Suppose that u teams learn f_{b_1}, \ldots, f_{b_u}. Then since \mathrm{deg}_Y f < u, they can cooperate to learn f, and hence f(0,0). Given a (u-1)-subset B of \{1,\ldots, n\}, when the teams numbered in B pool their shares they learn nothing useful, since for each h^\star \in \mathbb{F}[X], there is a unique polynomial f^\star with \mathrm{deg}_Y < u such that f(X, 0) = h^\star(X) and f(X, d_j) = f_j(X) for each j \in B. Explicitly,

    \displaystyle f^\star = h^\star(X) \prod_{j \in B} \frac{Y-d_k}{-d_k}+ \sum_{j \in B}Y \prod_{k \in B \atop k \not=j}\frac{Y-d_k}{d_j-d_k} f_j(X).

    This polynomial is consistent with the secret being h^\star (0); since h^\star(0) takes each value in \mathbb{F}_p equally often as h^\star(X) varies over the polynomials in \mathbb{F}_p[X] of degree at most t-1 no useful information is learned.

    It should be admitted that one can arrive at essentially the same situation, more simply, by sharing the secret between n teams, and then sharing each 'team-share' between the m people in each team, each time using the normal Shamir Scheme.

    Permutation modules and the simple group of order 168

    June 26, 2018

    In an earlier post I decomposed the permutation module for \mathrm{GL}_3(\mathbb{F}_2) acting on the 7 non-zero lines in \mathbb{F}_2^3, with coefficients in a field of characteristic 2. Some related ideas give the shortest proof I know of the exceptional isomorphism between \mathrm{PSL}_2(\mathbb{F}_7) and \mathrm{GL}_3(\mathbb{F}_2). The motivation section below explains why what we do ‘must work’, but is not logically necessary.


    Let h \in \mathrm{GL}_3(\mathbb{F}_2) have order 7 and let

    H  = \mathrm{N}_{\mathrm{GL}_3(\mathbb{F}_2)}(\langle h\rangle) \cong C_7 \rtimes C_3.

    The simple modules for \mathrm{GL}_3(\mathbb{F}_2) in characteristic 2 are the trivial module, the natural module E, its dual E^\star, and the reduction modulo 2 of a module affording the 8-dimensional character induced from either of the two non-trivial linear characters of H. (A short calculation with Mackey’s Formula and Frobenius Reciprocity shows the module is irreducible in characteristic zero; then since 168/8 is odd, it reduces modulo 2 to a simple projective. Alternatively one can work over \mathbb{F}_4 and perform the analogue of the characteristic zero construction directly.) The permutation module for \mathrm{GL}_3(\mathbb{F}_2) acting on the cosets of H, defined over \mathbb{F}_2, is projective. Since it has the trivial module in its socle and top, the only possible Loewy structures are

    \begin{matrix} \mathbb{F}_2 \\ E \oplus E^\star \\ \mathbb{F}_2 \end{matrix}, \quad \begin{matrix} \mathbb{F}_2 \\ E \\ E^\star \\ \mathbb{F}_2 \end{matrix}, \quad \begin{matrix} \mathbb{F}_2 \\ E^\star \\ E \\ \mathbb{F}_2. \end{matrix}.

    In every case, the permutation module contains a 4-dimensional \mathbb{F}_2\mathrm{GL}_3(\mathbb{F}_2)-submodule U having either E or E^\star as a top composition factor. Below we construct U as a module for \mathrm{PSL}_2(\mathbb{F}_7) and hence obtain the exceptional isomorphism in the form \mathrm{PSL}_2(\mathbb{F}_7) \cong \mathrm{GL}(E).

    In fact the final two Loewy structures can be ruled out because they are asymmetric with respect to E and E^\star, and so paired under the outer automorphism of \mathrm{GL}_3(\mathbb{F}_2). Thus if one exists then so does the other, and there would be two different projective covers of the trivial module (impossible).


    Let G = \mathrm{PSL}_2(\mathbb{F}_7) regarded as a permutation group acting on the 8 non-zero lines in \mathbb{F}_7^2. For j \in \{0,1,\ldots, 6\}, let j denote the line through (j,1), and let \infty denote the line through (1,0). Let M = \langle e_j : j \in \{0,1,\ldots, 6\} \cup \{\infty\} \rangle_{\mathbb{F}_2} be the corresponding \mathbb{F}_2G-permutation module.

    Let h = (0, 1, \dots, 6) \in G, corresponding to the Möbius map x \mapsto x + 1 and to the matrix \left( \begin{matrix} 1 & 1 \\ 0 & 1 \end{matrix} \right) in \mathrm{SL}_2(\mathbb{F}_7). The simple modules for \mathbb{F}_2\langle h \rangle correspond to the factors of

    x^7 + 1 = (x+1)(x^3+x+1)(x^3+x^2+1).

    Since \{1,2,4\} \le \mathbb{F}_7^\times is permuted by squaring, an idempotent killing the simple modules with minimal polynomial x^3 + x + 1 is h^4 + h^2 + h = (h^3+h^2+1)h. Therefore the module U we require has codimension 1 in the 8-3 = 5-dimensional module

    M(h + h^2 + h^4) = \langle e_\infty \rangle_{\mathbb{F}_2} \oplus \left\langle \begin{matrix} e_0 + e_1 + e_3 \\ e_1 + e_2 + e_4 \\ e_2 + e_3 + e_5 \\ e_3 + e_4 + e_6 \end{matrix} \right\rangle_{\mathbb{F}_2}

    Here (e_3+e_4+e_6)h = e_4+e_5 + e_0 is the sum of the first three basis vectors, so in its right action, h is represented by

    \left( \begin{matrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 \end{matrix}\right).

    By symmetry e_\infty must appear in some vector in the \mathbb{F}_2G-module U we require. It therefore seems a plausible guess that U we require is generated by u_0 where

    u_0 = e_\infty + e_0 + e_1 + e_3

    and so U has basis u_0, u_1, u_2, u_3 where u_i = u_0 h^i for each i.

    Let t = (1,2,4)(3,5,6) \in \mathrm{N}_G(\langle h \rangle) and let

    s = (0,\infty)(1,3)(2,5)(4,6) \in G,

    chosen so that h^t = h^2 and t^s = t^2. The corresponding Möbius maps are x \mapsto 2x and x \mapsto 3/x and one choice of corresponding matrices in \mathrm{SL}_2(\mathbb{F}_7) is \left( \begin{matrix} 4 & 0 \\ 0 & 2 \end{matrix} \right) and \left(\begin{matrix} 0 & 2 \\ 3 & 0 \end{matrix}\right). Extending the calculation for h above shows that in their (right) action on U,

    h \mapsto \left( \begin{matrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 \end{matrix}\right), t \mapsto \left( \begin{matrix} 1 & 1 & 0 & 1 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 1 & 1 & 1 \end{matrix}\right), s \mapsto \left( \begin{matrix} 1 & 0 & 0 & 0 \\ 1 & 1 & 1 & 1 \\ 1 & 0 & 1 & 0 \\ 1 & 0 & 0 & 1 \end{matrix}\right).

    The unique trivial submodule of U is spanned by

    u_0 + u_2 + u_3 = e_0 + e_1 + \cdots + e_6 + e_\infty.

    Quotienting out, we obtain the 3-dimensional representation of G where

    h \mapsto \left( \begin{matrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 1 \end{matrix} \right), \; t \mapsto \left( \begin{matrix} 0 & 1 & 1 \\ 0 & 1 & 0 \\ 1 & 0 & 1 \end{matrix}\right), \; s \mapsto \left( \begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 1 & 0 & 1 \end{matrix} \right)

    Since \mathrm{PSL}_2(\mathbb{F}_7) has cyclic Sylow 3-subgroups and cyclic Sylow 7-subgroups, and a unique conjugacy class of elements of order 2 (which contains s), one can see just from the matrices above that the representation is faithful. Comparing orders, we get \mathrm{PSL}_2(\mathbb{F}_7) \cong \mathrm{GL}_3(\mathbb{F}_2).