## Actions in invariant theory

A basic problem in invariant theory is to find the invariants for the action of $\mathrm{GL}_2(\mathbb{C})$ on binary forms of a given degree. For example, an invariant of the quadratic binary form $ax^2+2bxy+cy^2$ is the discriminant $b^2-ac$.

The purpose of this post is to do all the tedious book-keeping needed to explain, in modern language, the sense in which $b^2-ac$ is invariant, keeping careful track of all the various actions. We will also prove that $b^2-ac$ is, in a sense to be made precise, the unique invariant of $ax^2+2bxy + cy^2$.

### Actions on $V$ and on $V^\star$

Throughout let $g = \left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) \in \mathrm{GL}_2(\mathbb{C})$. Let $V = \mathbb{C}^2$, thought of as a set of column vectors, acted on by $\mathrm{GL}_2(\mathbb{C})$ in the natural way:

$g \cdot \left( \begin{matrix} x \\ y \end{matrix} \right) = \left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) \left( \begin{matrix} x \\ y \end{matrix} \right) = \left( \begin{matrix} \alpha x + \beta y \\ \gamma x + \delta y \end{matrix} \right).$

Let $V^\star$ be the dual space to $V$, thought of as a set of row vectors. Then $\mathrm{GL}_2(\mathbb{C})$ acts on $V^\star$ by

$g \cdot \left( \begin{matrix} a & b \end{matrix} \right) = \left( \begin{matrix} a & b \end{matrix} \right) \left( \begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right)^{-1} = \left( \begin{matrix} a \delta - b \gamma & -a \beta + b \delta \end{matrix} \right)\frac{1}{\alpha\delta-\beta\gamma}.$

More briefly, we may write $g \cdot \theta = \theta g^{-1}$, for $\theta \in V^\star$. Note that the inverse is essential to make the action well-defined.

### Action on binary $r$-forms

We now show how the action of $\mathrm{GL}_2(\mathbb{C})$ on $V^\star$ induces
an action on the space of binary $r$-forms. By definition, a binary $r$-form is a function $V \rightarrow \mathbb{C}$ of the form

$\left( \begin{matrix} x \\ y \end{matrix} \right) \mapsto a_0 x^p + \binom{r}{1} a_1 x^{p-1} y + \cdots + \binom{r}{j} a_j x^{p-j} y^j + \cdots + a_p y^p$

for some $a_0, \ldots a_p \in \mathbb{C}$. (The binomial coefficients could be omitted.) Observe that if $\theta^{(1)}, \ldots, \theta^{(r)} \in V^\star$ then

$\left( \begin{matrix} x \\ y \end{matrix} \right) \mapsto \prod_{i=1}^r \theta^{(i)} \left( \begin{matrix} x \\ y \end{matrix} \right)$

is a binary $r$-form. Conversely, the binary $r$-form
$\left( \begin{matrix} x \\ y \end{matrix} \right) \mapsto x^j y^{p-j}$
is equal to $\theta^j \phi^{p-j}$ where $\theta$, $\phi$ form the standard basis for $V^\star$. We may therefore identify the vector space of binary $r$-forms with $\mathrm{Sym}^r V^\star$. By functoriality of $\mathrm{Sym}^r$, the action of $\mathrm{GL}_2(\mathbb{C})$ is given by

$g \cdot (\theta^{(1)} \ldots \theta^{(r)}) = (g \cdot \theta^{(1)}) \ldots (g \cdot (\theta^{(r)}) = (\theta^{(1)} g^{-1}) \ldots (\theta^{(r)} g^{-1})$

on the elements spanning $\mathrm{Sym}^r V^\star$, and so by $g \cdot f = f g^{-1}$ for a general $f \in \mathrm{Sym}^r V^\star$.

Example. In the quadratic case, if

$f \left( \begin{matrix} x \\ y \end{matrix} \right) = a x^2 + 2b xy + c y^2$

then

\begin{aligned} \left( \left(\begin{matrix} \alpha & \beta \\ \gamma & \delta \end{matrix} \right) \cdot f \right) \left( \begin{matrix} x \\ y \end{matrix} \right) &= f \left( \begin{matrix} \delta x - \beta y \\ -\gamma x + \alpha y \end{matrix} \right) \frac{1}{\alpha\delta-\beta\gamma} \\ &= a' x^2 + 2b' xy + c'y^2 \end{aligned}

where $(\alpha\delta-\beta\gamma)^2 a' = a \delta^2 - 2b \gamma \delta + c \gamma^2$,
$(\alpha\delta-\beta\gamma)^2 b' = -a\beta\delta + b (\alpha\delta + \beta\gamma) + c \gamma^2 xy$ and $(\alpha\delta-\beta\gamma)^2c' = a \beta^2 - 2b \alpha\beta + c \alpha^2$. Since

$(a',b',c') = (a,b,c) \left( \begin{matrix} \delta^2 & -\delta \beta & \beta^2 \\ -2\gamma \delta & \alpha\delta + \beta\gamma & -2\alpha \beta \\ \gamma^2 & -\alpha \gamma & \alpha^2 \end{matrix} \right) \frac{1}{(\alpha \delta-\beta\gamma)^2}$

the new coefficients $a', b', c'$ are related to the old by the matrix $\mathrm{Sym}^2 g^{-1}$.

### Invariants: modern definition

The actions of $\mathrm{GL}_2(\mathbb{C})$ on $V$ and $V^\star$ induce an action on $V^\star \otimes V$, defined by $g \cdot (\theta \otimes v) = (g \cdot \theta) \otimes (g \cdot v)$. Consider the map $F : V^\star \otimes V \rightarrow \mathbb{C}$ defined by $F(\theta \otimes v) = \theta(v)$. By definition of the actions, we have

$F( g \cdot (\theta \otimes v) ) = F( \theta g^{-1} \otimes g v ) = \theta (g^{-1} g v) = \theta(v) = F(\theta \otimes v).$

More generally, we may define $F_r : \mathrm{Sym}^r (V^\star) \otimes V \rightarrow \mathbb{C}$ by $F_r(f \otimes v) = f(v)$, and again it follows that $F_r( g \cdot (f \otimes v) ) = F_r(f \otimes v)$ for all $f \in \mathrm{Sym}^r (V^\star)$ and $v \in V$. This shows that $F_r$ is an invariant of $\mathrm{Sym}^r (V^\star) \otimes V$ of weight $0$, in the sense defined below.

Definition. Let $W$ be a representation $\mathrm{GL}_2(\mathbb{C})$. An invariant of $W$ of weight $e$ is a function $F : W \rightarrow \mathbb{C}$ such that $F(g\cdot w) = (\det g)^e F(w)$ for all $g \in \mathrm{GL}_2(\mathbb{C})$ and $w \in W$.

Because of the presence of $V$ in $\mathrm{Sym}^r (V^\star) \otimes V$, it is also common to call the function $F_r$ above a covariant. Thinking of $F_r$ as evaluate $f$‘, explains the comment towards the end of I.3 in Hilbert’s Theory of algebraic invariants: The simplest example of a covariant is the form $f$ itself.’

### Invariants and covariants: classical definition

Let $f(x,y) = ax^2 + 2bxy + cy^2$. Set $x = \alpha x' + \beta y'$ and $y = \gamma x' + \delta y'$ and substitute for $x$ and $y$ in $f(x,y)$ to get

$f(x,y) = f(\alpha x' + \beta y', \gamma x' + \delta y') = a' {x'}^2 + 2b' x'y' + c' {y'}^2$

where $a' = a \alpha^2 + 2b \alpha \gamma + c \gamma^2$, $b' = a \alpha \gamma + b(\alpha\delta + \beta\gamma) + c\beta\delta$ and $c' = a \beta^2 + 2b \beta\delta + c \delta^2$. A covariant of $f$ of weight $e$ is a function of $a,b,c$ and $x,y$ which is multiplied by $(\alpha \delta - \beta\gamma)^e$ when $a,b,c,x,y$ are replaced with $a',b',c',x',y'$ respectively. (To test this one must, of course, rewrite the expression in terms of $a,b,c,x,y$.) An invariant of $f$ of weight $e$ is a covariant of weight $e$ that is independent of $x$ and $y$.

For example, the function $(a,2b,c) \rightarrow ac - b^2$ is an invariant of weight $2$, since

\begin{aligned} a'c'-b'^2 &= (a \alpha^2 + 2b\alpha \gamma + c\gamma^2)(a \beta^2 + 2b\beta\delta +c \delta^2) \\ &- (a \alpha\gamma + b(\alpha\delta+\beta\gamma) + c \beta\delta)^2 \\ &= (\alpha\delta-\beta\gamma)^2 (ac-b^2).\end{aligned}

As in Hilbert’s remark, the function $(a,b,c,x,y) \mapsto ax^2+bxy+cy^2$ is a covariant of weight $0$.

This definition is equivalent to the definition by actions above. Define $f'(x',y') = a'{x'}^2 + 2b'x'y' + c'{y'}^2$. Observe that if $g$ is the matrix above then

$g \left( \begin{matrix} x' \\ y' \end{matrix} \right) = \left( \begin{matrix} x \\ y \end{matrix} \right)$

and, by definition of the coefficients $a',b',c'$, we have

$f(x,y) = f'(x',y').$

Hence

$(g \cdot f') \left( \begin{matrix} x \\ y \end{matrix} \right) = f' g^{-1} \left( \begin{matrix} x \\ y \end{matrix} \right) = f' \left( \begin{matrix} x' \\ y' \end{matrix} \right) = f \left( \begin{matrix} x \\ y \end{matrix} \right).$

Thus $g \cdot f' = f$. So a covariant $F : \mathrm{Sym}^2 V^\star \otimes V \rightarrow \mathbb{C}$ of weight $e$ is the same as a function $F$ of $a',b',c',x',y'$ that is unchanged when $a',b',c',x',y'$ are replaced with $a,b,c,x,y$. This is equivalent to the classical definition just given. It is interesting to note how the classical definition suppresses the inverse in the action by `starting’ with $x'$ and $f'$.

Exercise. More generally, let $f(x,y) = \sum_{j=0}^p \binom{p}{j} a_j x^{p-j}y^j$. Define $f'(x',y') = f(x,y)$, and suppose that $f'(x',y') = \sum_{j=0}^p \binom{p}{j}a_j' {x'}^{p-j}{y'}^j$. Show that $g \cdot f' = f$ and hence that the row vectors of coefficients $(a_0,\ldots, a_p)$ and $(a_0',\ldots,a_p')$ are related by

$(a'_0,\ldots, a'_p) = (a_0,\ldots,a_p) \mathrm{Sym}^p g.$

For example, when $p=2$ we get

$(a'_0,a'_1,a'_2) = (a_0,a_1,a_2) \left( \begin{matrix} \alpha^2 & \alpha \beta & \beta^2 \\ 2\alpha\gamma & \alpha \delta + \beta\gamma & 2\beta\delta \\ \gamma^2 & \gamma\delta & \delta^2 \end{matrix} \right).$

This agrees with the example earlier, noting that we now have $g \cdot f' = f$, whereas in the earlier example, we had $g \cdot f = f'$. As a further check, one can observe that if $g \cdot f' = f$ and $h \cdot f'' = f'$ then $(gh) \cdot f'' = f$ and correspondingly

$(a''_0,a''_1,a''_2) = (a_0',a_1',a_2') \mathrm{Sym}^p h = (a_0,a_1,a_2) \mathrm{Sym}^p g \mathrm{Sym}^p h.$

### Invariants of the binary quadratic form

Now suppose that $F : \mathrm{Sym}^2 V^\star \rightarrow \det^e$ is an invariant of weight $e$. Assume also that $F$ is polynomial of degree $d$, so $F$ is given by a polynomial of degree $d$ in the coefficients of $\theta, \phi$. (By working in $V^\star \otimes_\mathbb{C} C[t]$, and using that $\mathrm{GL}(V)$ preserves the $t$-degree, it is easy to show that any polynomial invariant is a sum of such homogeneous polynomial invariants.) So $F$ is an element of

$\bigl( \mathrm{Sym}^d (\mathrm{Sym}^2 V^\star)\bigr)^\star$

with the property that $F(g \cdot f) = \det(g)^e F(f)$ for all $f \in \mathrm{Sym}^d \mathrm{Sym}^2 V^\star$. In characteristic zero, although not in prime characteristic, symmetric powers commute with duality. So finally, noting that $(g \cdot F)(f) = \det(g)^{-e} F(f)$ (one last inversion), we obtain an element

$G \in \mathrm{Sym}^d \mathrm{Sym}^2 V$

spanning a $1$-dimensional subrepresentation isomorphic to $\det^{-e}$. The coefficients in the matrices of $g$ acting on $\mathrm{Sym}^2 V$ and on $\det$ are of degree two in $\alpha,\beta,\gamma,\delta$. Hence $g$ acts with degree $2d$ on the left-hand side, and with degree $-2e$ on the right-hand side. It follows that there are no non-zero invariants unless $e=-d$. In this case, identifying $\det$ with $\bigwedge^2 V$, the number of linearly independent invariants is the multiplicity

$[\mathrm{Sym}^d \mathrm{Sym}^2 V : (\bigwedge^2 V)^{\otimes d}].$

By the Cayley–Sylvester formula, this multiplicity is one when $d$ is even, and zero when $d$ is odd. (For an symmetric group proof of the Cayley–Sylvester formula see Corollary 2.12 in this paper of Eugenio Giannelli.) Since powers of the discriminant give an invariant in each even degree, the discriminant generates the ring of invariants.