Invariants and Solving Polynomials

I’m back from Ohio, and ready to get back to the math blogging. Had my preliminary exams (written quals) today, so that should explain the delay in posting. Instead of covering anything I saw at YMC (though I will likely look at Grassmanians, Schubert Calculus, and Equivariant Schubert Calculus in the future), I’m going to talk about classical invariant theory and how to use it to solve the general degree 3 polynomials.

Classical invariant theory was important to great people like Hilbert, though nowadays we have Geometric Invariant Theory to look at (and I probably will, once I learn more of it). Today I’ll be focusing on cubics, and similar methods solve the quartic. For a LOT more in depth explanation, see “Classical Invariant Theory” by Peter Olver, it’s a London Mathematical Society student text, and rather good.

Now we take the cubic Q(x)=ax^3+3bx^2+3cx+d, the 3’s are there to simplify formulas later. First, we must discuss the difference between an invariant and a covariant of the cubic. We’ll set up some definitions.

Rather than the cubic directly, we will work with general binary forms. That is, we take Q(x,y)=\sum_{i=0}^n\left(\begin{array}{c}n\\i\end{array}\right)a_i x^i y^{n-i}. Note that Q(x,1) is the general nth degree equation, as desired. And we take 2 by 2 matrices to act on Q(x,y) by simple change of variables, that is, A=\left[\begin{array}{cc}\alpha&\beta\\\gamma&\delta\end{array}\right], then A\cdot Q(x,y)=Q(\alpha x+\beta y,\gamma x+\delta y). We do, however, require that A be invertible, to make it a change of variables.

An invariant of weight k of the form Q(x,y) is a function I(a_0,\ldots a_n) such that I(a)=(\alpha\delta-\beta\gamma)^k I(\bar{a}), where \bar{a} is the new set of coordinates. A covariant of weight k is a function J(a_0,\ldots,a_n,x,y)=(\alpha\delta-\beta\gamma)^k\bar{J}(\bar{a},\bar{x}), where again, the bars denote having been acted on by a change of variables. So an invariant is just a covariant that doesn’t explicitly depend on x,y.

Now, I’ll state a few things without justification, because they are disgusting computations and are best carried out in the privacy of your own home, rather than in front of people, even online.

The discriminant of the cubic is an invariant of weight six, given by the formula \Delta=a_0^2a_3^2-6a_0a_1a_2a_3+4a_0a_2^3-3a_1^2a_2^2+4a_1^3a_3. This invariant doesn’t just come from nowhere, however. It can be found using resultants, but that would take us a bit far afield for the present discussion.

For any binary form, the most important covariant is called the Hessian, it is found by H=Q_{xx}Q_{yy}-Q_{xy}^2, where subscripts are derivatives. For the cubic, this gives \frac{1}{36}H=(a_1a_3-a_2^2)x^2+(a_0a_3-a_1a_2)xy+(a_0a_2-a_1^2)y^2. There is one more covariant worth mentioning, and it is the Jacobian of two covariants. If K,L are covariants, then J=K_xL_y-K_yL_x, so for this case we need the Jacobian of Q,H. We call it T=Q_xH_y-Q_yH_x.

It turns out that, up to a reasonable notion of generating, these, along with the cubic itself, generate all covariants of our cubic equation. This is actually why Hilbert proved the original Basis Theorem: to show that there was a finite list of such covariants. Now we come to the notion of the syzygy, which brings to mind another of Hilbert’s great theorems. Another computation says that there is one syzygy for the covariants of the cubic, and that is T^2=2^4 3^6\Delta Q^2-H^3. From this, we will solve the cubic equation using elementary methods.

Reorganized, the equation becomes H^3=2^4 3^6 \Delta Q^2-T^2=(108\sqrt{\Delta} Q-T)(108\sqrt{\Delta}Q+T). Now we must check that these two terms have no common linear factors. If they did, then T and Q would have a common root, and their resultant would vanish (again those resultants…perhaps I should do a post on them) but it can be checked that the resultant is a multiple of the cube of the discriminant. We’ll assume the discriminant doesn’t vanish, which means that the cubic has three distinct roots (the other cases are easier anyway).

Now we recall that the Hessian is quadratic, so we can factor it with no problem, and call it H=LM, with L and M the distinct linear factors, so the above equation can be split up into L^3=108\sqrt{\Delta}Q-T and M^3=108\sqrt{\Delta}Q+T. This should immediately make us happy, because we can add L^3+M^3=216\sqrt{\Delta}Q, and so we get Q=\frac{L^3+M^3}{216\sqrt{\Delta}}=\frac{(L+M)(L+\omega M)(L+\omega^2M)}{216\sqrt{\Delta}} where \omega is a primitive cube root of unity.

And lo and behold, the cubic has been solved! A similar procedure using syzygies and covariants solves the quartic also. Now, those who know Galois Theory can immediately see that this cannot work for the Quintic, because it cannot be solved. So what goes wrong? My understanding, which may be flawed, is that there are too many invariants and covariants, and that the syzygies grow more complex faster than the polynomials themselves. To solve the cubic requires solving a quadric, to solve the quartic requires the cubic, but it appears that this method of solving a quintic requires solving a general degree 6 equation, leading to a dramatic failure.

EDIT: I don’t believe that I made it clear that L and M are chosen such that they satisfy the equations L^3=108\sqrt{\Delta}Q-T and M^3=108\sqrt{\Delta}Q+T, there is ambiguity otherwise, in that the lead coefficient of the Hessian H must be factored in some way for L and M, and these equations let you know how.

About Charles Siegel

Charles Siegel is currently a postdoc at Kavli IPMU in Japan. He works on the geometry of the moduli space of curves.
This entry was posted in Algebraic Geometry, Group Theory. Bookmark the permalink.

1 Response to Invariants and Solving Polynomials

  1. Pingback: Geometry of a Polynomial « Rigorous Trivialities

Leave a comment