## Groebner Bases and Automatic Geometry

Today I’m going to talk about using modern methods of algebraic geometry to automate (in large part) the proofs of statements of plane geometry. The discussion mostly follows Cox, Little, and O’Shea‘s “Ideals, Varieties, and Algorithms” which is a great book to use to learn the basics of algebraic geometry with lots of examples.

So we’re going to be looking in the polynomial ring over the complex numbers, $\mathbb{C}[x_1,\ldots,x_n]$, and we’ll first have to develop some machinery.

First, we must define a monomial ordering on $\mathbb{C}[x_1,\ldots,x_n]$ to be any relation > on $\mathbb{Z}^n_{\geq 0}$, or equivalently, any relation on the set of monomials $x^\alpha,\alpha\in\mathbb{Z}^n_{\geq 0}$, which is a total order, if $\alpha>\beta$ and $\gamma\in\mathbb{Z}^n_{\geq 0}$ then $\alpha+\gamma>\beta+\gamma$, and $>$ is a well-ordering on $\mathbb{Z}^n_{\geq 0}$. This means that every nonempty subset of $\mathbb{Z}^n_{\geq 0}$ has a smallest element under $>$.

The three most important monomial orderings are lex, grlex, and grevlex. Lex is the lexicographic order, in this, $\alpha>\beta$ if $\alpha-\beta$ has positive first term. Grlex is the graded lexicographic order, in this, we define $|\alpha|=\sum a_i$, and $\alpha>\beta$ if $|\alpha|>|\beta|$ and if $|\alpha|=|\beta|$, we use lex order. Grevlex is graded reverse lexicographic ordering. In this, $\alpha>\beta$ if $|\alpha|>|\beta|$, and if $|\alpha|=|\beta|$ then $\alpha>\beta$ if the rightmost entry of $\alpha-\beta$ is negative.

And here is some terminology: let $f=\sum_\alpha a_\alpha x^\alpha$ and > a monomial ordering, then the multidegree of $f$ is $multideg(f)=\max(\alpha\in\mathbb{Z}^n_{\geq 0}:a_\alpha\neq 0)$

The leading coefficient of $f$ is $LC(f)=a_{multideg(f)}$, and the leading monomial of $f$ is $LM(f)=x^{multideg(f)}$. The leading term of $f$ is $LT(f)=LC(f)LM(f)$.

It turns out that this is enough to define a division algorithm on $\mathbb{C}[x_1,\ldots,x_n]$, which is analogous to the standard division algorithm:

Fix a monomial ordering $>$ on $\mathbb{Z}^n_{\geq 0}$ for all that follows and let $F=(f_1,\ldots,f_s)$ be an ordered $s$-tuple of polynomials.

Then every $f\in \mathbb{C}[x_1,\ldots,x_n]$ can be written $f=a_1f_1+\ldots+a_sf_s+r$, where $r$ is either zero or a polynomial which is not divisible by any of $LT(f_1),\ldots,LT(f_s)$. We call $r$ the remainder of $f$ on division by $F$. Furthermore, if $a_if_i\neq 0$ then we have $multideg(f)\geq multideg(a_if_i)$.

How do we proceed with this? Well, we first test the lead term of our polynomial for division by the lead term of $f_1$. If it succeeds, we can perform monomial division, and then subtract the appropriate multiple of $f_1$ to cancel the lead term and repeat. If $f$ has lead term not divisible by $f_1$, proceed to $f_2$, etc. Unfortunately, this process depends on the order of division, and the remainder is not unique (in fact, it can be zero for some ordering and nonzero for another). There is, however, a way to patch this.

A finite subset $G=\{g_1,\ldots,g_t\}$ of an ideal $I$ is said to be a Groebner Basis if $\langle LT(g_1),\ldots,LT(g_t)\rangle=\langle LT(I)\rangle$. It shouldn’t be too surprising that this is the right notion, because the problem before would be if you got things that had lead term on the right but not the left.

It is a theorem that every ideal $I$ other than $\{0\}$ has a Groebner basis. Furthermore, any Groebner Basis for an ideal $I$ is a generating set of $I$.
And so, the fact motivating Groebner bases, is that if we let $\overline{f}^G$ be the remainder of $f$ upon division by $G$, and $G$ is a Groebner Basis, then $\overline{f}^G$ is independent of the ordering of $G$, and $f\in \langle G\rangle$ iff $\overline{f}^G=0$.

So this solves the ideal membership problem, which is to determine whether a polynomial is in a given ideal (which corresponds to whether a function vanishes on a given variety, geometrically speaking).

Best of all, there is an algorithm for finding Groebner bases of ideals. First we define the $S$-polynomial of $f,g\in\mathbb{C}[x_1,\ldots,x_n]$ nonzero. If $multideg(f)=\alpha$ and $multideg(g)=\beta$ then let $\gamma=(\gamma_1,\ldots,\gamma_n)$, where $\gamma_i=\max(\alpha_i,\beta_i)$ for each $i$. We call $x^\gamma$ the least common multiple of $LM(f)$ and $LM(g)$, written $x^\gamma=lcm(LM(f),LM(g))$

The $S$-polynomial of $f$ and $g$ is $S(f,g)=\frac{x^\gamma}{LT(f)}f-\frac{x^\gamma}{LT(g)}g$.

This is in fact leading towards the notion of syzygy, which is where the $S$ comes from, and this concept is extremely important, though I won’t go into more detail in this post.

So now I’ll describe the Buchberger Algorithm.

Let $I=\langle f_1,\ldots,f_s\rangle\neq\{0\}$ be a polynomial ideal. Then we can extend $F=(f_1,\ldots,f_s)$ to a Groebner basis by the following algorithm:

Input: $F=(f_1,\ldots,f_s)$

Output: $G=(g_1,\ldots,g_t)$ a Groebner basis such that $F\subseteq G$

Define $G=F$, then let $G'=G$, and for each pair $\{p,q\},p\neq q\in G'$, Find $S=\overline{S(p,q)}^{G'}$. If $S\neq 0$ then set $G=G\cup\{S\}$. Do this until $G=G'$.

This algorithm is not optimal, but it works. We can shorten it by noticing that if two terms have relatively prime leading monomials, then the $S$-polynomial must be zero, and by using other, more advanced, concepts.

We will use this machinery to give an example of what is called Automatic Geometric Theorem Proving. From here on, we will work with a very specific and simple example, and just note that everything done can be dealt with in more generality, and for harder problems (though you need more and more processing power to do it).Let $A,B,C,D$ be the vertices of a parallelogram in the plane. Then the diagonals $\overline{AD}$ and $\overline{BC}$ bisect each other at a point $N$.

This is a fairly simple theorem in Euclidean geometry, and it will serve as an example here, because anything more interesting is too computationally intense and has polynomials that are too large to show off.

First we must translate this into polynomials. We place $A$ at the origin $(0,0)$, and we will use $u_i$ for independent variables and $x_i$ for constrained variables. $B=(0,u_1)$. $C=(u_2,u_3)$, and $D=(x_1,x_2)$.

Then the statements that the figure is a parallelogram are $0=\frac{x_2-u_3}{x_1-u_2}$ and $\frac{u_3}{u_2}=\frac{x_2}{x_1-u_1}$. The denominators can be cleared and we get hypotheses $h_1:x_2-u_3=0$ and $h_2:(x_1-u_1)u_3-x_2u_2=0$

Now we define $N=(x_3,x_4)$, and the statements that $N$ is on both lines are $\frac{x_4}{x_3}=\frac{u_3}{x_1}$ and $\frac{x_4}{x_3-u_1}=\frac{u_3}{u_2-u_1}$, and so we get $h_3:x_4x_1-x_3u_3=0$ and $h_4:x_4(u_2-u_1)-(x_3-u_1)u_3=0$

We then use the Pythagorean Theorem to write the conclusions as polynomials, $g_1:x_1^2-2x_1x_3-2x_4x_2+x_2^2=0$ and $g_2:2x_3u_1-2x_3u_2-2x_4u_3-u_1^2+u_2^2+u_3^2=0$

We then compute a Groebner basis for $I=\langle h_1,\ldots,h_4\rangle\subset \mathbb{C}[u_1,u_2,u_3,x_1,x_2,x_3,x_4]$ using lexicographic ordering with $x_1>\ldots>x_4>u_1>u_2>u_3$.

We get $I=\langle f_1,\ldots,f_6\rangle$, where
$f_1=x_1x_4+x_4u_1-x_4u_2-u_1u_3$
$f_2=x_1u_3-u_1u_3-u_2u_3$
$f_3=x_2-u_3$
$f_4=x_3u_3+x_4u_1-x_4u_2-u_1u_3$
$f_5=x_4u_1^2-x_4u_1u_2-\frac{1}{2}u_1^2u_3+\frac{1}{2}u_1u_2u_3$
$f_6=x_4u_1u_3-\frac{1}{2}u_1u_3^2$
The variety, that is, the set of common zeroes, is reducible, because $f_2=(x_1-u_1-u_2)u_3$, so denoting the variety of an ideal $I$ by $V(I)$, we have $V(f_1,\ldots,f_6)=$ $V(f_1,x_1-u_1-u_2,f_3,\ldots,f_6)\cup V(f_1,u_3,f_3,\ldots,f_6)$

In fact, $V=V'\cup U_1\cup U_2\cup U_3$, where

$V'=V\left(x_1-u_1-u_2,x_2-u_3,x_3-\frac{u_1+u_2}{2},x_4-\frac{1}{2}u_3\right)$

$U_1=V\left(x_2,x_4,u_3\right)$

$U_2=V\left(x_1,x_2,u_1-u_2,u_3\right)$

$U_3=V\left(x_1-u_2,x_2-u_3,x_3u_3-x_4u_2,u_1\right)$

are irreducible.

We note that on $U_1,U_2$ we have the equation $u_3=0$, which causes our paralellogram to be a line, that is, degenerate. $U_3$ is also a degenerate case, where $u_1=0$.

So, if the conclusions are zero on every point of $V'$, then they are true, and we can check this easily, by substituting values for the $x_i$ in the conclusions $g_1$ and $g_2$, and then showing that the conclusions reduce to the zero polynomial, which they, in fact, do.