Hi, today I want start a series of posts on some of the most puzzling questions of number theory. The guiding theme will be that of multiple zeta values. They are particular numbers which are deeply engraved in lots of the research of the last thirty years. And they are a recurring them in the deep connections that are discovered between combinatorics, mathematical physics, higher arithmetics (for example the structure of the absolute Galois group $\mathrm{Gal}(\bar{\mathbb{Q}}/\mathbb{Q})$, low dimension topology (like the classification of knots, links etc...). Despite of their importance, very little is known about those numbers arithmetically, and most of the theory is conjectural. Due to their ubiquity, any significant progress in the understanding of those numbers is guaranteed to have far reaching consequences.
In that series of articles, I'll try to give a survey of what is currently known. Of course due to the variety and difficulty of the associated material the task is bound to be out of my modest reach, but I'll try to do my best in the hope it could be useful or at least in the hope of having a lot of fun along the way.
Definition 1. The multiple zeta values, or MZV's for short, are the numbers, $$ \zeta(s_1, \dots, s_r) \overset{def.}{=} \sum_{n_1 > \cdots > n_r > 0} \frac{1}{n_1^{s_1}\dots n_r^{s_r}} $$ Where $s_1, \dots, s_r$ are positive integers.
This sum converges as soon as $s_1 \ge 2$. The number $r$ is called the length of the multi zeta value. Its weight $n$ is the sum, $n = s_1 + \dots + s_r$. The multiple zeta values constitute an algebra of real numbers with coefficients in $\mathbb{Q}$ because we know a way to express the product of two $\mathbb{Q}$-linear combinations of them as a $\mathbb{Q}$-linear combination of them. We'll talk of that but not today...Conjecture 1. In his famous paper of 1994, D. Zagier formulated the conjecture that the dimension $d_n$ of the $\mathbb{Q}$-vector space spanned in $\mathbb{R}$ by the multi-zeta values of weight $n$ are given by the following Poincaré series, $$ \sum_{n \ge 0} d_n \, z^n = \frac{1}{1 - z^2 - z^3} $$
Table 1. Those dimensions satisfy the following recurrence relation, \begin{align*} & d_0 = 1, d_1 = 0, d_2 = 1, \\ & d_{n+3} = d_{n+1} + d_{n}. \end{align*} Which enable their fast computation, \begin{align*} \frac{1}{1 - z^2 - z^3} &= 1+{z}^{2}+{z}^{3}+{z}^{4}+2\,{z}^{5}+2\,{z}^{6}+3\,{z}^{7}+4\,{z}^{8}+ 5\,{z}^{9}+7\,{z}^{10}+9\,{z}^{11} \\ & +12\,{z}^{12}+16\,{z}^{13}+21\,{z}^{ 14}+28\,{z}^{15}+37\,{z}^{16}+49\,{z}^{17}+65\,{z}^{18}+86\,{z}^{19} \\ & + 114\,{z}^{20}+151\,{z}^{21}+200\,{z}^{22}+265\,{z}^{23}+351\,{z}^{24}+ 465\,{z}^{25}+616\,{z}^{26} \\ & +816\,{z}^{27}+1081\,{z}^{28}+1432\,{z}^{29 }+1897\,{z}^{30}+2513\,{z}^{31}+3329\,{z}^{32} \\ & +4410\,{z}^{33}+5842\,{z }^{34}+7739\,{z}^{35}+10252\,{z}^{36}+13581\,{z}^{37}+17991\,{z}^{38} \\ & + 23833\,{z}^{39}+31572\,{z}^{40}+41824\,{z}^{41}+55405\,{z}^{42}+73396 \,{z}^{43}+97229\,{z}^{44} \\ & +128801\,{z}^{45}+170625\,{z}^{46}+226030\,{ z}^{47}+299426\,{z}^{48} \\ & +396655\,{z}^{49}+525456\,{z}^{50} + o(z^{50}) \end{align*}
Conjecture 2. A very Strong conjecture is that the algebra of multiple zeta value is isomorphic to a polynomial algebra, \textit{i.e.} it is freely generated by some linear combinations of multiple zeta values which constitute a transcendence basis.
This says that we have, $$ \frac{1}{1 - z^2 - z^3} = \frac{1}{(1 - z)^{\ell_1}}\frac{1}{(1 - z^2)^{\ell_2}}\cdots\frac{1}{(1 - z^k)^{\ell_k}}\cdots $$ where $\ell_k$ is the number of generator of weight $k$ in this polynomial algebra.
Conjecture 3. Another good conjecture is that the $\mathbb{Q}$ vector space of multiple zeta values in $\mathbb{R}$ is a commutative and connected graded Hopf algebra. Connected here means that it is of dimension one in degree zero. Then the famous Milnor-Moore theorem tells us that this algebra is a polynomial algebra and that it is dual to the enveloping algebra of a free Lie algebra.
Free Lie algebras have very useful linear basis called Lyndon basis. Those basis are indexed by Lyndon words. We'll explain that in a moment. The aim of the computations that follow is to relate the number of generators of those various conjectural algebras structure using combinatorics.
Definition 2. The Möbius function, $\mu : \mathbb{N}^* \rightarrow \mathbb{Z}$ is defined for each positive integer $n$ by, $$ \mu(n) = \begin{cases} 0 & \text{if $n$ isn't square free.} \\ (-1)^k & \text{if $n$ is square free with $k$ distinct prime factors.} \end{cases} $$
Of course, for a positive integer $n$, being square free means that all the exponents are $1$ in its unique prime decomposition $n = p_1^{\alpha_1} \dots p_k^{\alpha_k}$ . For example, the numbers $2$ and $6$ are both square free but $9 = 3^2$ and $12 = 3\times 2^2$ aren't.
Lemma. For all positive integer $n$, $$ \sum_{k \,|\, n} \mu(k) = \delta_{1, n}, $$ where $\delta_{i,j}$ is the Kroneker symbol, $$ \delta_{i,j} = \begin{cases} 1 & \text{if $i = j$,}\\ 0 &\text{else.}\end{cases} $$
There are several nice proofs of this fact. First remark: As $\mu(k) = 0$ if $k$ isn't square free, only the square free divisors of $n$ contribute to the sum. Second remark: Taking a square free divisor $k$ of $n$ amounts to selecting a subset of the prime divisors of $n$. The value of $\mu(k)$ is $1$ or $-1$ whether this subset has an even or an odd number of elements.
For a non-empty set $X$, there is an equal number of subsets with even and odd numbers of elements. If this set $X$ is the empty set however, there is only one subset, the empty set itself which has an even number of elements. Specifically, the empty set doesn't have a subset with an odd number of elements. I regard this small break in regularity as one of the subtle secrets of mathematics (but maybe I'm bit too romantic). To sum it up, the role of the empty set is played by the number $1$ which has no prime divisor and the role of the subsets are played by the square free divisors.
Now this is where the magics begins.
Theorem (Möbius Inversion Formula). Let $M$ be an abelian group. Let $(a_n)_{n \ge 1}$ and $(b_n)_{n \ge 1}$ be two sequences of elements of $M$ indexed by the positive integers. The following two statements are equivalent,
Remark. Here we'll only use the case where the $a_n$ and $b_n$ are integers, but really both the theorem and its proof are valid for any abelian group $M$.
Proof. The strategy is to prove that applying the two transformations successively gives the identity. \begin{align*} \sum_{uv = n} \mu(u) \sum_{d\,|\,v} a_d = \sum_{ud \,|\, n}\mu(u) \, a_d = \sum_{uv = n} \underset{\delta_{1,u}}{\underbrace{\Biggl(\sum_{k\,|\, u}\mu(u)\Biggr)}} a_v = a_n \end{align*} Conversely, \begin{align*} \sum_{d\,|\,n} \sum_{uv = d} \mu(u)\, b_v = \sum_{uv \,|\, n}\mu(u) \, b_v = \sum_{uv = n} \underset{\delta_{1,u}}{\underbrace{\Biggl(\sum_{k\,|\, u}\mu(u)\Biggr)}} b_v = b_n \end{align*} End of the demonstration. ☐
First, we recall that a word, of length $n$, on an alphabet $X$ is a sequence of $n$ letters from $X$, $$ w = \langle\,a_1, a_2, \dots, a_n\,\rangle $$ where $a_1, a_2, \dots, a_n \in X$. The set of such words is called the free monoid on $X$ and is denoted $X^*$. The unit element of this monoid is the empty word denoted by $\varepsilon = \langle\,\rangle$. The multiplication of this monoid is simply concatenation. $$ \langle\,a_1, a_2, \dots, a_n\,\rangle \cdot \langle\,b_1, b_2, \dots, b_m\,\rangle = \langle\,a_1, a_2, \dots, a_n ,b_1, b_2, \dots, b_m\,\rangle $$
Now if we identify the letters of the alphabets with the corresponding words of length one $a = \langle\,a\,\rangle$ we can get rid of this cumbersome notation and write $a_1 a_2 \dots a_n$ for the word $\langle\,a_1, a_2, \dots, a_n\,\rangle$.
We won't say much on the free monoid itself now. Just that there is a useful induction principle (on the length of the words) associated to it.
Principle (Induction on Length). Let $w \in X^*$ be a word on the alphabet $X$ then, one and only one of the two following situations can occur,
Definition 3. Two words $w$ and $w'$ are called conjugated, and we note $w \sim w'$, if $w = u v$ and $w' = v u$ for some $u, v \in X^*$.
Example 1. On the alphabet $X = \{\,x_0, x_1\,\}$, the following words are all conjugated, \begin{align*} & x_0 x_0 x_1 x_0 x_1, && x_0 x_1 x_0 x_1 x_0, && x_1 x_0 x_1 x_0 x_0, && x_0 x_1 x_0 x_0 x_1, && x_1 x_0 x_0 x_1 x_0. \end{align*}
Theorem 1. The relation of conjugation just defined is an equivalence relation.
While the proofs of its reflexivity and its symmetry are trivial, the direct proof of its transitivity is a bit difficult. I let you check that as a funny exercise. Instead of a direct proof, we introduce a simple action of the infinite cyclic group $\mathbb{Z}$ on $X^*$ such that conjugated words are precisely those words that are in the same orbit. This action is defined on the generator $1$ of $\mathbb{Z}$ by the bijection $\sigma : X^* \rightarrow X^*$, defined as follows, \begin{align*} \sigma(\varepsilon) &= \varepsilon \\ \sigma(au) &= u a, \end{align*} where $a \in X$ and $u \in X^*$.
Definition 4. The orbit of a word $w$, denoted $[\,w\,]$, is the set of all the different words that we get by successive application of $\sigma$, $$ [\,w\,] = \{\, \sigma^n(w) \,|\, n \in\mathbb{Z} \,\}. $$
Example 2. Taking the same example as before, \begin{align*} \sigma(x_0 x_0 x_1 x_0 x_1) &= x_0 x_1 x_0 x_1 x_0 , \\ \sigma(x_0 x_1 x_0 x_1 x_0) &= x_1 x_0 x_1 x_0 x_0 , \\ \sigma(x_1 x_0 x_1 x_0 x_0) &= x_0 x_1 x_0 x_0 x_1 , \\ \sigma(x_0 x_1 x_0 x_0 x_1) &= x_1 x_0 x_0 x_1 x_0 , \\ \sigma(x_1 x_0 x_0 x_1 x_0) &= x_0 x_0 x_1 x_0 x_1 . \end{align*}
The cardinal of this orbit is finite and is a divisor of the length of the word $w$. Two words $w$ and $w'$ are conjugated if and only if their orbits are equal, and so, the orbit of a word is precisely its conjugacy class.
The size of the conjugacy class of a word $w$ is the is the length $r$ of the shortest word $u \in X^*$ such that $w = u^s$. The positive integer $s$ is called the period of the word $w$. Of course, we have $n = rs$ where $n$ is the length of $w$.
Definition 5. A word of length $n$ is $aperiodic$ if the size of its orbit is $n$.
Lyndon words on a given alphabet $X$ are words satisfying a minimal condition.
Definition 6. More precisely, a Lyndon word $w$, on a given alphabet $X$, is a word $w \in X^*$ that is,
Example 3. On the alphabet $X = \{\,x_0, x_1\,\}$ ordered by $x_0 \ge x_1$, the Lyndon words of length inferior or equal to six are, \begin{align*} &x_0, &&x_0x_0x_0x_0x_0x_1, &&x_0x_0x_0x_0x_1, &&x_0x_0x_0x_0x_1x_1, \\ &x_0x_0x_0x_1, & &x_0x_0x_0x_1x_0x_1, & &x_0x_0x_0x_1x_1, & &x_0x_0x_0x_1x_1x_1, \\ &x_0x_0x_1, & &x_0x_0x_1x_0x_1, & &x_0x_0x_1x_0x_1x_1, & &x_0x_0x_1x_1, \\ &x_0x_0x_1x_1x_0x_1, & &x_0x_0x_1x_1x_1, & &x_0x_0x_1x_1x_1x_1, & &x_0x_1, \\ &x_0x_1x_0x_1x_1, & &x_0x_1x_0x_1x_1x_1, & &x_0x_1x_1, & &x_0x_1x_1x_1, \\ &x_0x_1x_1x_1x_1, & &x_0x_1x_1x_1x_1x_1, & &x_1. & \end{align*}
Example 4. On the alphabet $X = \{\,x_0, x_1, x_2\,\}$ ordered by $x_0 \ge x_1 \ge x_2$, the Lyndon words of length inferior or equal to four are, \begin{align*} &x_0, & &x_0x_0x_0x_1, & &x_0x_0x_0x_2, & &x_0x_0x_1, & &x_0x_0x_1x_1, \\ &x_0x_0x_1x_2, & &x_0x_0x_2, & &x_0x_0x_2x_1, & &x_0x_0x_2x_2, & &x_0x_1, \\ &x_0x_1x_0x_2, & &x_0x_1x_1, & &x_0x_1x_1x_1, & &x_0x_1x_1x_2, & &x_0x_1x_2, \\ &x_0x_1x_2x_1, & &x_0x_1x_2x_2, & &x_0x_2, & &x_0x_2x_1, & &x_0x_2x_1x_1, \\ &x_0x_2x_1x_2, & &x_0x_2x_2, & &x_0x_2x_2x_1, & &x_0x_2x_2x_2, & &x_1, \\ &x_1x_1x_1x_2, & &x_1x_1x_2, & &x_1x_1x_2x_2, & &x_1x_2, & &x_1x_2x_2, \\ &x_1x_2x_2x_2, & &x_2. \end{align*}
What is the number $\lambda_n$ of Lyndon words of length $n$ on an alphabet with $q$ letters? Well first, we remark that any word of length $n$ can be written uniquely as a power of an aperiodic word. If we denote by $\alpha_n$ the number of aperiodic words on $q$ letters we can transcribe this remark as follows, $$ q^n = \sum_{d \,|\, n} \alpha_d. $$ If we now apply the Möbius inversion formula we get, $$ \alpha_n = \sum_{rs = n} \mu(r)\,q^s. $$
There are $n$ aperiodic words in the orbit of any Lyndon words of length $n$ (a Lyndon word is aperiodic). There is exactly one Lyndon word in the orbit of any aperiodic word (it is the minimal element by lexicographic order). From that we get, $$ \lambda_n = \frac{1}{n} \alpha_n $$
And so the number of Lyndon words of length $n$ on an alphabet with $q$ letters is, $$ \lambda_n = \frac{1}{n} \sum_{rs = n} \mu(r)\,q^s. $$
Example 5. We denote by $G_q(t)$ the generating series of Lyndon words on an alphabet with $q$ elements, \begin{align*} G_q(t) &= \sum_{n \ge 1} \left( \frac{1}{n} \sum_{rs = n} \mu(r)\,q^s \right)t^n \\ &= \sum_{r \ge 1 \atop s \ge 1} \frac{\mu(r)\,q^s\,t^{rs}}{rs}. \end{align*} An easy computation gives, \begin{align*} G_{{2}}(t)&=2\,t+{t}^{2}+2\,{t}^{3}+3\,{t}^{4}+6\,{t}^{5}+9\,{t}^{6}+18\,{t}^{7}+30\,{t}^{8}+56\,{t}^{9}+99\,{t}^{10} + \dots \\ G_{{3}}(t)&=3\,t+3\,{t}^{2}+8\,{t}^{3}+18\,{t}^{4}+48\,{t}^{5}+116\,{t}^{6}+312\,{t}^{7}+810\,{t}^{8}+2184\,{t}^{9} + \dots \\ G_{{4}}(t)&=4\,t+6\,{t}^{2}+20\,{t}^{3}+60\,{t}^{4}+204\,{t}^{5}+670\,{t}^{6}+2340\,{t}^{7}+8160\,{t}^{8}+29120\,{t}^{9} + \dots \\ G_{{5}}(t)&=5\,t+10\,{t}^{2}+40\,{t}^{3}+150\,{t}^{4}+624\,{t}^{5}+2580\,{t}^{6}+11160\,{t}^{7}+48750\,{t}^{8} + \dots \\ G_{{6}}(t)&=6\,t+15\,{t}^{2}+70\,{t}^{3}+315\,{t}^{4}+1554\,{t}^{5}+7735\,{t}^{6}+39990\,{t}^{7}+209790\,{t}^{8} + \dots \\ &\dots \end{align*}
Once I was working on Lyndon words on a weighted alphabet and wanted to count them both by length and weight. I did computer experiments for an hour but I had to stop and go out to give a course on algebra to undergraduate students. On my way to the University, I was in the subway and thought of a doubly indexed Möbius inversion formula. I really have no idea how it occurred to me... but I was lucky enough to have a pen and paper on me so I could write down the theorem and its proof. I've just found this paper while searching into an old box of papers. The formula is too cute not to be mentioned here.
Theorem 1. Let $M$ be an abelian group. Let $(a_{m,n})_{m \ge 1 \atop n \ge 1}$ and $(b_{m,n})_{m \ge 1 \atop n \ge 1}$ be two sequences of elements of $M$ indexed by pairs of positive integers. The following two statements are equivalent,
Proof. Same strategy as for the usual Möbius inversion formula. \begin{align*} \sum_{tr = m \atop ts = n} \mu(t) \sum_{du = r \atop dv = s} a_{u,v} &= \sum_{tdu = m \atop tdv = n}\mu(t) \, a_{u,v} = \sum_{cu = m \atop cv = n} \underset{\delta_{1,c}}{\underbrace{\Biggl(\sum_{t\,|\, c}\mu(t)\Biggr)}} a_{u,v} = a_{m,n} \end{align*} Conversely, \begin{align*} \sum_{du = m \atop dv = n}\sum_{tr = u \atop ts = v} \mu(t) \, b_{r,s} &= \sum_{dtr = m \atop dts = n}\mu(t) \, b_{r,s} = \sum_{kr = m \atop ks = n} \underset{\delta_{1,k}}{\underbrace{\Biggl(\sum_{t\,|\, k}\mu(t)\Biggr)}} b_{r,s} = b_{m,n} \end{align*} End of the demonstration. ☐