Café Math : Cassipoeia Type System (Part 1)
1, 1, 2, 2, 1, 8, 6, 7, 14, 27, 26, 80, 133, 170, 348, 765, 1002, 2176, 4682, ...   (A121350)
Math Blog
Phd Thesis
Financial Maths
About me

Cassipoeia Type System (Part 1)

Sat, 28 Jan 2012 20:45:01 GMT

Cassiopeia is a small computer algebra system I am writing. It is still very basic, but it is aimed at having a rich a powerful type system. This article is the first of a series on the theory behind that type system. The goam is to provide a conveinent dictionary between some sophisticated computer software type system (like that of C#) and the theory of categories which I chose as a foundation of the mathematical engine.

1. Terms

Definition 1. A signature $\Sigma$ is a given by an alphabet $A$ partitioned in two distinguished subalphabets $F$ and $V$, and a mapping $\alpha : A \to \mathbb{N}$ such that for all $v \in V$ we have $\alpha(v) = 0$.

Elements of $F$ are function symbols, elements of $V$ are variable symbols. Given a symbol $a \in A$, the integer $\alpha(a)$ is the arity of the symbol $a$. Variable symbols all have arity zero, and zero arity function symbols are called constant symbols.

Definition 2. A term $t$ of signature $\Sigma$, or $\Sigma$-term, is a tree with nodes labelled by the alphabet $A$. Outgoing degrees of nodes are constrained by the arity function $\alpha$ in the sense that each node labelled by a symbol $a \in A$ has degree $\alpha(a)$.

A term is uniquely decomposed in an expression of the form $t = (a\,;\,t_1, ..., t_n)$ with $\alpha(a) = n$. Such an expression can conveniently be written $a(t_1, ..., t_n)$ if no confusion arises. Terms of the form $t = (a\,;\,\,)$ i.e. with $\alpha(a)=0$ are identified with $a$ so we can write $t \in F$ to mean a constant term and $t \in V$ to mean a variable term.

Example 1. We take the signature $\Sigma$ given by $A=F=\{\, a, b \,\}$, $V = \emptyset$, $\alpha(a)=0$ and $\alpha(b)=2$. The set of all terms of that signature is infinite and is composed of binary trees as.

The number of such terms with $n$ nodes labelled by $b$ is the $n^\text{th}$ Catalan number, $$ \begin{align*} C_n = \frac{1}{n+1} \binom{2n}{n} \end{align*} $$ The generating series of this sequence of numbers is, $$ \begin{align*} G(t) &= \frac{1-\sqrt{1-4t}}{2t} \\ &= 1+t+2\,t^2+5\,t^3+14\,t^4+42\,t^5+132\,t^6+... \end{align*} $$

More generally, given any signature with a finite alphabet $A = F = \{\, a_1, ...,a_n \,\}$, $V = \emptyset$ and $\alpha(a_1) = k_1$, ..., $\alpha(a_n)=k_n$, we can count the number $N_k$ of terms of that signature having exactly $k$ nodes by introducing the following generating series, $$ \begin{align*} G(t) = \sum_{k \ge 0}\, N_k \,t^k \end{align*} $$ It is subject to the following algebraic equation, $$ \begin{align*} G(t) = t\,(G(t)^{k_1}+ ... + G(t)^{k_n}) \end{align*} $$ although this is not always easily solved for a compact closed form, one can easily pump the terms of the series using the Newton-Raphson method. For that one defines a sequence $(g_k)_{k\ge 0}$ of series approximations by recurrence, $$ \begin{align*} g_0 &= N_1\,t + o(t)\\ g_{k+1} &= g_k - \frac{g_k - t\,(g_k^{k_1} + ... + g_k^{k_n})}{1 - t\,(k_1\,g_k^{k_1 - 1} +...+k_n\,g_k^{k_n-1})} + o(t ^{2^{k+1}}). \end{align*} $$ where $N_1$ is the number of constant symbols in the signature. The series approximation $g_k$ is exact up to order $2^k$ which provides fast convergence.

Example 2. Taking $A=F=\{\, a, b \,\}$, $V = \emptyset$, $\alpha(a)=0$ and $\alpha(b)=3$ one obtains ternary trees. Applying the above method one finds the following approximation to the generating series after five iterations, $$ \begin{align*} G(t)& = t+t^4+3\,t^7+12\,t^{10}+55\,t^{13}+273\,t^{16}+1428\,t^{19}+7752\,t^{22}+43263\,t^{25}\\ &\quad +246675\,t^{28}+1430715\,t^{31} + o(t^{32}) \end{align*} $$

2. $\Sigma$-algebras

Definition 3. An algebra $A$ of signature $\Sigma$, or a $\Sigma$-algebra, is given by a set $A_\circ$ and for each $a \in F$ with $\alpha(a) = n$ a mapping of sets, $$ \begin{align*} f_a\,:\, A_\circ^n \to A_\circ \end{align*} $$

The set $A_\circ$ is the set of elements of the algebra. The mappings $(f_a)_{a\in F}$ are the operations of the algebra.

Definition 4. A morphism $\phi$ between two algebras $A$ and $A'$, is given by a mapping between their set of elements such that for each $a\in F$ with $\alpha(a) = n$ and all $x_1, ...,x_n \in A_\circ$ we have, $$ \begin{align*} \phi(f_a(x_1, ..., x_n)) = f'_a(\phi(x_1),...,\phi(x_n)) \end{align*} $$

2.1. Universal algebra

Definition 5. The universal algebra $U$ of signature $\Sigma$ is the $\Sigma$-algebra which elements are the terms of signature $\Sigma$ and which family of operations is defined for each $a \in F$ with $\alpha(a) = n$ and any terms $t_1, ..., t_n$ by, \begin{align*} f_a(t_1,..., t_n) = (a\,;\,t_1,...,t_n) \end{align*}

Theorem 1. (Universality) Given any $\Sigma$-algebra $A$ and any interpretation $\iota : V \to A$, there is one and only one morphism of $\Sigma$-algebra $\phi_U : U \to A$ such that for each $v \in V$ we have $\phi_U(v) = \iota(u)$

Proof. Existence and unicity both come from the induction principle.

Corollary. The universal $\Sigma$-algebra is unique up to a unique isomorphism of $\Sigma$-algebras (hence its name).

Proof. As usual, given $U_1$ and $U_2$ satisfying the definition we get two morphisms $\phi_{21} : U_1 \to U_2$ and $\phi_{12} : U_2 \to U_1$. By the unicity of $\phi_{1} : U_1 \to U_1$ and $\phi_{2} : U_2 \to U_2$ we get $\phi_{12} \circ \phi_{21} = \mathrm{Id}_{U_1}$ and $\phi_{21} \circ \phi_{12} = \mathrm{Id}_{U_2}$.

If we chose for one moment to distinguish an algebra $A$ from its set of elements $A_\circ$ and a morphism of algebras $\phi : A \to A'$ from the mapping of the underlying sets $\phi_\circ : A_\circ \to A'_\circ$ we get a seemingly dull covariant functor $(-)_\circ$ from the category of $\Sigma$-algebras to that of sets. The definition of the universal algebra describes another functor in the other direction, from the category of sets to that of $\Sigma$-algebras. We have a formal adjunction between those two functors. $$ \begin{align*} \mathrm{Hom}(U_V, A) \underset{nat.}{\simeq} \mathrm{Hom}(V, A_\circ) \end{align*} $$

The Hom-set on the left is the set of all morphisms of $\Sigma$-algebras from $U_V$ to $A$ and the Hom-set on the right is the set of all interpretations of the set of variable $V$ in the algebra $A$, that is, mappings associating an element of $A$ to any variable symbol. The formal adjunction means that any interpretation of the variable symbols in $A$ gives rise to a unique morphism of $\Sigma$-algebras $U_V \to A$ as the definition says, and that the converse is also true as is readily seen.

3. Type System

3.1. Types as terms

We consider a type system where types are described by terms called type terms. Variables of a type term are called type variables, constants are called primitive types, bound type terms are called generic types, unbound types are called non-generic types. Function symbols are called type constructors, and non-constant type terms are called non-primitive or compound types.

3.2. Subtyping

A subtyping relation is a partial order on the set of types. Due to soundness considerations we shall impose some restriction on such a relation.

Given two type terms $A$ and $B$ we say that $B$ extends $A$ or that $B$ is a subtype of $A$ and we note $B :A$

(To be continued...)


Jesam  posted 2012-04-29 07:35:44

Always a good job right here. Keep rolling on through.

Samuel VIDAL  posted 2012-04-29 21:05:30

Thanks ;-)