Café Math : Cassipoeia Type System (Part 1)

# Cassipoeia Type System (Part 1)

### Sat, 28 Jan 2012 20:45:01 GMT

Cassiopeia is a small computer algebra system I am writing. It is still very basic, but it is aimed at having a rich a powerful type system. This article is the first of a series on the theory behind that type system. The goam is to provide a conveinent dictionary between some sophisticated computer software type system (like that of C#) and the theory of categories which I chose as a foundation of the mathematical engine.

# 1. Terms

Definition 1. A signature $\Sigma$ is a given by an alphabet $A$ partitioned in two distinguished subalphabets $F$ and $V$, and a mapping $\alpha : A \to \mathbb{N}$ such that for all $v \in V$ we have $\alpha(v) = 0$.

Elements of $F$ are function symbols, elements of $V$ are variable symbols. Given a symbol $a \in A$, the integer $\alpha(a)$ is the arity of the symbol $a$. Variable symbols all have arity zero, and zero arity function symbols are called constant symbols.

Definition 2. A term $t$ of signature $\Sigma$, or $\Sigma$-term, is a tree with nodes labelled by the alphabet $A$. Outgoing degrees of nodes are constrained by the arity function $\alpha$ in the sense that each node labelled by a symbol $a \in A$ has degree $\alpha(a)$.

A term is uniquely decomposed in an expression of the form $t = (a\,;\,t_1, ..., t_n)$ with $\alpha(a) = n$. Such an expression can conveniently be written $a(t_1, ..., t_n)$ if no confusion arises. Terms of the form $t = (a\,;\,\,)$ i.e. with $\alpha(a)=0$ are identified with $a$ so we can write $t \in F$ to mean a constant term and $t \in V$ to mean a variable term.

Example 1. We take the signature $\Sigma$ given by $A=F=\{\, a, b \,\}$, $V = \emptyset$, $\alpha(a)=0$ and $\alpha(b)=2$. The set of all terms of that signature is infinite and is composed of binary trees as.

The number of such terms with $n$ nodes labelled by $b$ is the $n^\text{th}$ Catalan number, \begin{align*} C_n = \frac{1}{n+1} \binom{2n}{n} \end{align*} The generating series of this sequence of numbers is, \begin{align*} G(t) &= \frac{1-\sqrt{1-4t}}{2t} \\ &= 1+t+2\,t^2+5\,t^3+14\,t^4+42\,t^5+132\,t^6+... \end{align*}

More generally, given any signature with a finite alphabet $A = F = \{\, a_1, ...,a_n \,\}$, $V = \emptyset$ and $\alpha(a_1) = k_1$, ..., $\alpha(a_n)=k_n$, we can count the number $N_k$ of terms of that signature having exactly $k$ nodes by introducing the following generating series, \begin{align*} G(t) = \sum_{k \ge 0}\, N_k \,t^k \end{align*} It is subject to the following algebraic equation, \begin{align*} G(t) = t\,(G(t)^{k_1}+ ... + G(t)^{k_n}) \end{align*} although this is not always easily solved for a compact closed form, one can easily pump the terms of the series using the Newton-Raphson method. For that one defines a sequence $(g_k)_{k\ge 0}$ of series approximations by recurrence, \begin{align*} g_0 &= N_1\,t + o(t)\\ g_{k+1} &= g_k - \frac{g_k - t\,(g_k^{k_1} + ... + g_k^{k_n})}{1 - t\,(k_1\,g_k^{k_1 - 1} +...+k_n\,g_k^{k_n-1})} + o(t ^{2^{k+1}}). \end{align*} where $N_1$ is the number of constant symbols in the signature. The series approximation $g_k$ is exact up to order $2^k$ which provides fast convergence.

Example 2. Taking $A=F=\{\, a, b \,\}$, $V = \emptyset$, $\alpha(a)=0$ and $\alpha(b)=3$ one obtains ternary trees. Applying the above method one finds the following approximation to the generating series after five iterations, \begin{align*} G(t)& = t+t^4+3\,t^7+12\,t^{10}+55\,t^{13}+273\,t^{16}+1428\,t^{19}+7752\,t^{22}+43263\,t^{25}\\ &\quad +246675\,t^{28}+1430715\,t^{31} + o(t^{32}) \end{align*}

# 2. $\Sigma$-algebras

Definition 3. An algebra $A$ of signature $\Sigma$, or a $\Sigma$-algebra, is given by a set $A_\circ$ and for each $a \in F$ with $\alpha(a) = n$ a mapping of sets, \begin{align*} f_a\,:\, A_\circ^n \to A_\circ \end{align*}

The set $A_\circ$ is the set of elements of the algebra. The mappings $(f_a)_{a\in F}$ are the operations of the algebra.

Definition 4. A morphism $\phi$ between two algebras $A$ and $A'$, is given by a mapping between their set of elements such that for each $a\in F$ with $\alpha(a) = n$ and all $x_1, ...,x_n \in A_\circ$ we have, \begin{align*} \phi(f_a(x_1, ..., x_n)) = f'_a(\phi(x_1),...,\phi(x_n)) \end{align*}

## 2.1. Universal algebra

Definition 5. The universal algebra $U$ of signature $\Sigma$ is the $\Sigma$-algebra which elements are the terms of signature $\Sigma$ and which family of operations is defined for each $a \in F$ with $\alpha(a) = n$ and any terms $t_1, ..., t_n$ by, \begin{align*} f_a(t_1,..., t_n) = (a\,;\,t_1,...,t_n) \end{align*}

Theorem 1. (Universality) Given any $\Sigma$-algebra $A$ and any interpretation $\iota : V \to A$, there is one and only one morphism of $\Sigma$-algebra $\phi_U : U \to A$ such that for each $v \in V$ we have $\phi_U(v) = \iota(u)$

Proof. Existence and unicity both come from the induction principle.

Corollary. The universal $\Sigma$-algebra is unique up to a unique isomorphism of $\Sigma$-algebras (hence its name).

Proof. As usual, given $U_1$ and $U_2$ satisfying the definition we get two morphisms $\phi_{21} : U_1 \to U_2$ and $\phi_{12} : U_2 \to U_1$. By the unicity of $\phi_{1} : U_1 \to U_1$ and $\phi_{2} : U_2 \to U_2$ we get $\phi_{12} \circ \phi_{21} = \mathrm{Id}_{U_1}$ and $\phi_{21} \circ \phi_{12} = \mathrm{Id}_{U_2}$.

If we chose for one moment to distinguish an algebra $A$ from its set of elements $A_\circ$ and a morphism of algebras $\phi : A \to A'$ from the mapping of the underlying sets $\phi_\circ : A_\circ \to A'_\circ$ we get a seemingly dull covariant functor $(-)_\circ$ from the category of $\Sigma$-algebras to that of sets. The definition of the universal algebra describes another functor in the other direction, from the category of sets to that of $\Sigma$-algebras. We have a formal adjunction between those two functors. \begin{align*} \mathrm{Hom}(U_V, A) \underset{nat.}{\simeq} \mathrm{Hom}(V, A_\circ) \end{align*}

The Hom-set on the left is the set of all morphisms of $\Sigma$-algebras from $U_V$ to $A$ and the Hom-set on the right is the set of all interpretations of the set of variable $V$ in the algebra $A$, that is, mappings associating an element of $A$ to any variable symbol. The formal adjunction means that any interpretation of the variable symbols in $A$ gives rise to a unique morphism of $\Sigma$-algebras $U_V \to A$ as the definition says, and that the converse is also true as is readily seen.

# 3. Type System

## 3.1. Types as terms

We consider a type system where types are described by terms called type terms. Variables of a type term are called type variables, constants are called primitive types, bound type terms are called generic types, unbound types are called non-generic types. Function symbols are called type constructors, and non-constant type terms are called non-primitive or compound types.

## 3.2. Subtyping

A subtyping relation is a partial order on the set of types. Due to soundness considerations we shall impose some restriction on such a relation.

Given two type terms $A$ and $B$ we say that $B$ extends $A$ or that $B$ is a subtype of $A$ and we note $B :A$

(To be continued...)

Jesam  posted 2012-04-29 07:35:44

Always a good job right here. Keep rolling on through.

Samuel VIDAL  posted 2012-04-29 21:05:30

Thanks ;-)

Comment:

Name:

Email:

Website: