How do you apply $\Sigma$-typing to the definition of a category? (new to type theory here)

337 Views Asked by At

Here's a previous answer in which the author states that the definition of category is the following $\Sigma$-type. I'm not sure how that would look formally, so can you explain?

You don't have to restate the full definition. For simplicity use $\dots$ or whatever as fill-in-the-blanks.

Specifically, how do I build a $\Sigma$-type that is "composed of" the following types:

Definition of category as a set of dependent types. Screenshot of nLab page.

such that I can say that "the above is the definition of category".

2

There are 2 best solutions below

4
On BEST ANSWER

After reading your comments on SCappella's (excellent) answer, I think you might be confused about the logical underpinnings of type theory - so it is that which I will try to address in this answer. (Throughout I will be using screenshots of latex I've compiled. This is because I can't seem to get MSE to write proof trees or use blackboard numerals)

The "horizontal line" you are referring to is part of the definition of a Judgement, and (loosely) describes what strings of symbols we are able to write down inside of type theory.

There are two (fundamental) questions we can ask in type theory:

is $A$ a type?

and (assuming $A$ is a type)

does $a : A$?

These reflect two complementary notions, which together build the entire theory. We start out assuming that we have certain types, and known ways of gluing types together. That lets us build a family of all the types which our type theory has access to. Typing Judgements give the formal underpinning of this part of the system. Of course, it's not all about types, we care about values too! Types typically come with constructors, which tell us how to build terms inhabiting that type. These typically come in the form of Introduction Rules and Elimination Rules.

To summarize:

  • Typing Judgments allow us to prove that a string of characters is actually a type
  • Introduction Rules allow us to take preexisting terms (using old types) and build new terms
  • Elimination Rules allow us to take big terms and evaluate them to get smaller terms

Here are some basic examples of these types of judgement in action:

Judgements

In this system we introduce 3 basic types - naturals, the singleton type, and the empty type. Each comes with a Typing Judgement, Introduction Rules, and Elimination Rules (except the empty type has no introduction rules, because it's empty).

Similarly, we introduce 3 was of making new types from old ones - $\Sigma$-types, $\Pi$-types, and $=$-types. Each of these also has (more complicated) Typing Judgements, as well as Introduction and Elimination Rules. I have omitted the introduction/elimination forms for $=$-types, because they are a bit unwieldy, and they don't add much to this discussion. You can find them on page 437 of The HoTT Book if you're interested.

We also typically introduce Computation Rules, which tell us how to evaluate our terms. I have also left these out because they will not be useful in the following discussion.


In the linked post, the answerer describes a construction for the type $\text{Cat}$ of all (small) categories. They do this by saying "make a $\Sigma$ type containing all the stuff you need to be a category", and by explicitly listing these terms. SCappella does a fantastic job explaining why this is the correct type intuitively, but you want to know how to translate this definition into a "formal" setting, in particular you ask about writing it judgementally.

First, let me tell you why we don't typically do this. We use judgements in order to formalize what strings of symbols are and aren't allowed, and to clarify what certain strings of symbols mean. Additionally we can impose rules saying certain strings of symbols are "the same", and this is what leads to the computational aspect of Type Theory. However, once we have formalized these ideas, it is often clear when a string of symbols is or isn't valid. In these cases, we sleep soundly knowing we can formalize things if we need to, but we save time by electing not to. You'll see why later in the post...

As an analogy, think about how most proofs in mathematics work - First Order Logic (which is what 99% of mathematicians use in their daily lives) is formalized in a very similar way to Type Theory. Indeed we have judgements telling us when a certain string of symbols is "true". Even so, you would be hard pressed to find a mathematician outside of a logic department who could even read a formal proof, let alone would write one!

In exactly the same way, as Type Theorists, we often say imprecise things, or give enough information to convince the reader that what we're saying could be formalized. That is good enough for mathematics at large, and it is good enough for us too.


In the linked definition of a category, then, we see almost a verbatim translation of the set theoretic definition of a category into type theoretic terms. In set theory, a category is a set (or a proper class...) of objects, together with a set of arrows between any two objects. There is a composition operation called $\circ$ on certain arrows, and there are rules about what must happen when they compose. The type theoretic definition is the same, except we carry around these bonus operations and rules inside of $\Sigma$ types! This is because in Type Theory we can meld our object of study with our logic. There is no "one line" way of writing down all the requirements of a category in set theory, because the sets and the first order logic are two different components! The fact that we can unify these is one of the main allures of type theory (at least for me).

For completeness sake, here is the definition of $\text{Cat}$, the type of categories in full gory detail:

Cat

(Here $\mathcal{U}$ is the "type of all types". There's some foundational issues I'm sweeping under the rug, but don't worry too much about it.)


It is "clear" that this is a well formed type, and any type which inhabits it (remember, $\text{Cat}$ is a type whose values are themselves types with bonus properties) is a category in the traditional sense. The reason there are no judgements is because there is no need for them. Judgements let us see if a string of symbols is really a type, or if a string of symbols really is a term inhabiting a type. In a comment you said that the judgement notation is "more beautiful". In the hopes of convincing you otherwise (and to procrastinate on my grading...) I've written up a full proof (more or less) that the above type really is a type. The only liberty I took was allowing myself to derive $A \texttt{ Type}$ from $A : \mathcal{U}$. This is not technically something you can do in a proof tree (it's basically an inversion lemma), but I doubt anybody is looking too closely at this. As an (awful, awful) exercise, try to understand this entire proof tree, and find the inevitable typos that lie inside it (I didn't proofread it at all. This is more for effect than anything).

Too many hours of my life

So the whole tree didn't quite fit on my monitor... Here's a copy of the whole thing if you want to look around it. Hopefully this clarifies why we don't often reason with judgements.


As a last exercise, let's try to find the category of groups as a term living inside $\text{Cat}$. I'll spare us both, and I won't write up a judgemental proof of the validity of the term we come up with. If you'd like to, I encourage you to do it exactly once. We will argue informally, as we would in any algebra class, safe in the knowledge that we could find an explicit proof term for the category of groups living in $\text{Cat}$ if we ever truly needed to.

Let's take the following definition for the category of groups:

Grp

Do you see why $G : \mathsf{Grp}$ if and only if $G$ is a group? Notice we are doing the same abuse of notation as we always do. Really $G$ doesn't have type $\mathsf{Grp}$! $$(G,\mu,\iota,1,\text{assoc},\text{left-inv},\text{right-inv},\text{left-unit},\text{right-unit}) : \mathsf{Grp}$$

In set theoretic foundations, we do the same thing. $G$ isn't a group, the tuple $(G, \mu, \iota, 1)$ is the group! Since $\Sigma$-types are dependent tuples, it should come as no surprise that they are the perfect candidate for replacing tuples in this style of definition (which is ubiquitous in mathematics!). In addition to tupling up the operations we want, we also tuple up terms corresponding to proofs that the operations behave according to the group laws.


You might guess that I've cheated, then, in saying $\mathsf{Grp} : \text{Cat}$, and you would be right! We actually want a (dependent) tuple, whose first element is $\mathsf{Grp}$ and whose following elements - define the category structure - witness the fact that the category laws hold

As an actual exercise, see if you can figure out what terms you need to add, and see if you can convince yourself that these terms would actually let you form a (horribly disgusting) proof term inhabiting $\text{Cat}$ which you could prove with a proof tree that makes my above example look extremely tame.

To get you started, consider $\text{hom}(G,H) = \sum_{f:G \to H}P(f)$ - what do you want $P$ to be?


This was super long, but there's a lot of depth in this material, and I wanted to make sure I touched on everything that might have been confusing.

I hope this helps ^_^

5
On

In set theory, you'd define a category to be a tuple $(C_0, \hom_C, 1, \circ)$ where $C_0$ is a set, for each $x$ and $y$ in $C_0$, $\hom_C(x, y)$ is a set, for each $x$ in $C_0$, $1_x \in \hom_C(x, x)$ and for each $x$, $y$ and $z$ in $C_0$, $\circ: \hom_C(y, z) \times \hom_C(x, y) \to \hom_C(x, z)$ is a function. You also need some equations between things in $\hom_C(x, y)$. A sigma type, especially a nested sigma type, is the type theoretic version of this idea.

If $A$ is a type and $B: A \to Type$ is a type depending on $A$, $\sum_{a: A} B(a)$ is the type of pairs $(a, b)$ where $a: A$ and $b: B(a)$. Similarly, $\sum_{a: A}\sum_{b: B(a)} C(a, b)$ is the type of triples $(a, b, c)$ where $a: A$, $b: B(a)$ and $c: C(a, b)$.

The second important construct is the dependent product. You can think of $\prod_{a: A} B(a)$ as "for all $a: A$, $B(a)$". This reading is especially relevant when $B(a)$ is something that can only have at most one element. Depending on the exact type theory used and the exact setup, the equalities between morphisms will have that property.

The dependent product is still relevant for the other constructs in a category. For example, $1: \prod_{x: C_0} \hom_C(x, x)$. That is, for all $x: C_0$, $1_x: \hom_C(x, x)$. The type of $\circ$ will also be a dependent product, since it's a function for all $x, y, z: C_0$. Non-dependent function types can also be thought of as special cases of dependent products and it's typical to define $A \to B := \prod_{a: A} B$ to be the type of functions from $A$ to $B$.

So as a warmup, we'll find the type of categories without associativity or the identity laws. It's going to be

$$ \sum_{C_0: Type} \sum_{\hom_C: C_0 \times C_0 \to Type} \sum_{1: \prod_{x: C_0} \hom_C(x, x)} \prod_{x: C_0} \prod_{y: C_0} \prod_{z: C_0} \hom_C(y, z) \times \hom_C(x, y) \to \hom_C(x, z). $$

So we can see each part of the original tuple is simply placed under a $\Sigma$, except the last which is just placed after everything.

To include our equalities, we'll need three more pieces of data. This just means that the sigma will be even more nested, but it's straightforward to add a few more $\Sigma$s. The first two pieces of data are the unit laws, that is, $unitl: \prod_{x: C_0} \prod_{y: C_0} \prod_{f: \hom_C(x, y)} \circ(x, x, y, 1_x, f) = f$ and $unitr: \prod_{x: C_0} \prod_{y: C_0} \prod_{f: \hom_C(x, y)} \circ(x, y, y, f, 1_y) = f$

You'll notice I used all the arguments of $\circ$, but this could simply be written $1_x \circ f$ if so desired. Similarly, I've been writing $1_x$, but this should properly be $1(x)$.

Our last piece of data is associativity.

$$ assoc: \prod_{x: C_0} \prod_{y: C_0} \prod_{z: C_0} \prod_{w: C_0} \prod_{f: \hom_C(z, w)} \prod_{g: \hom_C(y, z)} \prod_{h: \hom_C(x, y)} (f \circ g) \circ h = f \circ (g \circ h) $$

Again, all we need to do to write this out formally is to put each piece of data under a sigma (except the last, which goes after all the sigmas).