- Type Theory
HoTT Math 1: Elementary group theory
This is the first in a series of posts on doing mathematics in Homotopy Type Theory (HoTT). Various conventions and notations are explained in the preamble; there are no additional prerequisites for this post.
A group is a set equipped with a constant , a unary operation and a binary operation that satisfy the usual axioms. In HoTT the group axioms must be motivated, so the group comes with three more components:
The axioms are concrete objects in proof-relevant mathematics! The last two axioms are actually the conjunction of two axioms since conjunction in HoTT correspond to products. It’s conceptually convenient to package them together and use and for the two conjuncts obtained by applying the projections. In fact, it makes sense to pack all of these into one
(but, for the sake of clarity, it is best to refrain from using things like for ).
The axioms types above are parametrized by and the three components , , , so one can form the type of all groups:
This level of formalism is cumbersome and since it is perfectly clear what the type is from the narrative description, it is best to avoid it. The only difference from classical mathematics is that the axioms are given as concrete objects rather than abstract statements.
Familiar identities, for example the fact that is its own inverse, can be obtained by combining the group axioms. We have a path and another path . Concatenating these two paths yields the desired path
There are other ways to see this, for example, symmetric reasoning yields
In general, these two paths are need not be the same but since is a set, i.e., a 0-type, there is at most one path in and therefore the two paths above must be the same.
Let’s try something a tad more complicated — an old favorite — the uniqueness of identity elements. To say that is a left identity element means exhibiting an element of the type
so shows that is an identity element. Similarly, shows that is a right identity element. Classically, we know that if is a left identity and if is a right identity then we must have . In HoTT, this corresponds to exhibiting an element of type
To prove this, let’s fix . Given and we have paths
and therefore a path
The classical proof would end here but to get the desired element of , we need to wrap things up as follows. Since the final path was obtained uniformly from , we have an element
and we can then unfix to obtain the desired element
I slightly abused -notation in the argument above but the idea was only to give a hint how the argument could be formalized. In fact, we did not need to worry about this at all. Since is a set, there is at most one path in and thus there is also at most one element in
by function extensionality (§4.9). Therefore, the unique choice principle (§3.9) applies to give the desired element of . In fact, also has exactly one element by function extensionality.
In the end, the classical proof of uniqueness of inverses was sufficient and unambiguous. In fact, the same is true for essentially all similar facts of elementary algebra. This is good for multiple reasons but most importantly this means that doing math in HoTT does not involve getting caught up in elementary stuff like this: you can invoke any time without referencing a particular proof since the proof is unique. There is one important caveat:
It is very important that the carrier type of a group is a set!
I fell into this trap when I asked this MathOverflow question. It is very tempting to think that the loop space is a group for every . This is only true if is a 1-type. Otherwise the carrier of is not necessarily a 0-type, the uniqueness of paths is lost, can have many different proofs, which are all relevant and must not be forgotten!
In conclusion, elementary group theory works fine in HoTT. The general feel is a bit different but it’s more fun than cumbersome to see how the axioms work in relevant mathematics. The same is true for much of elementary algebra and, ultimately, proof relevance is never cumbersome since proofs are almost always unique. It is important to remember that the carrier of an algebraic structure must always be a set for this to work!
In the next installment of HoTT Math, we will continue with elementary field theory which presents an additional difficulty since we must handle the fact that multiplicative inverses do not always exist…
Jesse McKeown wrote
Oh, hmm… I don’t think that’s quite the issue with non-sets carrying group operations; let me call your instead ; what you’ve got there, then is a term in ; and it really ought to be the only term in , because if you had another one, then you’d have a canonical name for an element in that somehow didn’t converge to reflectivity, using just the maps structure , which is just too weird.
I think what you’ve got a hold on there is that the other parts of the group structure can give you weird things; for instance, assuming and of the same does potentially give you an unknown identification , but the solution to that is to actually assume left and right identity elements separately, and then use the canonical identification you’ve constructed here to treat them as the same. (compare the (Joyal) definition of “equivalence” you quoted in the previous post, where the section and retraction are supposed separately, and then a canonical homotopy is found between them).
Much more difficult is associativity: as soon as there are 2-spheres floating around, one has room to distinguish and ; but everything not too wild that a classical mathematician would call a group is equivalent to some loopspace, and indeed “loopspace” is probably the correct way to say “group” in HoTT. The difficult part in generalizing the equations-and-more-equations definition, as with formalizing -sets, is in writing down all the necessary coherences.
Could you clarify your first paragraph? I don’t see what is “too weird.”
Mike Shulman wrote
i would have ended the proof at the same place that you said a classical proof would end. We wanted to prove “for all u and v, something or other”, and we did it by assuming u and v and constructing the something or other. That’s exactly what we had to do, by the propositions-as-types interpretation of “for all” and the introduction rule for Pi-types. Technically there is indeed a lambda-abstraction going on, but it doesn’t need to be mentioned in the mathematical English, any more than classical mathematical English feels the need to mention the difference between entailment and implication.
In particular, I think the phrase “since the final path was obtained uniformly” is misleading, because there is no non-uniform way to obtain anything – is there? I wouldn’t want people to think that whenever they prove a universally quantified statement in type theory they have to then check that the proof was performed “uniformly”, whatever that means.
Sounds like one of the traditional methods of proof to me: it would just be too weird to be otherwise. (-:
This may be one of the places where internal and external reasoning have to be distinguished. It may be the case that there is only one term we can write down in type theory which inhabits that type – that seems like a plausible sort of paramatricity result, although saying that is a long way from actually proving it. But that would be a meta-theoretic statement about type theory, very different from a claim inside type theory that that type has only one inhabitant. The latter is definitely not provable, because type theory has models in classical mathematics where LEM holds and thus non-uniform objects exist. (See e.g. Exercise 6.9.)
Trying to put it another way. This really isn’t a proof, so salt it to taste; but it would be weird to have a term that means a proof of in all -spaces that wasn’t homotopic to reflectivity because, on the one hand, there is only that equation in the discrete case, and on the other hand it really can’t be decidable whether a generic higher inductive type presents a discrete type or not; so whatever term would have to fail to be homotopic because it should also fail to be really computable.
(I had earlier thought that it’d be even weirder because for various reasons, I’d convinced myself that the nonstandard identification would have to live in the center of ; but actually for -spaces, that’s not weird at all, which is a fun exercise.)
I chose to write the proof it this way to illustrate what lies behind the generalization rule and the deduction theorem in HoTT. I agree that the wrap-up at the end of the proof doesn’t need spelling out, but I don’t think it would be right to stop where the classical proof would end since that would stop shy of the goal: an element of type . (Note that this can be fixed by starting the proof a little differently.)
Regarding uniformity, it depends what you mean by “obtain” but non-uniform proofs definitely happen if you assume some LEM. I think they can also happen in plain HoTT by fooling around with universes in silly ways.
I don’t think I agree – I think you can end the proof exactly at the place you end the classical proof, because the unmentioned step is exactly the same as the passage from entailment to implication that is always omitted in set-theory-based informal mathematics. Can you say exactly what you mean by “starting the proof differently”?
If you mean to refer to LEM-based proofs as non-uniform, then I definitely think the phrase “since the final path was obtained uniformly” is very misleading, since we would be doing exactly the same thing to finish the proof whether or not we had used LEM in it. The only difference is that in the latter case the resulting term would depend on the hypothesis of LEM. Uniformity has nothing to do with the introduction rule for Pi-types.
OK. I think I understand better. I think Mike hit the right issue here. By design, I’m not making constructive assumptions or anything of the kind. So weird things, even “too weird” ones, are very much allowed to happen!
I’ll have to think more about your other remarks.
Hm. Actually, I don’t think we disagree: the passage from entailment to implication is the deduction theorem and I wanted to illustrate what lies behind it in HoTT (since it’s a bit different than in classical first-order logic).
Uniformity is very clear in the -introduction rule: you need one expression to get . That’s all I’m saying by “the final path was obtained uniformly.”
Okay, I was just saying that the rest of it wouldn’t be needed if you didn’t have the specific goal of illustration.
It’s true that you need one expression, but that sort of uniformity has nothing to do with LEM. If we used LEM then there would still be one expression, it would just involve the assumed witness of LEM. Again I think that meaning of “uniformity” is a meta-theoretic concept that doesn’t need to be invoked when doing mathematics – everything we can say inside the type theory is automatically uniform in the sense of being given by one expression.
It’s still necessary to address the main point. If the truncated proof were submitted by a student, I would deduct a few points for not mentioning the relationship with an actual element of type . Of course, there are plenty of ways around that that don’t involve spelling out the -terms.
I agree that everything you can say inside formal type theory is uniform, but this not true in an informal setting. Verifying the hypotheses is a pretty standard mathematical practice. It’s often not necessary since the hypotheses are often clearly visible but, especially in combination with uses of generalization and the deduction theorem, it’s sometimes necessary to remind the reader why the hypotheses are valid. Not doing this can easily lead to mistakes! I’m sure good examples of this will pop up here pretty soon; this is not a good one since it’s painfully obvious that the hypothesis of -intro was satisfied.
If it were a student in an introduction-to-proofs class, then sure, I could see asking them to finish the proof by saying essentially “since we have assumed A and proven B, we can conclude that A implies B”. But that would be superfluous for a student in a group theory class. I guess I wasn’t sure what audience you were thinking of.
Can you please provide an example of something you can say in informal type theory that’s not uniform? I really can’t think of any. Indeed, the intent with informal type theory was that everything be formalizable, so there should be no differences of this sort — if there are, then that’s a defect of what one means by “informal type theory” that should be remedied.
Also, I would not want to encourage anyone to think of “uniformity” of the construction as a “hypothesis”. It’s a hypothesis in the meta-theory, a premise of the judgment rule for introducing terms in Pi-types (although even then it’s not that there is some property of something called “uniformity” that has to be verified), but it’s not a mathematical hypothesis of the sort that we generally assume and verify. Avoiding confusion on this point is especially important when we talk about equality, because judgmental equality, being a judgment, is not something that can freely be “assumed” and “proven” in the same way that propositional equality can.
My comment certainly wasn’t meant to encourage thinking of the premise of -intro as a hypothesis on the same level as the group axioms; I would rather discourage that. But I don’t understand your point. The point of informal type theory is to embed the calculus of judgements into common language. There has to be ways of saying informally how the use of this or that rule is justified (when that’s not immediately obvious). Ideally, I should never have to talk about rules explicitly, I should have enough key words to say that in prose. My key word for the premise of -intro is “uniformity.” You can disagree with the choice of word, I don’t think it’s perfect, it comes from computability theory. I’ll stick with it since it’s now more-or-less automatic for me to use this word in this way.
As for examples of non-uniformity in informal type theory. I would rather wait for an example to pop up since artificial examples are never convincing. Note that the point is to informally see when an informal argument is valid or not. If you’re looking for a valid proof that can’t be formalized, you’re looking for the wrong thing.
There are some very interesting borderline cases… The first thing I would try to generate an interesting one would be the following. Concoct an induction on where the induction step involves hopping one universe up. This can’t be formalized within the universe setting used in the book since the proof wouldn’t fit in any universe. A more subtle flaw is that this conflates and the indexing of the universes, which are not necessarily related at all. However, such an informal proof could be “rescued” by having a richer universe structure. The tricky bit is getting the universe hopping to feel natural. A non-convincing example is a simple unmotivated hop: There is a universe ; every universe is an element of another universe . Therefore, there is a such that for each . (Meh.)
I wonder if the typical ambiguity introduced by the universes could also be exploited to generate non-uniform arguments…
[Edit@14:18 - fixed annoying typo.]
I don’t agree that the point of informal type theory is to embed the calculus of judgments into common language. If that were all there was to it, then we could just describe all the judgments in words instead of symbols. I would say that the point is to modify the usual mathematical language so that it becomes formalizable in type theory rather than set theory, hopefully requiring mathematicians to change their ways of thinking and writing as little as possible. I would really rather that they not have to constantly think in terms of judgments at all, let alone explaining why the premises of a judgment rule are satisfied.
I remain to be convinced that there are any examples of “non-uniformity” that aren’t obviously wrong on other grounds. We already know that typical ambiguity is something we have to be careful with (this comes up in chapter 10, for instance), and in that case the answer is simple: typical ambiguity just means neglecting to write the universe indices, so if your argument is no longer valid when you put the indices back in, it’s wrong. So that example doesn’t convince me of the need to muddy the waters by introducing a new term of art “uniformity”.
It increasingly sounds like we’re handling the same stick from opposite ends. What I’m talking about is not writing math but verifying it. This is automatic in formal systems but it is not automatic in informal proofs. For informal type theory to make sense to mathematicians you need to be able to verify it in real time, without having to formalize it. These clues about the use of the rules are for that. When I prove and ten pages later prove , I will say something before concluding not because my argument would otherwise be wrong but because it’s easier to verify. For the same reason, if my aim is to get an element of type then, in some way or another, I will mention the desired element, not necessarily by writing the -expression, but I will find a way to point to the element. It’s not necessarily wrong to omit it but it’s not good form.
I don’t have a good line of attack for exploiting typical ambiguity. Your response is not really satisfactory though since it requires formalizing the proof. Mathematicians won’t like “asking the local type-theorist” any more than they like “asking the local set-theorist.” But typical ambiguity isn’t what I was exploiting in the universe hopping example. What is wrong with a universe large enough that there is a where each is a universe and ?
If it had been a 10 page proof, then I would agree. Your argument above is just very short, so I didn’t see the need. Perhaps there is no real disagreement there.
Adding universe levels is not “formalizing the proof”. Indeed, there are a few places in the book where we add back in the universe levels when it becomes necessary, but it’s still informal type theory, no proof assistants or local type theorists required.
Nothing is wrong with your universe V of course, except that we haven’t assumed such a universe in the type theory of the book. I would say that your argument There is a universe U(0); every universe U(n) is an element of another universe U(n+1). Therefore, there is a N→U such that U(n):U(n+1) for each n is not valid because the “U” in “N→U” cannot be given a universe level; and also because “to be a universe” is not a proposition nor is there a “type of universes”, so that there is no “operation” of going from one universe to the next.
Yep. I don’t think we disagree.
We probably draw the formalization line at different places. It’s a fuzzy line anyway…
But does that make a proof using such a large universe wrong? (Just a heads up, in case you hadn’t noticed: further in that direction are the HoTT analogue of “large cardinals” in ZFC…)
I think your parenthetical answered your own question: a proof using a universe “U_omega” would be “wrong” in the type theory of the book in the same sense that a proof using an inaccessible cardinal would be wrong if ZFC is the implicit foundation. We made a particular choice for what the type theory underlying the book would be, mainly just for definiteness; of course there are lots of potential variations.
Again, that’s not a satisfactory answer: what’s an informal reason why the “proof” is wrong? How do I convince someone with such a “proof” in hand that they are doing something which is actually wrong?
Aside: As I mentioned before the book’s handling of universes is highly objectionable since it rules out highly plausible (and even desirable!) stronger forms instead of being agnostic about them.
Here is what I would (like to) answer if someone went on MathOverflow and asked: Is this proof correct? (In reference to a hypothetical inductive proof using “universe hopping” as outlined above, giving an element of type as a conclusion.)
No, your argument is not correct because you hop to the next universe at each induction step. This makes it impossible to get a single expression that works for each so you can’t apply the -introduction rule to get an element of type in the end. To make your induction uniform, you need to fix an infinite sequence of increasing universes at the outset. However, this is a (mild) large universe hypothesis so you need to work in HoTT+ to make this work.
How would you do it for someone using an inaccessible in ZFC? I don’t see any difference.
If you can suggest a specific, well-defined type theory we could use as the foundation for the book that can nevertheless be “agnostic” about how strong of universes there are, I would be very interested. We had to pick something!!
Mike Shulman wrote
Probably your hypothetical answer should be talking about the N-induction rule rather than the Pi-introduction one? I think that’s a perfectly good explanation, but I don’t see the need in it for the word “uniform”; why not just say “to make the induction valid”?
The agnostic handling of universes would be to require that (1) any two types are in a common universe and (2) any universe is in a larger universe (larger in the sense of typical ambiguity). This gives all the minimum required universes and does not impose restrictions on the global structure of universes.
Can you explain in what sense the linearly ordered universes “restrict” the global structure of universes more than your preferred version? Either one would require adding some additional rules in order to get larger universes, and I don’t see anything about the linearly ordered universes that makes it any harder to add such additional rules.
It’s also worth noting that if what want wants is consistency strength corresponding to large cardinal axioms, one can obtain this without modifying the structure of the “universe” hierarchy by instead adding constructions that increase the “size” of the existing universes. For instance, if a particular universe is closed under inductive-recursive definitions of a suitable sort, then it automatically has a Mahlo-like property, without changing any of the rules that apply specifically to that universe. That’s one of the reasons I don’t have any problem with a fixed linearly ordered sequence of universes.
I wouldn’t downvote an answer that said that; I might even upvote if I’m in a good mood. I might make a comment if I’m in a very philosophic mood since that doesn’t make justice to the second problem with the “proof”: confounding the internal and the indexing of the universes. This one is deeper. The root cause here is that the concept of induction is epistemologically prior to the internal and its induction rule (which requires uniformity). The confusion stems from identifying both with their epistemological ancestor (which doesn’t require uniformity). I don’t like using the words “epistemological” on MathOverflow, so I would probably not include that much detail in my answer, but I definitely like the idea of indicating the key difference between the internal and its ancestral concept: uniformity.
That might be an ambiguity in formulation. My reading is that the sequence
are the only universes. Assuming my reading is incorrect and there may be other universes, I still get the impression that this fundamental sequence is cofinal. It would be good to clarify this.
(Not important, but still, there is a philosophical issue with what those indices actually are.)
Okay, I think this discussion has gone about as far as it can productively. (-: The word “uniform” doesn’t convey the desired meaning to me, but maybe if we found a better word I would be happier. What about just “internal”?
How would you even state a rule that means “these are the only universes”?
Agreed. :-) I think “uniform” is more precise but I concur that the reason is probably that I use it with that precise meaning in a variety of unrelated contexts.
How would any other type “happen” to be a universe?
Well, if you added a rule giving you another universe U_omega, and rules that U_i : U_omega for all i, then you’d have another universe. The U_i are the only universes we assert in the type theory, but nothing “rules out” adding other universes.
OK. It looks like the main part of this is (unintentional) ambiguity. Cofinality is still an issue: When we say that A is a type, we mean that it inhabits some universe . (Section 1.3.) I still like my two axioms better. Every type belongs to some universe by (1) and by (2) every universe starts an unbounded chain. The only loss is that doesn’t mean anything specific. (I can’t believe that would be useful for anything significant.)
It also resolves the (not so important) philosophical issue of the indices. The two axioms give you a chain of 17 universes whenever you need that many. However, the two axioms don’t give you an actual “infinite chain” in any definite sense. (As I mentioned earlier, an infinite chain of universes indexed by the internal would be “too weird,” to borrow Jesse’s expression.)
I don’t really see what is ambiguous. I guess you’re right about cofinality the way we phrased it, although I tend to prefer formalizing it with an implicit a la Tarski coercion, in which case cofinality is no longer a problem. I can see your argument, but I think the important advantages of simplicity and ease of understanding for newcomers still sways me to the choice we made in the book for purposes of exposition.
(FWIW, An infinite chain of universes indexed by the internal N did at least at one point exist in Agda, I think; possibly now things are slightly different, but I think the type indexing universe levels is still an internal type inside the type theory.)
Guillaume Brunerie wrote
I have a side question about universes. Would it be useful for something to have very large universes (like , , and so on)? I think that in set theory there are some arithmetic statements that can only be proved using very large cardinals, is the same thing true in type theory ?
Absolutely! Specifying the length of the hierarchy is not necessarily the right way to approach that at first. (It leads very fine grained analysis which makes little sense without the bigger picture.) More interesting (and more natural) are reflection principles which state that cofinally many universes “behave” just like the real universe for various notions of “behave”. Since the real universe, with everything in it, doesn’t exist as an actual type this is the only way to internalize global statements.
None of this has been studied with respect to HoTT but we already have a child poster for testing: determinacy principles. These are statements are about rather small types (, , ) but we already know from their set-theoretic analyses that many cannot be proven without the use of large universes.
Guillaume Brunerie replied to François G. Dorais
Thanks! Do you have a concrete example of something which would require a universe containing infinitely many smaller universes (something like an internal )?
Specifying the length of the hierarchy is not necessarily the right way to approach that at first.
Yes, I agree, I gave explicit ordinals just to make things simple.
I found a few papers about Martin-Löf type theory with a super universe. If I understood correctly, the idea is that there is a universe forming operation which associates to every family of types a universe containing and all of the . A super universe is then a universe closed under this universe forming operation (see http://www1.maths.leeds.ac.uk/~rathjen/Super.pdf and http://www2.math.uu.se/~palmgren/universe.pdf)
That seems interesting, but I didn’t find any other paper about that (or about type theory with more than omega universes) and neither Coq or Agda contain omega+1 universes.
The way universes are handled in Agda is quite similar to what you are talking about in some comments above. There is an (internalized) abstract type of universe levels called
Levelwith the following constants:
zero : Level_ _suc : Level -> Level max : Level -> Level -> Level
and for every
i : Level, there is a universe
Set iin Agda).
Then base types (including
Level) are put in
Type zero, the universe
Type iis of type
Type (suc i)and a Pi-type indexed by something in
Type iand whose codomain is in
Type jwill be in
Type (max i j). It doesn’t give you a universe where the Pi-type
(i : Level) -> Type iwould live, though, because the universe level of the codomain cannot depend on the input. Note that
Levelis abstract, it’s not the natural numbers but just some type having the appropriate operations.
Maybe the requirement that
Levelis small could be dropped, and the
maxoperation could be extended to any small family of levels?
[Apologies Guillaume, your comment apparently got caught in my spam box for a while.]
Interesting note about Agda. This seems to exploit the fact that some types might not belong to a universe. I find that an intriguing idea, especially from the perspective of foundations. It is, however, an interesting way to see that conflating Level and Nat can be dangerous.
It’s too early for good examples (well, it’s too early for me to come up with them). I outlined the way to get something natural that requires above but I haven’t yet tried to get something concrete using that strategy. One thing that does require some large universes is to make sure that the interpretation of in §10.5 satisfies strong comprehension and strong collection. The problem is not specific to interpreting set theory, though, it’s about quantifiers over “large” types. For example, quantifiers over live in , so you can’t use them to define types in even if the desired types are, in principle, “small” or “innocuous” enough to be in . Large universes allow you to work around that (but never fully since that would be as good as which is known to be inconsistent).
I don’t think it’s dangerous to conflate Level and Nat, at least not in the sense of inconsistency. It just yields a different type theory than the one we used in the book.
Do you have a proof of strong collection for V using large universes? I would be intrigued to see it.
Of course, the “resizing axioms” are intended to single out exactly those types that are “small” or “innocuous” enough to stay in the same universe…
Guillaume Brunerie replied to François G. Dorais
I think a simple example that requires is the following: assuming the axiom of choice, the V in §10.5 is a model of ZFC, but if you consider the V in you even get a model of ZFC + there exists inaccessible cardinals. And I don’t see how to get such a model without (and without induction-recursion).
Mike, since it’s only been a few days, I can’t say anything definite about HoTT with large universes. I’m still digesting plain HoTT. If you want an idea what large universes can do, you can look at the work of Michael Rathjen, Anton Setzer, and others on large universes in MLTT. It’s too early for me to determine what effects univalence might have there but since you have much more experience with HoTT, you might be able to work through it quicker.
Sure. I only asked because your comment “One thing that does require some large universes is to make sure that the interpretation of V in §10.5 satisfies strong comprehension and strong collection” implied to me that you knew that large universes did entail strong collection.
Update: A “natural” example of a non-uniform proof (via “universe hopping”) can be found in this later post. It illustrates a failure of the -rule, where there is an arithmetical statement such that one can prove but there is no proof of The problem is that there is no uniform term such that so the -intro (or generalization rule) does not apply.
However, and this is the point, the argument for is perfectly reasonable since the -rule is intuitively valid. It is therefore unacceptable for a foundation to explicitly exclude this argument. HoTT needs to be able to accommodate this type of argument via additional large universe hypotheses.
Note that the post linked above also shows that conflating the universe indices with the internal natural numbers does indeed raise the consistency strength of HoTT with AC and hence LEM.
A small question. how to define a subgroup in this setting? and a quotient group?
Subgroups and quotients are special cases of group homomorphisms.
Unlike in set theory, elements of a type don’t have a life outside of that type, so you can’t pick out the elements you like to form a subset. If you want to form a subtype you can use a dependent pair type whose elements are pairs where and . Then the first coordinate projection gives an embedding whose “image” is the right “subtype” of .
The existence of quotient groups is a little more delicate but they are all there. I plan to get there in a later post. In the mean time, you can check out the HoTT book and read about higher inductive types in chapter 6.