Stream: beginners

Topic: What actually *is* a tag?


view this post on Zulip Declan Joseph Maguire (Oct 21 2023 at 08:15):

Tags are super useful and elegant and I really like them, but what the hell are they, mathematically speaking? They just seem like magic objects to me, even though I understand how they function in a practical sense. Like what do they compile to? Do they compile to anything at all, or are they just an abstraction that vanishes? What's actually happening when I decide to add a branch to a function that returns a tag?

view this post on Zulip Brian Carroll (Oct 21 2023 at 11:09):

The tag ID is represented as an integer at runtime.
For each expression in your program that has a tag type, the type checker will figure out the set of all possible tags that can end up there. Then that set of tags gets a runtime representation where each tag is assigned an integer value.
So the same tag name A could be represented as different numbers in different places, depending on what other tags reach there.
Tags can also have payload values and in that case they form a data structure containing the payload values and the tag ID.

view this post on Zulip Brian Carroll (Oct 21 2023 at 11:09):

You did ask for a mathematical answer and I realise this is a programming answer!

view this post on Zulip Declan Joseph Maguire (Oct 21 2023 at 11:14):

Still helpful, it gives a skeleton that might help me double check my understanding when trying to understand a more mathy version. I guess part of why I want the maths answer is how neat and tight the rest of the type system is. It's just a highly comprehensible, unified system, and is otherwise conventional, but then you have tags that seem to blur the very lines between variable, name, and value.

view this post on Zulip Declan Joseph Maguire (Oct 21 2023 at 11:14):

They're so bizzare, even as they're so simple in practical use.

view this post on Zulip Richard Feldman (Oct 21 2023 at 12:35):

https://vimeo.com/653510682 around the 40 minute mark talks about the in-memory representation, kinda aimed at a C/C++ audience - I'm happy to elaborate on any of that!

view this post on Zulip Isaac Van Doren (Oct 21 2023 at 14:23):

I think of them as functions. So if you have MyType : [Foo Str] then Foo is a function Foo : Str -> [Foo Str] and when you write Foo "some string" you're applying Foo to the argument "some string". This isn't what Foo explicitly is in the language but I think it works conceptually.

It is however explicit in languages like Elm and Haskell. For example in elm:

> type Foo = Foo String
> Foo
<function> : String -> Foo

Here tags are just functions and you can use them in all the same ways. (in haskell they are called value constructors)

view this post on Zulip Ayaz Hafiz (Oct 21 2023 at 14:36):

Here is the math interpretation:
You can think of tags as forming an ADT/sum type/coproduct (I'll use coproduct). Specifically, for a given tag union type [Zero, Add x y], we have the inclusion operations

Zero[Zero,Add x y]Add x y[Zero,Add x y]\begin{align*} \tt{Zero} &\hookrightarrow \tt{[Zero, Add\ x\ y]}\\ \tt{Add\ x\ y} &\hookrightarrow \tt{[Zero, Add\ x\ y]} \end{align*}

Now, Roc has an important distinction between "open" and "closed" tag union types. The distinction is that a closed tag union cannot grow - that is, there is no way to transform the closed union type [Zero, Add x y] to any other union type. However, the open tag union type [Zero, Add x y]* can grow - that is, there is an operation I can perform to add a new variant to that type, thus making the type larger. We can thus think of some relationship like

Zero[Zero,Add x y]Add x y[Zero,Add x y]<AnyOtherTag>[Zero,Add x y]    [Zero,Add x y,<AnyOtherTag>]\begin{align*} \tt{Zero} &\hookrightarrow \tt{[Zero, Add\ x\ y]*}\\ \tt{Add\ x\ y} &\hookrightarrow \tt{[Zero, Add\ x\ y]*}\\ \tt{<AnyOtherTag>} &\hookrightarrow \tt{[Zero, Add\ x\ y]*} \implies \tt{[Zero, Add\ x\ y, <AnyOtherTag>]*} \end{align*}

If you think of all possible tag unions as sets, you can say adding the tag Sub x y to [Zero, Add x y]* is a function that maps [Zero, Add x y]* to [Zero, Add x y, Sub x y]* and leaves all other tag unions intact.

If you are familiar with category theory, we can say precisely that there is some category TagsTags consisting of all open and closed tag unions, where growing an open tag union by one variant is an endofunctor mapping that particular tag union. And, there is a free functor from a open tag union to a closed one, and a forgetful functor the other way, that form a monad over the category TagsTags

view this post on Zulip Declan Joseph Maguire (Oct 27 2023 at 11:50):

Okay from these replies and some independent research, I think I get how Tags work. Tags without payloads are basically Just Types but with weird rules that mean you can create them on the fly and treat them syntactically like values (if there's something deeper to this, tell me!). Without payloads, the type has only one inhabitant with no properties beyond that, but the with "with payload" version is still a little mysterious to me mathematically. The rules around scope and equality are still a little murky to me, which I think is where a lot of my confusion comes from.

view this post on Zulip Brendan Hansknecht (Oct 27 2023 at 11:58):

Tags without payloads are enums.

view this post on Zulip Brendan Hansknecht (Oct 27 2023 at 11:58):

Tags only disappear fully is there is only a single possible tag.

view this post on Zulip Brendan Hansknecht (Oct 27 2023 at 11:59):

With a payload, a tag is justed an enum and a union of data.

view this post on Zulip Brendan Hansknecht (Oct 27 2023 at 12:00):

Things get a tiny bit mor complex with recusion

view this post on Zulip Declan Joseph Maguire (Oct 27 2023 at 12:08):

Yeah on a moment-to-moment level I'd understood that's how they worked, I just found it bizzare (if extremely practical) that I could just write one function over here that spits out a Tag, another over there that accepts a Tag, and the two just know that they're talking about the same thing. Like that's not how things normally work, if you defined a type in a function then it wouldn't magically exist elsewhere. I guess it only works because Tags without values have no properties beyond their names and thus having things like that have such a wide scope won't cause things to stop working in one function just because you created a tag elsewhere. The payload is almost the same but opposite, a fungible bag of data unconstrained by its tag.

view this post on Zulip Brendan Hansknecht (Oct 27 2023 at 12:10):

Oh, that is really more about our type inference than tags specifically

view this post on Zulip Declan Joseph Maguire (Oct 27 2023 at 12:17):

With hindsight I can see that, but it was just so strange because it clearly would always work, but felt somehow like it shouldn't. Like it should somehow cause problems, contradictions. I think the more abstract side is starting to click though.

view this post on Zulip Declan Joseph Maguire (Oct 27 2023 at 12:19):

Like I'm confused by backpassing too, but that doesn't bother me because that's currently so opaque that I know it's just a matter of getting it, whereas Tags felt more mysterious because it felt like there was something that wasn't clicking even though I understood the pragmatic use perfectly.

view this post on Zulip Ayaz Hafiz (Oct 27 2023 at 13:38):

I think one key point to reconciling this is the fact that almost all types in Roc are structural, rather than nominal (except for opaque types, but we can disregard that here). Unlike in C/Rust/Java where if you declare, for example, and enum in a scope, each variant in the enum is scoped to the “name” of the enum. But in Roc, when enum variants (tags) compose into an enum, the enum does not have a “name” - its type is determined solely by the variants it consists of. That’s why passing a tag A between two functions that declare it separately works - because the types are compared as [A]=[A], which is true - whereas in, for example C++ if you try to compare OneEnum::A=OtherEnum::A, this is false, because the names of the enums are different.

view this post on Zulip Declan Joseph Maguire (Oct 27 2023 at 13:50):

Those are some great searchterms to investigate later. Yeah, the fact that you can be so loose with it, the fact that you don't need a scope or enclosing name, is what's so surprising to me. You can see in the other comments how it slowly dawns on me why this doesn't break things.

view this post on Zulip Johan Lövgren (Jan 01 2024 at 18:41):

Ayaz Hafiz said:

If you are familiar with category theory, we can say precisely that there is some category TagsTags consisting of all open and closed tag unions, where growing an open tag union by one variant is an endofunctor mapping that particular tag union. And, there is a free functor from a open tag union to a closed one, and a forgetful functor the other way, that form a monad over the category TagsTags

By the way, I have had this thought at the back of my mind for a while, sorry for resurrecting an old thread :sweat_smile: ... This sounds like something there should be a paper about. Is anyone writing a paper about this? Or is this based on some paper already? Maybe someone should write a paper about it? :grinning_face_with_smiling_eyes:

view this post on Zulip Ayaz Hafiz (Jan 01 2024 at 20:19):

I don't know of any academic literature about that interpretation exactly, but it's the same as the idea of "pointed sets" (as opposed to regular sets) in math which are well-studied.

view this post on Zulip Johan Lövgren (Jan 01 2024 at 21:01):

Thanks, that gives me something to read about!

view this post on Zulip Rolf Kreibaum (Jan 02 2024 at 01:04):

Hi, I got a bit nerd sniped by that category theory connection / question and tried to work out some details. I mostly ran into problems with that. I don't think the Tags subcategory "just works" and would need some work to define what the Objects actually are that is more concrete than "all Roc types". Once that Object set has some mathematical modeling, that will directly include a model of open and closed tag unions.

If you actually do that modelling, you'll probably want to start with a smaller subset of Roc.

Anyway, here is what I wrote about that earlier, without trying to model what the Objects and Arrows look like.


I don't think we can throw categories on this and expect it to be simple.

growing an open tag union by one variant is an endofunctor mapping that particular tag union

Here a functor is a mechanism to turn one union type into another union type. The example functor here is "Adding Sub x y to the union".

[Zero] -> [Zero, Sub x y]
[Zero, Add x y] -> [Zero, Add x y, Sub x y]
[Roc] -> [Roc, Sub x y]
[Zero, Add x y, Sub x y] -> [Zero, Add x y, Sub x y]

Now the last one is a bit weird. It doesn't actually change the type because "Sub x y" is already in there. But it hints at a problem this example functor has:

[Sub Marine] -> TYPE MISMATCH, does not compile.

So you'll need to define each such functor on a separate subset (sub-category) of the "Tags" category.

view this post on Zulip Rolf Kreibaum (Jan 02 2024 at 02:07):


there is a free functor from a open tag union to a closed one, and a forgetful functor the other way

I just arrived here from a podcast on Roc so I had some reading up to do on open and closed. But I don't think you can forget either way. I failed to construct the forgetful functor.

Here is what I tried:

Given a Roc function f : [Red Str] -> [Red Nat] (e.g. string length) we can extend it to f*: [Red Str]* -> [Red Nat]* by defining

f* : [Red Str]* -> [Red Nat]*
f* = \tag ->
    when tag is
        Red str -> f str
        x -> x

So in this instance the functor looks ok.

I don't know if an open tag union is actually allowed in return position in Roc. I assume it is allowed, otherwise we wouldn't have much of a Tags category to start with.

But let's consider g: [Red] -> [Green Str] with Red -> Green "Apple" as definition.
The extension would be defined on g*: [Red]* -> [Green Str]* - but where does this map Green 7? It is not allowed to map to Green 7, because the type [Green Str]* guarantees that any value tagged as Green must carry a Str payload.

I had no success trying to define a functor going the other way either :-(

view this post on Zulip Rolf Kreibaum (Jan 02 2024 at 02:21):


More on open/closed from the Tutorial

If you have an open union, that means it can accumulate more tags through conditional branches.
If you accept an open union, that means you have to handle the possibility that it has a tag you can't know about.

To me this sounds very much like "an open union you have" and "an open union you receive" are actually two different things that just happen to share a name.

The "open union you receive" seems to be a proper type. It can appear it type signatures.
The "open union you have" seems to be the type of an expression - and it feels a lot more like a "closed union" to me. Because I can pass the value of the expression to functions that require a closed union, I know that only values of type "closed union" can actually be the value of this expression.


Last updated: Jul 06 2025 at 12:14 UTC