Stream: beginners

Topic: open and closed tag unions


view this post on Zulip Loric Brevet (Feb 22 2022 at 08:39):

Hey, this is my first message here, so first congrats for making Roc, I am having a really good time digging the project!

I however have some difficulties really grasping the difference between open and closed tag unions. If I send this piece of code to the REPL, I am expecting that the return type would be inferred to a closed tag as the pattern matching in the definition states all the possible return tags (and no "_"), but it infers an open tag instead. Is there a reason why?

» something = 0
…
… stoplightColor =
…     if something > 0 then
…         Red
…     else if something == 0 then
…         Yellow
…     else
…         Green
…
… stoplightColor

Yellow : [ Green, Red, Yellow ]*

Thanks

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 09:07):

So I could be wrong cause I don't intimately know the details of tags, but to my understand this is an open tag union, because it could be other values. There is a chance your color tag might also have Purple or Grey, but that can't be inferred from your code sample. All because the result can only be 1 of 3 possible values does not mean a color must be one of those 3 possible values.

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 09:09):

If instead you were matching on a color like this:

when color is
    Red -> #do something
    Yellow -> #another thing
    Green -> # last thing

That would limit to a closed union. In the when statement, you must deal with all the cases. Therefore, no other cases can exist in your color union

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 09:11):

Of course, you could just add:
stoplightColor : [ Green, Red, Yellow]
and that would also tell the compiler the type and force it to be a closed union. (not sure if that works in the repl though)

view this post on Zulip Loric Brevet (Feb 22 2022 at 10:40):

Thanks for your help, I have tried to adapt my example to use a when is, but the inferred tag is also open.

» something = Positive
…
… stoplightColor =
…     when something is
…         Positive -> Red
…         Negative -> Yellow
…         Equal -> Green
…
… stoplightColor

Red : [ Green, Red, Yellow ]*

Of course, if I change my implementation, there "could" be other tags introduced, but with that specific definition, I am not sure to understand why Roc does not report a closed tag union, as all the return values are explicited.

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 16:58):

So with that example if you look at the type of something, it should be a closed tag for [ Positive, Negative, Equal ], but stoplightColor is still in the same situation as before. Since it is only a result of an expressions there is no guarantee those are all of the possible values. So it is still open. The big difference is being the result of a when vs being used as the variable being matched by the when.

view this post on Zulip jan kili (Feb 22 2022 at 17:53):

I've been using Roc for a few months now, and I still don't really understand open vs closed tag unions. I think that inferred openness is a default behavior in order to make a def more usable (by default) as an argument to a later function that accepts a superset of those tags.

view this post on Zulip jan kili (Feb 22 2022 at 17:56):

Essentially, I think closed tag unions are kind of useless as currently implemented :/ are rarely preferable to open ones, for values that you're passing around (unless their implementation for function arguments is different than I remember) but I don't want to hijack this topic lol

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:06):

yeah I think I need to iterate on how to teach this; I don't think I've found a great way to teach it yet

view this post on Zulip jan kili (Feb 22 2022 at 18:07):

For example (building on your last example), if stringify : [ Blue, Green, Red, Yellow ] -> Str then I think you can't do the function call stringify stoplightColor if stoplightColor was closed. However, you can do it since stoplightColor is open.

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:07):

so if we didn't have closed unions, we couldn't have exhaustiveness checking

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:07):

in other words, if we didn't have closed unions, all whens would need a _ -> branch

view this post on Zulip jan kili (Feb 22 2022 at 18:09):

@Richard Feldman In @Loric Brevet's last example, why is stoplightColor inferred to be open?

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:11):

one way to think of the answer to that question:

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:11):

if I wrote stoplightColor = Red, then it would be open

view this post on Zulip jan kili (Feb 22 2022 at 18:11):

(yeah, good reduction, that confuses me equally haha)

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:12):

if I write stoplightColor = and then either Red or Green, then both Red and Green are open, so no matter what, I'm still assigning it to an open union

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:13):

also, if I wrote this:

stoplightColor =
    when something is
        Positive ->
            closedRed : [ Red ]
            closedRed = Red

            closedRed

        Negative -> Yellow
        Equal -> Green

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:13):

I'd get a type mismatch

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:13):

because closed unions can't grow (they're closed!)

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:13):

and all branches of a conditional have to be type-compatible

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:14):

so when you have multiple branches of a conditional that are all open unions, it's ok - they all union together to be a bigger open union

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:14):

but if any branch is a closed union, then all branches have to be exactly that closed union

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:14):

otherwise type mismatch

view this post on Zulip jan kili (Feb 22 2022 at 18:14):

(dang, going into dentist, this is more fun, bye)

view this post on Zulip Tommy Graves (Feb 22 2022 at 18:33):

I think what doesn't make sense is that a function which takes a closed record can't accept an open record, but a function that takes a closed tag union can seemingly accept an open tag union value, right?

Intuitively if I have a value of type [Red]* that means it could be Red or it could be any other tag at all. So how can I pass that into a function which takes [Red] -- if my value is Blue then it's incompatible with [Red]! Somehow this compiles:

main =
  f : [Red] -> Str
  f = \a ->
    when a is
      Red -> "something"

  x = Red

  f x

even though x is an open union?

In my head the explanation is "the compiler knows x is only ever Red even though its type is an open union" but I'm not sure what other parts of the type system work this way -- where just from looking at the types I can give an example where exhaustiveness checking would fail, but from looking at the actual implementation I know that can't happen?

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:36):

yeah, I wonder if the name "open union" is part of the problem :thinking:

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:36):

I wonder if there's a better name that might help

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:37):

like really the way the system works is that it has some very nice properties:

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:38):

but the way it has those nice properties requires this concept that doesn't seem to map neatly onto something that appears elsewhere in programming

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:38):

there's no "open unions are basically like ______" because they're not basically like anything I can think of :sweat_smile:

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:38):

they're a unique concept, as far as I'm aware

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:39):

I have this feeling that there's a good way to teach them that gives a good intuition for how they work - I'm just not sure what that is yet! :big_smile:

view this post on Zulip Tommy Graves (Feb 22 2022 at 18:42):

The good news is that what happens is actually intuitive (looking from the code snippet I gave before, I think it would be very surprising if it did not compile!). It's just when you actually inspect the type of a value that it gets confusing. I wonder if the type for x when x = Red should be displayed or formatted differently than [ Red ]* -- it's almost like there are three concepts (closed tag union, open tag union as a type annotation for a function parameter, and open tag union as the type of a value)

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:47):

yeah as long as you don't include any type annotations or talk about types, everything feels intuitive :laughing:

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:48):

which, to be fair, is another nice property of the system!

view this post on Zulip jan kili (Feb 22 2022 at 18:49):

I return with clean teeth to find my own views expressed better than I could express them! Wonderful.

view this post on Zulip jan kili (Feb 22 2022 at 18:50):

I also don't get what "growing" means in the context of "closed can't grow"

view this post on Zulip Tommy Graves (Feb 22 2022 at 18:51):

Yeah, I think words like "grow"and "accumulate" are tricky to understand in a language without mutation

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:53):

hrm yeah, what I mean is that the union can't combine with other unions

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:54):

like how in that conditional we have the tag unions [ Red ]*, [ Yellow ]* and [ Green ]*, and because they're each used in a different branch of the conditional, they combine to make the entire conditional be [ Red, Yellow, Green ]*

view this post on Zulip Richard Feldman (Feb 22 2022 at 18:54):

but closed unions in conditionals don't do that, they just give type mismatches

view this post on Zulip jan kili (Feb 22 2022 at 18:59):

I think we should reimagine the syntax and terminology from a script author's perspective, since it currently feels oriented toward compiler developers

view this post on Zulip jan kili (Feb 22 2022 at 19:02):

For example, maybe we'd find that the open/closed distinction is not useful in scripts, and then we could just let it be an under-the-hood compiler inferencing/checking implementation detail

view this post on Zulip jan kili (Feb 22 2022 at 19:03):

What problems in developer space were the concepts/features around open/closed tag unions created to solve?

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 19:19):

Yeah, from a user perspective, I feel like they are something like "restricted" and "expandable" unions.

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 19:20):

One with a restricted list of exact values, the other that could expand to contain anything.

view this post on Zulip Emi (Feb 22 2022 at 19:22):

exhaustive / nonexhaustive might work too, which is similar to language Rust uses for marking enums?

view this post on Zulip jan kili (Feb 22 2022 at 19:56):

(Should we move this reimagination discussion to another topic? Perhaps there is a parallel how-to discussion to be had here.)

view this post on Zulip Richard Feldman (Feb 22 2022 at 19:58):

I think it's fine to discuss here :shrug:

view this post on Zulip Richard Feldman (Feb 22 2022 at 19:58):

JanCVanB said:

maybe we'd find that the open/closed distinction is not useful in scripts, and then we could just let it be an under-the-hood compiler inferencing/checking implementation detail

we talked about this at some point, and I don't think it really works

view this post on Zulip Richard Feldman (Feb 22 2022 at 19:58):

or at least, not the way we'd been talking about it

view this post on Zulip Richard Feldman (Feb 22 2022 at 19:59):

there has to be a type variable there - that's really important

view this post on Zulip Richard Feldman (Feb 22 2022 at 19:59):

and sometimes it's not *

view this post on Zulip Richard Feldman (Feb 22 2022 at 19:59):

(although to be fair, that's pretty rare)

view this post on Zulip Richard Feldman (Feb 22 2022 at 19:59):

I don't think we should have language syntax or semantics vary by platform

view this post on Zulip Richard Feldman (Feb 22 2022 at 20:00):

Haskell has per-project syntax/semantics and it's a big pain point in the ecosystem

view this post on Zulip Richard Feldman (Feb 22 2022 at 20:03):

one problem with hiding it syntactically in the type - that is, making there be no observable distinction between the type signature of an open or closed tag union, except in the specific and rare case that there has to be a named type variable - is that you'd sometimes get surprising type mismatches

view this post on Zulip jan kili (Feb 22 2022 at 20:04):

Observations:

view this post on Zulip Richard Feldman (Feb 22 2022 at 20:04):

like a type rendered as [ A, B, C ] would sometimes be type-compatible with another type rendered as [ A, B ] and other times not, depending on whether they were secretly open or closed under the hood

view this post on Zulip Richard Feldman (Feb 22 2022 at 20:04):

that seems like it would be less confusing when it works, but super confusing when it didn't work

view this post on Zulip Richard Feldman (Feb 22 2022 at 20:05):

_ -> ... isn't just noisy and sometimes illogical, but also having it means you lose out on a great feature of tags

view this post on Zulip Richard Feldman (Feb 22 2022 at 20:05):

namely that if you add a new tag, you get type mismatches everywhere you need to account for it

view this post on Zulip Richard Feldman (Feb 22 2022 at 20:06):

that's a wonderful feature, and if you had _ -> everywhere, you wouldn't get that

view this post on Zulip jan kili (Feb 22 2022 at 20:07):

Could removing _ -> ... from Roc actually be a good thing?

view this post on Zulip Folkert de Vries (Feb 22 2022 at 20:11):

good luck matching on all the integers exhaustively

view this post on Zulip Folkert de Vries (Feb 22 2022 at 20:11):

or worse, strings

view this post on Zulip Tommy Graves (Feb 22 2022 at 20:12):

I was toying around with alternative type renderings, like:
for a parameter's type signature, display it as [ A, B, _ ] instead of [ A, B ]*
for a value's type, render it as [ A, B ]+ instead of [ A, B ]*

Those obscure that the * is a type variable, though, which I suppose is probably relevant?

view this post on Zulip Tommy Graves (Feb 22 2022 at 20:13):

[ A, B, _ ] has some intriguing value to me because it matches how you would pattern match the options but I'm not sure it actually helps with the conceptual problem we are talking about

view this post on Zulip jan kili (Feb 22 2022 at 20:14):

@Folkert de Vries

Could removing _ -> ... from Roc actually be a good thing?

... for tags only? Idk if that's possible...

view this post on Zulip Folkert de Vries (Feb 22 2022 at 20:15):

in theory, but in practice that will just be very annoying I think. Having a catch-all is very convenient

view this post on Zulip jan kili (Feb 22 2022 at 20:55):

New observation: _ -> ... in a tag-parsing when block only requires "openness" if the tag union isn't otherwise constrained.
Example of using a catch-all without requiring "openness":

light : [Red, Yellow, Green]
when light is
    Red -> 1
    _ -> 2

view this post on Zulip jan kili (Feb 22 2022 at 20:56):

What if using a tag catch-all for an unconstrained tag type was disallowed? Are there any use cases that would be hindered by that?

view this post on Zulip jan kili (Feb 22 2022 at 21:01):

lightParser = \light ->
    when light is
        Green -> "good"
        _ -> "oh no"
lightParser Yellow

Error: Unconstrained Tag Union: lightParser must have an annotation or drop its catch-all case.
This might make sense to me as a developer, but I it would be a blemish on Roc's stellar type inference claims :/

view this post on Zulip jan kili (Feb 22 2022 at 21:02):

I want something to change, but I'll stop prematurely jumping to solutions now :stuck_out_tongue_wink:

view this post on Zulip Richard Feldman (Feb 22 2022 at 21:13):

honestly, I don't think removing catch-all would help here

view this post on Zulip Richard Feldman (Feb 22 2022 at 21:13):

whether or not there's a catch-all branch, it's still the case that we very much want (and indeed need) it to be possible to return different tags from different branches of a conditional

view this post on Zulip Richard Feldman (Feb 22 2022 at 21:13):

for example, this has to work:

if blah then
    True
else
    False

view this post on Zulip Richard Feldman (Feb 22 2022 at 21:14):

else has the same semantics as _ -> in when, and obviously we're not removing else :big_smile:

view this post on Zulip Richard Feldman (Feb 22 2022 at 21:14):

for that conditional to work, it has to be possible to have different tags in different branches

view this post on Zulip Richard Feldman (Feb 22 2022 at 21:14):

which in turn means those can't be inferred as [ True ] and [ False ]

view this post on Zulip Richard Feldman (Feb 22 2022 at 21:14):

or else that would be a type mismatch

view this post on Zulip Nicholas Cahill (Feb 22 2022 at 21:25):

I still feel kind of confused by this, in terms of types. It feels like [ Red ]* is treated like a subtype of [ Red ] and [ Red, Green ] while working out what returns from a branching statement, but we're telling the compiler to pretend [ Red ] isn't a subtype of [ Red, Green ] when trying to figure out what the return type of a branching thing is?

view this post on Zulip jan kili (Feb 22 2022 at 21:28):

Here's a weird train of thought, coming from someone who doesn't know type theory, set theory, category theory, or FP:

view this post on Zulip Nicholas Cahill (Feb 22 2022 at 21:39):

Also the idea that I can pass a [ Red ]* value to a function that needs [ Red ]

view this post on Zulip Folkert de Vries (Feb 22 2022 at 22:06):

thinking of this as subtyping is misleading I think. Our types work with unification, basically the idea that 2 types are the same if I can substitute variables in the one type to get to the other type

view this post on Zulip Folkert de Vries (Feb 22 2022 at 22:06):

so in the case of [ Red ]*, I can substitute * = [] to get to [ Red ]

view this post on Zulip Folkert de Vries (Feb 22 2022 at 22:07):

but in the case of [ Red ], there are no substitutions I can make to turn that into [ Red, Green ]

view this post on Zulip Richard Feldman (Feb 22 2022 at 22:32):

yeah the way the compiler works is: whenever more than one type needs to get "unified" into a single type (for example, the types of each branch of a conditional must get unified into a single type, because the conditional expression as a whole needs to have a single type), there are rules for how unification works

view this post on Zulip Richard Feldman (Feb 22 2022 at 22:33):

one of the rules is that if an open tag union unifies with another open tag union, you get a new open tag union containing all the tags of both

view this post on Zulip Richard Feldman (Feb 22 2022 at 22:34):

another of the rules is that if a closed tag union unifies with a closed tag union, they must have identical type mismatch or else unification fails (which is what we call a type mismatch)

view this post on Zulip Richard Feldman (Feb 22 2022 at 22:35):

and one last rule is that whenever an open union and a closed union unify together, if the closed union contains all the tags in the open union, then they unify to the closed union...but if the open union has any tags the closed union doesn't have, then unification fails and it's a type mismatch

view this post on Zulip Richard Feldman (Feb 22 2022 at 22:35):

so those are the rules

view this post on Zulip Richard Feldman (Feb 22 2022 at 22:35):

the question is, is there a better way to explain them than just to lay them out like that?

view this post on Zulip Richard Feldman (Feb 22 2022 at 22:35):

(maybe there isn't!)

view this post on Zulip jan kili (Feb 22 2022 at 22:43):

Wow, those are super direct+clear explanations, which greatly help my understanding of the system implementation. Thanks, F&R!

view this post on Zulip jan kili (Feb 22 2022 at 22:46):

I think a lot of the confusion around tag unions, therefore, boils down to dissatisfaction with the unifier's level of pessimism/caution - for example, why can't two different closed tag unions be unified as a union of the two? The "uni-" (unify, union, union) vocabulary also reinforces that expectation.

view this post on Zulip Folkert de Vries (Feb 22 2022 at 22:54):

unification has a precise technical meaning here. If you can give me a substitution, then 2 types are "equal up to unification" which for us means they have an equivalent type

view this post on Zulip Folkert de Vries (Feb 22 2022 at 22:54):

there is just no substitution to give between [ Red ] and [ Red, Green ] because there are no variables to even substitute

view this post on Zulip jan kili (Feb 22 2022 at 22:55):

I imagine that union-y/optimistic unification might a be a bad thing to do in some situations? I feel like this pessimistic unification is shifting some kind of complexity onto developers, leading them to make every tag union open, for fear of mismatches. If that's a good practice, then we should make it a syntactically (and terminologically) simpler choice.

view this post on Zulip Folkert de Vries (Feb 22 2022 at 22:56):

there is no real thought or value judgement in these rules: this is just how the formalism works. It's a bit like why 1 + 1 is not 11. That's just not what + does, even though it has its own logic

view this post on Zulip jan kili (Feb 22 2022 at 22:58):

We might be talking past each other here: I'm sure that tag unions are implemented in a theoretically sound manner with respect to compilers and type systems, but to Roc app developers they are marketed as a tool for domain modeling. They don't work as expected for domain modeling.

view this post on Zulip Folkert de Vries (Feb 22 2022 at 22:59):

sure, it's just that any system you come up with ultimately should work within this framework of substitutions

view this post on Zulip Folkert de Vries (Feb 22 2022 at 23:00):

also a more restrictive version of this (sum types) seems to work fine in elm/haskell/ocaml/...

view this post on Zulip Folkert de Vries (Feb 22 2022 at 23:01):

we already provide a bunch more flexibility

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 23:03):

I feel like with domain modeling you essentially always want closed unions. You are trying to capture all of the possibilities of each piece of your domain. As such, to merge two unions, you have to make a wrapper union

A : [ Foo, Bar ]
B : [ Baz, Bump ]
AorB : [ SomeA A, SomeB B ]

That would be merging A and B. It wouldn't make sense to do it otherwise.

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 23:04):

I don't think you want the super union of AB : [ Foo, Bar Baz, Bump ]

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 23:04):

That elides information about the orginal unions they came from.

view this post on Zulip Folkert de Vries (Feb 22 2022 at 23:05):

right, and if that is what you want then you can write the functions to make it happen

view this post on Zulip jan kili (Feb 22 2022 at 23:12):

What should be done here? (Genuinely asking, as I'm still learning DDD.)

Animal : [ Cat, Dog, Lion, Tiger ]
Pet : [ Cat, Dog ]
breed : Animal, Animal -> List Animal
dog1 : [ Dog ]
dog2 : [ Dog ]

I want to do puppies = breed dog1 dog2, but I think that would generate a type mismatch because Pet can't be unified with Animal.

view this post on Zulip Folkert de Vries (Feb 22 2022 at 23:14):

yes

view this post on Zulip Folkert de Vries (Feb 22 2022 at 23:14):

you're trying to do subtyping

view this post on Zulip Folkert de Vries (Feb 22 2022 at 23:14):

why?

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 23:15):

One option:

Animal : [ Pet Pet, Wild Wild ]
Pet : [ Cat, Dog ]
Wild : [ Lion, Tiger ]

Another :

Animal : [ Cat, Dog, Lion, Tiger ]
isPet : Animal -> Bool

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 23:18):

Specifically, for the function:

dog1 : Animal
dog1  = Pet (Dog "whatever properties")

Then with the breed you would just match on the union. It will fail on an animal type mismatch. Returning a result with an error

view this post on Zulip Folkert de Vries (Feb 22 2022 at 23:19):

for a bit more context: this example looks exactly like the sort of thing you'd as the first example of class hierarchies in object-oriented programming and coming from the FP tradition that just does not make sense to me

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 23:20):

That's 100% true, but we still have to enable a way to model it. That model may 100% throw out the hierarchy or do thing very different, but if someone wants to solve a problem like this, there still needs to be a solution.

view this post on Zulip Folkert de Vries (Feb 22 2022 at 23:22):

yeah but then my question is what the shared behavior actually is here

view this post on Zulip jan kili (Feb 22 2022 at 23:29):

Side note, I'm unsure how to proceed because:

  1. I accidentally picked an OOP-like example, but my real point was that I think it's weird that tag functions reject inputs for being too-specific
  2. I want to learn FP and ditch my problematic OOP tendencies
  3. I want Roc to be welcoming to emigrants from OOP

:laughing: :shrug:

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 23:44):

I guess my question is why is dog1 a [ Dog ] in your example? Just make it a [ Dog ]* or Animal and everything works.

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 23:46):

Also, I am totally for trying to take a few full baked OOP examples and then trying to convert them into idiomatic Roc. That would probably be a good mini tutorial repo.

view this post on Zulip jan kili (Feb 23 2022 at 00:57):

@Brendan Hansknecht

I guess my question is why is dog1 a [ Dog ] in your example? Just make it a [ Dog ]* or Animal and everything works.

In the context of domain modeling, is it always best to use closed tag unions for all type aliases and open tag unions for all value annotations? It seems that way, to maximize flexibility.

view this post on Zulip Richard Feldman (Feb 23 2022 at 00:59):

I'd hesitate to say "always" whenever it comes to domain modeling, but off the top of my head I can't think of a situation where it would be a good idea to use an open tag union for domain modeling :thinking:

view this post on Zulip Richard Feldman (Feb 23 2022 at 01:00):

that said, I think it's fine to annotate values as closed unions if you know that's how they're going to be used

view this post on Zulip Richard Feldman (Feb 23 2022 at 01:00):

for example, I think annotating something as Bool (which is a type alias for [ True, False ]) is better than annotating it as [ True ]* even if that's technically more flexible :big_smile:

view this post on Zulip Richard Feldman (Feb 23 2022 at 01:01):

I mean, let's be honest - you know that [ True ]* is gonna end up as a Bool

view this post on Zulip Richard Feldman (Feb 23 2022 at 01:01):

and if it doesn't, it's because I made a mistake and I'd rather the compiler told me about it!

view this post on Zulip Nicholas Cahill (Feb 23 2022 at 01:48):

Okay yeah; I think describing these things as tag unions made me _really_ want to think of them as union types

view this post on Zulip Brendan Hansknecht (Feb 23 2022 at 01:58):

Are they somehow not union types? I am not sure I understand the comment. Can you define union types?

view this post on Zulip jan kili (Feb 23 2022 at 02:00):

They're unions with extra rules: closed unions can't be unioned further, and open unions are somewhat infinite

view this post on Zulip Nicholas Cahill (Feb 23 2022 at 02:01):

A union type is a type that has its union members as subtypes, so you get variances in values and function constraints from them.

view this post on Zulip jan kili (Feb 23 2022 at 02:02):

Should the takeaway for app developers be "in your tag types, delete this asterisk to get stricter type checking by disallowing passage to more-general-purpose functions"? If so, maybe they should be renamed like this

A complementary syntax change for [ A, B, C ]* / [ A, B, C ] might be [ A, B, C ] / [[ A, B, C ]] or ( A, B, C ) / [ A, B, C ], to make the former simpler & gentler. (This ignores [ A, B ]c because I don't grasp its intuitions yet.)

view this post on Zulip Nicholas Cahill (Feb 23 2022 at 02:05):

Like, Result = [ Ok Str, Err Str ] isn't a union of an Ok type and an Err type. Conceptually anyway. It's just its own type, and Ok Str isn't a subtype of it

view this post on Zulip Nicholas Cahill (Feb 23 2022 at 02:07):

Like when I say Ok "okay okay", I didn't instantiate an Ok, I instantiated a Result

view this post on Zulip Ayaz Hafiz (Feb 23 2022 at 02:08):

It might be helpful to not try to think in terms of subtyping, because there is no subtyping in Roc

view this post on Zulip Brendan Hansknecht (Feb 23 2022 at 02:09):

Ok, so I get the result comment.
What is an example of a proper union type?

view this post on Zulip Nicholas Cahill (Feb 23 2022 at 02:30):

Let's see. So if I was writing Elixir, I could do something like

def magnitude(x) when is_number(x) do
   x
end
def magnitude(x) when is_string(x) do
   String.length(x)
end

Then magnitude is a function that can take a string or an integer, no tags or anything like that.

view this post on Zulip Nicholas Cahill (Feb 23 2022 at 02:30):

I really need to learn how to make things look nicer in Zulip...

view this post on Zulip jan kili (Feb 23 2022 at 02:31):

Replace these single-quotes with backticks:

'''
code
'''

view this post on Zulip Brendan Hansknecht (Feb 23 2022 at 02:34):

You also can add the language name after the first set of backticks to get syntax highlighting if the language is supported.

view this post on Zulip Brendan Hansknecht (Feb 23 2022 at 02:35):

And interesting example, that would have never crossed my mind as a union at all. More like an interesting for of overloading a function. But I guess my background in unions come from c/c++ and rust.

view this post on Zulip Richard Feldman (Feb 23 2022 at 02:36):

I think the name "tag set" is interesting :thinking:

view this post on Zulip Nicholas Cahill (Feb 23 2022 at 02:36):

Using these tagged types, it might look more like this:

def magnitude({:int, i} = x do
    i
end
def magnitude({:string, s} = x) do
    String.length(s)
end

This input type would be something like [Int Int, String String]

view this post on Zulip Nicholas Cahill (Feb 23 2022 at 02:37):

Oh yeah, knowing how to format stuff rules

view this post on Zulip Nicholas Cahill (Feb 23 2022 at 02:39):

That example isn't quite perfect either, you'd want to use a struct to make it pickier

view this post on Zulip Brendan Hansknecht (Feb 23 2022 at 02:39):

Yeah, and in roc, it would be something like:

magnitude : [ Int I64, Str Str ] -> I64
magnitude = \x ->
    when x is
        Int i -> i
        Str s -> Str.len s

view this post on Zulip Richard Feldman (Feb 23 2022 at 02:40):

(incidentally, the reason they're called "tag unions" is in large part because in memory they are tagged unions)

view this post on Zulip Brendan Hansknecht (Feb 23 2022 at 02:41):

Definitely quite different from how the first example would actually compile and work, but similar to your second example, just with a single function and a match.

view this post on Zulip Nicholas Cahill (Feb 23 2022 at 02:52):

Yeah. That Elixir example is maybe a bad one because it still IS kind of a union. I think in most languages you're forced to give something like [ Int I64, Str Str] a name and reference that name when instantiating it, so it's clearer that it doesn't necessarily have anything to do (in terms of not having any kind of subtype/supertype relationship ) with [ Int I64, Str Str, Bool Bool ]

view this post on Zulip Nicholas Cahill (Feb 23 2022 at 02:55):

Like a Rust Result has Ok and Err in it, but Ok and Err don't mean anything outside of the context of Result. They're really Result::Ok and Result::Err

view this post on Zulip Johannes Maas (Feb 24 2022 at 11:47):

I get the feeling that there is a more defined question lurking beneath this discussion. I think it might be "open to whom?"

When returning tags from a function or a conditional, I know that no one can add additional tags. So while all the tags I use should combine, the result is a closed union.

When I use a tag union, I need to know if I am seeing all possible tags. If I own all the code, I'm good. What the compiler sees is all there is. But if I am exposing a function, I need to decide whether people can only use the tags I specified or whether I allow them to use additional ones.

So maybe there is some useful line to draw for internal and exposed types?

view this post on Zulip Loric Brevet (Feb 24 2022 at 15:27):

In the use case of a function, from a developer's point of view, I feel that we can also think differently depending on inputs/outputs.

To me, and as a true beginner in Roc (this topic was the first message I posted on Zulip after having dug Roc for a few days), I can see two different definitions of open/closed for inputs and outputs.

For inputs, it is rather understandable for someone new to the language.

Inputs of a function:

From what I read from this discussion, the way it works for outputs is more because of the implementation details. The compiler has to have a type that combines all the possible tags of a pattern matching, and this results in an open tag union. So the difference between a returned open tag union and a closed tag union is more difficult to understand out of the box.

If I try to define them.

Outputs of a function:

To me, this is the last one that is hard to comprehend easily.

view this post on Zulip Folkert de Vries (Feb 24 2022 at 15:30):

right its a tag union with at least the tags that this function can return, but others can add to it

view this post on Zulip Folkert de Vries (Feb 24 2022 at 15:30):

this happens often when dealing with errors

view this post on Zulip Folkert de Vries (Feb 24 2022 at 15:31):

a particular function can only return (say) 3 different errors, but then you combine it with another function that can return 2 different ones

view this post on Zulip Johannes Maas (Feb 24 2022 at 15:52):

Loric phrased it better than I did: Does it help to automatically manage the openness of outputs and only bother the developer with that for inputs?

view this post on Zulip Loric Brevet (Feb 24 2022 at 15:53):

This example for errors makes sense.

But on that occasion, wouldn’t it be better to have a type alias that aliases to the full set of available errors (so a closed tag union), so that the function can return this specifically?

HttpError : [ NotFound, BadGateway, Unauthorized ]

toHttpError : [ NothingHere, NotAuthenticated ]* -> HttpError
toHttpError = \err ->
    when err is
        NothingHere -> NotFound
        NotAuthenticated -> Unauthorized
        _ -> NotFound

If toHttpError would not be annotated, the compiler would still infer that to [ NotFound, Unauthorized ]*, but from a design perspective, annotating with the alias could make more sense?

view this post on Zulip Johannes Maas (Feb 24 2022 at 15:56):

I think the advantage of open tag unions is that it is quite bothersome to manage that alias. In Rust I tend to have to create many different error types and adding a new error variant is tedious.

In that case implicitly declaring the variants using an open union is much more ergonomic without an obvious downside.

view this post on Zulip Johannes Maas (Feb 24 2022 at 15:57):

I'm wondering, what is a use case for accepting an open union in e. g. a function? Why would I want to handle external variants?

view this post on Zulip jan kili (Feb 24 2022 at 16:21):

I'm wondering the same thing. Having a catch-all case (_ ->) for tags requires that function accepting an open tag union, and this pattern is prevalent in the tutorial:

stoplightStr =
    when stoplightColor is
        Red -> "red"
        _ -> "not red"

view this post on Zulip jan kili (Feb 24 2022 at 16:22):

However, is this an anti-pattern? What are some real-world use cases where this convenience/flexibility leads to better code than explicitly handling every tag in the union?

view this post on Zulip jan kili (Feb 24 2022 at 16:23):

I started to come up with an example:

format = \error ->
    when error is
        Unauthorized -> { code: 401, message: "Unauthorized" }
        NotFound -> { code: 404, message: "Not Found" }
        BadGateway -> { code: 502, message: "Bad Gateway" }
        _ -> { code: 500, message: "Internal Server Error" }

but I don't see how this code wouldn't benefit from the addition of an InternalServerError or Other tag.

view this post on Zulip jan kili (Feb 24 2022 at 16:24):

Especially since handling multiple cases is concise:

stoplightStr =
    when stoplightColor is
        Red -> "red"
        Yellow | Green -> "not red"

view this post on Zulip jan kili (Feb 24 2022 at 16:27):

Actually, the catch-all case doesn't require open tag union input, nevermind:

» f : [ A, B, C ] -> Str f = \t ->     when t is         A -> "a"         _ -> "x" { a: f A, b: f B, c: f C }

{ a: "a", b: "x", c: "x" } : { a : Str, b : Str, c : Str }

»

view this post on Zulip jan kili (Feb 24 2022 at 16:29):

So, if _ -> can still be used to handle all remaining tags in a closed tag union, the use case in question (for open tag unions as function inputs) seems to shrink to (a) some kind of polymorphism and (b) parsing tag unions defined outside your codebase. These seem like they could be useful in a library, but what are some real-world examples?

view this post on Zulip jan kili (Feb 24 2022 at 16:30):

On another note, after private messaging with @Ayaz Hafiz the other night, I feel like I understand the ideas behind open/closed tag unions much better now. Thanks, Ayaz :)

view this post on Zulip jan kili (Feb 24 2022 at 16:31):

I'm happy to share, but I don't want to steal Ayaz's thunder if a big post is coming soon

view this post on Zulip jan kili (Feb 24 2022 at 16:34):

Each day I'm feeling less sure about how to improve tag unions, because as I learn more about them I'm less eager to downgrade/nerf/gut their powerful flexibility... but they're still so confusing...

view this post on Zulip jan kili (Feb 24 2022 at 16:36):

I feel like a rioting peasant who has been invited into the castle to dine with the king, and now that I've gotten to know him I'm unsure about beheading him... :laughing:

view this post on Zulip Brendan Hansknecht (Feb 24 2022 at 16:49):

Frankly, I essentially just see open tag unions as a way to make type checking simpler, and to make more dynamic roc program easier to write.

If I was writing a large production grade app, I would probably use closed tag unions almost all of the time. Even with error types, if you use an open union you may end up getting multiple essentially identical variants. It's better to control and centralize that list and share an alias than to generate it from merging unions.

view this post on Zulip Brendan Hansknecht (Feb 24 2022 at 16:52):

Of course I can think of exceptions. For example, maybe an error returned from a library should be an open union so that it can be used more flexibly and get merged into the applications error type. But it also wouldn't harm an application much if it wasn't open. The application would just have to wrap or transform the libraries union.

view this post on Zulip Loric Brevet (Feb 24 2022 at 17:11):

I think I have the same feeling @JanCVanB about understanding those tags better each day, and accepting this design as being correct. And yes there will be exceptions where open unions will make more sense. And maybe this will lead to some patterns that could become "idiomatic" in Roc in the future.

The real problem with all of that is the learning curve for beginners. I am sure plenty of people will have the same questioning and so the answer might be to "simply" find the correct way to teach them, as @Richard Feldman told in a previous comment. But I am sure that explained slowly and iteratively, it could become rapidly at hand for newcomers.

I would look forward to reading a good blog post about it if anyone decides to take the plunge!

view this post on Zulip Tommy Graves (Feb 24 2022 at 17:13):

The good news is that we've recognized early on that this is a major point of confusion. I think by the time the language "goes public" there will be a very well rehearsed, compelling, and sensible way to explain them -- and I am pretty confident the terminology of open/closed will get replaced with something much more illuminating.

view this post on Zulip Ayaz Hafiz (Feb 24 2022 at 17:22):

I intend to write a more full description of why these kinds of tag unions (and extensible records) are useful and give intuitions for their behavior at some point. I think we have a few good partial explanations floating around, and there are good external resources too, they just need to be aggregated and exposed in a more accessible way.

In the meantime, here is a short note I wrote to try to help explain the difference between how Roc's tag unions work and how subtyping works, that I shared with @JanCVanB. It's not polished and may not be the most helpful in the world, but I hope someone can get something out of it, if they are wondering. https://gist.github.com/ayazhafiz/bfeb59736e746d150678bdabfb5226cd

view this post on Zulip jan kili (Feb 24 2022 at 18:06):

Thanks for that gist, Ayaz, it's great

view this post on Zulip jan kili (Feb 24 2022 at 18:07):

Making connections between [ A ]* and Int * is helpful

view this post on Zulip jan kili (Feb 24 2022 at 18:09):

I'm finding that there is no common app-level intuition for what * means - it means something different in tags vs records vs numbers. However, it always means the same thing from a compiler perspective - specialization.

view this post on Zulip Tommy Graves (Feb 24 2022 at 18:11):

That is an excellent explanation Ayaz!

view this post on Zulip Johannes Maas (Feb 28 2022 at 21:02):

I had some fun gathering my thoughts and working out a document. I think I managed to lay out a slight modification (allowing composition of closed tag unions) that might help with this problem: https://gist.github.com/j-maas/ed3d2811d808d0fa1386478575df928d

Feel free to point out any mistakes in that argumentation! :)


Last updated: Jul 06 2025 at 12:14 UTC