example use case for open tag union · beginners

I was following the discussion about open vs closed tag unions and started wondering what the use cases for open tag unions are.

To clarify, I understand how they are important for outputs like flexible error types where it's easy to return additional error variants.

I'm wondering about inputs. What's an example of a function that requires an open tag union as its input?

Johannes Maas (Feb 24 2022 at 22:29):

I just had the idea to reread the tutorial and Roc for Elm document to see if the example would help me. Unfortunately I couls only find artificial examples like

example : [ Foo Str, Bar Bool ]* -> Bool
example =
  \tag ->
    when tag is
      Foo str -> Str.isEmpty str
      Bar bool -> bool
      _ -> False

So I'm still not sure what a real use case for a function accepting open tag unions might be. I'm starting to get the feeling that maybe we don't need explicit open and closed tags. But I haven't fully grasped it yet.

Richard Feldman (Feb 24 2022 at 22:31):

this is a common rite of passage along the way to discovering that we do need both :big_smile:

Richard Feldman (Feb 24 2022 at 22:32):

but I agree that there may not be a real use case for a function accepting open tag unions

Richard Feldman (Feb 24 2022 at 22:32):

Richard Feldman (Feb 24 2022 at 22:33):

but without open unions, chaining effects that can fail in different ways would be pretty unpleasant :sweat_smile:

jan kili (Feb 24 2022 at 22:43):

@Richard Feldman can you share an example of how an effect chain would use open unions?

Brendan Hansknecht (Feb 24 2022 at 22:47):

I have one example of wanting an open tag union, but as a result type. Think of a function that takes a lambda as an arg. The lambda would do something and then maybe return an error. The error type of the lambda would be an open tag union. The function as whole would return either one of it's own errors, or one of the errors that the lambda could return.

Brendan Hansknecht (Feb 24 2022 at 22:48):

So the lambda returns an open union that is merged with the overall functions error union. The function could not know this union ahead of time due to accepting any lambda.

Richard Feldman (Feb 24 2022 at 22:49):

Richard Feldman (Feb 24 2022 at 22:50):

so open unions mean that File.read, File.write, and Http.get can have different error types, and yet this still works

Richard Feldman (Feb 24 2022 at 22:51):

the type of task there is something like Task {} [ HttpErr Http.Err, FileReadErr File.ReadErr, FileWriteErr File.WriteErr ]*

Richard Feldman (Feb 24 2022 at 22:52):

File.read : Str -> Task Str [ FileReadErr File.ReadErr ]*
File.write : Str, Str -> Task Str [ FileWriteErr File.WriteErr ]*
Http.get : Str -> Task Str [ HttpErr Http.Err ]*

Richard Feldman (Feb 24 2022 at 22:52):

alternatively, you could make them all have one gigantic error type called like SomethingWentWrong

Richard Feldman (Feb 24 2022 at 22:55):

then they'd be chainable, but when trying to recover from the error, you'd basically be stuck doing a when that either handles any possible thing that could conceivably ever go wrong (including errors from types of I/O operations that aren't even happening here, because they'd all need to be included in that error union)

Richard Feldman (Feb 24 2022 at 22:56):

so like imagine in a language with exceptions, if you just had one Exception type and that was it

Richard Feldman (Feb 24 2022 at 22:57):

this is the original use case for having open unions in the language incidentally

jan kili (Feb 24 2022 at 23:06):

So in this example, is await the function that takes an open tag union as an input? I don't see why it can't take a closed tag union input.

Richard Feldman (Feb 24 2022 at 23:06):

await : Task a err, (a -> Task b err) -> Task b err

Richard Feldman (Feb 24 2022 at 23:06):

jan kili (Feb 24 2022 at 23:07):

Does that example have any function ~~with~~ that benefits from an open tag union input?

Richard Feldman (Feb 24 2022 at 23:07):

jan kili (Feb 24 2022 at 23:08):

jan kili (Feb 24 2022 at 23:09):

Ah, nevermind, I see now that you weren't proposing this as an example of an OTU input.

Richard Feldman (Feb 24 2022 at 23:10):

jan kili (Feb 24 2022 at 23:11):

I'm wondering if anyone would notice if Roc started inferring function input tag unions as closed instead of open.

Richard Feldman (Feb 24 2022 at 23:13):

jan kili (Feb 24 2022 at 23:13):

Richard Feldman (Feb 24 2022 at 23:13):

Richard Feldman (Feb 24 2022 at 23:14):

jan kili (Feb 24 2022 at 23:14):

Richard Feldman (Feb 24 2022 at 23:14):

jan kili (Feb 24 2022 at 23:14):

Richard Feldman (Feb 24 2022 at 23:14):

Richard Feldman (Feb 24 2022 at 23:15):

like if we only have closed unions, then chaining tasks together would be a lot less pleasant!

Richard Feldman (Feb 24 2022 at 23:15):

also, if we only had closed unions, then other things would need to change - e.g. you'd have to declare the type up front before using one

jan kili (Feb 24 2022 at 23:15):

jan kili (Feb 24 2022 at 23:16):

jan kili (Feb 24 2022 at 23:17):

Richard Feldman (Feb 24 2022 at 23:18):

I've gotten burned by so many things where the pitch was "saves you keystrokes" and the fine print turned out to be "and will make you want to tear your hair out in a few months!" that I'm now deeply skeptical of anything where that's the pitch

jan kili (Feb 24 2022 at 23:18):

Richard Feldman (Feb 24 2022 at 23:19):

the reason I didn't want to use the await example in the tutorial is that it requires a big detour

Richard Feldman (Feb 24 2022 at 23:20):

but one potential approach is to teach an example platform, and then teach await without talking about types

Richard Feldman (Feb 24 2022 at 23:20):

so then by the time you get to open unions, I can just say "ok now let's look at await" which you're already familiar with from the earlier parts of the tutorial

jan kili (Feb 24 2022 at 23:30):

:face_palm: I just realized that tag union inputs for functions with no _ -> aren't inferred as open... they're inferred as closed. I somehow mis-assumed that, which is why I was asking about "would anyone notice"... whoops

» f = \t ->
…     when t is
…         A -> "a"
…         B -> "b"
… f

<function> : [ A, B ] -> Str

»

Ayaz Hafiz (Feb 24 2022 at 23:33):

They used to be inferred as open, but we have to infer them as closed or otherwise it's a soundness bug - then you could pass in a C, and we would compile the program but crash at runtime

jan kili (Feb 24 2022 at 23:34):

That realization is what prompted me to try the REPL - "wait, what would happen if..."

jan kili (Feb 24 2022 at 23:41):

@Johannes Maas I still don't know of a real-world example for a function input OTU, but this is a minimal artificial example:

» f = \t ->
…     when t is
…         A -> "a"
…         _ -> "?"
… x = B
… f x

"?" : Str

»

jan kili (Feb 24 2022 at 23:42):

where f's type is inferred as f : [ A ]* -> Str regardless of the existence & usage of x

jan kili (Feb 24 2022 at 23:44):

Warning to anyone else going to the REPL to learn tag unions, I've discovered multiple tag-union-related bugs/gaps in the compiler that hinder experimental learning.

Ayaz Hafiz (Feb 24 2022 at 23:45):

Are these bugs more than the ones you sent to me? If so can you please also file issues for them :)

jan kili (Feb 24 2022 at 23:48):

Johannes Maas (Feb 25 2022 at 07:02):

I think I see now that you need OTUs for proper type inference. Because if you have a when with a catch-all but you don't give type annotation, you can't know from looking at that when what tags are put into it.

And it is difficult to hide that from developers because they need to know whether they can input just the explicit branches or whether there is a catch-all branch.

Which means that they need to concern themselves with input OTUs, even if in practice they wouldn't really want to ask for input OTUs.

So in a way, you wouldn't really need it in explicit type annotations, but it is necessary for type inference and thus for showing the type of something.

Johannes Maas (Feb 25 2022 at 11:25):

I have feeling about it, and I think it boils down to: Do we need closed tag unions for outputs? I think not, because why prevent someone from piecing them together?

If for explicit type annotations we can always use closed inputs and we do not need closed outputs, then the only case where open unions are relevant is when we infer the type of a when with a catch-all. Then we need to say "I know the input can be one of these tags, but you could also give me any other".

In that case maybe we can never show whether a tag union is open or closed (that's quite naturally determined by wether it's an input or an output), except when we need to show the inferred type for a catch-all conditional where we need to admit that it could accept more tags than we can list.

Johannes Maas (Feb 25 2022 at 11:26):

Basically: If this holds, then the developer wouldn't need to worry about openness. It would just be something that sometimes shows up in inferred types to mark catch-alls.

jan kili (Feb 25 2022 at 12:22):

I agree with everything you just said except for the hiding :D what a wild ride of learning...

jan kili (Feb 25 2022 at 12:25):

But also I do see a use case for closed outputs, and it's best exemplified by something Richard mentioned: for something like Bool : [ True, False ] you might want a program to fail if it ever tried to parse a Bool with a function like parse : [ True, False, Maybe, Other ]* because "no, it can only be true or false!"

Richard Feldman (Feb 25 2022 at 12:39):

so today, let's say I write this function (not saying this is a good function to write, but it's certainly possible, so the compiler has to do something with it!)

blah : [ Foo Str, Bar ]a -> [ Foo Str, Bar, Baz ]a
blah = \fooOrBar ->
    when fooOrBar is
        Foo "" -> Bar
        Foo _ -> Foo "something"
        Bar -> Baz
        other -> other

let's say in this proposed design I don't give this function a type annotation, but I implement it and put it into the repl. What type should the compiler infer and display for me?

Richard Feldman (Feb 25 2022 at 12:41):

one answer could be that the compiler tracks the type variable (this is non-optional btw; even if the type variable is never displayed to the user, it must be there at least behind the scenes in order for the compiler to work properly) and just doesn't render it, so the compiler would say that this is the type:

blah : [ Foo Str, Bar ] -> [ Foo Str, Bar, Baz ]

Richard Feldman (Feb 25 2022 at 12:44):

myFunction : [ Foo Str, Bar, Something ] -> Str
myFunction = \arg ->
    answer : [ Foo Str, Bar, Baz ]
    answer = blah arg

    ...

Richard Feldman (Feb 25 2022 at 12:44):

Richard Feldman (Feb 25 2022 at 12:45):

because although blah allegedly (according to the type annotation the repl told me the compiler inferred for it) returns [ Foo Str, Bar, Baz ], and although I copy/pasted that exact type as the annotation for answer - which I got by calling blah! - those types don't line up because my annotation is missing the Something type. (Because actually blah returns any extra tags you give it, even though it doesn't say it does)

Richard Feldman (Feb 25 2022 at 12:45):

Richard Feldman (Feb 25 2022 at 12:46):

when answer is
    Foo _ -> "foo"
    Bar -> "bar"
    Baz -> "baz"

Richard Feldman (Feb 25 2022 at 12:46):

so if we just hid the type variable from the user, I think we could end up with some very confusing situations

Richard Feldman (Feb 25 2022 at 12:46):

Richard Feldman (Feb 25 2022 at 12:47):

but the actual problem is that it was correctly inferring them and then hiding important information about them :sweat_smile:

Richard Feldman (Feb 25 2022 at 13:15):

a related problem with the "hide the type variables" idea is that if I had this code:

foo : [ A, B ]
foo = doStuff "blah"

bar : [ C, D ]
bar = doOtherStuff "blah"

if condition then
    foo
else
    bar

this might or might not type-check depending on whether the doStuff and doOtherStuff functions happen to be doing an exhaustive when on the values they return

Richard Feldman (Feb 25 2022 at 13:16):

because the type annotations [ A, B ] and [ C, D ] in this hypothetical design could mean either open or closed tag unions, so they wouldn't change even if the implementation of the function changed in a way that made this code invalid

Richard Feldman (Feb 25 2022 at 13:18):

because the compiler would be hiding information that's required to understand why certain things are happening!

Brendan Hansknecht (Feb 25 2022 at 15:01):

Sometimes you legitimately have a specific list that you want to return a value from. It would be a bug if someone added a new type to the list. You would still want a close union output. The stoplight can only be red, green, or yellow. It wouldn't make any sense if someone could add to that union. So a closed union output.

Brendan Hansknecht (Feb 25 2022 at 15:04):

Also, if a function returns an open union, I believe that everything that depends on it will need a catch all case. That may not be desired.

Johannes Maas (Feb 25 2022 at 17:40):

Brendan Hansknecht (Feb 25 2022 at 17:42):

jan kili (Feb 25 2022 at 18:33):

Is this the most-polymorphic builtin/fundamental in Roc? I'm new to polymorphism, so that might be the primary source of my confusion.

Ayaz Hafiz (Feb 25 2022 at 18:44):

f : Bool -> [A, B]e
f = \b -> if b then A else B

when f True is
  A -> "is a"
  B -> "is b"

This will compile without needing a catch all case in the when expression because at the call site f True, we will infer that we need [A, B]e to be equal to [A, B]. And e will be specialized to [], so we will generate a specific version of f that has exactly the function signature Bool -> [A, B] for this use case.

Ayaz Hafiz (Feb 25 2022 at 18:45):

No, records behave in the same way, just "in the opposite direction". For example I can say {a: Str, b: Str}e as a type that can be specialized to be more specific than {a: Str, b: Str}, but must be at least as specific as that.

Brendan Hansknecht (Feb 25 2022 at 18:46):

jan kili (Feb 25 2022 at 18:47):

Yeah, maybe record polymorphism is more familiar to me from JSON/POJO experience

Ayaz Hafiz (Feb 25 2022 at 19:04):

Yes. * is just a special syntax for when there is no other type variable of the same name linked to each other. So {} -> [A]a and {} -> [A]* are the same thing, but [A]c -> [B]c cannot be replaced by [A]* -> [B]*, because the former says "the tags instantiated in c in the input are also shared in the output" while the latter would be equivalent to [A]c -> [B]d.

Stream: beginners

Topic: example use case for open tag union

Johannes Maas (Feb 24 2022 at 17:19):

Johannes Maas (Feb 24 2022 at 22:29):

Richard Feldman (Feb 24 2022 at 22:31):

Richard Feldman (Feb 24 2022 at 22:32):

Richard Feldman (Feb 24 2022 at 22:32):

Richard Feldman (Feb 24 2022 at 22:32):

Richard Feldman (Feb 24 2022 at 22:33):

jan kili (Feb 24 2022 at 22:43):

Brendan Hansknecht (Feb 24 2022 at 22:47):

Brendan Hansknecht (Feb 24 2022 at 22:48):

Richard Feldman (Feb 24 2022 at 22:49):

Richard Feldman (Feb 24 2022 at 22:49):

Richard Feldman (Feb 24 2022 at 22:50):

Richard Feldman (Feb 24 2022 at 22:51):

Richard Feldman (Feb 24 2022 at 22:52):

Richard Feldman (Feb 24 2022 at 22:52):

Richard Feldman (Feb 24 2022 at 22:52):

Richard Feldman (Feb 24 2022 at 22:52):

Richard Feldman (Feb 24 2022 at 22:55):

Richard Feldman (Feb 24 2022 at 22:56):

Richard Feldman (Feb 24 2022 at 22:57):

jan kili (Feb 24 2022 at 23:06):

Richard Feldman (Feb 24 2022 at 23:06):

Richard Feldman (Feb 24 2022 at 23:06):

jan kili (Feb 24 2022 at 23:07):

Richard Feldman (Feb 24 2022 at 23:07):

jan kili (Feb 24 2022 at 23:08):

jan kili (Feb 24 2022 at 23:09):

Richard Feldman (Feb 24 2022 at 23:10):

Richard Feldman (Feb 24 2022 at 23:10):

Richard Feldman (Feb 24 2022 at 23:10):

Richard Feldman (Feb 24 2022 at 23:10):

jan kili (Feb 24 2022 at 23:11):

jan kili (Feb 24 2022 at 23:11):

Richard Feldman (Feb 24 2022 at 23:13):

jan kili (Feb 24 2022 at 23:13):

Richard Feldman (Feb 24 2022 at 23:13):

Richard Feldman (Feb 24 2022 at 23:13):

Richard Feldman (Feb 24 2022 at 23:14):

Richard Feldman (Feb 24 2022 at 23:14):

jan kili (Feb 24 2022 at 23:14):

Richard Feldman (Feb 24 2022 at 23:14):

jan kili (Feb 24 2022 at 23:14):

Richard Feldman (Feb 24 2022 at 23:14):

Richard Feldman (Feb 24 2022 at 23:15):

Richard Feldman (Feb 24 2022 at 23:15):

jan kili (Feb 24 2022 at 23:15):

jan kili (Feb 24 2022 at 23:16):

jan kili (Feb 24 2022 at 23:16):

jan kili (Feb 24 2022 at 23:16):

jan kili (Feb 24 2022 at 23:17):

jan kili (Feb 24 2022 at 23:17):

jan kili (Feb 24 2022 at 23:17):

Richard Feldman (Feb 24 2022 at 23:18):

Richard Feldman (Feb 24 2022 at 23:18):

jan kili (Feb 24 2022 at 23:18):

Richard Feldman (Feb 24 2022 at 23:19):

Richard Feldman (Feb 24 2022 at 23:20):

Richard Feldman (Feb 24 2022 at 23:20):

Richard Feldman (Feb 24 2022 at 23:20):

jan kili (Feb 24 2022 at 23:30):

Ayaz Hafiz (Feb 24 2022 at 23:33):

jan kili (Feb 24 2022 at 23:34):

jan kili (Feb 24 2022 at 23:41):

jan kili (Feb 24 2022 at 23:42):

jan kili (Feb 24 2022 at 23:44):

Ayaz Hafiz (Feb 24 2022 at 23:45):

jan kili (Feb 24 2022 at 23:48):

Johannes Maas (Feb 25 2022 at 07:02):

Johannes Maas (Feb 25 2022 at 11:25):

Johannes Maas (Feb 25 2022 at 11:26):

jan kili (Feb 25 2022 at 12:22):

jan kili (Feb 25 2022 at 12:25):

Richard Feldman (Feb 25 2022 at 12:39):

Richard Feldman (Feb 25 2022 at 12:41):

Richard Feldman (Feb 25 2022 at 12:44):

Richard Feldman (Feb 25 2022 at 12:44):

Richard Feldman (Feb 25 2022 at 12:45):