Open tag unions · beginners · Zulip Chat Archive

I've read the tutorial's take on tag unions and it's saying after an if statement with two different tags as return values, the resulting type would be an open tag unions of those tags. Moreover, if you try to pattern match with when on an open tag union, it's written you should handle all possible cases, which means having a _ -> something case.

So I was expecting this program to not typecheck properly, but it's currently working without any problem. Is it me not understanding tag unions?

Michał Łępicki (Oct 19 2022 at 17:43):

It's mentioned in the tutorial, look for text: "when you already have a value which is an open union, you have fewer requirements"

Brendan Hansknecht (Oct 19 2022 at 17:56):

The type of a by itself would be [ Err Str, Ok Str ]*. This enables it to merge with other tags if necessary.

In this specific case, the when clause requested that it would be [ Err Str, Ok Str ]. So the open union is now restricted to being a closed union and it types check.

x : Num a
x = 12
y : U8
y = 15
z = x + y

What type is x? Originally x could be any number type. Then it was added to y which is a U8. As such, x has to be a U8 as well.

Brendan Hansknecht (Oct 19 2022 at 17:57):

The same sort of use limiting the possible type is what is happen with that if and when statement.

Brendan Hansknecht (Oct 19 2022 at 18:01):

a = if 3 > 4 then Ok "ok" else Err "err"
b =
    when a is
        Ok _ -> "yes"
        Err _ -> "no"
        _ -> "other?"

a = if 3 > 4 then Ok "ok" else Err "err"
b =
    when a is
        Ok _ -> "yes"
        Err _ -> "no"
        LastConcrete -> "That other concrete tag"

Brendan Hansknecht (Oct 19 2022 at 18:05):

in the first, a would have the type [ Ok Str, Err Str ]*.
In the second, a would have the type [ Ok Str, Err Str, LastConcrete ].

Kristian Notari (Oct 20 2022 at 07:40):

Ok so basically the use of when is restricting the value I'm pattern matching on based on the cases I've written. But I don't get how a is being restricted when it's not restricted originally.

If I have [Ok Str, Err Str]* I can't for sure restrict it to be [Ok Str, Err Str] cause there could be possible different cases for it. And that's what a is in this context.

Is that the case where the type checker automatically restrict a to be a closed union [Ok Str, Err Str] even in a assignment? So whenever I use a it's a closed union since I've once used it as a closed union in a when inside b? Otherwise I'm not getting it.

Kristian Notari (Oct 20 2022 at 07:42):

I mean, the a b is referring to is the same a I've defined on the previous row, right? It's not like I'm using some untyped argument to pattern match on

Michał Łępicki (Oct 20 2022 at 10:20):

» a = \e -> if 3 > 4 then Ok "ok" else e
… b = \v -> when a v is
…             Ok "ok" -> "ok"
…             Err "err" -> "err"
… b

── UNSAFE PATTERN ──────────────────────────────────────────────────────────────

This when does not cover all the possibilities:

5│>      b = \v -> when a v is
6│>                  Ok "ok" -> "ok"
7│>                  Err "err" -> "err"

Other possibilities include:

    Err _
    Ok _

I would have to crash if I saw one of those! Add branches for them!

Kristian Notari (Oct 20 2022 at 10:22):

That's because you're working with unknown values given in input, cause both a and b use their arguments which they don't know anything about, so the type checker safely assume they're open

Michał Łępicki (Oct 20 2022 at 11:24):

Right, that's a bad example because strings are not singleton types themselves, my bad.

Kristian Notari (Oct 20 2022 at 12:11):

Michał Łępicki (Oct 20 2022 at 12:18):

My understanding is: value a has an "open" union type, meaning it could be extended with some other tag so that another value has a different type. But in this case, when passed as an input to the when which expects a "closed" union type, it's fine because all possible tag variants in the a type are statically known and compatible. In my example (after fixing whens in the b function), the argument of function b gets narrowed to a closed tag union even though a function has a wider parametrized tag argument type

Kristian Notari (Oct 20 2022 at 12:27):

I come from a typescript background. If I have a value a which is typed as string | number which means either a string or either a number but I don't know what, if I "pattern match" on it to explore all possible cases I can't simply write a single case for string because the type checker would argue I'm missing a branch for number case.

In my example, I have a typed as [Ok Str, Err Str]* which I understood being the same as "I don't know which type of tag union this is, due to the * in it could be anything, but for sure if I encounter an Ok or Err tag they have a single value of type Str".

So when I pattern match on the a value and I'm explicitly declaring only two branches covering the Ok and Err tags, I'm left with * which could mean a could be anything else other than Ok and Err and I'm not covering that case.

Kristian Notari (Oct 20 2022 at 12:28):

Kristian Notari (Oct 20 2022 at 12:29):

Maybe @Brendan Hansknecht can answer me based on my last messages, because I'm probably missing something important

Michał Łępicki (Oct 20 2022 at 12:30):

I think that's the difference, in roc this doesn't mean that this value might also have any other tag inside, I think it means that it is known that it's only either Ok Str or Err Str at this point, but it could be composed with other tags for a different value

Kristian Notari (Oct 20 2022 at 12:34):

I guess a type itself doesn't constraint how a value of said type could be composed with some other values/types. The * there should mean it can be other things other than Ok Str and Err Str, and I supposed you should cover that case in the when, as you do in the tutorial in this first example

Kristian Notari (Oct 20 2022 at 12:35):

Michał Łępicki (Oct 20 2022 at 12:53):

Kristian Notari (Oct 20 2022 at 12:59):

I'm guessing here, but after having reread the answer I've received above, I'm suspecting a thing:

Kristian Notari (Oct 20 2022 at 13:04):

The only thing left for me to understand is: why should it be possible to restrict some value type if it doesn't affect the original value type?

In my example, the two a used are actually the same a, the same value. If I'm using that in my b value and by doing so restricting it to only be [Ok Str, Err Str] (without *), my a value, outside b keep its initial type of [Ok Str, Err Str]*.

Kristian Notari (Oct 20 2022 at 13:04):

Ghislain (Oct 20 2022 at 13:15):

I thought I understood but I agree with you, what is the type of a in this example, and why it works?

a = if 3 > 4 then Ok "ok" else Err "err"
b = when a is
    Ok _ -> "yes"
    Err _ -> "no"

c = when a is
    Ok _ -> "yes"
    Err _ -> "no"
    _ -> "?"

Str.concat b c

# "nono" : Str

Kristian Notari (Oct 20 2022 at 13:18):

Yeah cause in a mutable environment (so not Roc), you could change a value before using it in b and if type is not restricted (but it's only restricted within b), you could assign any other tag to a, causing b to crash at runtime. That is not the case, due to immutability, but still I'm not getting it

Kristian Notari (Oct 20 2022 at 13:20):

a still is [Ok Str, Err Str]* which is the more generic alternatives from the two. But why b type check? It should not in your example. Is it correctly working? Have you tried it in the repl?

Ghislain (Oct 20 2022 at 13:20):

Kristian Notari (Oct 20 2022 at 13:23):

Maybe the third "catch all" case in c, since a is already restricted by b, is redundant. It could be omitted. Maybe that would be a warning in the future. That would explain why b works, but this also implies a type being restricted actually, even if it doesn't seem to

Ghislain (Oct 20 2022 at 13:24):

a : [Ok Str, Err Str]*

c = when a is
    Ok _ -> "yes"
    Err _ -> "no"
    _ -> "?"

The 3rd pattern is redundant:

10│      c = when a is
11│        Ok _ -> "yes"
12│        Err _ -> "no"
13│        _ -> "?"
           ^

Any value of this shape will be handled by a previous pattern, so this
one should be removed.

Kristian Notari (Oct 20 2022 at 13:26):

by the way I've encountered this right now while playing with it :grinning_face_with_smiling_eyes: :
image.png

Kristian Notari (Oct 20 2022 at 13:26):

Kristian Notari (Oct 20 2022 at 13:27):

Kristian Notari (Oct 20 2022 at 13:29):

image.png
this explains why it's working, but also implies we're not getting it right

Kristian Notari (Oct 20 2022 at 13:30):

Kristian Notari (Oct 20 2022 at 13:38):

Ghislain (Oct 20 2022 at 13:41):

expect
    b = if 3 > 4 then Ok "ok" else Err "err"

    b != b

b : [Err Str, Ok Str]a
b = Err "err"

it doesn't give [Err Str, Ok Str]*, though I don't exactly understand the difference yet

Kristian Notari (Oct 20 2022 at 13:43):

The difference should be naming, so that you can refer to "whatever more there is" as a when composing tag types, but if it's not used anywhere else it should basically be equivalent as saying * there, cause you're not forcing a to be something appearing somewhere else.

Kristian Notari (Oct 20 2022 at 13:44):

Kristian Notari (Oct 20 2022 at 13:54):

At least counterintuitively for me, what is happening here is that the * bit doesn't allow the value you're assigning to be anything else different than what the original tag union is. It's basically only useful when used as a type parameter in functions.

c : {} -> [Ok Str]*

You're forced to return a tag called Ok with a value Str. You can't return Err "err" from c

Kristian Notari (Oct 20 2022 at 13:55):

Brendan Hansknecht (Oct 20 2022 at 14:15):

I am not sure how helpful this will be, but something that helps me is to remember that when all things are said and done, open tags like [Ok Str, Err Str]* no longer exist. The final output machine code will only ever specify closed tag types and values.

a = if 3 > 4 then Ok "ok" else Err "err"
b = when a is
    Ok _ -> "yes"
    Err _ -> "no"

c = when a is
    Ok _ -> "yes"
    Err _ -> "no"
    _ -> "?"

Str.concat b c

# "nono" : Str

a is a tag that has 2 possible variants. The first variant, Err Str has a tag of 0 and contains a Str. The second variant, Ok Str has a tag of 1 and contains a Str. The physical layout in memory is essentially { value: Str, tag: bool }. Once types are finalized for the whole program and it is being turned to machine code, there is no possiblity for a to be anything except Err Str or Ok Str.

Brendan Hansknecht (Oct 20 2022 at 14:17):

Kristian Notari (Oct 20 2022 at 14:18):

Yeah that is simple to reason about. What doesn't help is seeing a typed [Ok Str, Err Str]* from the compiler/type checker/repl. This way your convincing me * could be used when declaring things as a placeholder for this tag could be anything which is not the case.

Kristian Notari (Oct 20 2022 at 14:19):

The * is only helpful when accepting something in a more open/wide way, as in function arguments.

Kristian Notari (Oct 20 2022 at 14:19):

Kristian Notari (Oct 20 2022 at 14:22):

Kristian Notari (Oct 20 2022 at 14:23):

Ghislain (Oct 20 2022 at 14:30):

@Brendan Hansknecht As I understand what you just said, a would still have the same type [Ok Str, Err Str]* (the when would have no responsibility over the a type)

Brendan Hansknecht (Oct 20 2022 at 14:31):

I think the main confusion about open tags comes from them acting differently depending on context. These is seen most easily with function arguments and return values.

^^ yeah, noting star mosly mattering in function arguments is very correct. Though it also maters in return types.

When handling a function argument, you always need to deal with the _ case if it is an open union. This is because a function parameter is not concrete. It is a value that will change on each call. In the end we still have to generate concrete types, but we can't analyze locally and do that.

f : [ Ok Str, Err Str ]* -> Str
f = \x ->
  when x is
    Ok str -> str
    Err str -> str
    _ -> "?"

c = Apple
f c

what is the concrete type of x? The actual value that is generated by the call f c.
It is a tag with 3 variants:

So the in hardware representation is essentially { value: Str, tag: U8 }, and in the case of Apple, the value is empty and the tag is 0.

Kristian Notari (Oct 20 2022 at 14:36):

I'm getting the why and how of open/closed tags now, but it's still confusing how * change importance in different contexts

Kristian Notari (Oct 20 2022 at 14:37):

Kristian Notari (Oct 20 2022 at 14:38):

I get like when creating values directly without using functions it doesn't really make sense to use * in the first place, in fact the type checker errors out and advice you to narrow the type or generalize the value

Kristian Notari (Oct 20 2022 at 14:39):

Brendan Hansknecht (Oct 20 2022 at 14:41):

Now switching to the return case. An important part of a returned tag union is that it is allowed to grow. As such, you might start with [ Ok Str, Err Str ]* and end with [ Ok Str, Err Str, Apple ].

# let's just pretend this function is defined:
f : Str -> [Ok Str, Err Str]*

if someBool then
  f "yay!"
else
  Apple

This generates the same concrete tag as above. As such, the generated code from f has to be modified. If f were to just generate [Ok Str, Err Str], Err would have a tag of 10. This would end up leading to a conflict with Apple, which has a tag of 0 in the final tag. As such, the final output type of f is [ Apple, Err Str, Ok Str ]. Though it will only ever generate the Err or Ok case, it has to generate them with a new tag due to the tag union expanding to include Apple.

Brendan Hansknecht (Oct 20 2022 at 14:44):

So we talked about changing the name at some point to growing tag union instead of open. */[]* really means the current definition of this tag contains no variants, but later it can grow to contain variants. So [Ok, Err]* means. The current tag can either be an Ok or Err variant with the capability to eventually grow to hold other variants.

Kristian Notari (Oct 20 2022 at 14:46):

Why should the writer of the function f be responsible to open that for you? If the function f only return Ok or Err it should say so in the signature as the return type. What I want to do with that return value after calling f should be caller's responsibility. Is that done for performance reasons only?

Kristian Notari (Oct 20 2022 at 14:47):

I get the importance of preserving open tags when accepting them and when you need to return them (after modifying them for example, so they're linked to the argument type), but I'm not getting why I should bother as a function writer how callers will "union" tags in the future, after the call to my function

Brendan Hansknecht (Oct 20 2022 at 14:50):

If I define Color : [ Red, Green, Blue ]. And have a function f : Str -> Result Color [ NotAColor ]*, I am intentionally constraining the caller of this function. I am telling them that I know all possible colors. You can not take my color and merge it with some other list of colors. You only get red, green, or blue. At the same time, I am leaving the error type open to the user. I am telling them that I might give them a NotAColor error, but they might have tons of other errors they also want to handle.

Brendan Hansknecht (Oct 20 2022 at 14:53):

That enables my color to be passed into my other function draw : Shape, Position, Color -> Drawing. If instead I returned a [ Red, Green, Blue ]*, the user would have to match on it and then constrain the result to [ Red, Green, Blue ] before they could pass it to draw. I don't want that tag to expand. It is inconvenient to both myself and the user.

Brendan Hansknecht (Oct 20 2022 at 14:55):

Also, I guess it technically isn't important for performance, we could force all functions to always return open tags. That would not hurt performance, but it would make programming less convenient.

Brendan Hansknecht (Oct 20 2022 at 14:58):

Ghislain (Oct 20 2022 at 15:02):

Yes, I tried your code (hope that the function doesn't change the behavior, expect doesn't seem to like when in it)

fn = \v ->
    when v is
        Ok _ -> "yes"
        Err _ -> "no"
        LastConcrete -> "That other concrete tag"

expect
    a = if 3 > 4 then Ok "ok" else Err "err"

    (fn a) != (fn a)

a : [Err Str, Ok Str]a
a = Err "err"

Brendan Hansknecht (Oct 20 2022 at 15:07):

Ah, i guess when I talked up above i was also, trying to give the "concrete" type that the actual compiled binary would see. Technically the type of a doesn't exactly change. a is a [Err Str, Ok Str]b (used b to avoid confusion below)

What does actually change is b. In your example, b is [ LastConcrete ], leading to a final "concrete" type of [Err Str, LastConcrete, Ok Str].

Kristian Notari (Oct 20 2022 at 15:09):

I get it but that's not how it works in other languages and it's confusing me on the reasons why this is needed. If I say my f function return a Color and the caller needs Color to use it in other functions, if it uses a color based on the value returned by:

# let's just pretend this function is defined
Color : [ Red, Green, Blue ]
f : Str -> Color

# I expect myColor to be [Purple]Color
myColor = if Bool.true then f "yay" else Purple

draw shape position myColor

Cause myColor is not assignable to Color. I don't need to "help" the caller or constraint the caller. Signatures says that already.

Kristian Notari (Oct 20 2022 at 15:11):

While open/close concept on tags and records is useful when accepting stuff or when linking accepted stuff to returning stuff, in the case of f having the ability to constrain the tag union returned by the function so that the caller can't mess with it, it's confusing for me and I'm not getting why it's a feature

Brendan Hansknecht (Oct 20 2022 at 15:12):

Interesting. I come from lower level languages where the default is constrained and there are no other options. An enum is defined once and has no flexibility. As such, defaulting to constrained makes a lot of sense to me. Have never thought about it the way you are mentioning.

Brendan Hansknecht (Oct 20 2022 at 15:13):

What is a language that does what you mentioned and also has a static type system?

Michał Łępicki (Oct 20 2022 at 15:15):

I found that open unions can cause weird inference sometimes, so I appreciate that closed unions exist. Here I see no reason why I would need to handle A in bar, other than open unions weirdness:

» foo = \a ->
…   when a is
…     A x -> B x
…     y -> y
…
… bar =
…   when foo C is
…     B _ -> "ok"
…     C -> "ok"
… bar

── UNSAFE PATTERN ──────────────────────────────────────────────────────────────

This when does not cover all the possibilities:

10│>        when foo C is
11│>          B _ -> "ok"
12│>          C -> "ok"

Other possibilities include:

    A _
    _

I would have to crash if I saw one of those! Add branches for them!

Brendan Hansknecht (Oct 20 2022 at 15:16):

As an aside, open tags can have a cost in terms of generated assembly bloat and memory bloat, that said, without silly mistakes, I would expect this cost generally be zero.

Michał Łępicki (Oct 20 2022 at 15:22):

(That said I also wouldn't be able to write this code with closed unions, and I didn't find other weird behavior yet)

Brendan Hansknecht (Oct 20 2022 at 15:27):

The weirdness is because the input tag and the output tag have to be the same type. This is because y -> y (it is a no-op, we don't re-layout the tag). As such, if the input type can have an A variant, it means the output can theoretically have an A variant. Also, if the input can have any tag, so can the output.

It's kinda as if you wrote: f : Str -> [ Red, Green, Blue ], but f only ever returns the Green variant. Even though f will only ever be Green, you still need to match on Red and Blue when using the result of f.

Michał Łępicki (Oct 20 2022 at 15:32):

I think e.g. Typescript might do better with examples like this because they do flow typing? (I think it's more work for the compiler but I don't know much about it)

Michał Łępicki (Oct 20 2022 at 15:36):

Following the logic: for A the return type is B, for []* \ A (anything but A) return type is the same, then combine the branches so the type is []* -> [B]* \ A
But that seems a bit offtopic already, sorry about that

Brendan Hansknecht (Oct 20 2022 at 15:38):

Kristian Notari (Oct 20 2022 at 15:55):

That's the background I'm coming from and that's why it sounds confusing reasoning about types as it is in Roc

Kristian Notari (Oct 20 2022 at 15:55):

Both for "excluding" subtypes from types and both for the * context sensitive utility

Kristian Notari (Oct 20 2022 at 15:56):

But still, I'm not getting why not to go for something like typescript and rule out things like y -> y with better types or accept things like:

Apart from performance, I don't see the reasons why. Maybe they are good reasons, but I'm curious to know them

Brendan Hansknecht (Oct 20 2022 at 16:03):

At a minimum, it would make the compiler more complex and slower, it would have to figure out the underlying concrete type and the possible variants the type could take. So that is more information to collect and propagate. That said, I don't really know the answer. Hopefully someone with more type system knowledge can jump in and expand the explanation.

Ayaz Hafiz (Oct 21 2022 at 14:29):

This is going to be a long message, but I hope it will provide more context on how and why this works in Roc. It's a question we get pretty often and a story we've been trying to figure out how to make better, through documentation and/or language changes.

In rough, I'll try to provide an intuition for how these tags work, why they are useful in applications, and why things are this way - in particular, why not switch to subtyping? Then I'll show one experiment we are considering.

Intuition

Basic example, with TypeScript analogies

Okay, so the first thing to keep in mind is that Roc does not have subtyping. That means that a value of type A cannot be used where a value of type [A, B] is expected, unlike in e.g. TypeScript, where A can be passed to a function that expects A|B. We'll get to why Roc doesn't have subtyping later, but was this means from an intuition, is that if you try to use a tag union value given as the result of a function, your use must match the returned value exactly.

openFile : Str -> Result File [OpenFileError ...]
writeFile : File, Str -> Result {} [WriteFileError ...]

openAndWrite : Str, Str -> Result {} [OpenFileError ..., WriteFileError ...]
openAndWrite = \file, content ->
  when openFile file is
    Ok f -> writeFile f content
    Err e -> Err e

This does not compile, because despite [OpenFileError ...] (and [WriteFileError ...]) "fitting" into [OpenFileError ..., WriteFileError ...] from a subtyping perspective, that is not how Roc does things - because these types do not match exactly, they are not equal. Again, it is a very reasonable question why things cannot be done that way, and I'll elaborate that later on.

So, where did things go wrong here? The key insight is that our functions openFile and writeFile enumerate the failure cases they might produce, but the context those failures are used in is determined only by the user of those functions! Roc enables authors to enumerate variant cases while providing callers flexibility in how they can use those cases via open tag unions.

I think it may be helpful to look at analogous example in TypeScript here. We can write the following program:

interface OpenFileError { ... }
interface WriteFileError { ... }
type Result<Ok, Err> = ...

function openFile<T>(fileName: string) -> Result<File, OpenFileError | T> { ... }
function writeFile<T>(file: File, content: string) -> Result<File, WriteFileError | T> { ... }

I hope this might explain more of what's going on - as you can see, T is not useful for anything except for defining the type expected by the caller of that function! Now of course, this might seem totally useless in TypeScript, since OpenFileError always fits into OpenFileError | T, for any T. But I hope you can see that in cases like Roc's, where OpenFileError can be used as OpenFileError | T only if T = [], and otherwise the return value must be exactly OpenFileError | T, this can be useful.

openFile : Str -> Result File [OpenFileError ...]*
writeFile : File, Str -> Result {} [WriteFileError ...]*

openAndWrite : Str, Str -> Result {} [OpenFileError ..., WriteFileError ...]
openAndWrite = \file, content ->
  when openFile file is
    Ok f -> writeFile f content
    Err e -> Err e

from before now compiles, because the Roc compiler will effectively see "oh, I need to create a openFile where * = WriteFileError", and generate a specialization of openFile that has exactly the type signature

openFile : Str -> Result File [OpenFileError ..., WriteFileError ...]

which will be the version used in openAndWrite. That's analogous to instantiating

function openFile<T>(fileName: string) -> Result<File, OpenFileError | T> { ... }

function openFile(fileName: string) -> Result<File, OpenFileError | WriteFileError> { ... }

Non-function values can be used contextually

So, I hope the ethos of "allow users of a value decide the context it can be used in" makes sense at this point. I'd like to show how this extends to literal/non-function values as well, since that often trips folks up.

I think this is more intuitive than open tag unions, but the idea is the same - what exactly n is, from a raw-bytes-on-the-machine perspective, isn't determined until you use it in a particular way that resolves it to a concrete type - for example, pass it to a function that expects a U64, or an F32, or something else.

And this means that you can use n in multiple contexts that you like, because the definer of n gave you the freedom to do so! For example, the following works just fine:

n = Num *
n = 1

rcd : {a : U64, b : F32}
rcd = {a: n, b: n}

purple : [Purple]*
purple = Purple

Pastel : [Purple, Pink, Creme]
Deep : [Purple, DarkBlue, Indigo]

rcd : {pastel: Pastel, deep: Deep}
rcd = {pastel: purple, deep: purple}

You could think of purple : [Purple]* as the TypeScript type signature const purple<T>: Purple|<T>, if TypeScript allowed you to say something like that!

This might seem contrived, and usually, it is - much more so than allowing callers of functions to use return values in any context they like. It used to be important when Roc's booleans were not an opaque type, and were instead the tags [True, False]. You could write programs like

flag = True

if flag then foo {} else bar {}

Now, if the type of flag was only [True], it could not be used in the context [True, False] in the "if" statement. By making it [True]*, you add a type variable that tells the compiler "hey compiler, this is a True value, but allow people to use it any context they like", and the compiler can then make that particular usage exactly [True, False].

Why can't * capture everything?

foo : {} -> [A]*
foo = \{} -> B

Maybe you already understand now why this does not type check, but if not, the key to remember here is that * does not add information to the type, and it does not mean "this tag can be anything else". It means, "allow this value to be used in any context that includes itself, or anything more". So, foo's API contract is "I produce an A, and you can use that A in any larger context". But the contract is a lie, because it actually produces a B!

interface A {a: ""}
interface B {b : ""}

function f<T>() : A | T {
    return {b: ""}
}

function f<T>() : number | T {
    return ""
}

Why doesn't Roc use subtyping?

I won't bore you with the details here, I (and I'm sure others) would be happy to elaborate more to anyone who is interested, but the TLDR is that Roc's model of compiling to efficient, boxed representations of types would break down if Roc used subtyping a-la TypeScript's subtyping. The easiest way to think about this is that with the program

openFile : Str -> Result File [OpenFileError ...]*
writeFile : File, Str -> Result {} [WriteFileError ...]*

openAndWrite : Str, Str -> Result {} [OpenFileError ..., WriteFileError ...]
openAndWrite = \file, content ->
  when openFile file is
    Ok f -> writeFile f content
    Err e -> Err e

after the type checker figures out how the *s should be instantiated with a concrete type, based on the contextual usages in openAndWrite, we end up with the program

openFile : Str -> Result File [OpenFileError ..., WriteFileError ...]
writeFile : File, Str -> Result {} [OpenFileError ..., WriteFileError ...]

openAndWrite : Str, Str -> Result {} [OpenFileError ..., WriteFileError ...]
openAndWrite = \file, content ->
  when openFile file is
    Ok f -> writeFile f content   # no conversion needed here!!
    Err e -> Err e  # no conversion needed here!!

And notice in the two branches, we don't need to do any conversions at all to pass the values through with the same types - since they are exactly the same type, they will have exactly the same underlying byte representation on the machine, and can be transparently passed through.

openFile : Str -> Result File [OpenFileError ...]
writeFile : File, Str -> Result {} [WriteFileError ...]

openAndWrite : Str, Str -> Result {} [OpenFileError ..., WriteFileError ...]
openAndWrite = \file, content ->
  when openFile file is
    Ok f -> writeFile f content   # uh-oh, how do I convert?
    Err e -> Err e  # uh-oh, how do I convert?

Now, the types returned from openFile and writeFile are not the same types as they are used in openAndWrite, and so the underlying representation (as it is today) would not be the same. The Roc compiler would have only a few options here:

I hope that provided a baseline idea of why Roc doesn't use subtyping today, but happy to elaborate more.

Can tags be more intuitive.

Maybe! We'd like to think so. One idea is that we should always make tag unions open in "output position", i.e. when returned by a function, and always allow the user to determine the context they are used in. Here's a playground of what that would look like: https://ayazhafiz.com/plts/playground/cor/easy_tags, feedback welcome!

Ayaz Hafiz (Oct 21 2022 at 14:29):

Hit 9985 chars of the 10000 chars limit there :sweat_smile: and apparently it can't be edited. Edits:

Kristian Notari (Oct 21 2022 at 14:56):

I've understood the performance reasons behind, the typescript analogies (so how they relate with Roc code as you described) and the subtyping stuff. I'm not getting the lines like:

Kristian Notari (Oct 21 2022 at 15:01):

I mean, I'm not seeing why I, as a library developer, should choose to return closed tag unions, which basically limit what my caller could do with that value

Ayaz Hafiz (Oct 21 2022 at 15:03):

Yeah, I don’t think you ever would want to! That’s why open tag unions are useful, to let the caller decide how exactly they want to use it. And it’s the subject of the experiment I mentioned at the bottom, where we can make it so that you cannot say that a tag union is closed if it appears as the return value of a function.

Kristian Notari (Oct 21 2022 at 15:03):

Because, from what I've understood, if the return tags were to always be open, they are then implemented, after compilation, depending on how you've used those within your code. So if you only use those for what they are, without ever unioning them with other tags, performance wise that's the same as having closed tags in the first place.

Ayaz Hafiz (Oct 21 2022 at 15:04):

Yes exactly. Open tag unions are never a thing at runtime, just like “<T>” type parameters are never a thing at runtime in TS/JS. They exist solely to provide more flexibility in the API surface of programs.

Kristian Notari (Oct 21 2022 at 15:06):

Do you have any examples of closed tag unions being useful as return types? Like it's not something that bothers that much, but having a feature in the language which can only be useful partially depending on where you use it smell like something that could be described better using another approach. Not to sound offensive, I'm just trying to better know Roc cause I really want to try it out!!

Kristian Notari (Oct 21 2022 at 15:09):

By the way, this should be written in capital letters in the tutorial page :D if it is not already

Ayaz Hafiz (Oct 21 2022 at 15:14):

No you’re totally right. I don’t think it’s all that useful to have closed tag unions in output position. The only example i can think of is if you write a function

but i’ve talked about this before with some people, and it seems like you would never actually write such a thing in practice. Which means it may be better to have returned [A] always be open (ie what is [A]* today)

Kristian Notari (Oct 21 2022 at 15:37):

Ok so basically you're trying to convert [A] to be by default as you wrote [A]*when used in output positions?

Ayaz Hafiz (Oct 21 2022 at 15:38):

Kristian Notari (Oct 21 2022 at 15:39):

a : [A]*
# or
a : [A]

Kristian Notari (Oct 21 2022 at 15:39):

Ayaz Hafiz (Oct 21 2022 at 15:40):

Kristian Notari (Oct 21 2022 at 15:42):

So basically the * becomes relevant only in typing function arguments when you don't want to be that strict?

Kristian Notari (Oct 21 2022 at 15:43):

It start to smell like variance :D when used as input or output it basically default to the opposite direction

Kristian Notari (Oct 21 2022 at 15:46):

Would closed tags be impossible to describe at all in output positions or there would be something new like [A]! to explicitly force closeness?

Brendan Hansknecht (Oct 21 2022 at 15:48):

I still think that many things have limited scope. If something has a known limited scope, it should always be a closed union (whether argument or return type). A Bool should always be [True, False], I don't ever want a Bool to expand to include KindaTruthy.

Kristian Notari (Oct 21 2022 at 15:53):

From a definition perspective yes, but you should not limit the caller by defining it closed on the return type of your functions. Because the typechecker already check what you have if you decide to union [Bool, True] with KindaTruthy then booleans operations don't work anymore. Is not something that's hidden from your control. It doesn't affect anything

Brendan Hansknecht (Oct 21 2022 at 15:55):

I don't want to write code the enables the user to create something that is more bug prone. Whether intentional or not. If they want a KindaTruthyBool, they can explicitly define it. I get that it would be constrained elsewhere, but I think every location should be as constrained as makes reasonable sense. It makes reasonable sense to constrain a Bool type to never expand.

Brendan Hansknecht (Oct 21 2022 at 15:59):

n = Num *
n = 1

rcd : {a : U64, b : F32}
rcd = {a: n, b: n}

I am not sure if I find this super cool or really scary. It feels like enabling implicit casts in my mind. As in, the * in Num * should only be able to refer to one things. If it was instead a type parameter T, it would either become U64 or F32. The fact it can become both feels like the type system lying. On the other hand, everything in Roc is immutable and it isn't like the type is changing from one line to the next, it is just being defined by each use. Super weird....not sure what to think of it.

Kristian Notari (Oct 21 2022 at 16:05):

It's not being two things at once, from what I've understood. It's just open to be concretely implemented by the compiler as the thing it's used for.

Kristian Notari (Oct 21 2022 at 16:07):

You can't be sure what's error prone, cause the use they do of what you return is solely up to them. What if you have the week workdays as tags and you return a closed tag union of [Monday, Tuesday, ..., Friday] and they want to use it as value for one of their "Schedule" values (tags) which also allows Weekends?

Kristian Notari (Oct 21 2022 at 16:11):

Moreover, Bool expansion can be useful. Consider the case where you want to describe (not saying it's the best way to do it, but still) how much someone agree with you. You could go for:

Agreement : [True, False, InTheMiddle Num *]

So to have three cases, 100% (yes, True), 0% (no, False), and something in the middle with a percentage.

Brendan Hansknecht (Oct 21 2022 at 17:06):

You just defined a different type, not a Bool. It can have a nice fromBool method, but I'd don't think any Bool should implicitly convert to an Agreement

Brendan Hansknecht (Oct 21 2022 at 17:10):

Kristian Notari (Oct 21 2022 at 17:18):

# this is my type
Agreement : [True, False, InTheMiddle Num *]

# this is an external API that gives me some boolean value back
getTrueOrFalse : [True, False]

# this is my function which should do something then give back my agreement value
f = if something then getTrueOrFalse else InTheMiddle 42

There's no conversion involved, yet it does not type check. Why shouldn't it type check?

Brendan Hansknecht (Oct 21 2022 at 18:49):

# My type
# Note, neither Bool nor Agreement are open tags. They explicitly enumerate every state they can ever contain.
Agreement : [True, False, InTheMiddle Num *]

# Some function that tells me if a contract is trustworthy or not.
# Maybe even uses its own type instead of Bool. could be [ Trustworthy, Untrustworthy ]
# In this case, lets assume bool.
isTrustworthy : Contract -> Bool

# My function to get an agreement value
calcAgreement : Contract -> Agreement
calcAgreement = \contract ->
    if something then
        # in this case only the trustworthiness matters. No middle states
        # if we just return `isTrustworthy contract` here, I think that should be a bug.
        # it is an implicit conversion from `Bool` to `Agreement`.
        # they are not the same type. We need to be explicit with the conversion.
        isTrustworthy contract |> Agreement.fromBool
    else
        # we don't have enough info to use isTrustworthy. So actually do some agree-abality calculation.
        InTheMiddle 42

I think it is very important to note, what happens if isTrustworthy changes it's api.
In my example, if isTrustworthy changes to anything other than a Bool, there will be a compiler error.
This means whoever changes is isTrustworthy will be forced to consider my code (even if this is me in the future updating some external library).

If instead, we just return isTrustworthy contract directly because the two tags are merged, we might not get a compiler error when the api of isTrustworthy changes. With this small example, it is likely that either way we will get a compiler error, but with larger tags, it is possible that isTrustworthy could be modified such that it is still a subset of Agreement, but using the tags to mean something different.
For this example, which of course will be contrived due to these specific tags, what if isTrustworthy suddenly started returning InTheMiddle 423 where this doesn't mean we have 423 agreement, but instead, the contract is still in the middle of negotiation, so we don't know if it is trustworthy and 423 is some sort of contract legal code for what stage the contract is in. We just introduced a huge bug due to allowing the result from isTrustworthy to implicity convert to an Agreement.

Ayaz Hafiz (Oct 24 2022 at 01:18):

That is compelling, but I don't know if that particular problem is avoidable in general with a language that anonymous structural types. Also, I feel like the payload of InTheMiddle should always be an opaque type/named structural type (e.g. record) in an API like this, so I wonder how likely this is to happen in practice.

Ayaz Hafiz (Oct 24 2022 at 01:19):

I think the upside of being able to accumulate tags returned from a function in any context you like, for free, is a huge upside for a language like Roc, that relies heavily on monad-like patterns, that we should try to prioritize making that seamless

Ayaz Hafiz (Oct 24 2022 at 01:19):

But, to Brendan's point, there are cases where you might accidentally do too much work based on the tags returned from a function

Ayaz Hafiz (Oct 24 2022 at 01:21):

get : Url -> Result Response [Http404, Http500, HttpTimeout]*

when get "https://roc-lang.org" is
  Err Http404 | Err Http500 | Err HttpTimeout -> ...
  Err HttpRedirect -> ... # useless branch!
  Ok l -> ...

Ayaz Hafiz (Oct 24 2022 at 01:22):

Ayaz Hafiz (Oct 24 2022 at 01:23):

however there are ways the compiler can help you out here, for example "bidirectional exhaustiveness checking", where we check that the branches possibly returned by the "when" condition (in this case get "https://roc-lang.org") are exhaustive or redundant relative to the branches that are matched in the when condition. So it's like normal exhaustiveness checking, but going the other way

Ayaz Hafiz (Oct 24 2022 at 01:24):

That would catch this case, helping you avoid some of the common pitfalls of these kinds of tag unions. I actually have a branch implementing that, and it works pretty well!

Ayaz Hafiz (Oct 24 2022 at 01:28):

I would be inclined to say "no", because that seems like it could make things as confusing as they are today for folks who do not grok the "open means usable in any context" definition of tag unions. But obviously I don't really know any of the right answers here

Brendan Hansknecht (Oct 24 2022 at 01:49):

Yeah, i definitely agree that my example is relatively unlikely. I am probably being overly defensive, but I am used to the explicit nature of enum like types. I probably need to play around with open tags a lot more to see the benefits. Currently my only use case for open tags is error types.

Question: If we defaulted to returning open unions from functions, would there be a way to explicitly close them?

Brendan Hansknecht (Oct 24 2022 at 01:50):

Ayaz Hafiz (Oct 24 2022 at 01:54):

That’s what Kristan’s question I quoted above is asking. I think the answer should be no, you can’t (otherwise it could be equally confusing as the current state of things)

Brendan Hansknecht (Oct 24 2022 at 01:58):

Kristian Notari (Oct 24 2022 at 07:21):

Brendan Hansknecht said:
Your example is correct and I agree with you more separation between domains is a good thing in general. The problem is that even with a domain translator function like Agreement.fromBool you're always subject to change in other's APIs. What if isTrustworthy start returning always true or always false? No compiler error, everything's fine, you notice later in production cause every agreement turns out to be True or False.

Ideally this kind of corner cases are already covered by testing, because there's no way to defend your code 100% from changes to other APIs if the signature/type still matches.

Brendan Hansknecht (Oct 24 2022 at 14:38):

@Kristian Notari I am pretty sure you just described a different class of error. Yes, always true or always false is a bug that should be tested for. Unit tests should catch that.

On the other hand, roc has a a
static type system. You should never need to write a test that a function actually returns a specific type. The type system should automatically catch all errors of this class. Only in a dynamic language would you need a test of that variety.

Brendan Hansknecht (Oct 24 2022 at 14:39):

Having strong guarantees that mean you don't even need a test is much better than writing many many tests and hoping you didn't miss an edge case.

Brendan Hansknecht (Oct 24 2022 at 14:41):

Also, isTrustworthy always returning false is not a change in it's API. It still is returning a bool, which is all that its API guarantees. Again, it is still likely a bug, but it is not related to the API.

Kristian Notari (Oct 24 2022 at 14:47):

This ^^ wrote by you is the same class of errors as the one I was telling you about with my example. Not a "change in the signature" but a change in the API behavior still worth noticing.

And this ^^ too would be checked only by tests. As soon as the function you're using (isTrustworthy) or the function you're creating (calcAgreement) keep their signature, everything can change, even after you add closed tags, it's just more uncomfortable to wrap/unwrap them every single time.

Kristian Notari (Oct 24 2022 at 14:48):

If you're always adhering to the type signatures of functions (so no compiler error) everything's fine and what's left is tests goal to test, if needed.

Kristian Notari (Oct 24 2022 at 14:49):

It doesn't really matter if tags are closed or not, if you can easily translate to and from tags with wrapping or with implicit tag merging

Brendan Hansknecht (Oct 24 2022 at 14:50):

Those two quotes both only happen if you don't have Agreement.fromBool and don't enforce explicit casts. With explicit casts, they are both impossible.

Kristian Notari (Oct 24 2022 at 14:51):

You're just moving the problem to the wrapping/unwrapping. What if Agreement.fromBool changes to return a InTheMiddle 42 ?

Kristian Notari (Oct 24 2022 at 14:51):

Brendan Hansknecht (Oct 24 2022 at 14:53):

You are greatly misrepresenting my point. The types make you safer and remove a class of problems. They help to scope issues and give you some guarantees. This does not mean they guarantees no bugs. This does not mean you don't have to test. An explicit cast can still have a bug, but it is much easier to test 1 explicit cast function than it is to test ever single location that could have an implicit cast.

Kristian Notari (Oct 24 2022 at 14:55):

"The types make you safer and remove a class of problems" --> I hope we all agree on that one since we're here for Roc :grinning_face_with_smiling_eyes: and I agree too

Brendan Hansknecht (Oct 24 2022 at 14:56):

In this case, the types simply guarantee that isTrustworthy returns a Bool. I should never need to test whether or not isTrustworthy returns InTheMiddle 42.

Brendan Hansknecht (Oct 24 2022 at 14:57):

Kristian Notari (Oct 24 2022 at 15:01):

Yeah but just define a helper function that chain isTrustworthy with Agreement.fromBool. Does that fulfill your requirements for having less bugs? Because from calcAgreement you now call a function which return a Agreement but you really don't know what. The problem's still there

Brendan Hansknecht (Oct 24 2022 at 15:02):

I still think that is much better than the implicit cast. It is a single testable function instead of needing to test every location that could have an implicit cast.

Kristian Notari (Oct 24 2022 at 15:04):

Tags are great, once you have them there's no really the need of wrapping/unwrapping things if they just get merged easily as with typescript. You'd only need wrapping/unwrapping (the casting you were referring to) if you need to add data to them or you want them to have different names, to group them, or something similar (readability). It doesn't alter correctness

Brendan Hansknecht (Oct 24 2022 at 15:07):

Besides limited cases, like error tags, i have not seen uses where I would consider tags automatically merging beneficial. I would love to learn otherwise, but I have not seen examples where I would want to default to open tag unions. So I see open tags as a valuable feature in some cases, but not something I would expect to default to.

Kristian Notari (Oct 24 2022 at 15:18):

In my typescript experience, I use string literals as types (I mean everyone does). In the typescript type system "mystring" is a subtype of string, so you can precisely type string literals as if they were tags (no data attached, but can workaround that in other ways). There are part of the domain which are focused on some bits and other parts focused on others. When you compose those parts, the resulting "type" should be the union of those "string literals" you're returning.

So in our example, the code which handle the true or false bit could be somewhere and somewhere else there could be the code that handle the InTheMiddle case. When you "join" those bits in your "upper level" code it's not mandatory nor beneficial (unless you want to add other info, wrap them with different names or something similar which has nothing to do with correctness or bugs) to wrap them with other "strings" or other types.

type ContractTrustworthy = "true" | "false"
declare function isTrustworthy(contract: Contract): ContractTrustworthy

type Undecided = "in-the-middle" // ignore the number bit for a second
declare function makeInTheMiddle(contract: Contract): Undecided

type Agreement = ContractTrustworthy | Undecided
declare function calcAgreement(contract: Contract): Agreement

Kristian Notari (Oct 24 2022 at 15:19):

if (something) {
    return isTrustworthy(contract)
} else {
    return makeInTheMiddle(contract)
}

Kristian Notari (Oct 24 2022 at 15:21):

Type are automatically merged (that is unioned) and you get an Agreement out of the two branches, which is the merging of both ContractTrustworthy and Undecided. It's common with errors, so errors merge with other errors giving you a list of reasons things could possibly fail for

Kristian Notari (Oct 24 2022 at 15:22):

But there's nothing less/more error prone in doing it this way or with wrapping types

Kristian Notari (Oct 24 2022 at 15:23):

One can object refactoring/readability/noticing future bugs can be easier/harder with one approach or the other (don't know which honestly), but that's pure developer friendliness, the two approaches doesn't change the program correctness underneat.

Kristian Notari (Oct 24 2022 at 15:25):

(here I'm using true and false as string literals instead of the types true and false which typescript has, boolean type, just to better adhere to the tags example in Roc)

Brendan Hansknecht (Oct 24 2022 at 15:34):

Yeah, the fundamental difference is just if you have to be explicit or not. That is all.

Agreement : [ True, False, InTheMiddle U64 ]

if something then
    isTrustworthy contract |> fromTrustworthy
else
    makeInTheMiddle |> fromInTheMiddle

Agreement : [ ExactAgreement ConstractTrustworthy, FuzzyAgreement Undecided ]

if something then
    isTrustworthy contract |> ExactAgreement
else
    makeInTheMiddle |> FuzzyAgreement

Kristian Notari (Oct 24 2022 at 15:35):

Another example I can give on "consider tags automatically merging" is that from a mental model perspective (at least mine), they're unions of things. I assume you can always say "this OR that", in whatever language you like. And that's what you basically want, except you need a wrapping/casting/converting pass as you've just describer before you actually can do that if you have closed tags.

Brendan Hansknecht (Oct 24 2022 at 15:36):

I think it has been shown in a lot of languages, that most of the time it is more preferable to be explicit because implicit casts lead to more bugs. As such, I prefer closed unions as the default. I understand the use of an open union. I think they are great for error tags, but errors are a special case where accumulating anything and everything makes a lot of sense.

Kristian Notari (Oct 24 2022 at 15:36):

Agreement : True OR False OR InTheMiddle U64

if something then
    isTrustworthy contract # True OR False
else # OR
    makeInTheMiddle # InTheMiddle U64

Kristian Notari (Oct 24 2022 at 15:37):

The fact I can't express this simple use case because some library author wrote their isTrustworthy function with closed tags is a no go personally. It shouldn't be his right to decide

Brendan Hansknecht (Oct 24 2022 at 15:38):

Also, I think it would be different if we had another way to define a tag as something like this:

Agreement: ContractTrustworthy | Undecided

In this case, I am explicitly saying that an Agreeement is the merging of ContractTrustworthy and Undecided. In that case, you are kinda opting into implicit conversion. Roc does not currently support this, but if it did, I think it would have value.

Brendan Hansknecht (Oct 24 2022 at 15:39):

Kristian Notari (Oct 24 2022 at 15:39):

Agreement: []ContractTrustworthy

Kristian Notari (Oct 24 2022 at 15:39):

Kristian Notari (Oct 24 2022 at 15:41):

Brendan Hansknecht (Oct 24 2022 at 15:43):

I like closed unions as the default. I think open unions should only be used in special cases like error types. All that said, I definitely think there would be value in having the ability to define a type that explicitly merges multiple closed unions into a bigger union (still closed). This type would then enable the implicit cast use case, but at least it is documented in the type system by the new type being defined as the merging of multiple sub-unions.

Kristian Notari (Oct 24 2022 at 15:44):

The problem being: the function author should not be able to choose how his tags "closeness" affect my code

Kristian Notari (Oct 24 2022 at 15:44):

Kristian Notari (Oct 24 2022 at 15:45):

Cause you don't know what makes sense to close or open beforehand. Different context and domains should be able to decide on their own

Brendan Hansknecht (Oct 24 2022 at 15:45):

Which if you could define agreement as Agreement: ContractTrustworthy | Undecided, that would not be a problem. You just are explicitly changing your type to extend the closed ContractTrustworthy

Brendan Hansknecht (Oct 24 2022 at 15:46):

Instead of defining an arguable different type: Agreement : [True, False, InTheMiddle Num *] and then implicitly casting

Kristian Notari (Oct 24 2022 at 15:47):

Yeah with Roc "uniqueness of tag" philosophy I agree I should not recreate "True" tag and hope it will be considered the same as the function return type tag I'm calling, that's a thing

Brendan Hansknecht (Oct 24 2022 at 15:48):

Which I guess as you said could theoretically be written like: Agreement: [InTheMiddle Num *]ContractTrustworthy, assuming you are extending and not merging two different unions

Kristian Notari (Oct 24 2022 at 15:48):

But my Roc syntax before is not invented I'm pretty sure I've read it on the Tutorial. You could write Agreement : []ContractTrustworthy,Undecided or something similar and it just union them

Kristian Notari (Oct 24 2022 at 15:48):

Kristian Notari (Oct 24 2022 at 15:49):

The problem being, even if that's valid Roc syntax and we both agree you should be able to write your Agreement type like that, if the author of the isTrustworthy function says the return type is a CLOSED tag then you literally CAN'T merge it, even if your Agreement type says so, cause if/else must return the same type in both branches

Kristian Notari (Oct 24 2022 at 15:50):

and the Roc typechecker would argue it cannot "merge" the closed union tag returned by isTrustworthy

Kristian Notari (Oct 24 2022 at 15:50):

that's why they're defaulting to change this behaviur in Roc to "output types should have open tags only"

Brendan Hansknecht (Oct 24 2022 at 15:52):

Kristian Notari (Oct 24 2022 at 15:53):

Brendan Hansknecht (Oct 24 2022 at 15:53):

Kristian Notari (Oct 24 2022 at 15:53):

Kristian Notari (Oct 24 2022 at 15:54):

I'm feeling your worry for "my True tag being considered the same as someone else's True tag without anything explicitly written for it"

Kristian Notari (Oct 24 2022 at 15:55):

And I can agree with you it's not ideal, if the all tag feature thing in the language seems to prioritize uniqueness

Kristian Notari (Oct 24 2022 at 15:57):

But my opinion is that considering the two True to be identical (so allowing what you call as "implicit casting") is better than having closed tags decided by function authors.

Brendan Hansknecht (Oct 24 2022 at 15:57):

We technically should be able to do this, the one disadvantage of it, is that it would have some sort of runtime cost. The compiler would just generate a conversion function that in the minimal case would just change an integer. In the worst case, might have to copy around a solid bit of data.

So returning open tags does have a performance advantage, but I think in most cases it would be minimal, and as long as we still had open tags, you would have an explicit way to work around it.

Kristian Notari (Oct 24 2022 at 15:59):

The problem still is: "is my True tag the same as the True tag defined in someone else's code?"

Brendan Hansknecht (Oct 24 2022 at 16:02):

One extra piece of information that probably actually argues for returning an open tag. If two unions have all the exact same tags with the same data, the compiler will consider them the exact same type unless they are opaque.

Agreement : [Agree, Disagree]

QuestionAnswer : [Agree, Disagree]

Since these two aliases happened to be 100% identical, they are the same type to the compiler.

Kristian Notari (Oct 24 2022 at 16:05):

Kristian Notari (Oct 24 2022 at 16:06):

I mean, if you really want to make things "closed" you just make it opaque I guess

Ayaz Hafiz (Oct 24 2022 at 16:06):

ContractTrustworthy a : [True, False]a
Undecided a : [Undecided U64]a

Combined : ContractTrustworthy(Undecided [])

I'm not sure I see how this helps though. You are still forced to do an explicit conversion if you want to convert ContractTrustworthy[] to Combined, at least today.

Kristian Notari (Oct 24 2022 at 16:08):

It was about considering Combined as "creatable" via tag values defined in the tag type it's defined from

Kristian Notari (Oct 24 2022 at 16:09):

Kristian Notari (Oct 24 2022 at 16:10):

Combined : [True, False, Undecided U64] # this should be its own definition
Combined : ContractTrustworthy(Undecided []) # this should be populated simply with ContractTrustworthy or Undecided tag values directly

Kristian Notari (Oct 24 2022 at 16:11):

Because in the first case I'm expecting to use MY OWN VALUES for Combined, while in the second example I'm expecting to use other tag values as values for Combined.

Ayaz Hafiz (Oct 24 2022 at 16:11):

I think I see what you're getting at, but those two definitions are the same today. The alias names Combined, ContractTrustworthy don't matter in measuring type equality; you have to use opaque types to make them distinct, as you mentioned.

Brendan Hansknecht (Oct 24 2022 at 16:12):

Ah, I was hoping you could define Combined : ContractTrustworthy(Undecided []) -> allow implicit casts from the sub unions to the combined union.

Kristian Notari (Oct 24 2022 at 16:12):

Yeah now it makes sense, but before I was thinking tags were like enums, so each one with its own value, different from others even if called the same

Brendan Hansknecht (Oct 24 2022 at 16:12):

Like the compiler would create the casts for you since you wrote a unioned type.

Ayaz Hafiz (Oct 24 2022 at 16:13):

I agree re your previous point, I think if you really want things to be closed you should use an opaque type. In my mind tags are mostly for light-weight, contextual data that is unambiguous. If you want to encode domain-specific information that should be preserved faithfully, opaque types are a better choice.

I think we see this with Roc programs today already. Tags are mostly used only for collecting errors/effects, or for small, localized function calls that don't escape out anywhere (like List.walkUntil, which uses [Continue state, Break state]), and things that do need to escape further away are hidden behind an opaque type (e.g. Parser, File in the cli platform). It also is the case in the original motivating example for this discussion - Ok/Err tags.

Kristian Notari (Oct 24 2022 at 16:14):

Agreement : [Agree, Disagree]
QuestionAnswer : [Agree, Disagree]

remove the whole problem of "implicit/explicit" casting based on how tags are defined, if with direct values or if combined with existing values

Kristian Notari (Oct 24 2022 at 16:15):

Kristian Notari (Oct 24 2022 at 16:16):

Brendan Hansknecht (Oct 24 2022 at 16:17):

Kristian Notari (Oct 24 2022 at 16:18):

MyOpaqueTag = [A,B] # don't know how to write opaque tags yet sorry
MyOpenTag = [A,B]

Are these two equivalent? So, is the only discriminant for equality here the tag's name? If I have:

f : MyOpaqueTag
g : MyOpenTag

Brendan Hansknecht (Oct 24 2022 at 16:19):

Brendan Hansknecht (Oct 24 2022 at 16:20):

When using them in code, they are essentially wrapped tags that are wrapped with a special tag that is always going to be unique from other tags.

Brendan Hansknecht (Oct 24 2022 at 16:21):

Ayaz Hafiz (Oct 24 2022 at 16:25):

fromBytes : List U8, fmt -> Result val [Leftover (List U8)]DecodeError | val has Decoding, fmt has DecoderFormatting
fromBytes = \bytes, fmt ->
    when fromBytesPartial bytes fmt is
        { result, rest } ->
            if List.isEmpty rest then
                when result is
+                   result as finalResult -> finalResult
-                   Ok val -> Ok val
-                   Err TooShort -> Err TooShort
            else
                Err (Leftover rest)

Brendan Hansknecht (Oct 24 2022 at 16:26):

Anyway, I still think I like explicit casts by default, but I understand the sentiment and goals with being more open. Especially when think of tags as light weight and temporary rather than exact and for domain model. As was said, when you have a type that you want to be exact, you have opaque types.

Kristian Notari (Oct 24 2022 at 16:27):

A : [True, False]

Mine : A
Mine : [True, False]

mean two different things for Mine casting wise it's hard to easily explain. It's not what you would expect

Ayaz Hafiz (Oct 24 2022 at 16:28):

For implicit casting I am thinking of the underlying structural type, not based on the alias name. Relying on the alias name for structural aliases (not opaque types) has a lot of problems. But this is an aside, I don't want to take this thread off topic.

Brendan Hansknecht (Oct 24 2022 at 16:30):

Not sure if this is a can of worms and should just be pushed to implementation time or a different discussion, but how does syntax work if returned tags are always open?

myFunc : [ A, B] -> [ B, C] -> This would be closed tag to open tag?
myFunc : [ A, B]* -> [ B, C] -> This would be open tag to open tag?
myFunc : [ A, B]* -> [ B, C]* -> This would be invalid syntax because only function inputs can be explicitly labeled as open now?

Kristian Notari (Oct 24 2022 at 16:30):

Kristian Notari (Oct 24 2022 at 16:31):

Oh you mean, if my tag type has values [A,B] and I'm receiving a subtype set of those values I should just as them to my type?

Kristian Notari (Oct 24 2022 at 16:40):

But should that be explicit? I mean, if they are a subtype my opinion on this is that they should be converted implicitly. But again, this is not done due to performance reasons right? You have to explicitly tell the compiler you're converting them? With as keyword or something?

Brendan Hansknecht (Oct 24 2022 at 16:46):

I think this is talking about if the tags weren't open by default. Since the tags are closed, this is one way to implement the conversion instead of needing to write your own Argument.fromBool. You would just use as Argument. This should not be required in the case the functions return open tags by default.

Ayaz Hafiz (Oct 24 2022 at 16:52):

yeah the rules get messy. What you described is what I was thinking, and Richard and I previously talked about how that would extend to aliases and it seems correct there too. One thing that isn't clear to me is what should happen for tags behind opaque types, that seems trickier.

Ayaz Hafiz (Oct 24 2022 at 16:54):

I think it might be helpful to go back to the original motivation though, which is that @Kristian Notari initially found the behavior of open tag unions confusing. Do you feel like you have a better grasp on them now @Kristian Notari ?

If so, what do you think helped - the framing of "being able to use the tags in any context", or something else? And, if so, would changing how open/closed tag unions behaved help, in your opinion, or is this a documentation/teaching problem>

And if not, what do you continue to find challenging to understand about the behavior of open tag unions, given this thread?

Richard Feldman (Oct 24 2022 at 17:49):

Richard Feldman (Oct 24 2022 at 17:54):

also interesting: if it's the only way to do it (e.g. if we change it so that you can no longer have closed tag unions in the output position), then it becomes very clear that this is what you should do when you want things to be closed, because that becomes the only way to do it!

Kristian Notari (Oct 24 2022 at 19:12):

The confusing bits around tags (and more generally the * or "extensions/unions" in types, even for records) were about the framing. You've cleared my doubts with the sentence I've reported here already.

I don't find anything challenging anymore. Now I'm curious to see where this type system bits in Roc are going to lead, because from my perspective things that are structurally equal are equal, when it comes to thinking about tags or records, despite the fact one would like to have some more control over the making and accessing of values of a specific type. I come from a typescript background so that answers the why I'm more inclined to see everything as structural instead of named. I mean, I know of nominative type systems but still. And I don't know how many name clashes with tags are going to happen nor how that plays well in actual development because I have no experience with Roc whatsoever so I'm willing to try.

One last thing, for the open/closeness of types: since the "defaulting to open depending on input/output position" discussion we had, I had the doubt this is one of those times where (going abstract here) a concept that covers A is being used to solve B. I don't know if that's the case but when you adopt a concept and then you need cases and branches to say "yes but that thing in this context means another thing" maybe it's a clue that should not be the right approach or at least not the best way to let the user approach it. Don't know, I really don't want to judge with so few experiences in Roc, I'm just being honest and giving you a clear and honest opinion on what I have been feeling reading through the docs and writing with you here on this thread.

At the end of the day I just want to see Roc grow cause I was waiting for it these past months refreshing the website to see more talks from Richard that were not appearing and then I checked back and it was "available" so I'm already pleased a project like this (goals-wise) exists.

Stream: beginners

Topic: Open tag unions

Kristian Notari (Oct 19 2022 at 15:15):

Michał Łępicki (Oct 19 2022 at 17:43):

Brendan Hansknecht (Oct 19 2022 at 17:56):

Brendan Hansknecht (Oct 19 2022 at 17:57):

Brendan Hansknecht (Oct 19 2022 at 18:01):

Brendan Hansknecht (Oct 19 2022 at 18:05):

Kristian Notari (Oct 20 2022 at 07:40):

Kristian Notari (Oct 20 2022 at 07:42):

Michał Łępicki (Oct 20 2022 at 10:20):

Kristian Notari (Oct 20 2022 at 10:22):

Michał Łępicki (Oct 20 2022 at 11:24):

Kristian Notari (Oct 20 2022 at 12:11):

Michał Łępicki (Oct 20 2022 at 12:18):

Kristian Notari (Oct 20 2022 at 12:27):

Kristian Notari (Oct 20 2022 at 12:28):

Kristian Notari (Oct 20 2022 at 12:29):

Michał Łępicki (Oct 20 2022 at 12:30):

Kristian Notari (Oct 20 2022 at 12:34):

Kristian Notari (Oct 20 2022 at 12:35):

Michał Łępicki (Oct 20 2022 at 12:53):

Kristian Notari (Oct 20 2022 at 12:59):

Kristian Notari (Oct 20 2022 at 13:04):

Kristian Notari (Oct 20 2022 at 13:04):

Ghislain (Oct 20 2022 at 13:15):

Kristian Notari (Oct 20 2022 at 13:18):

Kristian Notari (Oct 20 2022 at 13:20):

Ghislain (Oct 20 2022 at 13:20):

Kristian Notari (Oct 20 2022 at 13:23):

Ghislain (Oct 20 2022 at 13:24):

Kristian Notari (Oct 20 2022 at 13:26):

Kristian Notari (Oct 20 2022 at 13:26):

Kristian Notari (Oct 20 2022 at 13:27):

Kristian Notari (Oct 20 2022 at 13:29):

Kristian Notari (Oct 20 2022 at 13:30):

Kristian Notari (Oct 20 2022 at 13:38):

Ghislain (Oct 20 2022 at 13:41):

Kristian Notari (Oct 20 2022 at 13:43):

Kristian Notari (Oct 20 2022 at 13:44):

Kristian Notari (Oct 20 2022 at 13:54):

Kristian Notari (Oct 20 2022 at 13:55):

Brendan Hansknecht (Oct 20 2022 at 14:15):

Brendan Hansknecht (Oct 20 2022 at 14:17):

Kristian Notari (Oct 20 2022 at 14:18):

Kristian Notari (Oct 20 2022 at 14:19):

Kristian Notari (Oct 20 2022 at 14:19):

Kristian Notari (Oct 20 2022 at 14:22):

Kristian Notari (Oct 20 2022 at 14:23):

Ghislain (Oct 20 2022 at 14:30):

Brendan Hansknecht (Oct 20 2022 at 14:31):

Kristian Notari (Oct 20 2022 at 14:36):

Kristian Notari (Oct 20 2022 at 14:37):

Kristian Notari (Oct 20 2022 at 14:37):

Kristian Notari (Oct 20 2022 at 14:37):

Kristian Notari (Oct 20 2022 at 14:38):

Kristian Notari (Oct 20 2022 at 14:39):

Brendan Hansknecht (Oct 20 2022 at 14:41):

Brendan Hansknecht (Oct 20 2022 at 14:44):

Kristian Notari (Oct 20 2022 at 14:46):

Kristian Notari (Oct 20 2022 at 14:47):

Brendan Hansknecht (Oct 20 2022 at 14:50):

Brendan Hansknecht (Oct 20 2022 at 14:53):

Brendan Hansknecht (Oct 20 2022 at 14:55):

Brendan Hansknecht (Oct 20 2022 at 14:58):

Ghislain (Oct 20 2022 at 15:02):

Brendan Hansknecht (Oct 20 2022 at 15:07):

Kristian Notari (Oct 20 2022 at 15:09):

Kristian Notari (Oct 20 2022 at 15:11):

Brendan Hansknecht (Oct 20 2022 at 15:12):

Brendan Hansknecht (Oct 20 2022 at 15:13):

Michał Łępicki (Oct 20 2022 at 15:15):

Brendan Hansknecht (Oct 20 2022 at 15:16):

Michał Łępicki (Oct 20 2022 at 15:22):

Brendan Hansknecht (Oct 20 2022 at 15:27):

Michał Łępicki (Oct 20 2022 at 15:32):

Michał Łępicki (Oct 20 2022 at 15:36):

Brendan Hansknecht (Oct 20 2022 at 15:38):

Brendan Hansknecht (Oct 20 2022 at 15:38):

Brendan Hansknecht (Oct 20 2022 at 15:38):

Why can't `*` capture everything?