I've read the tutorial's take on tag unions and it's saying after an if
statement with two different tags as return values, the resulting type would be an open tag unions of those tags. Moreover, if you try to pattern match with when
on an open tag union, it's written you should handle all possible cases, which means having a _ -> something
case.
So I was expecting this program to not typecheck properly, but it's currently working without any problem. Is it me not understanding tag unions?
It's mentioned in the tutorial, look for text: "when you already have a value which is an open union, you have fewer requirements"
The type of a
by itself would be [ Err Str, Ok Str ]*
. This enables it to merge with other tags if necessary.
In this specific case, the when
clause requested that it would be [ Err Str, Ok Str ]
. So the open union is now restricted to being a closed union and it types check.
Here is concrete integer example that may help:
x : Num a
x = 12
y : U8
y = 15
z = x + y
What type is x? Originally x
could be any number type. Then it was added to y
which is a U8
. As such, x
has to be a U8
as well.
The same sort of use limiting the possible type is what is happen with that if
and when
statement.
As another example, these are also valid:
a = if 3 > 4 then Ok "ok" else Err "err"
b =
when a is
Ok _ -> "yes"
Err _ -> "no"
_ -> "other?"
a = if 3 > 4 then Ok "ok" else Err "err"
b =
when a is
Ok _ -> "yes"
Err _ -> "no"
LastConcrete -> "That other concrete tag"
in the first, a
would have the type [ Ok Str, Err Str ]*
.
In the second, a
would have the type [ Ok Str, Err Str, LastConcrete ]
.
Ok so basically the use of when
is restricting the value I'm pattern matching on based on the cases I've written. But I don't get how a
is being restricted when it's not restricted originally.
If I have [Ok Str, Err Str]*
I can't for sure restrict it to be [Ok Str, Err Str]
cause there could be possible different cases for it. And that's what a
is in this context.
Is that the case where the type checker automatically restrict a
to be a closed union [Ok Str, Err Str]
even in a
assignment? So whenever I use a
it's a closed union since I've once used it as a closed union in a when
inside b
? Otherwise I'm not getting it.
I mean, the a
b
is referring to is the same a
I've defined on the previous row, right? It's not like I'm using some untyped argument to pattern match on
I also realize I don't fully understand this, why is this an error for example?
» a = \e -> if 3 > 4 then Ok "ok" else e
… b = \v -> when a v is
… Ok "ok" -> "ok"
… Err "err" -> "err"
… b
── UNSAFE PATTERN ──────────────────────────────────────────────────────────────
This when does not cover all the possibilities:
5│> b = \v -> when a v is
6│> Ok "ok" -> "ok"
7│> Err "err" -> "err"
Other possibilities include:
Err _
Ok _
I would have to crash if I saw one of those! Add branches for them!
That's because you're working with unknown values given in input, cause both a
and b
use their arguments which they don't know anything about, so the type checker safely assume they're open
Right, that's a bad example because strings are not singleton types themselves, my bad.
But still, I'm not getting how my example type check correctly
My understanding is: value a
has an "open" union type, meaning it could be extended with some other tag so that another value has a different type. But in this case, when passed as an input to the when
which expects a "closed" union type, it's fine because all possible tag variants in the a
type are statically known and compatible. In my example (after fixing when
s in the b
function), the argument of function b
gets narrowed to a closed tag union even though a
function has a wider parametrized tag argument type
I come from a typescript background. If I have a value a
which is typed as string | number
which means either a string or either a number but I don't know what, if I "pattern match" on it to explore all possible cases I can't simply write a single case for string
because the type checker would argue I'm missing a branch for number
case.
In my example, I have a
typed as [Ok Str, Err Str]*
which I understood being the same as "I don't know which type of tag union this is, due to the * in it could be anything, but for sure if I encounter an Ok or Err tag they have a single value of type Str".
So when I pattern match on the a
value and I'm explicitly declaring only two branches covering the Ok
and Err
tags, I'm left with *
which could mean a
could be anything else other than Ok
and Err
and I'm not covering that case.
That's why I'm expecting a type check error if I'm not handling the *
case
Maybe @Brendan Hansknecht can answer me based on my last messages, because I'm probably missing something important
In my example, I have a typed as [Ok Str, Err Str]* which I understood being the same as "I don't know which type of tag union this is, due to the * in it could be anything, but for sure if I encounter an Ok or Err tag they have a single value of type Str".
I think that's the difference, in roc this doesn't mean that this value might also have any other tag inside, I think it means that it is known that it's only either Ok Str
or Err Str
at this point, but it could be composed with other tags for a different value
I guess a type itself doesn't constraint how a value of said type could be composed with some other values/types. The *
there should mean it can be other things other than Ok Str
and Err Str
, and I supposed you should cover that case in the when
, as you do in the tutorial in this first example
Which I thought being exactly my case, so requiring a catchall branch
Good point, so it's different for concrete values vs function parameters
I'm guessing here, but after having reread the answer I've received above, I'm suspecting a thing:
when
you should cover all cases, which include the *
case ("catch all")a
has a type guessed by the type checker as [Ok Str, Err Str]*
initially, but when I use it in the when
expression inside my b
value and since I'm defining only two cases, I'm actually restricting a
to only be [Ok Str, Err Str]
(so without *
)The only thing left for me to understand is: why should it be possible to restrict some value type if it doesn't affect the original value type?
In my example, the two a
used are actually the same a
, the same value. If I'm using that in my b
value and by doing so restricting it to only be [Ok Str, Err Str]
(without *
), my a
value, outside b
keep its initial type of [Ok Str, Err Str]*
.
While I was expecting it to be considered [Ok Str, Err Str]
everywhere else
I thought I understood but I agree with you, what is the type of a
in this example, and why it works?
a = if 3 > 4 then Ok "ok" else Err "err"
b = when a is
Ok _ -> "yes"
Err _ -> "no"
c = when a is
Ok _ -> "yes"
Err _ -> "no"
_ -> "?"
Str.concat b c
# "nono" : Str
Yeah cause in a mutable environment (so not Roc), you could change a
value before using it in b
and if type is not restricted (but it's only restricted within b
), you could assign any other tag to a
, causing b
to crash at runtime. That is not the case, due to immutability, but still I'm not getting it
Ghislain said:
I thought I understood but I agree with you, what is the type of
a
in this example, and why it works?
a
still is [Ok Str, Err Str]*
which is the more generic alternatives from the two. But why b
type check? It should not in your example. Is it correctly working? Have you tried it in the repl?
yes, it works
Maybe the third "catch all" case in c
, since a
is already restricted by b
, is redundant. It could be omitted. Maybe that would be a warning in the future. That would explain why b
works, but this also implies a
type being restricted actually, even if it doesn't seem to
Yeah, but I tried this:
a : [Ok Str, Err Str]*
c = when a is
Ok _ -> "yes"
Err _ -> "no"
_ -> "?"
and it prints the redundancy:
The 3rd pattern is redundant:
10│ c = when a is
11│ Ok _ -> "yes"
12│ Err _ -> "no"
13│ _ -> "?"
^
Any value of this shape will be handled by a previous pattern, so this
one should be removed.
by the way I've encountered this right now while playing with it :grinning_face_with_smiling_eyes: :
image.png
I guess you have to provide an implementation for a
Ghislain said:
and it prints the redundancy:
The 3rd pattern is redundant: 10│ c = when a is 11│ Ok _ -> "yes" 12│ Err _ -> "no" 13│ _ -> "?" ^ Any value of this shape will be handled by a previous pattern, so this one should be removed.
That's strange to me :(
image.png
this explains why it's working, but also implies we're not getting it right
the second point in the list
I'm not sure I should be able to type this:
image.png
Maybe something interesting is the output of this expect
:
expect
b = if 3 > 4 then Ok "ok" else Err "err"
b != b
b : [Err Str, Ok Str]a
b = Err "err"
it doesn't give [Err Str, Ok Str]*
, though I don't exactly understand the difference yet
Ghislain said:
it doesn't give
[Err Str, Ok Str]*
, though I don't exactly understand the difference yet
The difference should be naming, so that you can refer to "whatever more there is" as a
when composing tag types, but if it's not used anywhere else it should basically be equivalent as saying *
there, cause you're not forcing a
to be something appearing somewhere else.
Like assigning *
to a named type variable a
you can use in other points
Ok this explains why it's working:
image.png
At least counterintuitively for me, what is happening here is that the *
bit doesn't allow the value you're assigning to be anything else different than what the original tag union is. It's basically only useful when used as a type parameter in functions.
Even if you use it in function return types as in:
c : {} -> [Ok Str]*
You're forced to return a tag called Ok
with a value Str
. You can't return Err "err"
from c
That's what you would get:
image.png
I am not sure how helpful this will be, but something that helps me is to remember that when all things are said and done, open tags like [Ok Str, Err Str]*
no longer exist. The final output machine code will only ever specify closed tag types and values.
What is a
in this example?
a = if 3 > 4 then Ok "ok" else Err "err"
b = when a is
Ok _ -> "yes"
Err _ -> "no"
c = when a is
Ok _ -> "yes"
Err _ -> "no"
_ -> "?"
Str.concat b c
# "nono" : Str
a
is a tag that has 2 possible variants. The first variant, Err Str
has a tag of 0 and contains a Str
. The second variant, Ok Str
has a tag of 1 and contains a Str
. The physical layout in memory is essentially { value: Str, tag: bool }
. Once types are finalized for the whole program and it is being turned to machine code, there is no possiblity for a
to be anything except Err Str
or Ok Str
.
As such, the _ -> "?"
is literally impossible.
Yeah that is simple to reason about. What doesn't help is seeing a
typed [Ok Str, Err Str]*
from the compiler/type checker/repl. This way your convincing me *
could be used when declaring things as a placeholder for this tag could be anything
which is not the case.
The *
is only helpful when accepting something in a more open/wide way, as in function arguments.
As I'm understanding it
In fact this is an error:
image.png
And it's counterintuitive for me :)
Should this fail like this?
image.png
Brendan Hansknecht said:
in the first,
a
would have the type[ Ok Str, Err Str ]*
.
In the second,a
would have the type[ Ok Str, Err Str, LastConcrete ]
.
@Brendan Hansknecht As I understand what you just said, a
would still have the same type [Ok Str, Err Str]*
(the when
would have no responsibility over the a
type)
I think the main confusion about open tags comes from them acting differently depending on context. These is seen most easily with function arguments and return values.
^^ yeah, noting star mosly mattering in function arguments is very correct. Though it also maters in return types.
When handling a function argument, you always need to deal with the _
case if it is an open union. This is because a function parameter is not concrete. It is a value that will change on each call. In the end we still have to generate concrete types, but we can't analyze locally and do that.
When you see:
f : [ Ok Str, Err Str ]* -> Str
f = \x ->
when x is
Ok str -> str
Err str -> str
_ -> "?"
c = Apple
f c
what is the concrete type of x
? The actual value that is generated by the call f c
.
It is a tag with 3 variants:
Apple
with tag 0 and no valueErr
with tag 1 and a value of Str
Str
with tag 2 and a value of Str
So the in hardware representation is essentially { value: Str, tag: U8 }
, and in the case of Apple
, the value
is empty and the tag
is 0.
I'm getting the why and how of open/closed tags now, but it's still confusing how *
change importance in different contexts
Have you looked at the repl examples I've shown above? here
This is counterintuitive at least :)
The same goes for List *
or wherever you use *
I get like when creating values directly without using functions it doesn't really make sense to use *
in the first place, in fact the type checker errors out and advice you to narrow the type or generalize the value
but still, it's confusing :D
Now switching to the return case. An important part of a returned tag union is that it is allowed to grow. As such, you might start with [ Ok Str, Err Str ]*
and end with [ Ok Str, Err Str, Apple ]
.
another example:
# let's just pretend this function is defined:
f : Str -> [Ok Str, Err Str]*
if someBool then
f "yay!"
else
Apple
This generates the same concrete tag as above. As such, the generated code from f
has to be modified. If f
were to just generate [Ok Str, Err Str]
, Err
would have a tag of 10. This would end up leading to a conflict with Apple
, which has a tag of 0 in the final tag. As such, the final output type of f
is [ Apple, Err Str, Ok Str ]
. Though it will only ever generate the Err
or Ok
case, it has to generate them with a new tag due to the tag union expanding to include Apple
.
Kristian Notari said:
Have you looked at the repl examples I've shown above? here
So we talked about changing the name at some point to growing tag union instead of open. *
/[]*
really means the current definition of this tag contains no variants, but later it can grow to contain variants. So [Ok, Err]*
means. The current tag can either be an Ok
or Err
variant with the capability to eventually grow to hold other variants.
Brendan Hansknecht said:
Why should the writer of the function f
be responsible to open that for you? If the function f
only return Ok
or Err
it should say so in the signature as the return type. What I want to do with that return value after calling f
should be caller's responsibility. Is that done for performance reasons only?
I get the importance of preserving open tags when accepting them and when you need to return them (after modifying them for example, so they're linked to the argument type), but I'm not getting why I should bother as a function writer how callers will "union" tags in the future, after the call to my function
If I define Color : [ Red, Green, Blue ]
. And have a function f : Str -> Result Color [ NotAColor ]*
, I am intentionally constraining the caller of this function. I am telling them that I know all possible colors. You can not take my color and merge it with some other list of colors. You only get red, green, or blue. At the same time, I am leaving the error type open to the user. I am telling them that I might give them a NotAColor
error, but they might have tons of other errors they also want to handle.
That enables my color to be passed into my other function draw : Shape, Position, Color -> Drawing
. If instead I returned a [ Red, Green, Blue ]*
, the user would have to match on it and then constrain the result to [ Red, Green, Blue ]
before they could pass it to draw
. I don't want that tag to expand. It is inconvenient to both myself and the user.
Also, I guess it technically isn't important for performance, we could force all functions to always return open tags. That would not hurt performance, but it would make programming less convenient.
@Ghislain, can you be more specific with this comment: https://roc.zulipchat.com/#narrow/stream/231634-beginners/topic/Open.20tag.20unions/near/305162300
Yes, I tried your code (hope that the function doesn't change the behavior, expect
doesn't seem to like when
in it)
fn = \v ->
when v is
Ok _ -> "yes"
Err _ -> "no"
LastConcrete -> "That other concrete tag"
expect
a = if 3 > 4 then Ok "ok" else Err "err"
(fn a) != (fn a)
and it gives:
a : [Err Str, Ok Str]a
a = Err "err"
Ah, i guess when I talked up above i was also, trying to give the "concrete" type that the actual compiled binary would see. Technically the type of a
doesn't exactly change. a
is a [Err Str, Ok Str]b
(used b
to avoid confusion below)
What does actually change is b
. In your example, b
is [ LastConcrete ]
, leading to a final "concrete" type of [Err Str, LastConcrete, Ok Str]
.
Brendan Hansknecht said:
That enables my color to be passed into my other function
draw : Shape, Position, Color -> Drawing
. If instead I returned a[ Red, Green, Blue ]*
, the user would have to match on it and then constrain the result to[ Red, Green, Blue ]
before they could pass it todraw
. I don't want that tag to expand. It is inconvenient to both myself and the user.
I get it but that's not how it works in other languages and it's confusing me on the reasons why this is needed. If I say my f
function return a Color
and the caller needs Color
to use it in other functions, if it uses a color based on the value returned by:
# let's just pretend this function is defined
Color : [ Red, Green, Blue ]
f : Str -> Color
# I expect myColor to be [Purple]Color
myColor = if Bool.true then f "yay" else Purple
That automatically gives him a type error when he tries to call:
draw shape position myColor
Cause myColor
is not assignable to Color
. I don't need to "help" the caller or constraint the caller. Signatures says that already.
While open/close concept on tags and records is useful when accepting stuff or when linking accepted stuff to returning stuff, in the case of f
having the ability to constrain the tag union returned by the function so that the caller can't mess with it, it's confusing for me and I'm not getting why it's a feature
Interesting. I come from lower level languages where the default is constrained and there are no other options. An enum is defined once and has no flexibility. As such, defaulting to constrained makes a lot of sense to me. Have never thought about it the way you are mentioning.
What is a language that does what you mentioned and also has a static type system?
I found that open unions can cause weird inference sometimes, so I appreciate that closed unions exist. Here I see no reason why I would need to handle A
in bar
, other than open unions weirdness:
» foo = \a ->
… when a is
… A x -> B x
… y -> y
…
… bar =
… when foo C is
… B _ -> "ok"
… C -> "ok"
… bar
── UNSAFE PATTERN ──────────────────────────────────────────────────────────────
This when does not cover all the possibilities:
10│> when foo C is
11│> B _ -> "ok"
12│> C -> "ok"
Other possibilities include:
A _
_
I would have to crash if I saw one of those! Add branches for them!
As an aside, open tags can have a cost in terms of generated assembly bloat and memory bloat, that said, without silly mistakes, I would expect this cost generally be zero.
Here I see no reason why I would need to handle A in bar, other than open unions weirdness
(That said I also wouldn't be able to write this code with closed unions, and I didn't find other weird behavior yet)
The weirdness is because the input tag and the output tag have to be the same type. This is because y -> y
(it is a no-op, we don't re-layout the tag). As such, if the input type can have an A
variant, it means the output can theoretically have an A
variant. Also, if the input can have any tag, so can the output.
It's kinda as if you wrote: f : Str -> [ Red, Green, Blue ]
, but f
only ever returns the Green
variant. Even though f
will only ever be Green
, you still need to match on Red
and Blue
when using the result of f
.
I think e.g. Typescript might do better with examples like this because they do flow typing? (I think it's more work for the compiler but I don't know much about it)
Following the logic: for A
the return type is B
, for []* \ A
(anything but A
) return type is the same, then combine the branches so the type is []* -> [B]* \ A
But that seems a bit offtopic already, sorry about that
Yeah, we don't do any "negative" types. I'm not exactly sure what to call them.
Where it can be any value except something
So it is reducing the scope of variants a variable could be.
Michał Łępicki said:
I think e.g. Typescript might do better with examples like this because they do flow typing? (I think it's more work for the compiler but I don't know much about it)
That's the background I'm coming from and that's why it sounds confusing reasoning about types as it is in Roc
Both for "excluding" subtypes from types and both for the *
context sensitive utility
But still, I'm not getting why not to go for something like typescript and rule out things like y -> y
with better types or accept things like:
a : *
a = Ok "ok"
Apart from performance, I don't see the reasons why. Maybe they are good reasons, but I'm curious to know them
At a minimum, it would make the compiler more complex and slower, it would have to figure out the underlying concrete type and the possible variants the type could take. So that is more information to collect and propagate. That said, I don't really know the answer. Hopefully someone with more type system knowledge can jump in and expand the explanation.
This is going to be a long message, but I hope it will provide more context on how and why this works in Roc. It's a question we get pretty often and a story we've been trying to figure out how to make better, through documentation and/or language changes.
In rough, I'll try to provide an intuition for how these tags work, why they are useful in applications, and why things are this way - in particular, why not switch to subtyping? Then I'll show one experiment we are considering.
Okay, so the first thing to keep in mind is that Roc does not have subtyping. That means that a value of type A
cannot be used where a value of type [A, B]
is expected, unlike in e.g. TypeScript, where A
can be passed to a function that expects A|B
. We'll get to why Roc doesn't have subtyping later, but was this means from an intuition, is that if you try to use a tag union value given as the result of a function, your use must match the returned value exactly.
This means if I have a function
openFile : Str -> Result File [OpenFileError ...]
writeFile : File, Str -> Result {} [WriteFileError ...]
openAndWrite : Str, Str -> Result {} [OpenFileError ..., WriteFileError ...]
openAndWrite = \file, content ->
when openFile file is
Ok f -> writeFile f content
Err e -> Err e
This does not compile, because despite [OpenFileError ...]
(and [WriteFileError ...]
) "fitting" into [OpenFileError ..., WriteFileError ...]
from a subtyping perspective, that is not how Roc does things - because these types do not match exactly, they are not equal. Again, it is a very reasonable question why things cannot be done that way, and I'll elaborate that later on.
So, where did things go wrong here? The key insight is that our functions openFile
and writeFile
enumerate the failure cases they might produce, but the context those failures are used in is determined only by the user of those functions! Roc enables authors to enumerate variant cases while providing callers flexibility in how they can use those cases via open tag unions.
I think it may be helpful to look at analogous example in TypeScript here. We can write the following program:
interface OpenFileError { ... }
interface WriteFileError { ... }
type Result<Ok, Err> = ...
function openFile<T>(fileName: string) -> Result<File, OpenFileError | T> { ... }
function writeFile<T>(file: File, content: string) -> Result<File, WriteFileError | T> { ... }
I hope this might explain more of what's going on - as you can see, T
is not useful for anything except for defining the type expected by the caller of that function! Now of course, this might seem totally useless in TypeScript, since OpenFileError
always fits into OpenFileError | T
, for any T
. But I hope you can see that in cases like Roc's, where OpenFileError
can be used as OpenFileError | T
only if T = []
, and otherwise the return value must be exactly OpenFileError | T
, this can be useful.
So, this now corresponds to the Roc program
openFile : Str -> Result File [OpenFileError ...]*
writeFile : File, Str -> Result {} [WriteFileError ...]*
where the TypeScript type parameter <T>
is Roc's *
here.
The usage
openAndWrite : Str, Str -> Result {} [OpenFileError ..., WriteFileError ...]
openAndWrite = \file, content ->
when openFile file is
Ok f -> writeFile f content
Err e -> Err e
from before now compiles, because the Roc compiler will effectively see "oh, I need to create a openFile
where * = WriteFileError
", and generate a specialization of openFile that has exactly the type signature
openFile : Str -> Result File [OpenFileError ..., WriteFileError ...]
which will be the version used in openAndWrite
. That's analogous to instantiating
function openFile<T>(fileName: string) -> Result<File, OpenFileError | T> { ... }
with a concrete T = WriteFileError
to produce a function
function openFile(fileName: string) -> Result<File, OpenFileError | WriteFileError> { ... }
So, I hope the ethos of "allow users of a value decide the context it can be used in" makes sense at this point. I'd like to show how this extends to literal/non-function values as well, since that often trips folks up.
I think it's good to start with numbers.
n : Num *
n = 1
I think this is more intuitive than open tag unions, but the idea is the same - what exactly n
is, from a raw-bytes-on-the-machine perspective, isn't determined until you use it in a particular way that resolves it to a concrete type - for example, pass it to a function that expects a U64, or an F32, or something else.
And this means that you can use n
in multiple contexts that you like, because the definer of n
gave you the freedom to do so! For example, the following works just fine:
n = Num *
n = 1
rcd : {a : U64, b : F32}
rcd = {a: n, b: n}
The same principle extends to tag unions. I can do something like
purple : [Purple]*
purple = Purple
Pastel : [Purple, Pink, Creme]
Deep : [Purple, DarkBlue, Indigo]
rcd : {pastel: Pastel, deep: Deep}
rcd = {pastel: purple, deep: purple}
You could think of purple : [Purple]*
as the TypeScript type signature const purple<T>: Purple|<T>
, if TypeScript allowed you to say something like that!
This might seem contrived, and usually, it is - much more so than allowing callers of functions to use return values in any context they like. It used to be important when Roc's booleans were not an opaque type, and were instead the tags [True, False]
. You could write programs like
flag = True
if flag then foo {} else bar {}
Now, if the type of flag
was only [True]
, it could not be used in the context [True, False]
in the "if" statement. By making it [True]*
, you add a type variable that tells the compiler "hey compiler, this is a True value, but allow people to use it any context they like", and the compiler can then make that particular usage exactly [True, False]
.
*
capture everything?Finally, let's take a look at the example
foo : {} -> [A]*
foo = \{} -> B
Maybe you already understand now why this does not type check, but if not, the key to remember here is that *
does not add information to the type, and it does not mean "this tag can be anything else". It means, "allow this value to be used in any context that includes itself, or anything more". So, foo
's API contract is "I produce an A, and you can use that A in any larger context". But the contract is a lie, because it actually produces a B!
This would be analogous to the TypeScript
interface A {a: ""}
interface B {b : ""}
function f<T>() : A | T {
return {b: ""}
}
which also does not typecheck, for the same reason. Or even more directly,
function f<T>() : number | T {
return ""
}
I won't bore you with the details here, I (and I'm sure others) would be happy to elaborate more to anyone who is interested, but the TLDR is that Roc's model of compiling to efficient, boxed representations of types would break down if Roc used subtyping a-la TypeScript's subtyping. The easiest way to think about this is that with the program
openFile : Str -> Result File [OpenFileError ...]*
writeFile : File, Str -> Result {} [WriteFileError ...]*
openAndWrite : Str, Str -> Result {} [OpenFileError ..., WriteFileError ...]
openAndWrite = \file, content ->
when openFile file is
Ok f -> writeFile f content
Err e -> Err e
after the type checker figures out how the *
s should be instantiated with a concrete type, based on the contextual usages in openAndWrite
, we end up with the program
openFile : Str -> Result File [OpenFileError ..., WriteFileError ...]
writeFile : File, Str -> Result {} [OpenFileError ..., WriteFileError ...]
openAndWrite : Str, Str -> Result {} [OpenFileError ..., WriteFileError ...]
openAndWrite = \file, content ->
when openFile file is
Ok f -> writeFile f content # no conversion needed here!!
Err e -> Err e # no conversion needed here!!
And notice in the two branches, we don't need to do any conversions at all to pass the values through with the same types - since they are exactly the same type, they will have exactly the same underlying byte representation on the machine, and can be transparently passed through.
With how Roc compiles code, the same would not be true if we allowed
openFile : Str -> Result File [OpenFileError ...]
writeFile : File, Str -> Result {} [WriteFileError ...]
openAndWrite : Str, Str -> Result {} [OpenFileError ..., WriteFileError ...]
openAndWrite = \file, content ->
when openFile file is
Ok f -> writeFile f content # uh-oh, how do I convert?
Err e -> Err e # uh-oh, how do I convert?
Now, the types returned from openFile
and writeFile
are not the same types as they are used in openAndWrite
, and so the underlying representation (as it is today) would not be the same. The Roc compiler would have only a few options here:
I hope that provided a baseline idea of why Roc doesn't use subtyping today, but happy to elaborate more.
Maybe! We'd like to think so. One idea is that we should always make tag unions open in "output position", i.e. when returned by a function, and always allow the user to determine the context they are used in. Here's a playground of what that would look like: https://ayazhafiz.com/plts/playground/cor/easy_tags, feedback welcome!
Hit 9985 chars of the 10000 chars limit there :sweat_smile: and apparently it can't be edited. Edits:
First of all, thanks for taking the time to answer my doubts.
I've understood the performance reasons behind, the typescript analogies (so how they relate with Roc code as you described) and the subtyping stuff. I'm not getting the lines like:
Ayaz Hafiz said:
So, where did things go wrong here? The key insight is that our functions
openFile
andwriteFile
enumerate the failure cases they might produce, but the context those failures are used in is determined only by the user of those functions! Roc enables authors to enumerate variant cases while providing callers flexibility in how they can use those cases via open tag unions.
where I assume there's also some "usability" reason behind the choice.
I mean, I'm not seeing why I, as a library developer, should choose to return closed tag unions, which basically limit what my caller could do with that value
Yeah, I don’t think you ever would want to! That’s why open tag unions are useful, to let the caller decide how exactly they want to use it. And it’s the subject of the experiment I mentioned at the bottom, where we can make it so that you cannot say that a tag union is closed if it appears as the return value of a function.
Because, from what I've understood, if the return tags were to always be open, they are then implemented, after compilation, depending on how you've used those within your code. So if you only use those for what they are, without ever unioning them with other tags, performance wise that's the same as having closed tags in the first place.
Yes exactly. Open tag unions are never a thing at runtime, just like “<T>” type parameters are never a thing at runtime in TS/JS. They exist solely to provide more flexibility in the API surface of programs.
Do you have any examples of closed tag unions being useful as return types? Like it's not something that bothers that much, but having a feature in the language which can only be useful partially depending on where you use it smell like something that could be described better using another approach. Not to sound offensive, I'm just trying to better know Roc cause I really want to try it out!!
Ayaz Hafiz said:
[...] and it does not mean "this tag can be anything else". It means, "allow this value to be used in any context that includes itself, or anything more"[...]
By the way, this should be written in capital letters in the tutorial page :D if it is not already
No you’re totally right. I don’t think it’s all that useful to have closed tag unions in output position. The only example i can think of is if you write a function
id : [A] -> [A]
id = \x -> x
but i’ve talked about this before with some people, and it seems like you would never actually write such a thing in practice. Which means it may be better to have returned [A] always be open (ie what is [A]* today)
Ok so basically you're trying to convert [A]
to be by default as you wrote [A]*
when used in output positions?
Yes
And what about direct values? Like:
a : [A]*
# or
a : [A]
Are they the same as output positions for functions?
yes
So basically the *
becomes relevant only in typing function arguments when you don't want to be that strict?
It start to smell like variance :D when used as input or output it basically default to the opposite direction
Would closed tags be impossible to describe at all in output positions or there would be something new like [A]!
to explicitly force closeness?
I still think that many things have limited scope. If something has a known limited scope, it should always be a closed union (whether argument or return type). A Bool should always be [True, False]
, I don't ever want a Bool to expand to include KindaTruthy
.
From a definition perspective yes, but you should not limit the caller by defining it closed on the return type of your functions. Because the typechecker already check what you have if you decide to union [Bool, True]
with KindaTruthy
then booleans operations don't work anymore. Is not something that's hidden from your control. It doesn't affect anything
I don't want to write code the enables the user to create something that is more bug prone. Whether intentional or not. If they want a KindaTruthyBool
, they can explicitly define it. I get that it would be constrained elsewhere, but I think every location should be as constrained as makes reasonable sense. It makes reasonable sense to constrain a Bool
type to never expand.
the following works just fine:
n = Num *
n = 1
rcd : {a : U64, b : F32}
rcd = {a: n, b: n}
I am not sure if I find this super cool or really scary. It feels like enabling implicit casts in my mind. As in, the *
in Num *
should only be able to refer to one things. If it was instead a type parameter T
, it would either become U64
or F32
. The fact it can become both feels like the type system lying. On the other hand, everything in Roc is immutable and it isn't like the type is changing from one line to the next, it is just being defined by each use. Super weird....not sure what to think of it.
It's not being two things at once, from what I've understood. It's just open to be concretely implemented by the compiler as the thing it's used for.
Brendan Hansknecht said:
I don't want to write code the enables the user to create something that is more bug prone. Whether intentional or not. If they want a
KindaTruthyBool
, they can explicitly define it. I get that it would be constrained elsewhere, but I think every location should be as constrained as makes reasonable sense. It makes reasonable sense to constrain aBool
type to never expand.
You can't be sure what's error prone, cause the use they do of what you return is solely up to them. What if you have the week workdays as tags and you return a closed tag union of [Monday, Tuesday, ..., Friday]
and they want to use it as value for one of their "Schedule" values (tags) which also allows Weekends?
Moreover, Bool
expansion can be useful. Consider the case where you want to describe (not saying it's the best way to do it, but still) how much someone agree with you. You could go for:
Agreement : [True, False, InTheMiddle Num *]
So to have three cases, 100% (yes, True), 0% (no, False), and something in the middle with a percentage.
You just defined a different type, not a Bool
. It can have a nice fromBool
method, but I'd don't think any Bool should implicitly convert to an Agreement
Implicit conversions are a common source of bugs.
You're not converting anything. If you have:
# this is my type
Agreement : [True, False, InTheMiddle Num *]
# this is an external API that gives me some boolean value back
getTrueOrFalse : [True, False]
# this is my function which should do something then give back my agreement value
f = if something then getTrueOrFalse else InTheMiddle 42
There's no conversion involved, yet it does not type check. Why shouldn't it type check?
I guess I just fundamentally see types and conversions differently than you do.
# My type
# Note, neither Bool nor Agreement are open tags. They explicitly enumerate every state they can ever contain.
Agreement : [True, False, InTheMiddle Num *]
# Some function that tells me if a contract is trustworthy or not.
# Maybe even uses its own type instead of Bool. could be [ Trustworthy, Untrustworthy ]
# In this case, lets assume bool.
isTrustworthy : Contract -> Bool
# My function to get an agreement value
calcAgreement : Contract -> Agreement
calcAgreement = \contract ->
if something then
# in this case only the trustworthiness matters. No middle states
# if we just return `isTrustworthy contract` here, I think that should be a bug.
# it is an implicit conversion from `Bool` to `Agreement`.
# they are not the same type. We need to be explicit with the conversion.
isTrustworthy contract |> Agreement.fromBool
else
# we don't have enough info to use isTrustworthy. So actually do some agree-abality calculation.
InTheMiddle 42
I think it is very important to note, what happens if isTrustworthy
changes it's api.
In my example, if isTrustworthy
changes to anything other than a Bool
, there will be a compiler error.
This means whoever changes is isTrustworthy
will be forced to consider my code (even if this is me in the future updating some external library).
If instead, we just return isTrustworthy contract
directly because the two tags are merged, we might not get a compiler error when the api of isTrustworthy
changes. With this small example, it is likely that either way we will get a compiler error, but with larger tags, it is possible that isTrustworthy
could be modified such that it is still a subset of Agreement
, but using the tags to mean something different.
For this example, which of course will be contrived due to these specific tags, what if isTrustworthy
suddenly started returning InTheMiddle 423
where this doesn't mean we have 423
agreement, but instead, the contract is still in the middle of negotiation, so we don't know if it is trustworthy and 423
is some sort of contract legal code for what stage the contract is in. We just introduced a huge bug due to allowing the result from isTrustworthy
to implicity convert to an Agreement
.
That is compelling, but I don't know if that particular problem is avoidable in general with a language that anonymous structural types. Also, I feel like the payload of InTheMiddle
should always be an opaque type/named structural type (e.g. record) in an API like this, so I wonder how likely this is to happen in practice.
I think the upside of being able to accumulate tags returned from a function in any context you like, for free, is a huge upside for a language like Roc, that relies heavily on monad-like patterns, that we should try to prioritize making that seamless
But, to Brendan's point, there are cases where you might accidentally do too much work based on the tags returned from a function
For example
get : Url -> Result Response [Http404, Http500, HttpTimeout]*
when get "https://roc-lang.org" is
Err Http404 | Err Http500 | Err HttpTimeout -> ...
Err HttpRedirect -> ... # useless branch!
Ok l -> ...
That second branch in the when
is useless, but it would be admitted currently
however there are ways the compiler can help you out here, for example "bidirectional exhaustiveness checking", where we check that the branches possibly returned by the "when" condition (in this case get "https://roc-lang.org"
) are exhaustive or redundant relative to the branches that are matched in the when condition. So it's like normal exhaustiveness checking, but going the other way
That would catch this case, helping you avoid some of the common pitfalls of these kinds of tag unions. I actually have a branch implementing that, and it works pretty well!
Kristian Notari said:
Would closed tags be impossible to describe at all in output positions or there would be something new like
[A]!
to explicitly force closeness?
I would be inclined to say "no", because that seems like it could make things as confusing as they are today for folks who do not grok the "open means usable in any context" definition of tag unions. But obviously I don't really know any of the right answers here
Yeah, i definitely agree that my example is relatively unlikely. I am probably being overly defensive, but I am used to the explicit nature of enum like types. I probably need to play around with open tags a lot more to see the benefits. Currently my only use case for open tags is error types.
Question: If we defaulted to returning open unions from functions, would there be a way to explicitly close them?
I guess you could always just make them opaque?
That’s what Kristan’s question I quoted above is asking. I think the answer should be no, you can’t (otherwise it could be equally confusing as the current state of things)
Ok
Brendan Hansknecht said:
Your example is correct and I agree with you more separation between domains is a good thing in general. The problem is that even with a domain translator function like Agreement.fromBool
you're always subject to change in other's APIs. What if isTrustworthy
start returning always true
or always false
? No compiler error, everything's fine, you notice later in production cause every agreement turns out to be True or False.
Ideally this kind of corner cases are already covered by testing, because there's no way to defend your code 100% from changes to other APIs if the signature/type still matches.
@Kristian Notari I am pretty sure you just described a different class of error. Yes, always true or always false is a bug that should be tested for. Unit tests should catch that.
On the other hand, roc has a a
static type system. You should never need to write a test that a function actually returns a specific type. The type system should automatically catch all errors of this class. Only in a dynamic language would you need a test of that variety.
Having strong guarantees that mean you don't even need a test is much better than writing many many tests and hoping you didn't miss an edge case.
Also, isTrustworthy
always returning false is not a change in it's API. It still is returning a bool, which is all that its API guarantees. Again, it is still likely a bug, but it is not related to the API.
Brendan Hansknecht said:
Having strong guarantees that mean you don't even need a test is much better than writing many many tests and hoping you didn't miss an edge case.
You need tests in both cases. It's not a different class of errors.
Brendan Hansknecht said:
With this small example, it is likely that either way we will get a compiler error, but with larger tags, it is possible that
isTrustworthy
could be modified such that it is still a subset ofAgreement
, but using the tags to mean something different.
This ^^ wrote by you is the same class of errors as the one I was telling you about with my example. Not a "change in the signature" but a change in the API behavior still worth noticing.
You always have to check things.
Brendan Hansknecht said:
For this example, which of course will be contrived due to these specific tags, what if isTrustworthy suddenly started returning InTheMiddle 423 where this doesn't mean we have 423 agreement, but instead, the contract is still in the middle of negotiation, so we don't know if it is trustworthy and 423 is some sort of contract legal code for what stage the contract is in. We just introduced a huge bug due to allowing the result from isTrustworthy to implicity convert to an Agreement.
And this ^^ too would be checked only by tests. As soon as the function you're using (isTrustworthy
) or the function you're creating (calcAgreement
) keep their signature, everything can change, even after you add closed tags, it's just more uncomfortable to wrap/unwrap them every single time.
If you're always adhering to the type signatures of functions (so no compiler error) everything's fine and what's left is tests goal to test, if needed.
It doesn't really matter if tags are closed or not, if you can easily translate to and from tags with wrapping or with implicit tag merging
Those two quotes both only happen if you don't have Agreement.fromBool
and don't enforce explicit casts. With explicit casts, they are both impossible.
You're just moving the problem to the wrapping/unwrapping. What if Agreement.fromBool
changes to return a InTheMiddle 42
?
you can't prevent that to happen by just using types
You are greatly misrepresenting my point. The types make you safer and remove a class of problems. They help to scope issues and give you some guarantees. This does not mean they guarantees no bugs. This does not mean you don't have to test. An explicit cast can still have a bug, but it is much easier to test 1 explicit cast function than it is to test ever single location that could have an implicit cast.
"The types make you safer and remove a class of problems" --> I hope we all agree on that one since we're here for Roc :grinning_face_with_smiling_eyes: and I agree too
In this case, the types simply guarantee that isTrustworthy returns a Bool. I should never need to test whether or not isTrustworthy
returns InTheMiddle 42
.
With implicit casts, you do need to test for that.
Yeah but just define a helper function that chain isTrustworthy
with Agreement.fromBool
. Does that fulfill your requirements for having less bugs? Because from calcAgreement
you now call a function which return a Agreement
but you really don't know what. The problem's still there
I still think that is much better than the implicit cast. It is a single testable function instead of needing to test every location that could have an implicit cast.
Tags are great, once you have them there's no really the need of wrapping/unwrapping things if they just get merged easily as with typescript. You'd only need wrapping/unwrapping (the casting you were referring to) if you need to add data to them or you want them to have different names, to group them, or something similar (readability). It doesn't alter correctness
Besides limited cases, like error tags, i have not seen uses where I would consider tags automatically merging beneficial. I would love to learn otherwise, but I have not seen examples where I would want to default to open tag unions. So I see open tags as a valuable feature in some cases, but not something I would expect to default to.
In my typescript experience, I use string literals as types (I mean everyone does). In the typescript type system "mystring"
is a subtype of string
, so you can precisely type string literals as if they were tags (no data attached, but can workaround that in other ways). There are part of the domain which are focused on some bits and other parts focused on others. When you compose those parts, the resulting "type" should be the union of those "string literals" you're returning.
So in our example, the code which handle the true or false bit could be somewhere and somewhere else there could be the code that handle the InTheMiddle case. When you "join" those bits in your "upper level" code it's not mandatory nor beneficial (unless you want to add other info, wrap them with different names or something similar which has nothing to do with correctness or bugs) to wrap them with other "strings" or other types.
So in a typescript example:
type ContractTrustworthy = "true" | "false"
declare function isTrustworthy(contract: Contract): ContractTrustworthy
type Undecided = "in-the-middle" // ignore the number bit for a second
declare function makeInTheMiddle(contract: Contract): Undecided
type Agreement = ContractTrustworthy | Undecided
declare function calcAgreement(contract: Contract): Agreement
and then the body would be:
if (something) {
return isTrustworthy(contract)
} else {
return makeInTheMiddle(contract)
}
Type are automatically merged (that is unioned) and you get an Agreement
out of the two branches, which is the merging of both ContractTrustworthy
and Undecided
. It's common with errors, so errors merge with other errors giving you a list of reasons things could possibly fail for
But there's nothing less/more error prone in doing it this way or with wrapping types
One can object refactoring/readability/noticing future bugs can be easier/harder with one approach or the other (don't know which honestly), but that's pure developer friendliness, the two approaches doesn't change the program correctness underneat.
(here I'm using true and false as string literals instead of the types true
and false
which typescript has, boolean type, just to better adhere to the tags example in Roc)
Yeah, the fundamental difference is just if you have to be explicit or not. That is all.
With closed unions, you either have to wrap or convert.
convert:
Agreement : [ True, False, InTheMiddle U64 ]
if something then
isTrustworthy contract |> fromTrustworthy
else
makeInTheMiddle |> fromInTheMiddle
wrap:
Agreement : [ ExactAgreement ConstractTrustworthy, FuzzyAgreement Undecided ]
if something then
isTrustworthy contract |> ExactAgreement
else
makeInTheMiddle |> FuzzyAgreement
Another example I can give on "consider tags automatically merging" is that from a mental model perspective (at least mine), they're unions of things. I assume you can always say "this OR that", in whatever language you like. And that's what you basically want, except you need a wrapping/casting/converting pass as you've just describer before you actually can do that if you have closed tags.
I think it has been shown in a lot of languages, that most of the time it is more preferable to be explicit because implicit casts lead to more bugs. As such, I prefer closed unions as the default. I understand the use of an open union. I think they are great for error tags, but errors are a special case where accumulating anything and everything makes a lot of sense.
From my mental model what I see is (not valid Roc code):
Agreement : True OR False OR InTheMiddle U64
if something then
isTrustworthy contract # True OR False
else # OR
makeInTheMiddle # InTheMiddle U64
The fact I can't express this simple use case because some library author wrote their isTrustworthy
function with closed tags is a no go personally. It shouldn't be his right to decide
Also, I think it would be different if we had another way to define a tag as something like this:
Agreement: ContractTrustworthy | Undecided
In this case, I am explicitly saying that an Agreeement
is the merging of ContractTrustworthy
and Undecided
. In that case, you are kinda opting into implicit conversion. Roc does not currently support this, but if it did, I think it would have value.
^^ Yeah, essentially what you said with different wording.
You can actually do that in Roc from my understanding of the Tutorial:
Agreement: []ContractTrustworthy
I don't know how to union multiple ones, but you actually can
The same with records
Strange syntax IMO but working I guess :grinning:
So I guess, my statement has to shift.
I like closed unions as the default. I think open unions should only be used in special cases like error types. All that said, I definitely think there would be value in having the ability to define a type that explicitly merges multiple closed unions into a bigger union (still closed). This type would then enable the implicit cast use case, but at least it is documented in the type system by the new type being defined as the merging of multiple sub-unions.
The problem being: the function author should not be able to choose how his tags "closeness" affect my code
It should be a caller problem
Cause you don't know what makes sense to close or open beforehand. Different context and domains should be able to decide on their own
Which if you could define agreement as Agreement: ContractTrustworthy | Undecided
, that would not be a problem. You just are explicitly changing your type to extend the closed ContractTrustworthy
Instead of defining an arguable different type: Agreement : [True, False, InTheMiddle Num *]
and then implicitly casting
Yeah with Roc "uniqueness of tag" philosophy I agree I should not recreate "True" tag and hope it will be considered the same as the function return type tag I'm calling, that's a thing
Which I guess as you said could theoretically be written like: Agreement: [InTheMiddle Num *]ContractTrustworthy
, assuming you are extending and not merging two different unions
But my Roc syntax before is not invented I'm pretty sure I've read it on the Tutorial. You could write Agreement : []ContractTrustworthy,Undecided
or something similar and it just union them
yeah that
The problem being, even if that's valid Roc syntax and we both agree you should be able to write your Agreement
type like that, if the author of the isTrustworthy
function says the return type is a CLOSED tag then you literally CAN'T merge it, even if your Agreement
type says so, cause if/else must return the same type in both branches
and the Roc typechecker would argue it cannot "merge" the closed union tag returned by isTrustworthy
that's why they're defaulting to change this behaviur in Roc to "output types should have open tags only"
So yeah, I guess my opinion is:
output types should have open tags only
currently strongly disagree
Closed tags should have some way to be unioned to fix this underlying issues
I am totally for that.
we're making progress here :rolling_on_the_floor_laughing:
now we disagree on different things
haha
than before
I'm feeling your worry for "my True
tag being considered the same as someone else's True
tag without anything explicitly written for it"
And I can agree with you it's not ideal, if the all tag feature thing in the language seems to prioritize uniqueness
But my opinion is that considering the two True
to be identical (so allowing what you call as "implicit casting") is better than having closed tags decided by function authors.
Closed tags should have some way to be unioned to fix this underlying issues
We technically should be able to do this, the one disadvantage of it, is that it would have some sort of runtime cost. The compiler would just generate a conversion function that in the minimal case would just change an integer. In the worst case, might have to copy around a solid bit of data.
So returning open tags does have a performance advantage, but I think in most cases it would be minimal, and as long as we still had open tags, you would have an explicit way to work around it.
The problem still is: "is my True
tag the same as the True
tag defined in someone else's code?"
One extra piece of information that probably actually argues for returning an open tag. If two unions have all the exact same tags with the same data, the compiler will consider them the exact same type unless they are opaque.
Agreement : [Agree, Disagree]
QuestionAnswer : [Agree, Disagree]
Since these two aliases happened to be 100% identical, they are the same type to the compiler.
Oh I thought they were considered as separate things right now
So Roc point is not that much about uniqueness as I thought
I mean, if you really want to make things "closed" you just make it opaque I guess
Closed tags should have some way to be unioned to fix this underlying issues
As Kristian mentioned you can do this today, if you have something like
ContractTrustworthy a : [True, False]a
Undecided a : [Undecided U64]a
Combined : ContractTrustworthy(Undecided [])
I'm not sure I see how this helps though. You are still forced to do an explicit conversion if you want to convert ContractTrustworthy[]
to Combined
, at least today.
It was about considering Combined
as "creatable" via tag values defined in the tag type it's defined from
without explicit conversion basically, based on how you define your tags
Combined : [True, False, Undecided U64] # this should be its own definition
Combined : ContractTrustworthy(Undecided []) # this should be populated simply with ContractTrustworthy or Undecided tag values directly
Because in the first case I'm expecting to use MY OWN VALUES for Combined
, while in the second example I'm expecting to use other tag values as values for Combined
.
I think I see what you're getting at, but those two definitions are the same today. The alias names Combined
, ContractTrustworthy
don't matter in measuring type equality; you have to use opaque types to make them distinct, as you mentioned.
Ah, I was hoping you could define Combined : ContractTrustworthy(Undecided [])
-> allow implicit casts from the sub unions to the combined union.
Yeah now it makes sense, but before I was thinking tags were like enums, so each one with its own value, different from others even if called the same
Like the compiler would create the casts for you since you wrote a unioned type.
I agree re your previous point, I think if you really want things to be closed you should use an opaque type. In my mind tags are mostly for light-weight, contextual data that is unambiguous. If you want to encode domain-specific information that should be preserved faithfully, opaque types are a better choice.
I think we see this with Roc programs today already. Tags are mostly used only for collecting errors/effects, or for small, localized function calls that don't escape out anywhere (like List.walkUntil
, which uses [Continue state, Break state]
), and things that do need to escape further away are hidden behind an opaque type (e.g. Parser
, File
in the cli platform). It also is the case in the original motivating example for this discussion - Ok/Err tags.
@Brendan Hansknecht the fact this leads to the same type being defined twice:
Agreement : [Agree, Disagree]
QuestionAnswer : [Agree, Disagree]
remove the whole problem of "implicit/explicit" casting based on how tags are defined, if with direct values or if combined with existing values
Brendan Hansknecht said:
Ah, I was hoping you could define
Combined : ContractTrustworthy(Undecided [])
-> allow implicit casts from the sub unions to the combined union.
so this doesn't matter anymore, it should not be a thing
Ayaz Hafiz said:
I think this answers everything, does it for you too @Brendan Hansknecht ?
If you want to encode domain-specific information that should be preserved faithfully, opaque types are a better choice.
I guess that is fair.
If that is the recomendation, I think it is reasonable
@Ayaz Hafiz what happens with the following definition:
MyOpaqueTag = [A,B] # don't know how to write opaque tags yet sorry
MyOpenTag = [A,B]
Are these two equivalent? So, is the only discriminant for equality here the tag's name? If I have:
f : MyOpaqueTag
g : MyOpenTag
Can I use f
and g
interchangeably?
no
The two opaque tags would be unique even if they have the exact same variants
And an opaque tag is unique from a non opaque tag
When using them in code, they are essentially wrapped tags that are wrapped with a special tag that is always going to be unique from other tags.
Also, definition of an opaque type is just :=
Brendan Hansknecht said:
Ah, I was hoping you could define
Combined : ContractTrustworthy(Undecided [])
-> allow implicit casts from the sub unions to the combined union.
fwiw if closed unions in output are kept around, I would like this too. For example we have code like this that does transformations of tags solely for the sake of upcasting: https://github.com/roc-lang/roc/blob/50fac9cc9e4eaee584c750cdfe8fd397458c3d83/crates/compiler/builtins/roc/Decode.roc#L89-L98. I think it would be nice to tell the compiler to automatically do the transformation for you via e.g.
fromBytes : List U8, fmt -> Result val [Leftover (List U8)]DecodeError | val has Decoding, fmt has DecoderFormatting
fromBytes = \bytes, fmt ->
when fromBytesPartial bytes fmt is
{ result, rest } ->
if List.isEmpty rest then
when result is
+ result as finalResult -> finalResult
- Ok val -> Ok val
- Err TooShort -> Err TooShort
else
Err (Leftover rest)
I have an experiment for that too :laughter_tears:
Anyway, I still think I like explicit casts by default, but I understand the sentiment and goals with being more open. Especially when think of tags as light weight and temporary rather than exact and for domain model. As was said, when you have a type that you want to be exact, you have opaque types.
Yeah but having:
A : [True, False]
Mine : A
Mine : [True, False]
mean two different things for Mine
casting wise it's hard to easily explain. It's not what you would expect
For implicit casting I am thinking of the underlying structural type, not based on the alias name. Relying on the alias name for structural aliases (not opaque types) has a lot of problems. But this is an aside, I don't want to take this thread off topic.
Not sure if this is a can of worms and should just be pushed to implementation time or a different discussion, but how does syntax work if returned tags are always open?
myFunc : [ A, B] -> [ B, C]
-> This would be closed tag to open tag?
myFunc : [ A, B]* -> [ B, C]
-> This would be open tag to open tag?
myFunc : [ A, B]* -> [ B, C]*
-> This would be invalid syntax because only function inputs can be explicitly labeled as open now?
Ayaz Hafiz said:
what do you mean by "underlying structural type" ?
Oh you mean, if my tag type has values [A,B]
and I'm receiving a subtype set of those values I should just as
them to my type?
But should that be explicit? I mean, if they are a subtype my opinion on this is that they should be converted implicitly. But again, this is not done due to performance reasons right? You have to explicitly tell the compiler you're converting them? With as
keyword or something?
I think this is talking about if the tags weren't open by default. Since the tags are closed, this is one way to implement the conversion instead of needing to write your own Argument.fromBool
. You would just use as Argument
. This should not be required in the case the functions return open tags by default.
Not sure if this is a can of worms and should just be pushed to implementation time or a different discussion, but how does syntax work if returned tags are always open?
yeah the rules get messy. What you described is what I was thinking, and Richard and I previously talked about how that would extend to aliases and it seems correct there too. One thing that isn't clear to me is what should happen for tags behind opaque types, that seems trickier.
I think it might be helpful to go back to the original motivation though, which is that @Kristian Notari initially found the behavior of open tag unions confusing. Do you feel like you have a better grasp on them now @Kristian Notari ?
If so, what do you think helped - the framing of "being able to use the tags in any context", or something else? And, if so, would changing how open/closed tag unions behaved help, in your opinion, or is this a documentation/teaching problem>
And if not, what do you continue to find challenging to understand about the behavior of open tag unions, given this thread?
Ayaz Hafiz said:
I think if you really want things to be closed you should use an opaque type.
interestingly, Bool
now models this technique!
also interesting: if it's the only way to do it (e.g. if we change it so that you can no longer have closed tag unions in the output position), then it becomes very clear that this is what you should do when you want things to be closed, because that becomes the only way to do it!
Ayaz Hafiz said:
The confusing bits around tags (and more generally the *
or "extensions/unions" in types, even for records) were about the framing. You've cleared my doubts with the sentence I've reported here already.
I don't find anything challenging anymore. Now I'm curious to see where this type system bits in Roc are going to lead, because from my perspective things that are structurally equal are equal, when it comes to thinking about tags or records, despite the fact one would like to have some more control over the making and accessing of values of a specific type. I come from a typescript background so that answers the why I'm more inclined to see everything as structural instead of named. I mean, I know of nominative type systems but still. And I don't know how many name clashes with tags are going to happen nor how that plays well in actual development because I have no experience with Roc whatsoever so I'm willing to try.
One last thing, for the open/closeness of types: since the "defaulting to open depending on input/output position" discussion we had, I had the doubt this is one of those times where (going abstract here) a concept that covers A is being used to solve B. I don't know if that's the case but when you adopt a concept and then you need cases and branches to say "yes but that thing in this context means another thing" maybe it's a clue that should not be the right approach or at least not the best way to let the user approach it. Don't know, I really don't want to judge with so few experiences in Roc, I'm just being honest and giving you a clear and honest opinion on what I have been feeling reading through the docs and writing with you here on this thread.
At the end of the day I just want to see Roc grow cause I was waiting for it these past months refreshing the website to see more talks from Richard that were not appearing and then I checked back and it was "available" so I'm already pleased a project like this (goals-wise) exists.
Last updated: Jul 06 2025 at 12:14 UTC