Since the tag union [Green, Blue] is a subset of [Red, Green, Blue], I expected the following code to work, but the compiler complains. Is this working as intended? If so, what's the rationale?
f: U8 -> [Red, Green, Blue]
f = |n| {
match n {
0 => Red
_ => g(n)
}
}
g: U8 -> [Green, Blue]
g = |n| {
match n {
1 => Green
_ => Blue
}
}
main! = |_args| {
dbg f(2)
Ok({})
}
Here's the error I get.
-- TYPE MISMATCH ---------------------------------
The second branch of this match does not match the previous branch :
┌─ /var/folders/74/7_g_vgn57nzgjh9mw5smp6rc0000gn/T/roc/debug-f081a87b/yFTm8aktdFZvhh9IG3uovOGfNvuQlGPk/main.roc:30:14
│
30 │ _ => g(n)
│ ^^^^
The second branch is:
[Blue, Green]
But the previous branch results in:
[Blue, Green, Red]
All branches in a match must have compatible types.
Note: You can wrap branches values in a tag to make them compatible.
To learn about tags, see <https://www.roc-lang.org/tutorial#tags>
Found 1 error(s) and 0 warning(s) for /var/folders/74/7_g_vgn57nzgjh9mw5smp6rc0000gn/T/roc/debug-f081a87b/yFTm8aktdFZvhh9IG3uovOGfNvuQlGPk/main.roc.
Of course the problem disappears if I make both unions open ([Red, Green, Blue, ..] and [Green, Blue, ..]), but I don't see why I need to do that in this case.
Off the top of my head, I would expect [Green, Blue] to need to be open, but not the other one.
Unfortunately, currently both need to be open.
Plus, I'm not sure why g() would have to return an open union. When writing this function, I might not know who is going to use it and how. It's weird to have to make it open just in case someone needs to use it and return a superset.
Oddly, the error disappears if I implement f like this:
f: U8 -> [Red, Green, Blue]
f = |n| {
match n {
0 => Red
_ => {
match g(n) {
Green => Green
Blue => Blue
}
}
}
}
But this fails:
f: U8 -> [Red, Green, Blue]
f = |n| {
match n {
0 => Red
_ => {
match g(n) {
x => x
}
}
}
}
It's as if the Green and Blue returned by g are not the same as the Green and Blue returned by f. Pretty surprising tbh.
I feel like returned unions used to default to open.
Not sure the state in the zig compile
Your odd example is not surprising to me. By being explicit you decoupled the types. In x => x both sides must be the same type. In Blue => Blue both sides came be totally different and unrelated types.
That used to be a common work around for.some unification issues.
I can see why [A, B] and [B, C] should be incompatible types, because neither is a subset of the other. But I don't see why [A, B] should not be compatible with [A, B, C] (but not the other way round). I guess I'm saying that the compiler should automatically cast [A, B] to [A, B, C] when needed. Other than the complexity of implementing that, is there a downside I'm missing?
After more thought, it's no different than the fact that U8 doesn't automatically get turned into U64 when needed.
But I can write a.to_u64(), whereas I don't see an easy way to promote a tag union to a superset. Perhaps tag unions could support something like to_superset(), so I could write g().to_superset() rather than match g() { Green => Green, Blue => Blue }.
Ok, I'm officially confused. Why doesn't the following code work?
main! = |_args| {
f : [Red, Green, ..]
f = Red
g : [Red, Green, Blue, ..]
g = f
dbg g
Ok({})
}
Is the the polarity thing that @Jared Ramirez has mentioned isn't implemented yet? it sounds familiar
so there is quite a bit here:
Aurélien Geron said:
I can see why [A, B] and [B, C] should be incompatible types, because neither is a subset of the other. But I don't see why [A, B] should not be compatible with [A, B, C] (but not the other way round). I guess I'm saying that the compiler should automatically cast [A, B] to [A, B, C] when needed. Other than the complexity of implementing that, is there a downside I'm missing?
in this case both are closed unions, so the type is saying “i contain exactly [A, B]”, so it can be auto extended to also contain C. if the type was [A, B, ..], then it should be able to be extended
Aurélien Geron said:
Ok, I'm officially confused. Why doesn't the following code work?
main! = |_args| { f : [Red, Green, ..] f = Red g : [Red, Green, Blue, ..] g = f dbg g Ok({}) }
in below, i’m assuming the error is on g = f
this case is complicated, and it’s actually not realated to polarity, but it’s about what’s called generalization/instantiation. when a type is generalized, it means that each place that uses the type gets instantiated to a fresh copy of it where all the lowercase letters in the type signature get fresh, local copies, that are inferred at the call site. then note that f: [A, B, ..] becomes [A, B, ..c] during compilation. so when f would be instantiated, c is instantiated to a “flex” variable.
but here’s the wrinkle: in the zig compiler, we decided ONLY to generalize/instantiate functions! so here f is NOT generalized. because of this, the implicit c is not generalized, and stays a “rigid” var. and rigid vars cannot be extended, so the compiler rejects g = f
i suspect that if you dropped the annotation from f, g = f would work
this begs a broader question about how to handle generalization in cases like this. we went this path of only generalizing functions because if you generalize everything, you can end of up with some surprising performance issues. (see https://github.com/seanpm2001/Roc-Lang_RFCs/blob/main/0010-let-generalization-lets-not.md). but, the above not working is def confusing
I have thought about expanding the generalization strategy to allow some literals that we know are "harmless" (won't cause perf surprises) such as number literals, tags, records, tuples
the original proposal suggested number literals but I figured in the new compiler we should try the bare minimum (just lambdas) and see if there was demand in practice for the others
there's still the potential surprise of "if you change it to f = dbg Red then it will debug-print Red twice (once for f and once for g)
but arguably that's good because it accurately informs you about what's happening at runtime
I guess I'm saying that the compiler should automatically cast [A, B] to [A, B, C] when needed. Other than the complexity of implementing that, is there a downside I'm missing?
This has been discussed a lot in the past. I wish I remembered it all. I think most people found that once functions assumed open tags for outputs, must of the problems went away and it was not to bad. I guess we could generically assume open unions, but I really want a compiler error to be reported if someone accidentally adds tags to my union.
yeah, that would be reasonable i think. but for complex tags with polymorphic payloads, the perf issue may still be possible
then also was thinking as i typed out the above that maybe if there’s a type sig we should always generalize, since from the user perspective the same syntactically as defining a function, which does generalize. but maybe we still never generalize for let defs without annos.
yeah, that would be reasonable i think. but for complex tags with polymorphic payloads, the perf issue may still be possible
Yeah, could definitely explode memory. The one saving grace is that at least they are unified at compile time and it is not a cast that could be shifting around tons of data.
I dunno, is that bad though? :thinking:
I guess I'm just wondering if this is an actual perf footgun vs something that could happen theoretically but actually wouldn't in practice
like calling arbitrary functions can obviously explode in practice, like the example in Ayaz's writeup
but constructing polymorphic tags and structs feels much more hypothetical
as being a source of an actual real-world perf surprise
i played with it a bit & as expected:
f worksf : {} -> [Red, Green, ..] worksf : _ works& I re-read Ayaz's proposal & it actually explicitly disallows:
f : [Red, Green, ..]
saying it should error with a TYPE IS NOT POLYMORPHIC error.
thinking out loud here: in the case where we don't generalize (ie binding a let value like f), maybe instead of desugaring [A, B, ..] as [A, B, ..c] (rigid), we desugar it as [A, B, .._] (weak). that way, it can be refined in based on it's usage. but since functions are generalize & are instantiated at callsites anyways, they keep the existing semantics
that would make:
f : [Red, Green, ..]
f = Red
g : [Red, Green, Blue, ..]
g = f
work & have the following types:
f : [Red, Green, ..[Blue, ..x]]g : [Red, Green, Blue, ..x]x is the same exact type var hereoh yeah I forgot, I had an idea about Ayaz's proposal regarding tag unions
he talked about the problem with them being that they can break cross-module caching, which is a great point, and his idea was to prohibit exporting them
but I think an even easier idea is just to not generalize them at the top level (just like today)
and only allow local defs to generalize
which prevents the exposing problem while making cases like this one work, and also while keeping generalization rules decoupled from the concept of module boundaries
so here f would generalize, but the same def at top-level would not? would the top-level version error if you tried to annotate it with a rigid/..?
hm, I guess? (I hadn't thought about it)
maybe the rule could be that top-level values can't have unbound type variables unless they're functions? :thinking:
(with or without annotations)
Thanks everyone, that's very helpful. :folded_hands:
For anyone else wondering, here's Ayaz's proposal:
https://github.com/roc-lang/rfcs/blob/main/0011-union-refinement.md
oh actually the one that's relevant here is actually https://github.com/roc-lang/rfcs/blob/main/0010-let-generalization-lets-not.md - probably should have linked to it earlier, oops!
Oh right, thanks @Richard Feldman
Richard Feldman said:
I dunno, is that bad though? :thinking:
For most users and most cases probably not. Definitely hit some pain in rocci bird due to the equivalent when we still had tasks. Each capture was a slightly different layout and constantly was being shuffled around
I think this probably should not be decided worrying about perf
It should worry about ergonomic
At least for the most part
And the question here is are stricter types that give stronger guarantees or more flexible types more ergonomic
I think I'm still in the camp that most of the time, stricter is the preferred answer and if you don't want stricter, you can just avoid adding type a notions and it will infer something relaxed
That said, I totally see the reserve argument of loose by default and if you want stricter, define a nominal type.
Last updated: Jun 16 2026 at 16:19 UTC