Stream: beginners

Topic: When exactly can tag unions merge?


view this post on Zulip Aurélien Geron (Jun 14 2026 at 07:35):

Since the tag union [Green, Blue] is a subset of [Red, Green, Blue], I expected the following code to work, but the compiler complains. Is this working as intended? If so, what's the rationale?

f: U8 -> [Red, Green, Blue]
f = |n| {
    match n {
        0 => Red
        _ => g(n)
    }
}

g: U8 -> [Green, Blue]
g = |n| {
    match n {
        1 => Green
        _ => Blue
    }
}

main! = |_args| {
    dbg f(2)
    Ok({})
}

Here's the error I get.

-- TYPE MISMATCH ---------------------------------

The second branch of this match does not match the previous branch :
   ┌─ /var/folders/74/7_g_vgn57nzgjh9mw5smp6rc0000gn/T/roc/debug-f081a87b/yFTm8aktdFZvhh9IG3uovOGfNvuQlGPk/main.roc:30:14
   │
30 │         _ => g(n)
   │              ^^^^

The second branch is:

    [Blue, Green]

But the previous branch results in:

    [Blue, Green, Red]

All branches in a match must have compatible types.
Note: You can wrap branches values in a tag to make them compatible.
To learn about tags, see <https://www.roc-lang.org/tutorial#tags>

Found 1 error(s) and 0 warning(s) for /var/folders/74/7_g_vgn57nzgjh9mw5smp6rc0000gn/T/roc/debug-f081a87b/yFTm8aktdFZvhh9IG3uovOGfNvuQlGPk/main.roc.

Of course the problem disappears if I make both unions open ([Red, Green, Blue, ..] and [Green, Blue, ..]), but I don't see why I need to do that in this case.

view this post on Zulip Brendan Hansknecht (Jun 14 2026 at 07:36):

Off the top of my head, I would expect [Green, Blue] to need to be open, but not the other one.

view this post on Zulip Aurélien Geron (Jun 14 2026 at 07:40):

Unfortunately, currently both need to be open.

Plus, I'm not sure why g() would have to return an open union. When writing this function, I might not know who is going to use it and how. It's weird to have to make it open just in case someone needs to use it and return a superset.

view this post on Zulip Aurélien Geron (Jun 14 2026 at 07:44):

Oddly, the error disappears if I implement f like this:

f: U8 -> [Red, Green, Blue]
f = |n| {
    match n {
        0 => Red
        _ => {
            match g(n) {
                Green => Green
                Blue => Blue
            }
        }
    }
}

But this fails:

f: U8 -> [Red, Green, Blue]
f = |n| {
    match n {
        0 => Red
        _ => {
            match g(n) {
                x => x
            }
        }
    }
}

It's as if the Green and Blue returned by g are not the same as the Green and Blue returned by f. Pretty surprising tbh.

view this post on Zulip Brendan Hansknecht (Jun 14 2026 at 07:44):

I feel like returned unions used to default to open.

view this post on Zulip Brendan Hansknecht (Jun 14 2026 at 07:45):

Not sure the state in the zig compile

view this post on Zulip Brendan Hansknecht (Jun 14 2026 at 07:46):

Your odd example is not surprising to me. By being explicit you decoupled the types. In x => x both sides must be the same type. In Blue => Blue both sides came be totally different and unrelated types.

view this post on Zulip Brendan Hansknecht (Jun 14 2026 at 07:46):

That used to be a common work around for.some unification issues.

view this post on Zulip Aurélien Geron (Jun 14 2026 at 07:55):

I can see why [A, B] and [B, C] should be incompatible types, because neither is a subset of the other. But I don't see why [A, B] should not be compatible with [A, B, C] (but not the other way round). I guess I'm saying that the compiler should automatically cast [A, B] to [A, B, C] when needed. Other than the complexity of implementing that, is there a downside I'm missing?

view this post on Zulip Aurélien Geron (Jun 14 2026 at 08:26):

After more thought, it's no different than the fact that U8 doesn't automatically get turned into U64 when needed.

But I can write a.to_u64(), whereas I don't see an easy way to promote a tag union to a superset. Perhaps tag unions could support something like to_superset(), so I could write g().to_superset() rather than match g() { Green => Green, Blue => Blue }.

view this post on Zulip Aurélien Geron (Jun 14 2026 at 09:16):

Ok, I'm officially confused. Why doesn't the following code work?

main! = |_args| {
    f : [Red, Green, ..]
    f = Red

    g : [Red, Green, Blue, ..]
    g = f

    dbg g

    Ok({})
}

view this post on Zulip Luke Boswell (Jun 14 2026 at 09:38):

Is the the polarity thing that @Jared Ramirez has mentioned isn't implemented yet? it sounds familiar

view this post on Zulip Jared Ramirez (Jun 14 2026 at 15:10):

so there is quite a bit here:

Aurélien Geron said:

I can see why [A, B] and [B, C] should be incompatible types, because neither is a subset of the other. But I don't see why [A, B] should not be compatible with [A, B, C] (but not the other way round). I guess I'm saying that the compiler should automatically cast [A, B] to [A, B, C] when needed. Other than the complexity of implementing that, is there a downside I'm missing?

in this case both are closed unions, so the type is saying “i contain exactly [A, B]”, so it can be auto extended to also contain C. if the type was [A, B, ..], then it should be able to be extended

Aurélien Geron said:

Ok, I'm officially confused. Why doesn't the following code work?

main! = |_args| {
    f : [Red, Green, ..]
    f = Red

    g : [Red, Green, Blue, ..]
    g = f

    dbg g

    Ok({})
}

in below, i’m assuming the error is on g = f

this case is complicated, and it’s actually not realated to polarity, but it’s about what’s called generalization/instantiation. when a type is generalized, it means that each place that uses the type gets instantiated to a fresh copy of it where all the lowercase letters in the type signature get fresh, local copies, that are inferred at the call site. then note that f: [A, B, ..] becomes [A, B, ..c] during compilation. so when f would be instantiated, c is instantiated to a “flex” variable.

but here’s the wrinkle: in the zig compiler, we decided ONLY to generalize/instantiate functions! so here f is NOT generalized. because of this, the implicit c is not generalized, and stays a “rigid” var. and rigid vars cannot be extended, so the compiler rejects g = f

i suspect that if you dropped the annotation from f, g = f would work

this begs a broader question about how to handle generalization in cases like this. we went this path of only generalizing functions because if you generalize everything, you can end of up with some surprising performance issues. (see https://github.com/seanpm2001/Roc-Lang_RFCs/blob/main/0010-let-generalization-lets-not.md). but, the above not working is def confusing

view this post on Zulip Richard Feldman (Jun 14 2026 at 15:14):

I have thought about expanding the generalization strategy to allow some literals that we know are "harmless" (won't cause perf surprises) such as number literals, tags, records, tuples

view this post on Zulip Richard Feldman (Jun 14 2026 at 15:15):

the original proposal suggested number literals but I figured in the new compiler we should try the bare minimum (just lambdas) and see if there was demand in practice for the others

view this post on Zulip Richard Feldman (Jun 14 2026 at 15:16):

there's still the potential surprise of "if you change it to f = dbg Red then it will debug-print Red twice (once for f and once for g)

view this post on Zulip Richard Feldman (Jun 14 2026 at 15:17):

but arguably that's good because it accurately informs you about what's happening at runtime

view this post on Zulip Brendan Hansknecht (Jun 14 2026 at 15:22):

I guess I'm saying that the compiler should automatically cast [A, B] to [A, B, C] when needed. Other than the complexity of implementing that, is there a downside I'm missing?

This has been discussed a lot in the past. I wish I remembered it all. I think most people found that once functions assumed open tags for outputs, must of the problems went away and it was not to bad. I guess we could generically assume open unions, but I really want a compiler error to be reported if someone accidentally adds tags to my union.

view this post on Zulip Jared Ramirez (Jun 14 2026 at 15:23):

yeah, that would be reasonable i think. but for complex tags with polymorphic payloads, the perf issue may still be possible

then also was thinking as i typed out the above that maybe if there’s a type sig we should always generalize, since from the user perspective the same syntactically as defining a function, which does generalize. but maybe we still never generalize for let defs without annos.

view this post on Zulip Brendan Hansknecht (Jun 14 2026 at 15:24):

yeah, that would be reasonable i think. but for complex tags with polymorphic payloads, the perf issue may still be possible

Yeah, could definitely explode memory. The one saving grace is that at least they are unified at compile time and it is not a cast that could be shifting around tons of data.

view this post on Zulip Richard Feldman (Jun 14 2026 at 16:33):

I dunno, is that bad though? :thinking:

view this post on Zulip Richard Feldman (Jun 14 2026 at 16:34):

I guess I'm just wondering if this is an actual perf footgun vs something that could happen theoretically but actually wouldn't in practice

view this post on Zulip Richard Feldman (Jun 14 2026 at 16:34):

like calling arbitrary functions can obviously explode in practice, like the example in Ayaz's writeup

view this post on Zulip Richard Feldman (Jun 14 2026 at 16:35):

but constructing polymorphic tags and structs feels much more hypothetical

view this post on Zulip Richard Feldman (Jun 14 2026 at 16:35):

as being a source of an actual real-world perf surprise

view this post on Zulip Jared Ramirez (Jun 14 2026 at 18:52):

i played with it a bit & as expected:

& I re-read Ayaz's proposal & it actually explicitly disallows:

f : [Red, Green, ..]

saying it should error with a TYPE IS NOT POLYMORPHIC error.

thinking out loud here: in the case where we don't generalize (ie binding a let value like f), maybe instead of desugaring [A, B, ..] as [A, B, ..c] (rigid), we desugar it as [A, B, .._] (weak). that way, it can be refined in based on it's usage. but since functions are generalize & are instantiated at callsites anyways, they keep the existing semantics

view this post on Zulip Jared Ramirez (Jun 14 2026 at 18:54):

that would make:

    f : [Red, Green, ..]
    f = Red

    g : [Red, Green, Blue, ..]
    g = f

work & have the following types:

view this post on Zulip Richard Feldman (Jun 14 2026 at 18:54):

oh yeah I forgot, I had an idea about Ayaz's proposal regarding tag unions

view this post on Zulip Richard Feldman (Jun 14 2026 at 18:55):

he talked about the problem with them being that they can break cross-module caching, which is a great point, and his idea was to prohibit exporting them

view this post on Zulip Richard Feldman (Jun 14 2026 at 18:55):

but I think an even easier idea is just to not generalize them at the top level (just like today)

view this post on Zulip Richard Feldman (Jun 14 2026 at 18:56):

and only allow local defs to generalize

view this post on Zulip Richard Feldman (Jun 14 2026 at 18:56):

which prevents the exposing problem while making cases like this one work, and also while keeping generalization rules decoupled from the concept of module boundaries

view this post on Zulip Jared Ramirez (Jun 14 2026 at 18:57):

so here f would generalize, but the same def at top-level would not? would the top-level version error if you tried to annotate it with a rigid/..?

view this post on Zulip Richard Feldman (Jun 14 2026 at 19:18):

hm, I guess? (I hadn't thought about it)

view this post on Zulip Richard Feldman (Jun 14 2026 at 19:20):

maybe the rule could be that top-level values can't have unbound type variables unless they're functions? :thinking:

view this post on Zulip Richard Feldman (Jun 14 2026 at 19:20):

(with or without annotations)

view this post on Zulip Aurélien Geron (Jun 14 2026 at 19:45):

Thanks everyone, that's very helpful. :folded_hands:
For anyone else wondering, here's Ayaz's proposal:
https://github.com/roc-lang/rfcs/blob/main/0011-union-refinement.md

view this post on Zulip Richard Feldman (Jun 14 2026 at 21:26):

oh actually the one that's relevant here is actually https://github.com/roc-lang/rfcs/blob/main/0010-let-generalization-lets-not.md - probably should have linked to it earlier, oops!

view this post on Zulip Aurélien Geron (Jun 14 2026 at 21:50):

Oh right, thanks @Richard Feldman

view this post on Zulip Brendan Hansknecht (Jun 15 2026 at 06:15):

Richard Feldman said:

I dunno, is that bad though? :thinking:

For most users and most cases probably not. Definitely hit some pain in rocci bird due to the equivalent when we still had tasks. Each capture was a slightly different layout and constantly was being shuffled around

view this post on Zulip Brendan Hansknecht (Jun 15 2026 at 06:16):

I think this probably should not be decided worrying about perf

view this post on Zulip Brendan Hansknecht (Jun 15 2026 at 06:16):

It should worry about ergonomic

view this post on Zulip Brendan Hansknecht (Jun 15 2026 at 06:16):

At least for the most part

view this post on Zulip Brendan Hansknecht (Jun 15 2026 at 06:16):

And the question here is are stricter types that give stronger guarantees or more flexible types more ergonomic

view this post on Zulip Brendan Hansknecht (Jun 15 2026 at 06:17):

I think I'm still in the camp that most of the time, stricter is the preferred answer and if you don't want stricter, you can just avoid adding type a notions and it will infer something relaxed

view this post on Zulip Brendan Hansknecht (Jun 15 2026 at 06:18):

That said, I totally see the reserve argument of loose by default and if you want stricter, define a nominal type.


Last updated: Jun 16 2026 at 16:19 UTC