I've been thinking a bunch about https://roc.zulipchat.com/#narrow/channel/395097-compiler-development/topic/Alias.20analysis.20error.20across.20modules/near/438500819 and I think that despite my initial reservations, it's worth talking through what it could look like in Roc to change from "opaque types" to "nominal types" - specifically for tag unions.
for example, here's one sketch of a design idea:
:= to no longer mean "opaque type" but instead "nominal tag union" - e.g. instead of Email := Str we might do Email := [Wrapper Str] (the usual implements stuff Just Works the same way it does today)Email example, I could match on Email.Wrapper str -> ...exposes [Email exposing [Wrapper], ...] - if they're exposed, then other modules can import the Email type and use Email.Wrapper, but otherwise they can't (and the type is opaque)import Email exposing [Email exposing Wrapper]one thing this would immediately let us do is to say:
Bool is now a nominal tag union, defined as Bool := [True, False]True and False tags, and expose themTrue into the repl, it says True : Bool - and pattern matching etc still works the normal wayTrue anymore, or Falsethis would still let us customize how Bool encoding and decoding works, which was the original motivation for making it opaque
(we could consider doing the same to Result, which has also come up in the past)
going back to https://roc.zulipchat.com/#narrow/channel/395097-compiler-development/topic/Alias.20analysis.20error.20across.20modules/near/438500819 - currently, the best solution to this problem is to disallow recursive tag unions unless they're opaque. However, this does create a problem, which is that it's important that we have a rule that opaque types can't be sent to the host.
Besides the fact that the future replay feature can't work otherwise, on a fundamental level, signaling that a type is Opaque means "I am reserving the right to modify the internal implementation details of this thing, including its structure" - but modifying the internal structure of something you send to the host is undefined behavior; the host relies on statically knowing that type's layout, so if you say "I am going to feel free to change this thing's layout" while also sending it to the host, that would be among the biggest footguns imaginable. :sweat_smile:
and yet, we want to be able to send recursive types to the host!
nominal types would be the best fix for that. It would be a way to say "here is this thing's exact structure, it's not opaque, and yet it is nominal, so that bug can't happen."
anyway, that's the general shape of the idea. I'm curious what others think!
oh yeah I forgot to mention - nominal tag unions would also give us a way to have enumerations where you associate each tag with a number or a string or something like that
and similarly, nominal record types could be a way to specify serialization customizations, e.g. "this field should encode and decode as if it had this name instead of its actual name"
there might also be some path for better null handling in json there, not sure
This sounds great :grinning:
What if I want to send a roc dictionary to the host? :grinning_face_with_smiling_eyes: That is opaque but maybe something a host would want. I guess that would lock it to a specific version of roc potentially if we update the dict impl....still might be useful in a handful of cases especially where perf really matters. Though maybe in those cases, it is basic to control the data structures and unwrap them before passing to the host
Otherwise I don't really grok the tradeoffs between nominal and opaque, but nominal tags sound nice in general.
Besides the fact that the future replay feature can't work otherwise, on a fundamental level, signaling that a type is Opaque means "I am reserving the right to modify the internal implementation details of this thing, including its structure" - but modifying the internal structure of something you send to the host is undefined behavior; the host relies on statically knowing that type's layout, so if you say "I am going to feel free to change this thing's layout" while also sending it to the host, that would be among the biggest footguns imaginable. :sweat_smile:
I don't completely understand this point. I get that making a type opaque means reserving the right to make changes to the internal structure, but who is making that promise to whom? I'm trying and failing to come up with a scenario where this becomes a problem, here's the ones I considered:
If it's the platform author making the promise to application authors than I don't think it matters. If as a platform author I change the internal implementation of an opaque type then I can also change the host code to match.
If it's an application author using that promise to abstract one part of the application from another than I also don't how it matters. The host cannot statically known application-defined types because the application is written and compiled later. From the host's perspective all application-defined types might as well be opaque.
The one scenario I might see a problem is if a platform directly depends on a Roc library and generates glue for the library's opaque types structure in host code. Then if the library updates the structure of the opaque type the host will break. Forbidding sending opaque types to the host would prevent this, but refusing to generate glue for opaque types would too.
Brendan Hansknecht said:
What if I want to send a roc dictionary to the host?
oh yeah, builtins are always fine - the host knows their layout based on the Roc version
currently, the best solution to this problem is to disallow recursive tag unions unless they're opaque
What about anonymous error unions? For example:
tryFoo = \x ->
result1 = try x ? FooErr
result2 = try tryBar 123 ? FooSawBarErr
"success"
tryBar = \y ->
result1 = try y ? BarErr
result2 = try tryFoo "abc" ? BarSawFooErr
200
FooErrors : [FooErr, FooSawBarErr BarErrors]
BarErrors : [BarErr, BarSawFooErr FooErrors]
Would this be allowed?
those don't look recursive to me, unless I'm missing something, so should be fine
Whoops, naming
The point is that we currently let users "just propagate with try". It seems that without allowing for recursive tag unions, then this wouldn't be allowed, and they'd have to figure out how to propagate mutually recursive errors
Maybe with a try tryFoo "abc" ? BarErrors.BarSawFooErr or equivalent
Which isn't that bad
:thinking: do mutually recursive errors come up in practice? I can't think of a scenario where that would come up
I agree, not a popular usage of tag unions
Jasper Woudenberg said:
The one scenario I might see a problem is if a platform directly depends on a Roc library and generates glue for the library's opaque types structure in host code. Then if the library updates the structure of the opaque type the host will break. Forbidding sending opaque types to the host would prevent this, but refusing to generate glue for opaque types would too.
yeah this is the scenario that's a problem. I don't think forbidding glue generation is an ok solution because glue generation is optional (e.g. today most platform development is done without glue)
I also don't think it would be useful to have a rule of "you can use opaque types but only if they came from the platform, not from third party packages" - if you can only use opaque types that you have access to unwrap, then it's simpler to have the rule be "ok so just unwrap them before you send them"
yeah this is the scenario that's a problem. I don't think forbidding glue generation is an ok solution because glue generation is optional (e.g. today most platform development is done without glue)
Yeah, I'm doing that myself in Zig at the moment :sweat_smile:.
If it's platform development that benefits from the 'opaque types cannot be sent to the host' rule, is it then fair to say this proposal is trading application author convience for platform author convenience, by not going with the 'recursive types need to be opaque' approach?
For a platform author not using glue to integrate with an opaque type from a library they'd have to look at the library source code and find the opaque type implementation. I guess that sounds so iffy to me I can't really imagine a host author doing it by accident. Wonder if a "don't do that!" would be enough, considering how much other opportunities platform authors have to mess up given how low they are in the stack.
oh the bigger problem with the third party thing is that version ranges mean you can't possibly know what layout you're getting
because opaque types can change their internal structure as a nonbreaking change
so let's say I'm a platform author, I depend on v1.0.0 of a third party package, I look at its layout and code my host against that
then they release 1.0.1 which has a different internal structure, the application selects that, and now we have UB
Sure, that makes sense. I guess I'm wondering how likely it is someone will try binding host code to the implementation of an opaque type from an external library in the first place, and so how much we're willing to sacrifice application development to prevent it.
I say that because writing C-bindings between languages feels like an advanced skill, much more so than the rule of thumb "don't integrate with the private implementation details of external code", so I've trouble imagining someone trying this without understaning the problems with it. I'm open to being biased on the order I learned things myself though :sweat_smile:.
As a note, I would not be surprised if recursive tags become semi common in the future of roc. In essentially every language that has errors instead of exceptions, some form of library is written that creates deeply nested errors to essentially build up context/a stack trace. This almost always eventually hits some form of mutual recursion.
This could easily happen in roc if at every call that could fail a user simply wraps the error with more context and returns
Also, I'm a bit confused by some of the initial statements, is this fully replacing opaque types or is it specific to tag unions? Cause you also mention maybe adding nominal records? Just don't fully understand the scope here
it's a pretty vague idea; I'm not sure either! :big_smile:
this isn't at the stage of a proposal or anything, just trying to feel out the general idea
Brendan Hansknecht said:
As a note, I would not be surprised if recursive tags become semi common in the future of roc. In essentially every language that has errors instead of exceptions, some form of library is written that creates deeply nested errors to essentially build up context/a stack trace. This almost always eventually hits some form of mutual recursion.
but none of those languages have polymorphic sum types; is there still demand for that in practice if you do? (I hope not, but I guess we'll see!)
Sounds reasonable to try
It seems to me that adding nominal tag unions would effectively add nominal records at the same time because you could define a tag union with a single tag and a record as the payload. We can already do something similar with anonymous unions today but it’s not quite the same because you don’t need access to the specific constructor to make one.
This seems like a good solution to the problem and I think it would be nice to have nominal tag unions at times. That being said, it’s unfortunate that it would increase the size of the language. I could see the presence of nominal and structural unions being very confusing for beginners.
It could also introduce more decision points where you have to ask yourself which tool to use. Maybe this wouldn’t be much of an issue if it is explicitly communicated that you should always use structural unions unless you have a specific need for nominal ones.
yeah maybe terminology like "fixed" vs "flexible" might help? :thinking:
might also be more confusing haha
or like "anonymous records"/"anonymous tag unions" vs "named records"/"named tag unions"
Anonymous/named sounds promising to me, but i think the presence of both kinds of unions will be somewhat confusing regardless
More confusing than opaque tag unions?
Given the :thinking:, I'll clarify a bit. I don't think this makes the language any more complex. It is just swapping out opaque tags for nominal tags. Slightly different tradeoffs, but no more complexity or confusion.
Right now if you want to use a tag union there's only one choice. You could choose to wrap that tag union in an opaque type if that is desired but it's a separate concept. With this design, there would now be two ways to use tag unions:
Color : [Red, Blue]
Color := [Red, Blue]
The difference between these two choices is not obvious. They can both be used to solve similar problems, but they each have different consequences. This seems more confusing than the current situation.
I could also see a world where users familiar with other languages with nominal tag unions like Elm or Haskell might not realize that Roc has structural unions and default to always explicilty declaring nominal unions with :=. That might not be too difficult of an anti-pattern to correct, but I think it would happen with this approach.
I think you are just pointing out that opaque tags today are a pain to use and don't really work. It has been requested multiple times to make them more flexible. We have opaque tags today. Look at bool. They just suck to use. I think long term, eventually something would give in that system as well.
So I would still argue it is not really more complex in practice
Also, I think it would be totally fine if someone only want to use nominal tags. I think it would even be a reasonable best practice
Apart from error style tags that accumulate tons of adhoc variants, I think that nominal tags are more type safe and that is beneficial
Well right now because you can't use an opaque tag union like a tag union normally there's never really the need to make a decision about which you should use. But this change will introduce that decision which seems more complex to me.
I don't have a better solution to propose and I don't think this is enough of a reason not to go with nominal types, it's just unfortunate that it could introduce some confusion.
one idea is that we could just give them a separate name - so like "tag union" means what it does today, and the idea is that "it's a union of tags, and the union can grow on the fly" and then separately we have a concept of "enum" (for example, to use Rust's terminology) which is a hardcoded enumeration of alternatives
rather than using the terminology of "nominal" and "structural"
so then we could say like
the declaration syntax could use a keyword to reinforce that, e.g.
enum Bool [True, False]
or maybe another way to explain it could be in terms of the nominal types, e.g.
I think we would need to remove the idea of closed tags completely from user space for that framing to feel cohesive
Cause it sounds like enums are just closed tags
Of course we would still need closed tags in general to avoid requiring _ -> in pattern matching
Richard Feldman schrieb:
one thing this would immediately let us do is to say:
Boolis now a nominal tag union, defined asBool := [True, False]- we automatically import its
TrueandFalsetags, and expose them- this means when you put
Trueinto the repl, it saysTrue : Bool- and pattern matching etc still works the normal way- it also means you can't construct a userspace tag named
Trueanymore, orFalse
I don't really like this. I like the fact that you can just use any capitalized identifier as a tag. There are situations in which one might want to use True and False as tags, e.g. when you're handling some kind of external language or data representation that has true and false as possible values.
I think it is an upside that you can't name your tags True or False. You know it is either one of the two values, not maybe the Roc True, but actually, it is an open tag union that in certain code paths is Truh (typo), or Undefined (example: because you model something like a dynamic language value with your tag).
I like the idea of removing the concept of closed tags in favor of enums for the public (tutorial and general explanations), like how Task is fading to the background with the current purity inference proposal
Fritz Psiorz said:
Richard Feldman schrieb:
one thing this would immediately let us do is to say:
Boolis now a nominal tag union, defined asBool := [True, False]- we automatically import its
TrueandFalsetags, and expose them- this means when you put
Trueinto the repl, it saysTrue : Bool- and pattern matching etc still works the normal way- it also means you can't construct a userspace tag named
Trueanymore, orFalseI don't really like this. I like the fact that you can just use any capitalized identifier as a tag. There are situations in which one might want to use
TrueandFalseas tags, e.g. when you're handling some kind of external language or data representation that hastrueandfalseas possible values.
we could allow opting out of that, e.g.
import Bool exposing [Bool]
I guess in general we could allow you to import builtin modules with different exposing settings in case you (for some reason) really want to choose names that builtin modules reserve
but if I'm being honest, I think demand for that in practice would be close to zero
especially because the workaround is so easy: just name them True_ and False_ or something, just like how people work around reserved record field names like if
one idea is that we could just give them a separate name - so like "tag union" means what it does today, and the idea is that "it's a union of tags, and the union can grow on the fly" and then separately we have a concept of "enum" (for example, to use Rust's terminology) which is a hardcoded enumeration of alternatives
Calling only one of the kinds of unions tag unions and the other enums feels odd to me given that they are both tagged unions. I like the idea of calling them named tag unions and anonymous tag unions more.
I think avoiding nominal/structural as the primary way of communicating the differences is a good idea. I suspect these are less familiar terms than anonymous/named and there seems to be a fair amount of confusion about what they mean.
or maybe another way to explain it could be in terms of the nominal types, e.g.
* Roc has enums and structs
* Records are anonymous structs
* Tag unions are anonymous enums
It seems like in Elm people almost always use structural records rather than nominal ones. If that is the case, maybe there isn't a need to have separate names for structural and named records.
what about "custom tag union"?
that name suggests what the default is: you have normal tag unions, and then when you want to customize them beyond the defaults they give you, you switch to a custom tag union
and a custom one isn't compatible with the normal ones (which means it can't grow automatically) because, well, it's custom!
that is, it's not the same as them anymore - which was the whole goal anyway
Smooth idea. Would records also rename to "custom structs"?
assuming we want nominal versions of both, I think I'd go "tag union / custom tag union" and "record / custom record"
keeping with the theme of "you can make a custom version if you want it to work differently from the default"
Ooh yeah I like that! Nice that then the default can be described just as a tag union and there doesn’t have to be any other modifier like anonymous or structural.
Does this mean you can't define a type alias for an open tag union or open record?
No you could definitely still define type aliases for anything
To use a custom (nominal) tag union you will have to declare it explicitly and it's tags will be associated with it exclusively. Defining a type alias just serves as a shorthand to refer to a type rather than creating a new, distinct type.
Last updated: Jun 16 2026 at 16:19 UTC