Stream: ideas

Topic: non-exhaustive tag unions


view this post on Zulip timotree (Apr 21 2024 at 00:41):

If a library uses a tag union to represent possible error conditions, it would be a breaking change to add a new error condition since it could break exhaustive when expressions downstream. Has there been thought put into allowing libraries to define future-extensible error types?

Could open tag unions somehow help with this? In Rust, there is a #[non_exhaustive] feature for enums which requires that pattern matching on that enum always have a wildcard case, much like open tag unions in Roc. When I was first learning Roc, I assumed that you could use open tag unions to emulate Rust's non-exhaustive enums, but it turns out that's just not how they currently work. For example:

tryStuff : Str -> Result Str [Err1, Err2]*
tryStuff = \s -> Ok s

useStuff =
    when tryStuff "hello" is
        Ok s -> s
        Err Err1 -> "err1 :("
        Err Err2 -> "err2 :(("
        # _ -> "other"
        # ^ this typechecks without a wildcard branch

view this post on Zulip timotree (Apr 21 2024 at 00:47):

Maybe a good syntax for this would be to use opaques

FutureError := []

tryStuff : Str -> Result Str [Err1, Err2]FutureError
tryStuff = \s -> Ok s

Surprisingly, this already typechecks today! (but not with the desired behavior)

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 00:59):

This fundamentally is to allow adding error tags (and thus changing the API) without a major version bump due to it being a breaking change?

For the case of errors, what is the real advantage here? Wouldn't it be better to force the user to update their exhaustive pattern match when you add a new error type.

I think the only use of non-exhaustive that I have thought was good in rust was for platform specifications. It is a case where the list is ever changing with new definitions and essentially no user actually supports all platforms. So essentially every match had a _ -> equivalent anyway.

view this post on Zulip timotree (Apr 21 2024 at 01:19):

I suppose I should give a motivating example.

Right now basic-cli has an IOError tag union with a boatload of tags:

IOError : [
    NotFound,
    PermissionDenied,
    ConnectionRefused,
    ConnectionReset,
    HostUnreachable,
    NetworkUnreachable,
    ConnectionAborted,
    NotConnected,
    AddrInUse,
    AddrNotAvailable,
    NetworkDown,
    BrokenPipe,
    AlreadyExists,
    WouldBlock,
    NotADirectory,
    IsADirectory,
    DirectoryNotEmpty,
    ReadOnlyFilesystem,
    FilesystemLoop,
    StaleNetworkFileHandle,
    InvalidInput,
    InvalidData,
    TimedOut,
    WriteZero,
    StorageFull,
    NotSeekable,
    FilesystemQuotaExceeded,
    FileTooLarge,
    ResourceBusy,
    ExecutableFileBusy,
    Deadlock,
    CrossesDevices,
    TooManyLinks,
    InvalidFilename,
    ArgumentListTooLong,
    Interrupted,
    Unsupported,
    UnexpectedEof,
    OutOfMemory,
    Other,
]

If in the future basic-cli gains support for a new operating system with new kinds of errors, it would be nice if this tag union could be extended without a major version bump.

Brendan Hansknecht said:

For the case of errors, what is the real advantage here? Wouldn't it be better to force the user to update their exhaustive pattern match when you add a new error type.

Arguably it makes sense for applications to include a wildcard branch when matching on IO errors, because that branch is meaningful: It's saying "here's what I want to do if I get an unexpected error because I'm running on some future operating system I don't know about yet"

view this post on Zulip timotree (Apr 21 2024 at 01:19):

Brendan Hansknecht said:

This fundamentally is to allow adding error tags (and thus changing the API) without a major version bump due to it being a breaking change?

Yep!

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 01:34):

Ah yep. IOError definitely qualifies as well. Falls into the system dependent growing list type of variant. Where all cases are quite unlikely to be handled.

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 01:35):

Aside: Probably would be nice to split that into many different sub errors if possible. Like you can get DirectoryNotEmpty when loading a file.

view this post on Zulip Richard Feldman (Apr 21 2024 at 18:34):

we could do something like:

IoError : [
    NotFound,
    Others,
    Etc,
][..]

view this post on Zulip Richard Feldman (Apr 21 2024 at 18:35):

so the [..] would be like an "unmatchable union" or something

view this post on Zulip Richard Feldman (Apr 21 2024 at 18:38):

and I guess that would be something you could only express in a type annotation, which would be fine because it's always okay for type annotations to make a type more restrictive (e.g. to introduce a compiler error) than what would be inferred from the implementation

view this post on Zulip Richard Feldman (Apr 21 2024 at 18:38):

so we'd never infer [..] but you could add it to an annotation if desired

view this post on Zulip timotree (Apr 21 2024 at 18:40):

I like the idea of having a distinguished "unmatchable union". I guess the other question is what happens when you pattern match on IoError

view this post on Zulip timotree (Apr 21 2024 at 18:44):

IoError : [NotFound, PermissionDenied][..]

myErrorMessage : IoError -> Str
myErrorMessage = \err ->
    when err is
        NotFound -> "We were unable to find the requested resource"
        PermissionDenied -> "We did not have permission to perform the request action"
        otherwise -> "We encountered an unexpected error"

If this is the syntax, what is the type of otherwise?

view this post on Zulip Luke Boswell (Apr 21 2024 at 18:48):

Could you use _ -> "We encountered an unexpected error"?

view this post on Zulip timotree (Apr 21 2024 at 18:51):

Hmm.. maybe then what's special about [..] is that you're allowed to pattern-match it with _ but not an identifier name?

view this post on Zulip Richard Feldman (Apr 21 2024 at 18:55):

what would be special about it is only that you'd get a compiler error if you didn't have a catch-all branch in your pattern match, such as _ -> or otherwise ->

view this post on Zulip Richard Feldman (Apr 21 2024 at 18:56):

at least today, the type of otherwise would be the type of the entire union, which is what its type would always be

view this post on Zulip timotree (Apr 21 2024 at 18:57):

Oh wait maybe my question doesn't make any sense. For some reason I thought that pattern matching on tag unions eliminated the tags you wrote cases for from the catch-all type

view this post on Zulip Richard Feldman (Apr 21 2024 at 18:57):

we've talked about doing that but haven't done it yet

view this post on Zulip Richard Feldman (Apr 21 2024 at 18:57):

if we did, its type could potentially be []

view this post on Zulip Richard Feldman (Apr 21 2024 at 18:58):

which would ordinarily give an "unnecessary branch" warning, but of course in the case of [..] it wouldn't

view this post on Zulip witoldsz (Apr 21 2024 at 19:56):

Why is tryStuff : Str -> Result Str [Err1, Err2]* not working as an open union in the example of the initial question?

https://www.roc-lang.org/tutorial#open-and-closed-tag-unions

view this post on Zulip Isaac Van Doren (Apr 21 2024 at 21:16):

I can see a scenario where a library exposes a non exhaustive tag union and then in a consuming application, there isn't any reasonable default behavior, so the app author chooses to crash instead:

when val is
    Foo -> ...
    Bar -> ...
    _ -> crash "this doesn't make sense"

I'm not very fond of this because it means that when any tag is added the app will crash instead of emitting a type error.

I like that right now if you write a when is with no wildcard, you know that if any new tag is added you will be forced to handle it. I don't like that this would take away that control from the author.

In the IoError example, there are already so many tags that I would guess almost all applications will use a wildcard and won't be broken by adding a new tag.

I get that this means there would be more breaking changes for semver purposes, but it doesn't seem worth it to take away this level of control from the app author.

view this post on Zulip Isaac Van Doren (Apr 21 2024 at 21:20):

We could remove this warning message in the case where the final branch is a wildcard and all cases have already been handled so that app authors can opt in to the non-exhaustive behavior but are not forced.

── REDUNDANT PATTERN in gob.roc ────────────────────────────────────────────────

The 2nd pattern is redundant:

41│      when x is
42│          Foo -> "foo"
43│          _ -> "bar"
             ^

Any value of this shape will be handled by a previous pattern, so this
one should be removed.

view this post on Zulip timotree (Apr 21 2024 at 23:12):

witoldsz said:

Why is tryStuff : Str -> Result Str [Err1, Err2]* not working as an open union in the example of the initial question?

When I was first learning Roc I thought that open union was the non-exhaustive union I'm talking about, but it turns out I misunderstood. The flexibility to extend the open union goes to each caller of tryStuff, not tryStuff itself. You can test this yourself. If you try returning Err Other from tryStuff you'll get an error.

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 23:15):

To be more specific here, based on the use in the caller of tryStuff, * maps to nothing. As such, the returned value from tryStuff is restricted to only [Err1, Err2]. Essentially the * maps to nothing and goes away.

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 23:16):

* gets filled in at compile time. As such, it doesn't guarantee exhaustive or not.

view this post on Zulip timotree (Apr 21 2024 at 23:26):

Isaac Van Doren said:

I like that right now if you write a when is with no wildcard, you know that if any new tag is added you will be forced to handle it. I don't like that this would take away that control from the author.

That's a good point. I hadn't thought about the way this feature could take away control from downstream devs.

view this post on Zulip timotree (Apr 21 2024 at 23:27):

It is important to keep in mind that non-exhaustiveness would be something that libraries would opt into, so I guess we're imagining a scenario where a library author thinks it would be a good idea to make a union non-exhaustive, but then as a downstream dev I want to override that decision. Maybe that is just a sign of bad library design?

view this post on Zulip Richard Feldman (Apr 21 2024 at 23:28):

well in the case of OS errors specifically, there's not much the package author can do if a new type of OS error comes up

view this post on Zulip Richard Feldman (Apr 21 2024 at 23:30):

either they lump the new error under UnknownError (which presumably happens by default), in which case users of the package can't match on it (because it's just this generic UnknownError) or else they add a variant for it and now it's a breaking change

view this post on Zulip Richard Feldman (Apr 21 2024 at 23:31):

they could also offer the error as an opaque blob of bytes that the application can decode, which means new error types can be added as a nonbreaking change, but now you don't even get programmatic help seeing what the possible errors are; that just has to be in documentation or something

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 23:48):

Yeah, this is why rust has one io error variant that just is an error code

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 23:48):

It is the fallback

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 23:48):

Just a number

view this post on Zulip timotree (Apr 21 2024 at 23:50):

io::ErrorKind in Rust doesn't have a number variant. Are you talking about about io::Error::raw_os_error()?

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 23:53):

Doesn't io::error give a number in the case the kind is other

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 23:54):

Ah yeah, probably the raw os error

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 23:55):

Cause you get an io::Error from the result. If it has a kind of other, you can grab the raw int

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 23:56):

Also, I feel like as a user, non-exhaustive is such a niche use case that it may often be better to force a non-tag union representation or something that works around the issue.

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 23:57):

It is super frustrating when something is non-exhaustive, you want to match all cases, but you can't. Then you update and miss something and hit an unhandled case

view this post on Zulip Brendan Hansknecht (Apr 21 2024 at 23:58):

I personally prefer the breaking update and forced handling essentially all the time. Let the user opt into non-exhaustively matching.

view this post on Zulip Brendan Hansknecht (Apr 22 2024 at 00:01):

But maybe I am biased from Google and living in a world of everything always being at head. There is no versioning. Just update everything when you change something that should be exhaustive. Breaking changes are easier when they are the norm and you can just update everything

view this post on Zulip Richard Feldman (Apr 22 2024 at 00:44):

Brendan Hansknecht said:

I personally prefer the breaking update and forced handling essentially all the time. Let the user opt into non-exhaustively matching.

this is definitely a reasonable design

view this post on Zulip Richard Feldman (Apr 22 2024 at 00:45):

it's really difficult for Rust to adopt it because they want to have super strong backwards compatibility and also OS operations wrapped in the stdlib

view this post on Zulip Richard Feldman (Apr 22 2024 at 00:45):

but platforms don't have that problem; you can do a breaking release of your platform to accommodate new errors and applications can just update

view this post on Zulip Richard Feldman (Apr 22 2024 at 00:46):

same with (future) platform-agnostic effectful packages in the ecosystem

view this post on Zulip Richard Feldman (Apr 22 2024 at 00:46):

so probably the best way to proceed is to leave things as they are, and then revisit if there's demand in practice

view this post on Zulip timotree (Apr 22 2024 at 00:48):

I agree that any solution here should wait until there's clear demand.. but I thought it would be fun to explore the design space a bit!

view this post on Zulip timotree (Apr 22 2024 at 02:17):

Brendan Hansknecht said:

Aside: Probably would be nice to split that into many different sub errors if possible. Like you can get DirectoryNotEmpty when loading a file.

I think part of the rationale for combining them all into one big error union is that any narrowing down of the error possibilities for a given operation would have to be target-specific. e.g. it doesn't seem like you should get NetworkUnreachable from loading a file, but maybe your target has some networked filesystem. or it doesn't seem like you should get DirectoryNotEmpty from opening a file, but maybe your target has some weird kinds of files that when you open them they execute an rm -r command on some other directory.

The biggest downside in my experience is that it makes me rely more on the documentation and less on the types. If I want to intelligently handle some IO errors, I've got to look at the documentation for the operation I'm performing to see what each error is known to mean on the targets that are currently supported. If there's some error which is not mentioned in the documentation, then I'm left wondering whether the documentation is incomplete or that error is never expected to occur on the operating systems I'm targeting.

One way to design the library to address both this problem and the future compatibility problem would be to have an opaque error type for each operation, e.g. FileOpenError, and then have functions which categorize the error on a given set of targets. So e.g. if I think my code will only be deployed on linux, I can call FileOpenError.linux2024 : FileOpenError -> [NotFound, PermissionDenied, Unexpected] which lets me handle all of the cases which can be expected to occur on linux as of 2024, and one Unexpected case for if my program is in fact not running on that kind of target and the error is none of the above.


Last updated: Jun 16 2026 at 16:19 UTC