`Task _ {}` kinda sucks · compiler development

Stream: compiler development

Topic: `Task _ {}` kinda sucks

Brendan Hansknecht (Oct 06 2024 at 16:23):

So recently, we realized that open tags where leading to a solid number of ffi bugs. These were made apparent when we switch over to task as a builtin. Anytime we made a task like Task someData [] it lead to ffi issues. The big problem is that [] was an open tag. It would eventually specialize to SomeErrorTag. That was leading to the ffi changing depending on the error type used by an application.

This was the root of all kinds of strange bugs. As such, we switch over basic cli and other platforms using {} as the no error type. This solved the immediate ffi problem by requiring that the platform author manually deal with the issue.

This has two main downsides:

Even if a task can't fail, we are generating unnecessary result wrappings in the platform api. Often adding in a crash if the {} error case is ever somehow hit.
We are leaving the fix up to the platform author, who may not even know there is a problem.

I think we should more directly fix this. I also don't think it will be too hard:

Ban any sort of type variables in the task type for a hosted function. No Task Str *. Those clearly have undefined ffi types.
If we run into a [] in the ok or err case of a Task. Automatically map it in a way that ensures it will never expand.

2 is pretty easy to do. This is an example of doing it manually for the error case:

# Hosted generated function
stdoutLine : Str -> Task {} []

# Wrapping function:
line : Str -> Task {} []
line = \str ->
    (Ok x) = stdoutLine str |> Task.result!
    Task.ok x

We just have to automatically generate the equivalent of that wrapping function. Preferably when we implement it, we inline the representation instead of actually calling Task.result.

This enables user to just write Task _ [] and it to automatically work. It will never expand accidentally and break ffi with the platform.

Aside: I think we also have to ban Task [] [], but we can allow for both Task [] _ and Task _ [] where _ is a proper type that enables instantiate the task.

Thoughts?

Brendan Hansknecht (Oct 06 2024 at 16:25):

One ffi note that is important to note to platform authors:

Task {} [] will just be void return type.
Task Something [] will just be Something return type.
Task Something SomeErr will be Result Something SomeErr return type.

Richard Feldman (Oct 06 2024 at 16:39):

I think it would be simpler to ban [] in hosted types altogether

Richard Feldman (Oct 06 2024 at 16:39):

in the purity inference world, I don't think anyone would even notice :big_smile:

Richard Feldman (Oct 06 2024 at 16:40):

because the only reason it comes up a lot now is that Task _ [] comes up a lot

Richard Feldman (Oct 06 2024 at 16:40):

but in the purity inference world, Task Foo [] just becomes Foo

Brendan Hansknecht (Oct 06 2024 at 16:40):

If purity inference is coming soon, I agree. If not, I think we should fix this

Richard Feldman (Oct 06 2024 at 16:41):

yeah Agus already has the type-checking part almost done! :smiley:

Richard Feldman (Oct 06 2024 at 16:41):

hm, although shouldn't we still run into problems with non-empty tag unions? :thinking:

Brendan Hansknecht (Oct 06 2024 at 16:42):

oh! This is much farther along than I realized.

Richard Feldman (Oct 06 2024 at 16:42):

like if I send [Foo, Bar] to the host, there are circumstances where that can unify with [Foo, Bar, Baz]

Brendan Hansknecht (Oct 06 2024 at 16:42):

haha...yeah...

Richard Feldman (Oct 06 2024 at 16:42):

which could change layouts

Brendan Hansknecht (Oct 06 2024 at 16:42):

So maybe we need a more wholistic solution for any tags sent to the host

Richard Feldman (Oct 06 2024 at 16:42):

yeah like explicitly saying [Foo, Bar, Baz][] - or inferring it that way I guess

Richard Feldman (Oct 06 2024 at 16:43):

like basically treating any tag union types in hosted signatures as closed

Brendan Hansknecht (Oct 06 2024 at 16:43):

Really we want host tags to be closed but to automatically closed, but I also think that it would be preferred to to automatically map them back to open.

Brendan Hansknecht (Oct 06 2024 at 16:43):

Or at least make it trivial to do so?

Brendan Hansknecht (Oct 06 2024 at 16:43):

Cause all error tags will want to be open at the end of the day

Richard Feldman (Oct 06 2024 at 16:44):

oh true

Richard Feldman (Oct 06 2024 at 16:44):

Brendan Hansknecht (Oct 06 2024 at 16:44):

terrible code in basic-cli to work around this

Err : [
    EndOfFile,
    BrokenPipe,
    UnexpectedEof,
    InvalidInput,
    OutOfMemory,
    Interrupted,
    Unsupported,
    Other Str,
]

handleErr = \err ->
    when err is
        e if e == "EOF" -> StdinErr EndOfFile
        e if e == "ErrorKind::BrokenPipe" -> StdinErr BrokenPipe
        e if e == "ErrorKind::UnexpectedEof" -> StdinErr UnexpectedEof
        e if e == "ErrorKind::InvalidInput" -> StdinErr InvalidInput
        e if e == "ErrorKind::OutOfMemory" -> StdinErr OutOfMemory
        e if e == "ErrorKind::Interrupted" -> StdinErr Interrupted
        e if e == "ErrorKind::Unsupported" -> StdinErr Unsupported
        str -> StdinErr (Other str)

line : Task Str [StdinErr Err]
line =
    PlatformTasks.stdinLine
    |> Task.mapErr handleErr

Richard Feldman (Oct 06 2024 at 16:46):

so, setting aside ergonomics, I think there is only one way to do this correctly

Brendan Hansknecht (Oct 06 2024 at 16:46):

To work around this today in roc would require returning the open tag and then mapping every tag field (potentially recursively) with a when task |> Task.result! is.

Richard Feldman (Oct 06 2024 at 16:46):

actually nm I can think of two ways to do it correctly

Brendan Hansknecht (Oct 06 2024 at 16:46):

I think there is only one way to do this correctly

Force all type variables in the hosted api to be empty?

Brendan Hansknecht (Oct 06 2024 at 16:47):

So open tags are forced closed and Task Str err has not err case?

Richard Feldman (Oct 06 2024 at 16:47):

so one way to do it correctly is to have a big Error union which represents literally every possible error the host might see

Richard Feldman (Oct 06 2024 at 16:47):

and then all hosted functions use that as their Error types, and we have a rule that all tag unions in hosted types are closed unions, but that's okay because you've literally enumerated every possible one

Richard Feldman (Oct 06 2024 at 16:48):

then as the platform author you write wrappers around these that just expose the specific errors that can happen for a particular operation (again, setting aside ergonomics - this would at least work)

Brendan Hansknecht (Oct 06 2024 at 16:48):

What happens when a user wants to wrap or modify an error? Task.mapErr ExtraContext?

Richard Feldman (Oct 06 2024 at 16:48):

and those are open unions, and those are what get exposed to application authors

Richard Feldman (Oct 06 2024 at 16:49):

application author experience is unchanged here

Richard Feldman (Oct 06 2024 at 16:49):

this is an extra step for the platform author to take for the sake of layout correctness in the host

Brendan Hansknecht (Oct 06 2024 at 16:49):

I'm not sure I follow this? How are we opening the union such that the application author experience is unchanged?

Richard Feldman (Oct 06 2024 at 16:50):

the platform author is using Task.mapErr

Brendan Hansknecht (Oct 06 2024 at 16:50):

Ok, then why do we need one giant Error type?

Richard Feldman (Oct 06 2024 at 16:50):

for layout reasons

Richard Feldman (Oct 06 2024 at 16:51):

for a hosted function, the host needs to know statically what the layout of that tag union is

Richard Feldman (Oct 06 2024 at 16:51):

in order to know what the layout of that Result is

Richard Feldman (Oct 06 2024 at 16:51):

and that's only knowable statically if the tag union is closed

Richard Feldman (Oct 06 2024 at 16:51):

(and if any closures inside it are boxed, which is already a rule we separately need)

Brendan Hansknecht (Oct 06 2024 at 16:52):

Task.mapErr will deal with any layout issues by mapping to an open union.

task1: Str -> Task {} [SomeErr1]
task2: Str -> Task {} [SomeErr2]

exposedTask1 = \str -> Task.mapErr task1 \Err SomeErr1 -> Err SomeErr1
exposedTask2 = \str -> Task.mapErr task2 \Err SomeErr2 -> Err SomeErr2

Richard Feldman (Oct 06 2024 at 16:52):

oh, I see

Richard Feldman (Oct 06 2024 at 16:52):

sure, that would do the same thing

Brendan Hansknecht (Oct 06 2024 at 16:54):

But yeah, ignoring ergonomic, I think we just need to ban all type variables. This includes the type variable in open tags.

Brendan Hansknecht (Oct 06 2024 at 16:55):

** and ban unboxed closures as you mentions

Richard Feldman (Oct 06 2024 at 16:55):

well named type variables are useful

Richard Feldman (Oct 06 2024 at 16:55):

but those need to compile to opaque pointers

Richard Feldman (Oct 06 2024 at 16:56):

like that's what should be doing instead of Model ideally

Brendan Hansknecht (Oct 06 2024 at 16:56):

Ah, yeah, you have to special case some type variables, but those have to be in a container.

Brendan Hansknecht (Oct 06 2024 at 16:57):

Box model, List model, etc.

Brendan Hansknecht (Oct 06 2024 at 16:57):

And this is due to the container having a known layout.

Brendan Hansknecht (Oct 06 2024 at 16:57):

But you can't do Result Str err

Richard Feldman (Oct 06 2024 at 16:58):

yeah I think that should be fine

Richard Feldman (Oct 06 2024 at 16:59):

like Box model instead of the current Model for initializing webserver global state...you'd want that to be heap-allocated anyway so it would be passed by pointer to all the different request handlers

Brendan Hansknecht (Oct 06 2024 at 17:00):

A larger record is passed by reference anyway. So Model is still better.

Brendan Hansknecht (Oct 06 2024 at 17:00):

No need to keep unboxing and reboxing

Brendan Hansknecht (Oct 06 2024 at 17:00):

But yeah Box model is simpler

Brendan Hansknecht (Oct 06 2024 at 17:02):

Cause Box model in roc probably leads to a copy. It will be copied to the stack so it can be used as model without the Box in the roc application.

Brendan Hansknecht (Oct 06 2024 at 17:05):

We can fix this probably with a builtin though. Something that keeps the box alive, but for large records that will be passed by pointer anyway, will just pass in the pointer to the box.

Richard Feldman (Oct 06 2024 at 17:05):

random thought, but we could have a Box.leak : a -> Box a which gives you...

Richard Feldman (Oct 06 2024 at 17:05):

yeah, something like that haha

Brendan Hansknecht (Oct 06 2024 at 17:06):

I wonder how often large records in roc are mutated in place vs make a new copy.....

Brendan Hansknecht (Oct 06 2024 at 17:07):

Anyway, this is a far side tangent at this point.

Brendan Hansknecht (Oct 06 2024 at 17:09):

For the original topic of this thread:

It sounds like purity inference is actually pretty close. So we should just wait for that. On top of that, we have some known changes around restricting types that get passed to/from the host. That will fix any of these annoying bugs, but will not fix the ergonomics (luckily, the ergonomics are only a platform problem and shouldn't generally affect applications if platforms are implemented well).

Richard Feldman (Oct 06 2024 at 17:10):

agreed!

Richard Feldman (Oct 06 2024 at 17:11):

we could do the "make all the unions in hosted types be closed automatically" change anytime

Last updated: Jul 26 2025 at 12:14 UTC