sending unsized values to the host · compiler development

Stream: compiler development

Topic: sending unsized values to the host

Richard Feldman (Aug 24 2024 at 23:31):

building on the discussion with @Ayaz Hafiz @Folkert de Vries from the meetup today - I think probably the way forward is the thing we talked about earlier, namely:

require that anything sent to the host has a size that the host can know statically
this means if you want to send a closure or type variable, the platform module has to explicitly Box it (and we just pay the perf penalty for now, since obviously boxing needs to be supported regardless)
you get a compile-time error if you're trying to send anything with statically unknown size (statically unknown to the host, that is - e.g. a closure or type variable, or a data structure that contains either of those)

Richard Feldman (Aug 24 2024 at 23:33):

assuming that's the design, then I think it should be possible for the platform to say to the application that the application needs to provide something that looks like this:

{
    init : Task state [],
    handleRequest : state, Request -> Task Response [],
}

Richard Feldman (Aug 24 2024 at 23:33):

and I thought about how unification could work and I think it might actually be kinda straightforward

Richard Feldman (Aug 24 2024 at 23:34):

basically in constraint gen, create a Function constraint where :point_up: is the argument type - e.g. pretend the function has this type:

{
    init : Task state [],
    handleRequest : state, Request -> Task Response [],
} -> {}

Richard Feldman (Aug 24 2024 at 23:34):

and then create another constraint as if we're calling the function passing main as the argument

Richard Feldman (Aug 24 2024 at 23:35):

so if I have a main with this type:

{
    init : Task MySharedServerState [],
    handleRequest : MySharedServerState, Request -> Task Response [],
}

...it should successfully unify with that "argument"

Richard Feldman (Aug 24 2024 at 23:38):

then from there we could add constraints similar to the ones we have today for the app-defined concrete types (e.g. Model) except that they're set to be Equal to the type variables in this "function"

Richard Feldman (Aug 24 2024 at 23:38):

so then after solving finishes, we have resolved Variables for the type variables, just like we do today

Richard Feldman (Aug 24 2024 at 23:38):

but without the application author having to declare them in a special way

Richard Feldman (Aug 24 2024 at 23:38):

and then from there everything can proceed the same way it does today

Richard Feldman (Aug 24 2024 at 23:39):

unless I'm missing something, it seems like that should work - but of course I could always be missing something! :big_smile:

Brendan Hansknecht (Aug 25 2024 at 00:01):

I don't follow the type checking part. What is the the function solving? Type checking for basic webserver works today and it uses those types.

Brendan Hansknecht (Aug 25 2024 at 00:03):

It has an explicit function to go from state to Box state for sending to the platform.

Brendan Hansknecht (Aug 25 2024 at 00:04):

Also, boxing is fine, but I prefer the current system of telling the platform the size of things (sadly doesn't really work with nested data).... So boxing is probably more reasonable I guess.

Ayaz Hafiz (Aug 25 2024 at 00:09):

this all sounds good but i don’t follow how it solves the problem of different value locations in structs? unless you’re not intending to solve that here

Richard Feldman (Aug 25 2024 at 00:12):

Ayaz Hafiz said:

this all sounds good but i don’t follow how it solves the problem of different value locations in structs? unless you’re not intending to solve that here

basically disallow anything that would move around (so, unboxed closures and unboxed type variables)

Richard Feldman (Aug 25 2024 at 00:13):

if they have to be in a Box, then the host statically knows how big that is

Richard Feldman (Aug 25 2024 at 00:13):

and that's recursively true, e.g. you can't send anything that contains an unboxed closure or type variable

Richard Feldman (Aug 25 2024 at 00:14):

so unless I'm missing something, that means any struct that goes to the host will have a fixed field order that never varies by application

Richard Feldman (Aug 25 2024 at 00:14):

and it's up to the platform to figure out how to make that happen

Richard Feldman (Aug 25 2024 at 00:14):

the idea being that long-term we at least need this to work, so for now we can make it a restriction and make it work

Richard Feldman (Aug 25 2024 at 00:14):

and then in the future if we find a way to relax the restriction, then cool

Richard Feldman (Aug 25 2024 at 00:15):

maybe it ends up being important, maybe this restriction is fine indefinitely, but either way it's a path forward that unblocks effect interpreters among other things :big_smile:

Richard Feldman (Aug 25 2024 at 00:18):

Brendan Hansknecht said:

I don't follow the type checking part. What is the the function solving? Type checking for basic webserver works today and it uses those types.
It has an explicit function to go from state to Box state for sending to the platform.

sorry, relevant distinction from the conversation that I didn't say explicitly here - the idea is that instead of having this concept of Model - that is, a special concrete type that the platform specifies which the application has to provide - instead the platform asks the application for literally a record with lowercase type variables like state in it

Richard Feldman (Aug 25 2024 at 00:21):

the Model design is more learning curve for application authors (compared to the docs being literally "application author, give me a normal record of this type and we're all set") and it blocks a future design where we just have one entrypoint and the app doesn't have to specify what it's sending the platform (which currently is needed to tell the platform about the app's Model type)

Luke Boswell (Aug 25 2024 at 00:29):

I know it's a little off topic, but is this a big change? Any chance we could roll it into Builtin Task change? that would save another round of releases and that's the only change that requires coordination with the platforms I'm tracking for a while.

Richard Feldman (Aug 25 2024 at 00:31):

I think this should be separate

Richard Feldman (Aug 25 2024 at 00:32):

it might not end up being a big code change, but the design space is a minefield and we've had plenty of "go partway down the road before realizing it's not going to work out...and then start over" in this area :big_smile:

Brendan Hansknecht (Aug 25 2024 at 00:32):

Oh, got it... Ok

Luke Boswell (Aug 25 2024 at 00:34):

No worries, I'm just thinking about the work to coordinate releases and upgrade platforms. It's not a problem, just thought it might be easy to include it while we are at it.

Brendan Hansknecht (Aug 25 2024 at 00:34):

Aside, I was originally thinking this was related to the bug in contributing with tags passed to the host expanding

Richard Feldman (Aug 25 2024 at 00:37):

ah no, unrelated!

Ayaz Hafiz (Aug 25 2024 at 05:36):

I think there are still a couple unresolved things here. One is that the state type is unsized until the application is compiled, right? So unless that is boxed as well, you need to somehow manage potential field reorders. Another is that in order to do this, you must make sure that any reachable values (closures, etc) are boxed - and so you need to perform an analysis somehow to determine how "far" in the program you must box values accordingly.

Richard Feldman (Aug 25 2024 at 10:53):

yeah I agree with both points: state would have to be boxed, and we'd need to do some special pass over the final type of the platform-host boundary to generate that error - but since it would just be for one type signature in the entire program, I think it should be pretty quick?

Ayaz Hafiz (Aug 25 2024 at 15:51):

well it's not just one final pass, you must make sure that everything in the program that can reach a value exposed to the host is correctly boxed, right?

Ayaz Hafiz (Aug 25 2024 at 15:52):

or else have some layer between the app and the host that walks the type and boxes it appropriately

Brendan Hansknecht (Aug 25 2024 at 17:47):

I feel like we can just ban the dynamically sized types in the platform api and force the platform author to deal with it as they see fit. Avoid requiring some sort of automatic boxing pass and either have the platform author expose the Box to the end users or box it themselves

Brendan Hansknecht (Aug 25 2024 at 17:48):

In most cases, it will just be shallow boxing of a single type

Brendan Hansknecht (Aug 25 2024 at 17:52):

For:

{
    init : Task state [],
    handleRequest : state, Request -> Task Response [],
}

If we allow exposing more than one value to the platform, the final api would be:

init : Task (Box state) []

handleRequest : (Box state), Request -> Task Response []

:point_up: requires being able to expose multiple values to the host, but it is by far the simplest host api and we should definitely support it in the long term.

Sadly, today, we can only expose a single value to the host.
So today, that api would need to be:

{
    init : Task (Box state) [],
    handleRequest : Box (Box state, Request -> Task Response []),
}

Brendan Hansknecht (Aug 25 2024 at 17:54):

Oh, though, I think we also need to box all of the tasks cause they are technically functions:

So the required api today to avoid all unsized types going to the platform would be

{
    init : Box (Task (Box state) []),
    handleRequest : Box (Box state, Request -> Box (Task Response [])),
}

To get this api, we would just ban unboxed functions and unboxed type variables from being passed to the platform api.

Ayaz Hafiz (Aug 25 2024 at 18:17):

sure, but that’s not a trivial task

Ayaz Hafiz (Aug 25 2024 at 18:17):

like you need to make sure that every value that is potentially reachable is boxed

Brendan Hansknecht (Aug 25 2024 at 18:22):

Is this only a hard problem cause types aren't required to be specified? Like if we require a type for the platform api, doesn't this just become a single traversal of the type?

Brendan Hansknecht (Aug 25 2024 at 18:24):

I'm probably thinking too naively, but I think we could require that the platform api is type in a way such that this becomes relatively easy to deal with.

Ayaz Hafiz (Aug 25 2024 at 21:37):

if you have some closure f deep down in your call stack that makes its way into a data structure given to the host, you need to know to somehow box that closure. for example

hostExposed = { value: 1, f: makeF {} }

makeF = if Bool.true then make1 {} else make2 {}

make1 = \{} ->
  v = 1
  \u -> u + v

make2 = \u -> u

here you have to know that make1 needs to be boxed, or somehow box it before you give it to the host

Richard Feldman (Aug 25 2024 at 22:24):

so the algorithm I was thinking of wouldn't allow functions passed to the host to return unboxed functions.

specifically, the idea would be that in this design the platform sends a total of exactly 1 value to the host (so, "one entrypoint") but of course it can be a record, tuple, etc. That 1 value's type may contain only the following (after expanding type aliases):

Builtin opaque types (Str, List, Box, numbers, etc.) - if any userspace opaque type is encountered, it's an error.
Collections of any of these allowed types (e.g. records, tuples, lists, etc.) with the exception of [], which is disallowed based on other discussions
Lowercase type variables if and only if they are inside a Box
Functions, with the following additional restrictions:
- the function may only accept and return one of these allowed types
- each of its arguments and return types is checked to make sure that if it contains a function anywhere inside its type, that function is inside a Box

Richard Feldman (Aug 25 2024 at 22:25):

so in this design, the above example would give an error because makeF returns an unboxed function

Richard Feldman (Aug 25 2024 at 22:27):

edit: the following is incorrect; I missed something, which Brendan pointed out below

note that it's drawing a distinction between whether a function happens to close over anything (since that can only be knowable after the application has been compiled, and the whole point of this is to get a layout that's statically knowable regardless of what the application's code is), but rather it's crudely assuming that:

if it's a top-level function (that is, provided directly in the record or something; not returned by another function) then of course it wouldn't have been able to capture anything
all other functions might capture, so require that they are boxed

Richard Feldman (Aug 25 2024 at 22:28):

I think this could be done in one pass because the recursive "verify that this type is allowed" function can take one argument for whether any encountered functions must be boxed, and then when you recurse on function args and return value, you just set that to true

Brendan Hansknecht (Aug 25 2024 at 22:33):

Top level can still capture in roc:

x =
  myCapture = callSomething 123
  \abc ->
       ...

Brendan Hansknecht (Aug 25 2024 at 22:33):

x is a top level function with a capture. Thus needs to be boxed

Richard Feldman (Aug 25 2024 at 22:34):

ahh good point, I forgot about that

Richard Feldman (Aug 25 2024 at 22:35):

I guess it's fine if all functions have to be boxed since the host will just hold onto those forever anyway

Ayaz Hafiz (Aug 26 2024 at 00:05):

so you have to annotate makeF as returning a box explicitly?

Richard Feldman (Aug 26 2024 at 00:10):

not just annotate it, literally use the Box builtin to wrap it! :big_smile:

Brendan Hansknecht (Aug 26 2024 at 00:21):

Basically, roc enforces the types, but the platform author has to actually ensure things are boxed explicitly.

So like the platform might accept a makeF from the app, but it would expose a wrapped version to the underlying host:

hostExposed : { value: I64, f: Box (I64 -> I64) }
hostExposed = { value: appExposed.value, f: Box.box appExposed.f }

appExposed : { value: I64, f: I64 -> I64 }
appExposed = { value: 1, f: makeF {} }

makeF : {} -> (I64 -> I64)
makeF = \{} -> if Bool.true then make1 {} else make2

make1 : {} -> (I64 -> I64)
make1 = \{} ->
  v = 1
  \u -> u + v

make2 : I64 -> I64
make2 = \u -> u

Ayaz Hafiz (Aug 26 2024 at 00:31):

i wonder if there’s a simpler solution for the developer than making them write a function that walks the data structure and maps the values to boxed values. if the app has a host-opaque type then you have to always box that on the app side somehow

Brendan Hansknecht (Aug 26 2024 at 00:38):

I think for most constructions the amount of nesting here will be pretty shallow. So the required boxing will be minimal and easy to do in the platform. No need to do it on the app side I think.

Brendan Hansknecht (Aug 26 2024 at 00:39):

Like going from this app api:

{
    init : Task state [],
    handleRequest : state, Request -> Task Response [],
}

To this platform host api:

{
    init : Box (Task (Box state) []),
    handleRequest : Box (Box state, Request -> Box (Task Response [])),
}

is not too bad.

Brendan Hansknecht (Aug 26 2024 at 00:39):

hostExposed = {
    init: Box.box (app.init |> Task.map Box.box),
    handleRequest:
        Box.box \boxedState, request ->
            Box.unbox boxedState
            |> app.handleRequest request
            |> Box.box
}

Brendan Hansknecht (Aug 26 2024 at 00:43):

So none of the boxing is exposed to the app author at all.

Brendan Hansknecht (Aug 26 2024 at 00:44):

That said, for some complex tree based apis like an action state setup, probably would want to expose boxing to the application author. Otherwise, the entire data structure before being passed to the platform has to be walked with many pieces getting boxed.

Richard Feldman (Aug 26 2024 at 00:48):

so my reason for liking this proposed design - for now, at least - is that:

no matter what we do in the future, all of this has to work at a minimum - e.g. there isn't a conceivable design where you couldn't send boxed functions to the host - so there's no wasted effort in getting it working in this way
assuming my proposed algorithm is as straightforward to implement as it seems to be, it doesn't seem to be too much effort to get working
it would unblock effect interpreters as well as letting us remove the Model concept (which would unblock other things)
in the future, we can always relax the restrictions as nonbreaking changes

Brendan Hansknecht (Aug 26 2024 at 00:52):

it would unblock effect interpreters

That's awesome! Is this something that could be tested today?

Richard Feldman (Aug 26 2024 at 01:16):

Folkert and I tried to get it working, but ran into errors related to lambda sets - which Ayaz's proof-of-concept (which I believe Sam has either started implementing in the compiler or is planning to soon!) should solve

Sam Mohr (Aug 26 2024 at 06:23):

Richard Feldman said:

Folkert and I tried to get it working, but ran into errors related to lambda sets - which Ayaz's proof-of-concept (which I believe Sam has either started implementing in the compiler or is planning to soon!) should solve

I'm still reading through our current impl and comparing to Ayaz's new proof of concept. I'll be making a big push over labor day weekend since I have no plans the whole weekend!

Oskar Hahn (Sep 05 2024 at 16:16):

I am not sure, if I understand this new concept. @Agus Zubiaga explained here that a Model would be a type variable. It would be possible to write something like updateModel : model -> Task model [].

Is this correct?

How would this typecheck? For example if you have something like:

Model : U64
updateModel : model -> Task model []

How would it be possible, that updateModel "foo" would throw a type error?

My current problem is, that I want to do something like

Model: a
updateModel : Model-> Task {} []

There is currently no syntax in roc to express that. Will that be possible with this change? (It does not matter for me, if I have to use a Box).

Brendan Hansknecht (Sep 05 2024 at 18:41):

To my understanding, the goal is to allow this in the platform api with the host:

updateModel : Box model -> Task (Box model) []

Both type variables and functions will be allowed to be passed to the host as along as they are boxed.

Brendan Hansknecht (Sep 05 2024 at 18:42):

You then could expose something like this to the app:

update : model -> Task model []

Your platform updateModel function would just deal with all of the boxing.

Brendan Hansknecht (Sep 05 2024 at 18:43):

Whatever the user picks for the model would just work in this case.

Brendan Hansknecht (Sep 05 2024 at 18:43):

If you need multiple functions with the same model type variable, you would need to use a tuple or record to link the type variables together.

Brendan Hansknecht (Sep 05 2024 at 18:44):

For example. You might expose this to the app:

Funcs model: {
    init: {} -> Task model []
    update : model -> Task model []
}

Brendan Hansknecht (Sep 05 2024 at 18:45):

The paired host type with boxing would be:

Funcs model: {
    init: Box ({} -> Task (Box model) [])
    update : Box (Box model -> Task (Box model) [])
}

All the boxing avoids the host dealing with any unsized types.

Brendan Hansknecht (Sep 05 2024 at 18:45):

Does that make sense @Oskar Hahn?

Oskar Hahn (Sep 05 2024 at 19:25):

Ahh. Thank you. That makes sens. I think I missed the part, that this is only possible in the platform. (Of cause it is)

Is it correct, that with this change, Tasks can be defined anywhere in the platform, not only in the hosted module? So the hosted modules will not be needed anymore?

And what do you think is the timeframe? More like builtin tasks (around halve a year) or more like effect interpreter (around two years).

Brendan Hansknecht (Sep 05 2024 at 21:11):

Tasks can be defined anywhere in the platform, not only in the hosted module?

I think hosted will still be needed. I think removing it would be an orthogonal problem to this.

And what do you think is the timeframe? More like builtin tasks (around halve a year) or more like effect interpreter (around two years).

I would assume more like builtin task, but I'm really not sure.

Last updated: Jul 26 2025 at 12:14 UTC