I thought of a future way we could reduce the size of binaries like hello world on basic-cli:
the basic idea is that we have roc generate a function which runs the effect interpreter switch for the host.
So for example, we say:
That is no different than what we have today with the legacy linker from a function garbage collection standpoint.
sure, but it would work with the effect interpreter
which the current effect interpreter design would not :big_smile:
There are a few core missing pieces:
Oh wow that's a great idea :light_bulb:
I'm assuming someday we could make the surgical linker do this DCE
doesn't seem that tricky, right? Write down all the symbols we define, go through and see which ones actually get used; all the ones that never get used are dead and can be eliminated
(maybe we'd only want to do that in --optimize)
And the transitive calls
:thinking: transitive calls?
I don't think it would work. Generally that information is gone in the final executable. So even though we know the locations of a few symbols, we don't really have the information to remove them (it is very hard to shift anything around at a minimum cause all relocations in the exe are already resolved). On top of that, in any system where we are having the host expose symbols to a shared library, we are already doing tricks just to get the host to keep the symbols around. So they will all look like they are being used by the host itself anyway.
Also, if roc builds the state machine, suddenly you lose the ability to make it an async rust state machine which is one of the main gains of the new system.
I implemented this for Wasm. I trace the full call graph of what is used and eliminate everything else.
You have to do the full call graph because functions call other functions. Starting with what you want to keep ends up being more efficient I think.
The tricky part is indirect calls. If you have any of those, it's tricky to be confident whether they're called or not.
And I believe we are planning to use those for closures.
Brendan Hansknecht said:
Also, if roc builds the state machine, suddenly you lose the ability to make it an async rust state machine which is one of the main gains of the new system.
I think that part can work fine - we can have roc expect the host-implemented effect function to receive a callback Roc closure and a pointer to the loop state (which the switch function would have received from the host, and can pass along), so it knows what callback to run once the async effect is done
You cant call an async function in a function called by roc code
Rust really needs the hole stack back down to the root to build it correctly
So rust has to own the switching function
oh you mean literally using Rust's async keyword
At least that is what I remember from really trying to make it work before with the current effect system.
as opposed to like directly using io_uring in a host that happens to be implemented in Rust
Being able to use hyper with nonblocking async http requests for example
Even if you can call into io uring, you would have to make a blocking call to io_uring in the rust effect. No way to let something else run on the same thread like you get with async rust
hm, I wonder if there's some way we could make the DCE happen
it feels like if we have the right information in the host and give the right app usage information to the surgical linker, with their powers combined it seems like it should be possible
like an alternative idea (that sounds kind of ridiculous, but potentially possible) is to do some partial interpreting of the actual instructions in the host, based on knowledge of what discriminants mainForHost could possibly return, and then eliminating branches which follow from jumps on comparisons that we know will never pass
that wouldn't solve the "these symbols look like they're used for other reasons" problems, but if those functions are only called from eliminated branches, then their usages go to zero and they can be eliminated
Two thoughts:
maybe instead of writing our own full linker we could fork existing ones like lld (especially if we'd only be using them in --optimize)
That said, even if we have every single function in it's own section, without relocation information, we have to disassemble the entire application and understand all indirect calls to have any hope of being able to remove code. On top of that, any resolve relocation likely will be a pain to move, so we will hit a wall in terms of trying to actually move around the code to shrink the binary.
I don't think there would be any indirect calls in this part of the program
Yeah, I don't think we can do this with the surgical linker. I think it really needs to be done with a standard linker at the cost of link time performance.
I don't think there would be any indirect calls in this part of the program
We have to get the full transitive list of calls and dependencies. For example, in the hype case, we want to remove as much of tokio, the async stack, web stuff, tls, etc as possible (of course we can't remove all of it)
hm yeah that's a good point
If we can't tell if a function is called by some random indirect call elsewhere in the app, we can't remove it.
even if you never do http, tokio has to spin up
because that's structurally part of the host program no matter what
yeah
I think if people want binary optimization that cares about a few megabytes, they have a number of options without us dealing with this:
Like today, we could have a small version of basic cli if we just make basic_cli_no_web or (apparently ureq isn't really smaller than tokio + hyper for this), or probably basic_cli_ureq that forces always blocking io and doesn't pull in an async runtime.basic_cli_zig
given feature flags are a thing in rust, it wouldn't be hard to cut multiple version of basic cli that clip off the largest effects if wanted but still guarantee the exact same api for roc. Can even have a friendly panic message,
basic_cli_no_web does not support web requests, but you called an http effect.
Please switch to `basic_cli_with_web` if you want to use http effects
"link to basic_cli_with_web"
Of course, that doesn't have to be at runtime, basic-cli-no-web could cut a release that fully removes all things http even from the roc side and then when someone asks about http calls, we could point them to the other platform.
Last updated: Jun 16 2026 at 16:19 UTC