hot loading · ideas · Zulip Chat Archive

Stream: ideas

Topic: hot loading

Richard Feldman (Dec 15 2023 at 20:40):

branching off from #ideas > server platform that can upgrade itself, I realized it's actually surprisingly conceptually straightforward to implement hot code loading in Roc today, and would continue to be so as long as we didn't introduce Stored (which I've recently been doubting we should introduce) or Random.generate (which I've also recently been doubting we should introduce)

Richard Feldman (Dec 15 2023 at 20:43):

without those two language features, all state lives in the host, and if a Roc application changes, then conceptually all that the host needs to do is:

have a pointer to mainForHost and always call mainForHost using that pointer
when a hot code load happens, dlopen a new Roc application (compiled as a dylib)
change the mainForHost pointer to point to the new one, done.

conceptually, that's all that would need to happen

Richard Feldman (Dec 15 2023 at 20:45):

hosts could also do specific things to make this more graceful, e.g. a webserver host could have a mainForHost pointer per request handler so that requests already in flight could keep using the old one, and then only new incoming requests would use the new one (and then reference-count the old one so it could be unloaded once it's done)

Richard Feldman (Dec 15 2023 at 20:49):

we could also have roc dev automatically open a socket (whose location could be exposed to the host via roc__hot_load_socket() or something), and then there could be a very simple protocol where the host listens for updates on that socket and then when the path to a new compiled app comes across the socket, the host loads it

Richard Feldman (Dec 15 2023 at 20:50):

that could work across processes, but also we could have an option for a TCP socket which accepts the entire byte contents of the new app, so that you could hot load directly into production if you wanted to

Richard Feldman (Dec 15 2023 at 20:51):

I think the only scenario where hot loading could fail would be if you had some type variable that went to the host (e.g. model for application state) and that changed between builds; at that point, the one the host has currently stored would be type-incompatible with the new one, so hot loading wouldn't be possible

Richard Feldman (Dec 15 2023 at 20:51):

so if you tried that, roc dev would have to give an error like "hot loading not possible because type variable foo changed" or whatever

Richard Feldman (Dec 15 2023 at 20:52):

and of course if you actually changed the host it wouldn't be possible to hot load that in this idea

Richard Feldman (Dec 15 2023 at 20:52):

but this seems like a good argument against Stored in particular, because this strategy would not work anymore if the compiled app had global state in it

Brendan Hansknecht (Dec 15 2023 at 21:56):

I like this a lot

Richard Feldman (Dec 15 2023 at 21:57):

yeah a thing I really like about it is that we could end up with roc dev just doing all of this behind the scenes for you automatically

Richard Feldman (Dec 15 2023 at 21:58):

like you just get hot code loading on your web server, you make code changes and they're instantly live on your localhost

Richard Feldman (Dec 15 2023 at 21:58):

like you get in interpreted languages

Richard Feldman (Dec 15 2023 at 22:00):

except you also get type errors etc, because it's automatically doing a watch too

Brendan Hansknecht (Dec 15 2023 at 22:04):

Probably need to change how we package platforms. Probably want them to package the non preprocessed host (it depends on a shared library and can be used with roc dev). Then after download, we preprocess the host and that would be used with any surgical linking flows

Brendan Hansknecht (Dec 15 2023 at 22:04):

To be fair, I think we want to make this change anyway. Cause it decouples packaging from surgical linker versions.

Richard Feldman (Dec 15 2023 at 22:05):

seems reasonable! :+1:

Richard Feldman (Dec 15 2023 at 22:06):

a thing that's cool about this design is that we can still output a static binary and hot load new app code at runtime

Richard Feldman (Dec 15 2023 at 22:06):

the original static app code just hangs around not being used

Richard Feldman (Dec 15 2023 at 22:07):

could theoretically try to get fancy and ship a dylib for the initial app, but the simplicity and convenience of a static binary outweighs being able to do one more dlclose by a lot imo :stuck_out_tongue:

Richard Feldman (Dec 15 2023 at 22:08):

a funny side benefit of this is that it helps resolve some other design questions (Stored and Random.generate) that I've been on the fence about

Richard Feldman (Dec 15 2023 at 22:09):

seems like if we're doing this then the bar for those to be worth it would be raised enough that we'd need more use cases to reconsider them, but knowing what we do today, shouldn't do them

Brendan Hansknecht (Dec 15 2023 at 22:42):

Hmm...though not sure how hot code reloading would mix with the surgical linker in this case....hmm

Brendan Hansknecht (Dec 15 2023 at 22:43):

Cause surgical linker expects a builtin dynamic lib dependency, not a runtime dlopen

Brendan Hansknecht (Dec 15 2023 at 22:43):

I don't think we could surgically link a binary that uses dlopen

Brendan Hansknecht (Dec 15 2023 at 22:44):

And making platforms release two versions, one with dlopen, and one for surgical linking sounds like a hassle.

Brendan Hansknecht (Dec 15 2023 at 22:44):

Maybe we will need smarter patching of a shared library or something like that instead

Richard Feldman (Dec 15 2023 at 22:56):

the surgical linker would see the same thing as today

Richard Feldman (Dec 15 2023 at 22:56):

in this idea at least

Richard Feldman (Dec 15 2023 at 22:56):

the dlopen would be extra, in addition to the stuff we do today

Richard Feldman (Dec 15 2023 at 22:57):

so the original (static) mainForHost would always stay loaded, it would just stop getting called

Richard Feldman (Dec 15 2023 at 22:57):

and then all the subsequent dlopened mainForHosts would get unloaded whenever they got replaced by a newer one

Brendan Hansknecht (Dec 15 2023 at 23:23):

So the platform has an if on every call to roc? That switches between the dlopen and static version?

Brendan Hansknecht (Dec 15 2023 at 23:25):

Also, may need to use dlmopen to avoid symbol conflicts from multiple versions of the app being loaded at once, but that is minor.

Richard Feldman (Dec 15 2023 at 23:26):

no, it has a global function pointer that's initialized to & mainForHost

Richard Feldman (Dec 15 2023 at 23:34):

that was my thinking at least :big_smile:

Brendan Hansknecht (Dec 15 2023 at 23:39):

Oh fair

Richard Feldman (Dec 16 2023 at 01:26):

Brendan Hansknecht said:

Also, may need to use dlmopen to avoid symbol conflicts from multiple versions of the app being loaded at once, but that is minor.

hm that's a good point - dlmopen is only available on Linux, so might need to generate unique symbols when hot loading so normal dlopen works :big_smile:

Kevin Gillette (Dec 23 2023 at 00:37):

Richard Feldman said:

that could work across processes, but also we could have an option for a TCP socket which accepts the entire byte contents of the new app, so that you could hot load directly into production if you wanted to

That part is the thing I think is risky from a security perspective (code injection in production). It's one thing to give users the tools to piece this together themselves, but if we offer this as a core tooling feature specifically mentioning production, I think we would need to offer a bit to conveniently gate that access, such as code signing and such

Richard Feldman (Dec 23 2023 at 00:38):

:thinking: how would code signing work in this context?

Kevin Gillette (Dec 23 2023 at 00:40):

No idea! That's part of my point :wink: There are definite and profound security implications, so it'd be important to design in security from the start

Kevin Gillette (Dec 23 2023 at 00:41):

At the basic level, scp'ing a binary to the server is a lot more secure than accepting arbitrary code over tcp

Kevin Gillette (Dec 23 2023 at 00:42):

code signing could provide at least proof-of-author

Alexander Pyattaev (Jan 05 2024 at 09:11):

We have tested all of that in the plugins experiment, and it seems to work out. The key showstoppers there is that glue can not produce bindings for dynamic runtime linking at the moment, and roc "libraries" can not expose more than one function to the platform without crazy contortions with function returning a record of functions.

Last updated: Jul 23 2026 at 13:15 UTC