Grokking platforms · beginners · Zulip Chat Archive

Stream: beginners

Topic: Grokking platforms

Erwin Kuhn (Dec 09 2021 at 17:40):

Here's a thread to gather my questions as I attempt to write a "platform author starter pack" (#ideas > Platform author starter pack). Hopefully it will be helpful to others as well.

To kick things off: is it possible to export a host function from a file other than the main host? It would be nice to have a separation between the necessary low-level functions like roc_alloc and the effects written by platform author.

Right now, I'm trying in Rust, based on the CLI platform: I create an effects.rs module alongside lib.rs, move the roc_fx_getLine and roc_fx_putLine there and add mod effects to lib.rs. The example fails with a ld error:

ld: /tmp/roc_appDgDJun.o: in function `roc_fx_getLine_fastcc_wrapper':
builtins-host:(.text+0x1765): undefined reference to `roc_fx_getLine'

I think the functions are exported on the Rust side: if I leave the original functions in lib.rs, the Rust compiler tells me there's a name conflict. However, I inspected the symbols of host.o with readelf and could not find roc_fx_getLine when I moved it into a separate module.

I'm no expert on the Rust build process, so I may be doing something incorrectly there!

The same question would also be interesting for C & Zig hosts

Erwin Kuhn (Dec 09 2021 at 17:43):

Sample code:

// platform/src/effects.rs
use roc_std::RocStr;

#[no_mangle]
pub extern "C" fn roc_fx_getLine() -> RocStr {
    use std::io::{self, BufRead};

    let stdin = io::stdin();
    let line1 = stdin.lock().lines().next().unwrap().unwrap();

    RocStr::from_slice(line1.as_bytes())
}
// roc_fx_putLine below

Brendan Hansknecht (Dec 09 2021 at 17:46):

To kick things off: is it possible to export a host function from a file other than the main host? It would be nice to have a separation between the necessary low-level functions like roc_alloc and the effects written by platform author.

Yes. They just use cffi. So all that matters is the name. The linker doesn't care where it came from as a long as the name is right. Probably will just need to import them in the main file or something similar to make sure rust/zig/etc properly expose them.

Brendan Hansknecht (Dec 09 2021 at 17:47):

Need to look into your exact issue to remember what is missing on the rust side to make sure that is exposed.

Brendan Hansknecht (Dec 09 2021 at 17:54):

Looks to be that rust is dead code eliminating them

Brendan Hansknecht (Dec 09 2021 at 18:03):

for example, adding this to the main file:

pub fn force_import() {
    effect::roc_fx_putLine(effect::roc_fx_getLine());
}

will fix it.

Brendan Hansknecht (Dec 09 2021 at 18:03):

I am not actually sure the correct rust solution....

Erwin Kuhn (Dec 09 2021 at 18:05):

Yes, you're right, that works!

Erwin Kuhn (Dec 09 2021 at 18:11):

Trying to find a cleaner way to do this: it seems like Rust is eliminating the use of the effect module itself, since no_mangle should otherwise prevent dead code elimination for the functions

Brendan Hansknecht (Dec 09 2021 at 18:13):

Yes. Looks like you only need to use one of the functions in the module and it will load them all

Erwin Kuhn (Dec 09 2021 at 23:28):

Does anyone know more about the build process for platforms? I tried reproducing the issue in a minimal Rust library and the compiler correctly keeps externally exported functions from another module, even if nothing from that module is used in the main code

Erwin Kuhn (Dec 09 2021 at 23:30):

So I suspect there must be something within Roc's build process that leads to that problem. Especially since the linker error is triggered from some temporary roc_appXXX.o object file, created during the process

Erwin Kuhn (Dec 10 2021 at 01:09):

Brendan Hansknecht (Dec 10 2021 at 01:32):

Don't have time to write out a full answer rn, but this link may help. It is where we call cargo:
https://github.com/rtfeldman/roc/blob/aab601366ec33affc888c6992209cb028b2c52d1/compiler/build/src/link.rs#L446

Brendan Hansknecht (Dec 10 2021 at 04:12):

So I suspect there must be something within Roc's build process that leads to that problem. Especially since the linker error is triggered from some temporary roc_appXXX.o object file, created during the process

This is totally possible, but the error shouldn't really relate to roc_appXXX.o. Essential rust is building a static lib. We are building an object file that depends on the lib. It is failing when we try to link. That means that the rust static lib is not exporting the symbol we expect to link against.

Brendan Hansknecht (Dec 10 2021 at 04:23):

In Rust hosts, any exposed function that takes a RocStr as argument has to call core::mem::forget on it, to "not mess with the ref count". I guess this is to avoid Rust automatically dropping the string and freeing its memory? Will there be a way to avoid doing this?

I think that we should be able to modify the drop function and some ownership pieces to fix this, but I am not completely sure. With the current setup, rust thinks that it is taking ownership of the string we past in while roc thinks that it still has ownership. Either we need to increase the refcount before calling a rust function, or we need rust to not think it owns (and needs to free) the data. Should definitely be possible to clean up, but needs to be looked into more. Specifically around ffi and ownership.

In the False interpreter example, when the host opens a file, it wants to return a BufReader<File> to Roc code. To do so, the BufReader is boxed and a U64 is returned to Roc code, then passed back to the Rust host to read bytes.
Is this an idiomatic pattern for platform authors: return pointers as opaque U64 handles from the host, and wrap them in a nice abstraction in the platform Roc code?

I would guess that this will become an idiomatic pattern. For complex types that Roc will never use directly, it is simply cleaner to just pass a pointer around. It also stops roc from messing with the type at all. That being said, if we wanted Roc to be able to mess with the type. it would likely just be passed as a struct/record. I think it will be use case dependent, but the opaque type is really ergonomic when done right.

What is the fx keyword in Roc code exactly? Currently, I've seen the use of fx.Effect when defining effects in the host package config, as well as mainForHost : SomeType as Fx

Not sure I have a good answer. It is our type for managing side effects and purity. Basically it wraps host related interactions to ensure purity in Roc. @Richard Feldman or @Folkert de Vries might have a vetter technical answer to this.

What does the first pair of braces stand for in the requires line in Package-Config.roc?
In the only example I've found where it's used (benchmarks example), the generated mainForHost has a different name roc__mainForHost_1_exposed_generic - is that related?

Don't know. Again, probably @Richard Feldman or @Folkert de Vries know.
Thought I think I know a bit about roc__mainForHost_1_exposed_generic, it is related to the types being passed by the function and if it needs to be generic for returning more complex types. In this case specifically, I think it returns a closure that could have dynamic size.

Brendan Hansknecht (Dec 10 2021 at 04:23):

Erwin Kuhn said:

Does anyone know more about the build process for platforms? I tried reproducing the issue in a minimal Rust library and the compiler correctly keeps externally exported functions from another module, even if nothing from that module is used in the main code

Can you share this code so I can double check some things?

Richard Feldman (Dec 10 2021 at 13:03):

In Rust hosts, any exposed function that takes a RocStr as argument has to call core::mem::forget on it, to "not mess with the ref count". I guess this is to avoid Rust automatically dropping the string and freeing its memory? Will there be a way to avoid doing this?

I think we can avoid this by having all extern host functions take a ManuallyDrop<RocStr> (or ManuallyDrop<RocList<Whatever>> etc.) - that will automatically tell Rust not to invoke drop on it.

If we get in the habit of doing that, then hopefully it will look weird to see an extern function in a host that doesn't have ManuallyDrop around all of its (non-Copy) arguments, which would be less error-prone than calling forget I suspect.

Erwin Kuhn (Dec 10 2021 at 14:11):

Regarding RocStr etc...: couldn't we build the manual drop into the type, to ensure ref count safety? If I'm not mistaken, once you wrap something into ManuallyDrop, Rust hands it off completely.

Here's a simple example that builds a slice of custom structs, while taking them out of the Rust drop order: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=ed86e540e0a17e1aa8b78dfe1da076c0

Folkert de Vries (Dec 10 2021 at 14:21):

well the "automatic drop" variant is also nice sometimes, e.g. in our code gen tests

Erwin Kuhn (Dec 10 2021 at 14:24):

I think it could be interesting to improve the platform authoring experience in Rust though, as long as it doesn't cause problems elsewhere

Erwin Kuhn (Dec 10 2021 at 14:24):

(but I think roc_std is only meant for platform code right?)

Erwin Kuhn (Dec 10 2021 at 14:28):

Brendan Hansknecht said:

Can you share this code so I can double check some things?

Sure! I'll fork the repo, so that you can plug into the CLI example right away. I'll be able to do that in a few hours

Richard Feldman (Dec 10 2021 at 14:44):

one thing I'm not sure about is whether platform authors might end up wanting to use roc_std data structures as intermediate values, in which case they'd want Drop to work as normal.

Richard Feldman (Dec 10 2021 at 14:45):

it hasn't come up yet, but maybe it will? I'm really not sure!

Erwin Kuhn (Dec 11 2021 at 00:10):

Yes, I assumed types like RocStr or RocList were meant purely for interoperability when getting / returning values from Roc -- but that may be only a subset of the goal of roc_std!

Erwin Kuhn (Dec 11 2021 at 00:17):

My thinking was along the lines of: would it be possible for platform authors to not have to think about how Roc talks to their host language (at least not too much)?

Brendan Hansknecht (Dec 11 2021 at 00:18):

To be fair, this is mostly a rust problem. Wouldn't be an issue in zig or c/c++

Brendan Hansknecht (Dec 11 2021 at 00:18):

Since rust cares about ownership and such.

Brendan Hansknecht (Dec 11 2021 at 00:19):

But maybe I am wrong

Erwin Kuhn (Dec 11 2021 at 00:19):

Yes I think that's totally fair!

Erwin Kuhn (Dec 11 2021 at 00:19):

I'm more used to Rust, hence my questions around it

Brendan Hansknecht (Dec 11 2021 at 00:20):

Totally fair

Lucas Rosa (Dec 11 2021 at 00:21):

I think in zig you can just make an extern struct for most of that

Lucas Rosa (Dec 11 2021 at 00:22):

it seems the platform author definitely needs to be aware of that, it's slightly the point I think. For app authors not at all

Lucas Rosa (Dec 11 2021 at 00:23):

but I might have misunderstood the question

Lucas Rosa (Dec 11 2021 at 00:24):

by "that" I meant, the general low-level shape of RocList and RocStr vs. [1, 2, 3] and "some str"

Brendan Hansknecht (Dec 11 2021 at 00:24):

I think it would be a good general guide (at least for now) to have all of the rust platforms use ManuallyDrop<T> for all of the _fx_ functions in order to remove the forgets and such. It would be a reasonable standard that makes the ownership clear.

Brendan Hansknecht (Dec 11 2021 at 00:25):

Not sure the case for data passed into roc.

Lucas Rosa (Dec 11 2021 at 00:25):

do we plan on having like packages for Rust/C/Zig for these data types that platform authors can just import and use?

Erwin Kuhn (Dec 11 2021 at 00:26):

I think that's the goal of roc_std in Rust, right? Mostly due to the trickeries of Rust's ownership model, but still

Lucas Rosa (Dec 11 2021 at 00:26):

and maybe also like convenient functions for some ops on them

Brendan Hansknecht (Dec 11 2021 at 00:26):

roc_std was made because the compiler is written in rust

Brendan Hansknecht (Dec 11 2021 at 00:26):

Other languages, I think the plan was to not directly support

Lucas Rosa (Dec 11 2021 at 00:27):

right, I'm imagining a world where the compiler doesn't execute the platforms build command which is where we are heading eventually

Brendan Hansknecht (Dec 11 2021 at 00:27):

As in, probably someone who knows roc really well will maintain the version for c or zig or rust, but they will not be 100% roc official. That plan may change in the future

Lucas Rosa (Dec 11 2021 at 00:28):

fair enough, too many langs to maintain them all

Brendan Hansknecht (Dec 11 2021 at 00:28):

I would guess roc will officially support a small subset and contrib of some sort will support other languages

Lucas Rosa (Dec 11 2021 at 00:28):

but we could have a rust one as a base for people to reference when doing it for other lang platforms

Lucas Rosa (Dec 11 2021 at 00:29):

it was really cool to see it done in swift and then have it run on iOS a few weeks ago

Lucas Rosa (Dec 11 2021 at 00:30):

there's Oden, Nim, D, C++ etc. so no way we would have the capacity to maintain all those

Lucas Rosa (Dec 11 2021 at 00:31):

maybe if there was a spec that we could output to json and then other people could code gen that stuff

Brendan Hansknecht (Dec 11 2021 at 00:35):

So question for @Richard Feldman or @Folkert de Vries: Does Roc essentially assume that anything past into an _fx_ function will never be freed? Like is it guaranteed that roc will later decrement the refcount for that and free it.

Lucas Rosa (Dec 11 2021 at 00:36):

I think so

Lucas Rosa (Dec 11 2021 at 00:36):

oh well actually, would that ruin referential transparency?

Lucas Rosa (Dec 11 2021 at 00:37):

or at least how does that effect the in-place optimizations

Lucas Rosa (Dec 11 2021 at 00:38):

wait nvm, I read what you said again, I think my questions don't make sense

Erwin Kuhn (Dec 11 2021 at 00:39):

Mmh, host code can retain a reference to some list, right? I think Roc doesn't increase the ref count when passing the list to host code, so it could end up doing in-place mutations of that list

Lucas Rosa (Dec 11 2021 at 00:40):

that's what I'm wondering, I haven't touched any ref counting code, it's the part I know the least about

Erwin Kuhn (Dec 11 2021 at 00:45):

Actually, the problem of ManuallyDrop in Rust is different from ref counting if I understand it correctly (it's about freeing memory), so let me take back that assumption :sweat_smile:

Brendan Hansknecht (Dec 11 2021 at 00:51):

yeah, this is a weird set of tradeoffs:

current method (don't touch refcount):

Definitely faster. The refcount is in memory and slow to load/update
host can safely retain a copy of a list by update the refcount while running the function
host can just look at the refcount. if 1 do inplace mutations
Some location require ManuallyDrop

option 2- always increment refcount when passing to host if referenced again in roc:

probably safer and likely less bug prone in most host languages. (most languages have some form of scope based lifetimes and drop/destructors/etc)
Requires touching the refcount in more locations, which is slow, but probably not a big deal since most locations will likely touch in memory data anyway.
In place mutation gets more confusing on the host. If the refcount is 2, that is the host and a location in the app. Should be safe. If the refcount is 1, that would mean that the host is expected to mutate in place and then return a new list/str/etc.

Brendan Hansknecht (Dec 11 2021 at 00:52):

Also ManuallyDrop will correctly deal with reference counting/intentionally avoid it. The drop function of RocStr checks the reference count before potentially calling dealloc.

Erwin Kuhn (Dec 11 2021 at 01:01):

So currently, the host cannot distinguish between 1) the object still has 1 reference in Roc and 2) the object has no more reference in Roc, after being passed to the host?

Brendan Hansknecht (Dec 11 2021 at 01:03):

Currently Roc always has 1 or more references.

Erwin Kuhn (Dec 11 2021 at 01:17):

In that case, if Roc passes an object to the host, that is never referenced again in Roc afterwards, the reference count will go to 0 at the end of the scope containing the host call and the object will be cleaned up?

Brendan Hansknecht (Dec 11 2021 at 01:22):

That sounds correct to me.

Brendan Hansknecht (Dec 11 2021 at 01:23):

It actually will likely be the next statement after the call to the host. That statement will likely be to decrement the refcount(and maybe free)

Erwin Kuhn (Dec 11 2021 at 02:02):

OK so the Drop implementation for RocStr is never meant to be run in Rust host code then, since it either decrements the ref count or frees the memory, which messes up with Roc's memory management

Erwin Kuhn (Dec 11 2021 at 02:03):

Would it be interesting to have a custom data type for objects passed from Roc to the host, that exposes the ref count to platform authors? Ex: allow incrementing the ref count, to signal to Roc that you're keeping a reference within the host

Erwin Kuhn (Dec 11 2021 at 02:03):

That would encode the contract of "we don't increase the ref count when passing something to the host"

Lucas Rosa (Dec 11 2021 at 02:38):

that could be tricky because we can’t necessarily assume a particular language for the platform. Rust isn’t necessarily first class for platform building just the C ABI is

Brendan Hansknecht (Dec 11 2021 at 02:42):

I think this question would just be for the rust platform, and it is really a question of what the semantics should be for roc_std. So I think it is a valid question and doable. I mean the refcount is already there and being accessed by the current forms of RocStr and etc.

Erwin Kuhn (Dec 11 2021 at 09:29):

Putting Rust aside, anyone writing a platform will have to work with the ref count, if they want to keep a reference to some object within the host, or perform in-place mutations for performance, right?

Erwin Kuhn (Dec 11 2021 at 09:35):

While in the long-run, Roc aims to be agnostic to the platform language and those types should be part of an external roc_std for each lang, would it be interesting to start providing those for, say, C, Rust and Zig, to start experimenting and see if those abstractions feel right for platform authors?

Erwin Kuhn (Dec 11 2021 at 09:38):

It may also be helpful for discussions around the design of platforms, since it's been mentioned that that part of the language will likely evolve a lot

Erwin Kuhn (Dec 11 2021 at 10:42):

Maybe the more general question is: how should platform authors think about Roc's reference counting?

Brendan Hansknecht (Dec 11 2021 at 18:03):

Erwin Kuhn said:

Putting Rust aside, anyone writing a platform will have to work with the ref count, if they want to keep a reference to some object within the host, or perform in-place mutations for performance, right?

The entire type is exposed. Exactly what Roc sees internally can be used externally. So I guess I was just trying to point out that this is a per language api question rather than a Roc design question. The design of roc types is pretty concrete at this point. Though we do have a few minor optimization changes to make.

would it be interesting to start providing those for, say, C, Rust and Zig, to start experimenting and see if those abstractions feel right for platform authors?

100%. Though would probably swap out C for C++ (personal preference and more automatic memory management). I overall think that it won't be too complex of a task to make these apis and make them function ergonomically. I think the rust roc_std is probably most of the way there for the types it exposes, but would be good to make a test platform in multiple languages that heavily exercises interactions with roc types and refcounting.

how should platform authors think about Roc's reference counting?

In the current state (to my understanding):

All roc interactions run in a single thread, so atomics are not needed. (multi-threading dealt with by the host)
The reference count (though in a weird format) is just a number stored with the memory to be freed.
Assuming the host doesn't change it, the number is the number of locations in roc that could view the data.
Roc will automatically increment and decrement it as needed based on scopes and references.
If Roc sees a refcount of 1, it is allowed to do inplace mutations.
When passing data to the host, the refcount is not changed (this may not be 100% true, it may follow the same rules as roc function calls and scoping).
If the host sees a refcount of 1, it is allowed to do inplace mutations.
If the host wants to keep a reference to the memory, it must increment the refcount (and decrement it later, maybe freeing data).
otherwise, the host should just ignore the refcount

I think that is the rough overview of the semantics from a host perspective.

Erwin Kuhn (Dec 13 2021 at 18:54):

Thanks for all the clarifications @Brendan Hansknecht ! This helps a lot.

Based on this + some digging in the compiler, I think I have enough for a small write-up. You also gave me an idea for an alternative API to Roc strings & lists in Rust. I'll start playing around with those this week!

The result will likely be full of mistakes, but hopefully a good basis for further questions :big_smile:

Brendan Hansknecht (Dec 13 2021 at 19:12):

The result will likely be full of mistakes, but hopefully a good basis for further questions

Sounds like a good starting point for most projects.

Tankor Smash (Feb 09 2022 at 23:08):

@Erwin Kuhn did you post the writeup somewhere? it'd be fun to read

Last updated: Jul 26 2025 at 12:14 UTC