Brendan Hansknecht said:
Personally, if a user opts into it, I would prefer to just allow roc to load a shared library as a platform extension.
some random thoughts on this topic
supposing I'm an application author and I want to use BLAS/LAPACK, or sqlite, or something like that in my CLI app or web server...it seems fine for me to be able to say "hey I'm going to opt into bringing this into my code base, even though it's not coming from the platform and could potentially have security problems etc"
and really, at that point, if I'm just sort of trusting the dependency to "follow the rules" regarding memory (e.g. no copying like we talked about in #ideas > wasm Task-based ffi?) then I guess it's about the same to allow pure functions in FFI
in the sense that if I'm trusting this third-party code to run unrestricted in my own process, and then trusting the pointers it gives back, the consequences of what can go wrong are essentially unlimited, so "it said it was pure but it actually wasn't" is kind of a drop in the bucket at that point
obviously for something like this, platforms would have to be able to say that this sort of thing is or isn't allowed, e.g. so that safe-script can actually continue to be safe :stuck_out_tongue:
and plugins and wasm can still work, etc. etc.
one thing I really dislike about this area is that although it opens up some new and potentially positive user experiences, it will absolutely introduce new categories of negative user experiences
like today, if I get a segfault in my roc app, there are a grand total of two possible explanations:
same with weird memory corruption issues or anything like that
in this world, I now have N potential explanations, where N is the number of dependencies I have on third-party FFI code
and the more ergonomic it is to use those dependencies, the more they'll get used, the more those bugs will come up, and the harder it will be for people to track them down
And the less incentive to write nice Roc packages
right, that too
another problem with dylibs specifically is that the default distribution story is absolutely rancid
I think there is a nice balance with WASM. It's a great escape hatch for, I have a lot of expensive legacy code... I can now wrap it and use it. But I still get to work with the nice Roc ecosystem
And to do that with WASM, you need a shim, which requires some API design and thought into how to do it well
I still remember earlier in my career when I'd install a package and see linker errors when my program started up and having absolutely no idea how to solve it...so I'd hunt down some cursory explanation on the internet, and get it working on my computer, then it wouldn't work on my teammate's computer, then it wouldn't work on our server etc.
and then after getting all of that working, eventually there would be a server upgrade and it would stop working again
I just really do not want to...I'll use the word inflict because that's how I feel about it...that user experience on end users
obviously there are some great uses for dylibs in terms of loading plugins and so forth
but just for general like "I want access to this code that's written in another language" the user experience is so hostile
Roc's goal is to be high level... arbitrary FFI explodes the complexity out in other ways too. Does a library/package author now need to support all current and future roc targets? compiling to macos-aarch64, linux-aarch64 ...
I think it honestly might be best as an escape hatch that is not allowed in the package ecosystem.
Like you can go manually download the roc-tf-shim, compile it to a shared library and extend basic-cli with it, but you will never get any sort of shared library dependent code form the package ecosystem.
At that point, why not just fork the platform?
that's an option, but it creates an incentive to end up with READMEs like "you can't install this as a normal Roc package, so to install it, here's what you do: copy these .roc files, and then install this dylib..."
Like we need friction to the level of a user writing performance (or depeendency) critical code that they might use once or twice.
If you need high performance and that level of flexiblity, you can always make a custom platform
why not just fork the platform?
I think it is about the need for extensibility in various languages without forking
And a normal user could do
Like it basically allows for opt in platform composibility at higher cost to setup and friction.
I'm confused tough... I thought you suggested we can have these, but they cant be made into a package.
Or do you mean, they are explicitly not permitted on the centralised index?
Cause as a library author, I would love to make roc-ml that wraps some machine learning framework and anyone can use with any platform if they really need ml.
they are explicitly not permitted on the centralised index?
Yeah. Can't be on the central index and roc will never just load one. Always requires more specific opt in and setup.
So a user will be 100% clear what they are getting into
Brendan Hansknecht said:
they are explicitly not permitted on the centralised index?
Yeah. Can't be on the central index and roc will never just load one. Always requires more specific opt in and setup.
fwiw I know from Elm that if you do this, someone will build a competing centralized index that allows this
In a future world where WASM gets super powers like SIMD etc... could you make a roc-ml using WASM?
I think wasm already supports simd
Oh I see. You are talking about just wrapping an existing framework. It would have to already support WASM
Also talking about doing something effectful. An ml library will at a minimum use the gpu if one is available (arguably effectful, but for roc, definitely an effect).
setting aside other considerations for a moment, how would allocations work in that world?
like let's say my platform is nea
I bring in something that uses malloc for all of its allocations
how does that work?
It uses malloc and nea has less guarantees.
haha I think it means nea doesn't work anymore :sweat_smile:
Why?
Just set nea not to use 100% of ram and malloc can live in the last x%
Or it means that nea doesn't support these extensions. Which is also fine
yeah the whole point of nea is that each request gets a fixed amount of memory, and OOM can't affect the others etc.
yeah it probably wouldn't be able to support that
Doesn't WASM have the same problem though? Or would it be using an interpreter that mallocs from the platform just the same.
Yeah, I don't think this needs to work with every platform. As was mentioned before, any platform should be able to opt out.
Doesn't WASM have the same problem though?
No, we can force wasm through roc alloc. Or allocate a static buffer for it. That said, it might jump the minimum memory footprint of each request.
at this point though, what would be the advantage of making something more first-class compared to the status quo where #ideas > Shared Library FFI Packages is already possible?
Having roc generate the tasks and types for calls would be huge for reducing errors. Theoretically the shims could even get glue support the same as platforms. Also, it unties it from platforms which is nice. Oh, and libffi is a pain to work with, having a static api known at compile time is simply way nicer.
what if we made glue work for any module, not just platform modules?
like if you pass it an ordinary interface module, it works on whatever's exported
It would have to generate different code for ffi modules unless roc is dispatching the effects or somehow enriching the types the platform sends over.
Cause everything has extra indirection with ffi
but the glue script itself could take care of that, right?
at least with runtime ffi
as long as it has access to the roc types like normal
Just noting it is different from generic platform glue if using #ideas > Shared Library FFI Packages. If we supported it directly, it would be the same as normal platform glue.
hmm... This is probably gonna fall apart with effect interpreters. Cause you would need ffi calls to interact with the host state machine. Perferably to be async somehow. Otherwise, they just block the async effect interpretter.
yeah that's part of why I like the idea of having this be more of a userspace thing that platform authors can implement (since they already can implement whatever they want)
like for example they can choose to open things in a subprocess with shared memory
to avoid blocking their own process's threads
or offer 2 different primitives, one of which runs in the current process and the other of which is in a separate process, etc.
If you're loading an arbitrary dylib, it could make a syscall though too right? spwan children etc
yeah, that's expected.
The goal here is to able to load a dll that can run compute on the gpu for example.
so then a generic library could exist that enables most platforms to be able to access gpu compute
That is at least my simple motivating example.
yeah, the relevant part to me is that this is not a new language feature or anything
it's just a thing that all platforms can innately do because they're written in languages that can open dylibs :big_smile:
I think it's a very relevant distinction if it's a first-class thing in the language compared to something that a given platform can choose to offer, or not, in userspace
I think the biggest pain is that dlls are much nicer to interact with if you know the api at compile time. Like even if we are using libffi, it would be preferable to send a roc type spec to the platform so it can setup the call properly without extra boxing or anything
for example, the fact that there are versions that are varying degrees of safe
it's already innately the case that you have to trust your platform
because they can do arbitrary code execution
Brendan Hansknecht said:
I think the biggest pain is that dlls are much nicer to interact with if you know the api at compile time. Like even if we are using libffi, it would be preferable to send a roc type spec to the platform so it can setup the call properly without extra boxing or anything
but can't glue do that already?
No, the platform is precompiled. It won't get ffi glue to understand the types.
hm, ok so what would a language feature version of this look like?
like let's say you want to wrap a ml library in a dylib and make it available to any application author whose platform offers support for running dylibs without safety checks
what would the language feature be that facilitates that?
So still with a platform specific set of ffi primitives and trying to add as little to the language as possible, but not make it painful to write the ml dylib
hm, so where would the foreign types be specified?
I think would would be wanted would be:
The ability to write a hosted module for dylibs that can do glue generation and can create a type spec for each generated function to pass to the platform.
I think the rest could be orchestrated in userland.
Oh, also would want a way to pass something dynamic to the platform that roc would guarantee matches the type spec.
Need to think about this more, but roughly something like that.
hm, yeah would be helpful to see a concrete design I think!
I'll think about it for sure.
There's a chance it could be done with the new encode and allowing glue to generate for a hosted file directly.
Brendan Hansknecht said:
I think it honestly might be best as an escape hatch that is not allowed in the package ecosystem.
this very much
Ok, so I have been looking at libffi more and thinking about this.
Note: I am thinking from an effect interpreter world view.
As mentioned above, libffi is really the only option for runtime configured ffi. For a good experience, we need to generate their primitives or something that can easily be translated into their primitives.
Fundamentally, libffi has two primitives:
void ** cause it doesn't know any of the argument types. (essentially a list of pointers to the arguments)ffi_type is essentially a enum for primitive types. For more complex types, it is a nested structure where complex types reference the primitive types that build them up. This preferably is generated once per ffi function and then reused from that point forward.
The void ** for the args and a void * for the return type is needed every single call.
None of these primitives can be generated directly in roc. So they must either be constructed at runtime by some sort of tagged structure passed to the host, be generated by some form of glue (only viable for ffi_type), or generated by the compiler itself.
What would the gold standard be? Not saying this should be what roc does, but I generally want to explore what would be the best experience while still constraining to going through the host for all effects. That way, for example, an async host can choose to run ffi calls on a blocking thread (I currently don't have any idea how we could make it work with async).
I think the best experience would be:
hosted-ffi module type that supports glue. The Glue generates the roc types, the effect function prototypes and the ffi_type constant for the function (this could alternatively be generated directly in the roc binary when loading an ffi function, but it would have to be created by the compiler, not roc code).hosted-ffi does not generate Task. Instead it generates FfiTask. FfiTask can be passed to the platform. When it is passed to the platform, it is passed as a tuple of args, a slot for return type, and a void ** pointing to each individual arg in the tuple. The platform can pass that directly off to libffi and is free from any sort of complex type wrangling. Since the platform is in control, it can spawn it in a subprocess, thread, or just run it directly. After completion of ffi, control is return to roc and roc will deal with cleaning up the args and moving the result type to the correct place.Note on 3: With something like stored where a function can load and store from the platform via a unique key, roc wouldn't need to be involved otherwise. That said for stored to work, we need a way for an indvidual function to keep state of unique keys. I think that ffi state could easily become a big hassle to manage without some sort of builtin state management.
Obviously, we don't have to match gold, but I think it give a good idea of the pieces required to make this nice. In the worst case, we have what I made for #ideas > Shared Library FFI Packages: everything is always boxed, the user app has to manage all ffi state, the ffi package author has to write their own glue, and no types are guaranteed to be set correctly.
We can pick some or all of the things above to make this picture nicer. I really think 1 and 2 would be huge. 1 makes it much more type safe and allows for easy ffi package dev. 2 enables fast and efficient ffi without a bunch of extra platform complexity. It also shouldn't be too hard to generate a tuple, and some pointers. For 3, I'm sure there must be a smart way to manage state. I mean we have to solve fast state management in general for things like webservers. Maybe the state pipelined around in a smart way. Maybe we allow for generation of unique keys and stored such that state can be local. Either way, I think some sort of solution will emerge.
Last updated: Jun 16 2026 at 16:19 UTC