Stream: ideas

Topic: wasm Task-based ffi?


view this post on Zulip Richard Feldman (Jul 31 2024 at 00:17):

something that just occurred to me: I can't think of a reason why Roc couldn't support a Task-based WebAssembly FFI :thinking:

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:18):

for example (just making things up here) we could have a module type called wasm and in its header it specifies a .wasm file which it wraps

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:22):

Like instead of requiring a zig/c/rust shim from js to roc?

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:23):

basically as a way to take platform-agnostic code that's written in another language and call it from Roc applications without having to get the platform involved, or use dylibs

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:23):

and share them in packages etc.

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:24):

there might be security concerns I'm missing though, e.g. around memory access

view this post on Zulip Luke Boswell (Jul 31 2024 at 00:25):

So is the external library we are wanting compiled and packaged into a WASM library?

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:25):

yeah

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:26):

SIMDjson might be an interesting example of that

view this post on Zulip Luke Boswell (Jul 31 2024 at 00:26):

And then platforms can recieve Tasks from roc saying "load this 'someCbutNowWasm.wasm library, and call X passing Y"

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:26):

yeah something like that

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:26):

or just like function pointers

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:27):

hm, it might be too difficult to validate things though

view this post on Zulip Luke Boswell (Jul 31 2024 at 00:27):

I don't see why this wouldn't work

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:27):

like if it says it returns a string, verifying that it's valid UTF-8

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:27):

or that reference counts were done correctly

view this post on Zulip Luke Boswell (Jul 31 2024 at 00:27):

If we are just passing standard Roc Types back and forth across the host boundary, and these are translated into types WASM understands

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:28):

yeah I'm just thinking about what a malicious actor could do in the package ecosystem

view this post on Zulip Luke Boswell (Jul 31 2024 at 00:28):

Or I guess you could have WASM enabled hosts, that support roc packages which includes WASM binaries

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:29):

like if it's all .roc files, there are certain exploits that aren't possible, so the question becomes - if there are now .wasm files too, is there some way we can maintain that guarantee?

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:29):

that you can install and run any roc package and it can't access other parts of the process's memory space, for example

view this post on Zulip Luke Boswell (Jul 31 2024 at 00:29):

The host/platform still controls everything at that boundary... so unless things can bust out of WASM runtimes I don't see how this could be an issue

view this post on Zulip Luke Boswell (Jul 31 2024 at 00:30):

^^ that being said, I'm no security expert

view this post on Zulip Luke Boswell (Jul 31 2024 at 00:30):

Just early thoughts about this

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:33):

Oh, this is for adding tasks to a package while trying to avoid adding general ffi to roc

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:33):

yeah

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:34):

I assume they'd need to be tasks, but maybe that's not a correct assumption either :laughing:

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:34):

So would enable someone to write some C and compile it to wasm and then call it wasm

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:34):

right

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:35):

And I assume this is freestanding, not wasi? So no file io or anything?

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:35):

kinda - I think something like WIT where the wasm file says "here are the operations I require"

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:36):

which we could then wrap as module params

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:36):

and they'd slot in nicely

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:36):

e.g. if a wasm file says "I need to be able to write to a file" then that operation can be provided in the normal Roc way (module params) just like anything else

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:37):

If they can do arbitrary system ffi via wasi (even if explicitly specified via wit), why limit to wasm?

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:37):

hm, I don't understand the question :sweat_smile:

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:37):

limit to wasm as opposed to what?

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:37):

Feels like a case where we should just allow wrapping a native dynamic library as tasks without any interaction with the platform.

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:38):

wasm can't do syscalls

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:38):

that's the only reason it's a possibility haha

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:38):

wasm can't do syscalls

That's exactly what wasi is for?

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:38):

ok I misunderstood earlier then

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:38):

And wit is for giving access to specific parts of wasi.

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:39):

I guess I mean "freestanding wasm" and not "freestanding wasi"

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:39):

like the type of wasm you could run in a browser, where you have to provide it with everything and it doesn't know how to do anything natively

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:39):

other than basic CPU operations and memory stuff

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:39):

Ok yeah, that makes more sense. Was very confused by the mention of wit where you can do package wasi:filesystem;.

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:40):

yeah sorry, I meant something like WIT but not exactly that

view this post on Zulip Luke Boswell (Jul 31 2024 at 00:40):

The package author includes a WIT file describing the interface for their WASM module, and we may be able to use that to generate the interface on the Roc side?

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:40):

yeah something like that

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:41):

Also, yes, would still need to be tasks. Cause wasm can use globals among other things that could be very unsafe in roc.

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:41):

yeah for sure if it uses stateful things like globals

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:42):

but I could imagine a scenario where you basically call the wasm function in its own isolated sandbox (e.g. give it its own memory arena and don't let it see anything else) and then don't maintain any state in between invocations

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:43):

and then we could either do that, or maintain state in between calls, depending on whether you specified to call it as a Task or not

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:43):

So I would say:

Extending platforms with wasm modules that get called by roc should be doable.
By making the wasm freestanding, it won't have any access to ffi.
Will need to use Task due to being able to hold onto state.
Calling into it will of course have all of the memory copying gripes. That said, memory returned from wasm should be possible to directly reference.
We could even use a wasm interpretter that jits/compiles to native to get really solid perf.

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:44):

yeah I'd ideally want to compile it all ahead of time

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:44):

so it just ends up in the binary and there's no wasm runtime in the compiled binary

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:44):

I don't know of a reason why that wouldn't be possible

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:45):

I believe binaryren loads .wasm files into LLVM IR in order to run LLVM optimization passes on it and then output another .wasm file

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:45):

Yeah, should be doable. So during roc compilation time would compile the wasm and setup the memory restrictions and such.

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:45):

but if that's possible, then it's also possible to load it into LLVM IR and then emit machine instructions

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:45):

right

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:46):

load it into LLVM IR and then emit machine instructions

Just leaves an open question if it becomes less safe/has an easier time escaping the interpreter (cause it is compiled away)

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:46):

but yeah, all sound doable.

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:48):

I think passing data in should be straightforward theoretically (it all has to be copied, which is unfortunate, but I don't see a way around that)

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:49):

I guess theoretically it could be possible to do some Morphic-esque analyis of like "this value is only ever going to be passed into wasm" and then do the roc_alloc equivalent directly into the memory wasm will be given access to, but I dunno about that :laughing:

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:49):

in practice, that is

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:49):

anyway, so assuming copying bytes in, and then copying bytes back out...

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:49):

we'd need some way to verify the bytes being copied out

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:50):

and then copying bytes back out...

This isn't necessary

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:50):

Or at least shouldn't be

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:50):

it is if we're maintaining state, right?

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:50):

otherwise the next time you call into wasm it could have stored a pointer into what it gave back last time

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:50):

Oh, to stop wasm from storing a version of a list and then mutating it in place.

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:50):

and then modify some distant part of the program

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:50):

yeah

view this post on Zulip Luke Boswell (Jul 31 2024 at 00:51):

attempt 1

.wit "world" inside a roc package describes the interface for the bundled WASM binary

default world simple_world {
    import {
        // Importing an addition function from the host environment
        fn add(a: i32, b: i32) -> i32
    }

    export {
        // Exporting a multiplication function from the WASM module
        fn multiply(a: i32, b: i32) -> i32
    }
}

The roc uses this WIT file to generate and provide Task based interface

module SimplePackage {
    # this is a module parameter that is required to instantiate this package
    # I'm not sure if we have types in the syntax for module params
    add : { a : I32, b : I32 } -> Task I32 *
} [
    multiply,
]

multiply : { a : I32, b : I32 } -> Task I32 *

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:51):

This feels so defensive, but I get the goals.

Personally, if a user opts into it, I would prefer to just allow roc to load a shared library as a platform extension.

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:52):

It can be unsafe, cause it is at the platform level.

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:52):

Any, but yeah for wasm that all sounds good. Lots of copies, but otherwise just a task based interface and it should be fine.

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:52):

Actually, only fine if we recursively copy everything in and out.

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:53):

Like a list of strings would need to copy the list and every string into and out of wasm

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:53):

right

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:53):

To guarantee wasm dosen't do anything evil

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:53):

so a potentially very valuable use of this could be math functions that don't do heap things anyway

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:54):

like is there a BLAS/LAPACK compiled to wasm anywhere? :thinking:

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:54):

eh, most math functins that matter are for multidimensional arrays. So lots of data

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:54):

hm, large ones?

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:56):

I mean I guess for some game programming stuff it would help. But for most stuff blas is used for, they tends to be at least medium sized. So all the copies could really hurt.

Like I don't think it would work for a generic blas wrapper. But it would probably work if you made a full blas simulation function with many operations and exposed it as one effect.

view this post on Zulip Richard Feldman (Jul 31 2024 at 00:56):

interesting

view this post on Zulip Luke Boswell (Jul 31 2024 at 00:57):

Why do we need to copy the data? The host is still in control of the information which is sandboxed inside the WASM runtime

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:57):

I think the copies would get too expensive if you have 2 copies for every single matrix add, multiply, etc.

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 00:58):

Why do we need to copy the data?

Have to copy in to allow wasm to see it cause it is sandboxed

Have to copy out to stop wasm from holding a reference and mutating it later such that roc seeing random changes to data that is supposed to be constant.

view this post on Zulip Luke Boswell (Jul 31 2024 at 00:59):

Can you allocate the RocList into an arena, pass that into WASM to modify and then when WASM returns you know it cannot do anything more so it's safe to pass back to roc

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:00):

you know it cannot do anything more

This is only true if wasm has not state from call to call

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:00):

Maybe we can restrict the wasm to no globals?

view this post on Zulip Richard Feldman (Jul 31 2024 at 01:00):

I don't know enough about wasm's memory model to be sure if this would work, but maybe in the wasm interop wrapper you could opt into some restrictions that gain performance without sacrificing security, specifically:

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:00):

Then only pure functions would exist.

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:01):

Yeah, I think wasm without globals and some extra checks could go a long way. Still definitely wouldn't be safe, but we could at a minimum just do a copy for uniqueness before handing off to wasm.

view this post on Zulip Richard Feldman (Jul 31 2024 at 01:03):

yeah a relevant question is - given the security requirements, what use cases are left that would be useful in practice?

view this post on Zulip Richard Feldman (Jul 31 2024 at 01:03):

I guess a possible answer in general is "a thing that at least works, and then in the future it can be rewritten in Roc to be faster because it doesn't have the security overhead"

view this post on Zulip Luke Boswell (Jul 31 2024 at 01:05):

Maybe if you have a big function that is written in C or something and it's been verified or is trusted and you don't want to rewrite it.

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:05):

I think the real issue is that it likely would be hard to use existing libraries (especially if no globals/state). So you would be writing raw c/rust/zig for the wasm. Definitely could be used to speed up some computations, but a much bigger lift to create a library for it. Probably can't just import blas/lapack/eigen/tf/etc and build for wasm with no globals and a thin type shim for roc lists.

view this post on Zulip Luke Boswell (Jul 31 2024 at 01:06):

WASM is already pretty restricted crossing the host boundary. So I wonder if the copy in and out is really that bad?

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:06):

So I would label it as a potential gain, but personally, I would turn to raw ffi in a platform with a shared library calling into blas before I would use something like this. But I definitely could be really biased.

view this post on Zulip Luke Boswell (Jul 31 2024 at 01:07):

Noting the massive potential boost to the ecosystem from being able to use code that is written in any language that compiles to WASM

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:08):

So I wonder if the copy in and out is really that bad?

Really depends on use case and how small of a chunk of code each function is. As I mentioned above, calling into a large function that will take a lot of time anyway is probably fine. Calling into wasm for individual ops probably is too costly.

And I think for this to be really nice, you would want to call into it for each individual op.

view this post on Zulip Luke Boswell (Jul 31 2024 at 01:08):

for each individual op.

What do you mean by this? ... as in each function call?

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:09):

Like I would want to expose the roc-wasm-matrix library that has all of the matrix operation super fast in wasm. Then the end user can make individual calls to add and sub and matmul. But that would be 2 copies for every matrix add.

view this post on Zulip Luke Boswell (Jul 31 2024 at 01:09):

But you could now have a Task to load data into WASM, and then the calls could be instructions to operate on that data, and another to eventually get the data back out right?

view this post on Zulip Luke Boswell (Jul 31 2024 at 01:10):

So you only need to have Two expensive copies, one in and one out

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:13):

So wasm returns a handle back to roc and roc works with the handle until it needs the data back out.

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:13):

So basically, treat it like we treat file today.

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:14):

So 100% has to be task cause we have to allow wasm to hold state, but as long as we delay returning state, it should be safe from scary mutation and mostly copy free.

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:18):

Yeah, that sounds doable.

x = WasmMatrix.createMatrix! someNumList [12, 22]
y = WasmMatrix.createMatrix! someNumList2 [22, 36]

x = WasmMatrix.mulF32! x 7.2
x = WasmMatrix.subInt! x 3
z = WasmMatrix.matmul! x y
out = WasmMatrix.extractMatrix z
...

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:18):

Not exactly sure how x and y will get freed, but sounds feasible.

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:19):

Also, probably still will really confuse users depending on if it is inplace or not:

x = WasmMatrix.createMatrix! someNumList [12, 22]
x2 = WasmMatrix.mulF32! x 7.2

Is x equal to x2?

view this post on Zulip Luke Boswell (Jul 31 2024 at 01:23):

Richard Feldman said:

but if that's possible, then it's also possible to load it into LLVM IR and then emit machine instructions

I might be misunderstanding here. But the mental model I had in my head, is that on the Roc side it's just Tasks and an abstract/opaque interface.

The host exposes some standard set of calls to roc to work with WASM modules, and roc uses those to instruct the host what to call and what arguments to provide etc.

So the package WASM binary is dynamically linked/loaded by the host. It can be cached in .cache/roc along with the other .roc files when roc runing an app.. or for building a executable it is expected to be available in a sub directory like /wasm-packages or in a path from an environment variable at runtime.

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:26):

The idea is that roc will compile it into the binary at roc app compilation time. Nothing done on the platform side.

view this post on Zulip Luke Boswell (Jul 31 2024 at 01:26):

Oh interesting...

view this post on Zulip Luke Boswell (Jul 31 2024 at 01:27):

So the host may never even know

view this post on Zulip Brendan Hansknecht (Jul 31 2024 at 01:27):

A package would include a task module and a .wasm file. Roc would hopefully compile the wasm to sandboxed native code. Could also just embedded the entire .wasm file into the binary with a wasm interpreter as well.

view this post on Zulip Luke Boswell (Jul 31 2024 at 01:28):

Lol, imagine a roc app compiled to WASM, that is using a WASM package that is running inside a WASM interpreter

view this post on Zulip Luke Boswell (Jul 31 2024 at 01:28):

And all of this is running inside a WASM interpreter

view this post on Zulip Richard Feldman (Jul 31 2024 at 01:29):

presumably if we know that the compilation target is wasm, we could make use of that knowledge to avoid silliness :big_smile:

view this post on Zulip Luke Boswell (Jul 31 2024 at 01:30):

I wonder how transitive dependencies would be handled? Like, we would only need to have one WASM binary per package, and the whole app will only use one version for each package.


Last updated: Jun 16 2026 at 16:19 UTC