Stream: compiler development

Topic: zig compiler - libroc exploration


view this post on Zulip Luke Boswell (Feb 02 2025 at 23:04):

I would like to throw some ideas around for libroc -- specifically for the usecase of embedding roc.

I've discussed some of these usecases previously, but for the sake of discussion -- let's say I'm building a roc playground. I'd like to build a WASM module that runs in the browser. It accepts roc source code and the name of a top level ident e.g. main! and any arguments. The playground then uses libroc to parse, typecheck, and interpret the program. The platform is simple and the only effect available is print! which the playground uses to append to a textarea.

From this example I have a few questions...

Does this use case sound reasonable? Is there anything obvious here I'm missing?

view this post on Zulip Luke Boswell (Feb 02 2025 at 23:09):

There is a lot here that is new to me, so I would appreciate any feedback. I may be off on a random tangent hallucinating ways to doing things.

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:11):

hm, so I just realized that actually libroc might be the wrong way to think about this :thinking:

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:11):

I don't think we'd want to produce a single library for this

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:12):

rather, I think we'd want to have the platform be able to build a custom library which includes both the interpreter and the platform-specific entrypoint functions

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:13):

hmm

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:14):

actually maybe it doesn't matter after all, nm it can work either way

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:19):

I assume it would be a zig/c library that can compile to wasm (or any other target). It also would be exposed by the roc executable if you load it as a shared library (assuming we can make it work).

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:19):

I am assuming it will be how the shim works. The shim will load the roc compiler as a shared library and launch the interpreter. So It will directly use libroc (I guess that forces it to be a cffi library)

view this post on Zulip Luke Boswell (Feb 02 2025 at 23:20):

Lol, is this the library version of inception?

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:41):

Its a really nice technique to bundle roc with libroc assuming it works (I feel like I have seen this before, but not sure it works on all platforms)

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:41):

ok so let's say there's a libroc which exposes a C function which:

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:43):

accepts the binary contents of the app module to run (e.g. the bytes found in main.roc, or if it's wasm, some bytes in memory)

Not this. It will get the file data by calling roc_load which will be in the struct of function pointers with the allocator. Cause it will need roc_load anyway to load other files.

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:44):

and then it also accepts a function pointer which takes a path and returns the source bytes associated with that path, or an error if they couldn't be read

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:45):

yeah :point_up: seems necessary to allow loading other modules

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:45):

and I guess we could say give me a starting point path and I'll go look it up in there, but kinda seems like unnecessary indirection

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:45):

Yeah, just need the function, shouldn't need main.roc cause you can use the function to get main.roc

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:45):

but that works too, sure

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:45):

and I guess we could say give me a starting point path and I'll go look it up in there, but kinda seems like unnecessary indirection

I think we need this for the shim

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:46):

Cause the shim will get compiled once, but main.roc source may change between calls

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:46):

ah sure

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:46):

ok fair enough!

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:47):

anyway, so then the other piece of this is getting help from glue to correctly translate the Roc args and return value to/from the host language

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:48):

aside: I think we decided elsewhere that we were going to always have the Roc functions accept a single arg from the host as a pointer, and then tuple them up if desired, in order to simplify the ABI - right?

I would rather fix c abi, but either is ultimately fine.

For libroc I think it should take a list of tags to specify the types and a list of pointers to specify the args.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:48):

The shim would deal with filling in that information (maps stadard cffi like we use with llvm to this interpreter form). Otherwise, the platform author is required to fill in the info if they want to use libroc directly.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:50):

Taking types as a list separate from the actual args avoids the nesting problem where you have to box everything. Instead it can use the flat representation, but have a nested spec that explains the underlying type layout.

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:55):

Brendan Hansknecht said:

aside: I think we decided elsewhere that we were going to always have the Roc functions accept a single arg from the host as a pointer, and then tuple them up if desired, in order to simplify the ABI - right?

I would rather fix c abi, but either is ultimately fine.

we can always do that later and relax the restriction, but this is astronomically easier to make correct :big_smile:

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:56):

Brendan Hansknecht said:

For libroc I think it should take a list of tags to specify the types and a list of pointers to specify the args.

hm, so what's the benefit of this compared to using glue to just generate the correct calls? :thinking:

view this post on Zulip Richard Feldman (Feb 02 2025 at 23:56):

a downside is the runtime validation on every call

view this post on Zulip Anthony Bullard (Feb 03 2025 at 00:11):

Would libroc allow for it to be fully embeddable? I.e., could the control be inverted?

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 00:28):

Yes, that is exactly the plan. Fully embedded and control inverted

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 00:28):

So not necessarilly any glue when using libroc

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 00:30):

In my mind, using libroc directly should be as nice as using embedded python or Lua interpreters.

view this post on Zulip Anthony Bullard (Feb 03 2025 at 00:32):

That would be awesome, would make it easier to make my Love2D for Roc port...

view this post on Zulip Anthony Bullard (Feb 03 2025 at 00:33):

Been doing Love2D with my daughter for a month, and while I don't mind Lua, I like Roc a lot more. :-)

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 00:35):

In a perfect world for libroc, I don't even need a main.roc. I can load any module and call onto it directly and even run a string of roc code directly.

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:57):

hm, I don't think that works

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:57):

it's more common than not for a module to refer to package shorthands like cli.

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:57):

to know what those resolve to, you have to have loaded a main.roc

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 00:58):

Assuming we want to eventually enable this:

In my mind, using libroc directly should be as nice as using embedded python or Lua interpreters.

I think something more dynamic is required

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:58):

so that could only possibly work in the specific scenario where I'm loading a module which only imports other local modules and none of them ever tries to import from any package whatsoever

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 00:58):

Though maybe it needs to be at the package boundary, not sure

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 00:59):

Like I should be able to load some random roc library and call code in it

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:59):

it seems like in practice ~100% of use cases for this will want a package

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:59):

sure, that's fine

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:59):

well, except for the restrictions we have on host-exposed functions :sweat_smile:

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:59):

like closures have to be boxed

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 01:00):

I assume all closures in the interpretter will be boxed, so that should be fine

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:00):

hmmm interesting

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 01:01):

I think they have to be cause we won't have run any form of specialization

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:01):

yeah for sure, I'm just trying to think of the layout implications

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:01):

in the non-libroc case

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:01):

I'm gonna put it on the other thread haha

view this post on Zulip Luke Boswell (Feb 03 2025 at 01:02):

I'm very glad. I brought this whole topic up... :smiley:

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:03):

yeah I guess loading any package or app should work

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:03):

and then once you've loaded it, you can call anything in any of its exposed modules

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:04):

but I still don't think that affects this:

Richard Feldman said:

in other words, libroc can expose a function which is exactly the same interface as :point_up: except for 3 extra arguments:

  1. The path to main.roc
  2. The function to go from a path to a .roc file to its source bytes
  3. The name of the entrypoint function within main.roc that I want to call

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:04):

for reference, the :point_up: was referring to:

Brendan Hansknecht said:

Normal platforms only see a single interface. That interface is:

Platform -> Roc standard FFI

  1. A pointer to write the return data to
  2. A record of function pointers (only allocators functions and roc_load)
  3. N pointers, one for each arg.

That is all they see period. Anything libroc is an implementation detail and not exposed to the platform.

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 01:04):

Via libroc, I should be able to call a function with a type variable. So that requires specifying the type somehow

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:05):

ahh interesting

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:05):

ok yeah that's something hosts can't do

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:05):

but seems reasonable to do when loading a package or something at runtime

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 01:05):

:100:

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:05):

cool, that makes sense to me then! :thumbs_up:

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:06):

btw I do think in general that if I'm embedding Roc into a larger program, I'm going to want to use glue to generate the bindings anyway

view this post on Zulip Richard Feldman (Feb 03 2025 at 01:06):

just because that makes it easier to get the types right

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 01:07):

That's fair, though the interpreter has to get types right somehow without glue. So it can't be that bad to use

view this post on Zulip Luke Boswell (Feb 03 2025 at 01:12):

Brendan Hansknecht said:

Via libroc, I should be able to call a function with a type variable. So that requires specifying the type somehow

Is this definitely something we want to support? would this be used for building a REPL or similar thing around libroc?

I thought the "interface" of a roc program was defined in the platform's main.roc file with the exposed entry-points. (and anything crossing the roc-host boundary has a fixed known size and concrete type)

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 01:14):

If we don't support it, I don't think there is much of a point to supporting the libroc use case. Just use the standard flow instead.

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 01:14):

When embedding Lua or python, one of the huge gains is the dynamic ability to interact with anything

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 01:17):

Python makes this possible by making everything a pyobject. That encodes all of the type info.

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 01:19):

I think something similar will be needed for the repl flow. At any breakpoint in the repl, I should be able to query a variable for all methods it has and then call one. That call might have a type variable in it.

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 01:20):

I should be able to return from the repl to the platform at any point. Then the platform should be able to do something with the object I return (whatever type it may be).

view this post on Zulip Brendan Hansknecht (Feb 03 2025 at 01:20):

I highly suggest playing around with embedded Lua or python. There is a lot of flexibility (though often also verbosity in generating objects of specific tagged types)

view this post on Zulip Luke Boswell (Feb 04 2025 at 09:53):

I thought I'd make a PR to get some feedback
https://github.com/roc-lang/roc/pull/7575

view this post on Zulip Brendan Hansknecht (Feb 04 2025 at 16:51):

I personally wouldn't setup a libroc now. I think libroc will be tailored around the interpreter and cut out a lot of the rest of the compiler. So don't really want random stuff going in now before we know exactly what it needs.

view this post on Zulip Brendan Hansknecht (Feb 04 2025 at 16:51):

Especially given libroc is more an experimental idea than something we know will work out.

view this post on Zulip Brendan Hansknecht (Feb 04 2025 at 16:52):

Probably will naturally get setup when trying to hook up the first platform to the interpreter and that will probably lead to first a static config and then a lot of learnings

view this post on Zulip Luke Boswell (Feb 04 2025 at 19:50):

@Brendan Hansknecht said

If we make a full featured lib roc, we should probably make main.zig strictly build roc via the same interfaces as libroc. That ensures they stay in sync.

That said, I'm not sold we want a full featured lib roc. I think we likely want a super small shim lib roc that only has the ability to interact with the intepreter.

Is there any reason we wouldn't want a full featured libroc?

I assumed we would implement the cli, repl, formatter, LSP etc using it.

view this post on Zulip Brendan Hansknecht (Feb 04 2025 at 20:01):

In my mind, Libroc is a c library. Those would all just be zig libraries and part of the regular code base (with only the exception being the lsp I guess).

view this post on Zulip Brendan Hansknecht (Feb 04 2025 at 20:02):

Even for the lsp, it would not use Libroc with the proposed plan. It would work like glue where the compiler is the platform

view this post on Zulip Brendan Hansknecht (Feb 04 2025 at 20:02):

And the compiler loads a shared library that is the lsp (or runs it via the interpreter)

view this post on Zulip Luke Boswell (Feb 04 2025 at 22:11):

Libroc is a c library

Even if it's just a super simple implementation. I was thinking of making a platform/host example of fully embedding roc using rust/zig.

I was thinking I could start on things like the playground, or LSP, even if most of it is stubbed out... so we can get a feeling for how it will all come together in future.


Last updated: Jul 06 2025 at 12:14 UTC