contributing to the zig rewrite · contributing

Stream: contributing

Topic: contributing to the zig rewrite

Norbert Hajagos (Feb 10 2025 at 15:28):

Hi! I want to contribute to the zig rewrite, but I also understand the foundations need to be laid first. With my limited zig and compiler experience I won't be the one doing that (though I have started doing ziglings wich is super fun). Think others have a similar feeling. But I can't just sit on my hands, so I thought I could do:

implement tail call elimination (know that part of the current compiler). Might be early for that :sweat_smile:
serialization of the different IRs to do snapshot testing
extending tokenization
anything of smaller size where no specialized knowledge is needed, and moving slow isn't a blocker for others.
If you can think of something, let me know.

Sam Mohr (Feb 10 2025 at 16:34):

I think working on function specialization would be a good candidate! All you need to do is, for every function in each module:

use the function set already calculated for you during function solving to replace function args of higher-order functions with a tag union
where said function args are called in the function body, replace them with a when expression that calls the original functions of the function set using the data in the function set

Sam Mohr (Feb 10 2025 at 16:34):

And that's it

Sam Mohr (Feb 10 2025 at 16:34):

I think tail call elimination can wait for now, but we'll definitely call on you when we get there

Sam Mohr (Feb 10 2025 at 16:35):

Seralizationation of IRs would also be pretty helpful!

Sam Mohr (Feb 10 2025 at 16:36):

For stuff that's non-blocking, you could definitely figure out the error reporting stuff. Until that's implemented, we'll just be debug printing all errors to the user.

Brendan Hansknecht (Feb 10 2025 at 17:16):

Seralizationation of IRs would also be pretty helpful!

Also, if you are will to write a generator or parser for an IR, that also would be really helpful for fuzzing.

Sam Mohr (Feb 10 2025 at 17:18):

The current plan seems to just be serializing to and from S-expressions

Sam Mohr (Feb 10 2025 at 17:18):

So it shouldn't be too complicated

Norbert Hajagos (Feb 10 2025 at 18:12):

Both are exciting! Specialization isn't something I'm familiar with. Didn't even consider how higher order functions work. Since I had a great experience contributing to the roc std lib and discovering it's inner workings, I would like to do that with specialization as well.

I've sat an unhealthy amount today, so I can't be working on anything. Tomorrow I'll check the rust specialization code and where it should go in the zig codebase.

Sam Mohr (Feb 10 2025 at 18:25):

I wouldn't look at the Rust code very much, it's gonna look very different in the Zig compiler

Jared Ramirez (Feb 10 2025 at 18:28):

I was thinking this exact same thing @Norbert Hajagos!

I would love to contribute to the re-write, but was similarly uncertainty of where would be helpful. Since Norbert is gonna take specialization, maybe I could look into writing a parser/generate for an IR?

Sam Mohr (Feb 10 2025 at 18:37):

You could do that, but there are actually 2 similarly scoped compiler stages unassigned

Sam Mohr (Feb 10 2025 at 18:38):

First is function lifting, second is function solving

Sam Mohr (Feb 10 2025 at 18:38):

https://github.com/roc-lang/rfcs/blob/ayaz/compile-with-lambda-sets/0102-compiling-lambda-sets.md#function_lift

Sam Mohr (Feb 10 2025 at 18:40):

The IR parse and generate is also helpful!

Norbert Hajagos (Feb 10 2025 at 19:11):

Huh... If the rust implementation isn't that relevant, do you have any good resource on this topic? Not looking for a paper, just basic stuff like why are we doing this?

Sam Mohr (Feb 10 2025 at 19:11):

That doc I just linked is the holy grail

Sam Mohr (Feb 10 2025 at 19:12):

Long story short, we are currently trying to move from typechecked IR to codegen-ready IR in a single, pretty complex pass

Sam Mohr (Feb 10 2025 at 19:12):

And our ability to compile lambda sets right now is just broken

Sam Mohr (Feb 10 2025 at 19:14):

where a lambda set is the set of values dynamically captured by a closure at runtime

Sam Mohr (Feb 10 2025 at 19:15):

So this multi-stage approach is all built around a consistent, robust design that transforms any possible Roc code into just concretely-typed functions and values, so we don't have to worry about lambda sets anymore

Norbert Hajagos (Feb 10 2025 at 19:22):

Got it!

Isaac Van Doren (Feb 10 2025 at 19:32):

I’m interested in doing function lifting!

Jared Ramirez (Feb 10 2025 at 19:40):

That leaves function-solving then, which I can start looking into.

That doc is super helpful!

Sam Mohr (Feb 10 2025 at 19:46):

Holy crap, you guys are my heroes!

Sam Mohr (Feb 10 2025 at 19:47):

That leaves every part of the pipeline assigned except for statement lowering and refcounting

Sam Mohr (Feb 10 2025 at 19:47):

An FYI for those volunteers here, those two stages are mostly gonna be copying from Rust and adapting to the Zig array-based IR storage

Sam Mohr (Feb 10 2025 at 19:48):

But that'll require reading a good bit more code

Sam Mohr (Feb 10 2025 at 19:49):

The lambda set stages you've all volunteered for just need you to understand that document

Brendan Hansknecht (Feb 10 2025 at 19:52):

Just a note

Brendan Hansknecht (Feb 10 2025 at 19:52):

Lower should be trivial

Brendan Hansknecht (Feb 10 2025 at 19:52):

Basic refcounting should be pretty easy

Brendan Hansknecht (Feb 10 2025 at 19:52):

The important refcounting optimizations likely are more complex to port, but pretty fundamental to a number of perf cases

Sam Mohr (Feb 10 2025 at 19:52):

A heads up to @Isaac Van Doren if you end up picking up function lifting, the one piece I'm not sure how to implement yet in function lifting is how to handle destructures, especially top-level ones. You'll need to figure out how to handle

{ x, y } =
    if condition then
        { x: 123, y: |n| n + 1 }
    else
        { x: 456, y: |n| n - 1 }

Sam Mohr (Feb 10 2025 at 19:53):

So feel free to ask around or maybe Ayaz in particular if you can't figure that out

Isaac Van Doren (Feb 10 2025 at 19:56):

Alright, thanks for the heads up :+1:

Jared Ramirez (Feb 11 2025 at 01:57):

When setting up a super basic test for function solving, I ran into a slew of zig compile errors originating in base modules. Is this a known issue?

I have a PR up with fixes to these in case it's helpful!

Sam Mohr (Feb 11 2025 at 01:59):

Zig only compiles what's being used, so those errors must've arisen from that code not being called

Niclas Ahden (Feb 11 2025 at 12:23):

Sam Mohr said:

That leaves every part of the pipeline assigned except for statement lowering and refcounting

I don't know compilers or zig (yet), but I'll take a look at lowering :ok:

Norbert Hajagos (Feb 11 2025 at 19:57):

That was a fruitful discussion, It's great to see others taking the leap!

Wizard ish (Feb 11 2025 at 22:39):

is parsing Roc itself mentioned here, or is that for later (just asking because looking at Zig comptime it seems like it could be used to create quite a powerful parser...)

Sam Mohr (Feb 11 2025 at 22:56):

I didn't mention parsing as a volunteering option because it's being handled by two people already

Sam Mohr (Feb 11 2025 at 22:56):

Joshua and Anthony

Sam Mohr (Feb 11 2025 at 22:58):

You're right that Zig's comptime could be used to do some crazy stuff for parsing

Brendan Hansknecht (Feb 11 2025 at 23:55):

Wizard ish said:

is parsing Roc itself mentioned here, or is that for later (just asking because looking at Zig comptime it seems like it could be used to create quite a powerful parser...)

If you want to experiment with comp time and faster parsing techniques, feel free to. Once we have the fuzzer up and running, and more robust tests, we can look into trying more advanced techniques. Just want to make sure we have a stable test suite first. That said, no guarantee we will use the techniques if they make the parser too complex or hard to understand. But if it interests you, I still advise exploring.

Anthony Bullard (Feb 12 2025 at 00:21):

I think experiments are interesting, but I want to stress that we want to keep the compiler readable and easy to hack on. So let's be mindful of that before committing to more extreme forms of Wizardry here :wink: Obviously HUGE improvements would be great, but I'd like to see how performant this straight SoA/DOD approach is without (much) magic. Parsing is also like 5% of the compute time in a compile task - reducing algorithmic complexity in the build phase is the most impactful work that can be done on overall compiler performance

Last updated: Aug 17 2025 at 12:14 UTC