Stream: compiler development

Topic: zig 16


view this post on Zulip Luke Boswell (May 29 2026 at 06:21):

I'd like to talk about our Zig 16 upgrade plans... specifically this PR https://github.com/roc-lang/roc/pull/9341

I've been working on that, and have rebased it a few times against main. I'd take a pluck and estimate it's more than 90% done, as at the last time we updated it. I've had it passing zig build minici locally on all three Windows, Macos, and Linux multiple times but never got a full CI run green.

Before today I was aiming for after the MIR changes landed (which they have now) ... but I'm now starting to think we should hold off on zig 16 upgrade for a while longer.

There is at least a few critical PR's in the pipeline I could see an argument to land first.

Today I started planning the upgrade for roc-wasm4 platform and realised we never landed the WASM Surgical Linking PR. The surgical linker is necessary so roc build can produce a wasm module (without adding another linker wasm-ld or some other linking approach). I can't remember why I paused work on that but I think it was blocked for some reason.

Also we are close to landing the LLVM backend.

So I feel like if we wait a few more weeks (rough estimate) we could be in a position where we have more of the platforms (including wasm targets) and the associated test apps and examples in a working state to help identify regressions in Roc.

I guess the alternate approach, is to pause Work in Progress and prioritise or focus on landing the Zig 16 upgrade. The other PR's in flight are smaller and so they'll be easier to upgrade.

Anyone have any thoughts or advice on how to navigate this?

view this post on Zulip Anton (May 29 2026 at 12:37):

So I feel like if we wait a few more weeks (rough estimate) we could be in a position where we have more of the platforms (including wasm targets) and the associated test apps and examples in a working state to help identify regressions in Roc.

We had a lot of memory bugs when upgrading last time, so waiting seems like the best approach.

view this post on Zulip Richard Feldman (May 29 2026 at 13:04):

I'd actually kinda prefer to just land it sooner

view this post on Zulip Richard Feldman (May 29 2026 at 13:04):

because the more stale it gets, the longer it'll take to land

view this post on Zulip Richard Feldman (May 29 2026 at 13:04):

and right now is kind of a nice time to have regressions on the new compiler bc nobody is relying on it yet :smile:

view this post on Zulip Richard Feldman (May 29 2026 at 13:05):

now that LLVM has landed, I'm fine pausing landing landing things until Zig 16 is in

view this post on Zulip Richard Feldman (May 29 2026 at 13:05):

especially because my Codex rate limits reset tomorrow :joy:

view this post on Zulip Luke Boswell (May 30 2026 at 00:35):

Ok, lets hold off merging anything into main for a little. Ill try and have zig 16 ready this weekend.

view this post on Zulip Luke Boswell (May 30 2026 at 12:03):

I've made good progress so far, pushed a few commits. I've updated Linux and that was passing minici, now switched across to Windows and working through issues there. Back to zig build test passing... now onto the full zig build minici before I switch to macos and see if there are any issues there too.

view this post on Zulip Luke Boswell (May 30 2026 at 12:16):

@Anton or @Richard Feldman would you mind looking at the CI issues? I can look at them tomorrow when I get up, but I'm guessing it's a few minor git workflow configuration things

view this post on Zulip Anton (May 30 2026 at 12:46):

I will :)

view this post on Zulip Anton (May 30 2026 at 18:23):

I have fixes locally for the typos and tracy issues, I spent a bunch of time looking at the best way to fix all the llvm: FAIL 'LinkFailed', claude is working on implementing our latest plan now :)

view this post on Zulip Anton (May 30 2026 at 18:23):

I will continue tomorrow.

view this post on Zulip Luke Boswell (May 30 2026 at 20:38):

Are you able to push any fixes you have? I can continue with it soon.

view this post on Zulip Luke Boswell (May 31 2026 at 01:21):

Anton said:

I have fixes locally for the typos and tracy issues, I spent a bunch of time looking at the best way to fix all the llvm: FAIL 'LinkFailed', claude is working on implementing our latest plan now :)

As a second line of effort here -- I'm going to try rebasing my surgical linker branch onto zig-16. I found and fixed some flaky LLVM build issues there which I think may be related to this. So I assume it will port across and may help with zig 16 CI.

view this post on Zulip Luke Boswell (May 31 2026 at 02:28):

Screenshot 2026-05-31 at 12.28.01.png

This looks relevant... switching to our embedded LLD instead of using cc

view this post on Zulip Luke Boswell (May 31 2026 at 02:31):

@Richard Feldman this looks like a root cause for our very slow LLVM tests too

view this post on Zulip Richard Feldman (May 31 2026 at 02:37):

oh yeh we shouldn't be doing that haha

view this post on Zulip Richard Feldman (May 31 2026 at 02:37):

we should be using lld on all targets for this

view this post on Zulip Richard Feldman (May 31 2026 at 02:37):

not just macOS

view this post on Zulip Luke Boswell (May 31 2026 at 03:09):

Does anyone run Nix here and can help with the Zig 16 branch.... need to bump the flake and re-generate the lock

The zig-16 branch is fully migrated to Zig 0.16 everywhere except the nix CI leg. build.zig:2539/:2554 use the 0.16-only std.Io.Dir API, but src/flake.nix:47 still pins zig = pkgs.zig_0_15, so the nix shell runs the build with 0.15 → compile error.

Recommended fix

Two parts:

1. src/flake.nix:47: pkgs.zig_0_15pkgs.zig_0_16
2. Regenerate src/flake.lock: the locked nixpkgs is from 2025-10-15, which predates the 0.16.0 release and has no zig_0_16. Bump it:
cd src && nix flake update nixpkgs

2. then commit both src/flake.nix and src/flake.lock.

view this post on Zulip Luke Boswell (May 31 2026 at 06:30):

I spoke with Richard about making LLVM backend in eval tests opt-in and switching that on for only one CI workflow. I've added that in my surgical linker branch (which I've rebased onto zig-16) and it significantly speeds up CI.

view this post on Zulip Niclas Ahden (May 31 2026 at 10:21):

@Luke Boswell https://github.com/roc-lang/roc/pull/9494 hope that helps!

view this post on Zulip Luke Boswell (May 31 2026 at 10:25):

Thank you @Niclas Ahden

view this post on Zulip Niclas Ahden (May 31 2026 at 10:29):

Thank YOU for pushing this forward! :bow: :smiley:

view this post on Zulip Anton (May 31 2026 at 18:05):

Luke Boswell said:

I spoke with Richard about making LLVM backend in eval tests opt-in and switching that on for only one CI workflow. I've added that in my surgical linker branch (which I've rebased onto zig-16) and it significantly speeds up CI.

Do we still have enough other tests that run with llvm on all operating systems?

view this post on Zulip Anton (May 31 2026 at 18:59):

Doing something similar to PR#9473 may also speed up those eval tests a lot

view this post on Zulip Luke Boswell (Jun 01 2026 at 00:09):

Anton said:

Do we still have enough other tests that run with llvm on all operating systems?

I think running the LLVM eval tests on a single machine is acceptable for this specific narrow case.

The eval tests run the common compiler pipeline down to LIR, and the non-LLVM eval backends still run on the OS matrix. That means OS-specific bugs during LIR lowering will be caught there.

The LLVM-specific eval tests are focussed on testing LIR -> LLVM bitcode. In MonoLlvmCodeGen.zig, there isn't much OS-specific lowering. So I think running the whole LLVM eval suite on every OS would be overkill and it's also very expensive.

view this post on Zulip Richard Feldman (Jun 01 2026 at 00:29):

yeah if there's any difference it's probably an LLVM bug :sweat_smile:

view this post on Zulip Richard Feldman (Jun 01 2026 at 00:29):

but we could, just to be extra safe, start doing like "after every main merge, kick this off and tell us if it breaks" more thorough runs

view this post on Zulip Luke Boswell (Jun 01 2026 at 00:32):

Interesting idea you gave me... what if for individual PR's you opt-ed into the specific CI or tests that are relevant for your PR. Then after the PR merges we kick off larger run all the things (which are much slower and so may actually be running against multiple PR's that landed since the last run).

view this post on Zulip Luke Boswell (Jun 01 2026 at 00:32):

I'm not sure if something like that is even possible with GH actions... just a thought though

view this post on Zulip Richard Feldman (Jun 01 2026 at 02:07):

yeah I don't think it is unfortunately

view this post on Zulip Anton (Jun 01 2026 at 10:58):

I think it is possible but it requires a non-trivial amount of custom scaffolding.

view this post on Zulip Anton (Jun 01 2026 at 11:01):

In the future I would also like to setup some CI stats tracking so that we can identify which workflows fail very rarely and just run them once a day.

view this post on Zulip Richard Feldman (Jun 01 2026 at 21:37):

I'm gonna merge this tonight even though we still have a CI failure on Windows (might be a flake? I'm not sure) and also one on Nix (should be a quick fix for anyone with a nix machine - just need to run a command to regenerate some hashes).

there are a bunch of PRs stacked up that I want to start landing, and I don't think it's worth continuing to block those on 0.16 when it's working aside from those two issues! :smile:

view this post on Zulip Richard Feldman (Jun 02 2026 at 00:07):

ok this is merged now! A few ci steps will fail for now, but that's an ok tradeoff to accept I think :smile:


Last updated: Jun 16 2026 at 16:19 UTC