Stream: compiler development

Topic: upgrade-llvm-zig


view this post on Zulip Luke Boswell (Jul 24 2024 at 05:55):

I managed to learn just enough nix to update the flakes and get a dev shell set up with the correct dependencies for llvm17 and zig12, I pushed it to the llvm17-zig12 branch.

view this post on Zulip Luke Boswell (Jul 24 2024 at 05:59):

Actually... maybe not. I'm close though

view this post on Zulip Brendan Hansknecht (Jul 24 2024 at 06:17):

Should we go straight for llvm18 and zig13?

view this post on Zulip Brendan Hansknecht (Jul 24 2024 at 06:17):

They are both out, right?

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:17):

Yeah, I looked at that. The only thing I wasn't 100% on was inkwell

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:18):

If we pin to tag 0.4.0 then it looks like that includes 17, but 18 is still sitting in a CI branch and I'm guessing not ready

view this post on Zulip Brendan Hansknecht (Jul 24 2024 at 06:18):

inkwell github says it support llvm 18

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:19):

I know it in the README, but I couldn't make cargo happy. It was saying it didn't exist

view this post on Zulip Brendan Hansknecht (Jul 24 2024 at 06:19):

Ah, I see, that is master. 0.4.0 is only to llvm 17

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:21):

I'm currently fumbling around trying to understand the differences between rust's llvm-sys, and why is needs a very specific binary to check the llvm version. That binary isn't provided in the newer llvmPackages_18.clangUseLLVM

view this post on Zulip Brendan Hansknecht (Jul 24 2024 at 06:21):

Yeah, looks like they have everything update for 18, but haven't cut a release yet

view this post on Zulip Brendan Hansknecht (Jul 24 2024 at 06:22):

Honestly, I would just pin the release to a git commit for now, update to llvm18 and avoid the need for two separate updates.

view this post on Zulip Brendan Hansknecht (Jul 24 2024 at 06:22):

Or fork and tag in a fork. I know we've done this before, not sure the exact mechanism though.

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:24):

I know the zig upgrade to 12/13 was quite painless before the recent refcount changes. I can motor through to bulk of that, but there's some low level stuff I can't fixup. One example was we are using overflow flag from a builtin (I think) and it's no longer available in the later versions.

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:24):

I spent a bit of time trying to update our builtins for the roc-wasm4 stuff, and that was the point I got stuck and deferred for a laterday.

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:25):

I figure, first stop is just getting our nix dependencies happy and working from there.

I don't have any experience with these kind of upgrades... just learning as I go

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:30):

Can we just track master maybe, and as we get closer we can pin to a specific tag, or maybe a later release?

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:35):

From the llvm-sys crate docs

llvm-sys requires a copy of llvm-config corresponding to the desired version of LLVM to build: llvm-config allows it to probe what libraries need to be linked and what compiler options are required.

Binary distributions of LLVM (including the official release packages) generallydo not include a copy of llvm-config, making them unsuited to use for building programs with llvm-sys. Known exceptions (that do include a copy of llvm-config) include:
* Official Debian/Ubuntu packages from apt.llvm.org
* Arch Linux's llvm package

If a suitable binary package is not available for your platform, compiling from source is usually the best option. See Compiling LLVM in this document for details.

How would we work around this if nix doesn't provide that binary?

view this post on Zulip Brendan Hansknecht (Jul 24 2024 at 06:37):

Can we just track master maybe, and as we get closer we can pin to a specific tag, or maybe a later release?

Yeah, I think that's fine. At a minimum it's fine for local testing

view this post on Zulip Brendan Hansknecht (Jul 24 2024 at 06:38):

How would we work around this if nix doesn't provide that binary?

Should be part of the llvm nix package I believe

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:41):

nvm, I was using the wrong nix package and output. It's llvmPackages_18.libllvm.dev not llvmPackages_18.clangUseLLVM.out

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:41):

Great, now Im through to actual roc code issues.

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:45):

There isn't a nix package for zig 13 yet. But I'll just leave it as 12 for now as they are very similar, and it should be really easy to upgrade when there is a release.

view this post on Zulip Brendan Hansknecht (Jul 24 2024 at 06:45):

Ah, then I guess you have to use llvm17. I think our version of llvm has to match zigs....though it may be fine if our version of llvm is newer than zigs. Cause llvm probably can load old llvm ir.

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:46):

Also pushed to a new branch that is less confusing https://github.com/roc-lang/roc/tree/upgrade-llvm-zig

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:46):

I forgot about that

view this post on Zulip Luke Boswell (Jul 24 2024 at 06:53):

Ok, based on my 2 minutes research... We have zig producing LLVM bitcode. Zig 12 produces LLVM 17.0.6, so I think we should keep them paired just in case.

view this post on Zulip Luke Boswell (Jul 24 2024 at 07:00):

Cause llvm probably can load old llvm ir.

does this help? https://releases.llvm.org/18.1.0/docs/ReleaseNotes.html#changes-to-the-llvm-ir

view this post on Zulip Luke Boswell (Jul 24 2024 at 07:01):

I just search all of the builtins bitcode .ll files and none of these are used. So maybe we should be fine.

view this post on Zulip Luke Boswell (Jul 24 2024 at 07:57):

Ok, so back to a happy place. :tada:

I've got to the following point;

Now I'm up to cargo build crashing at the builtins, which is because of the obvious changes between zig versions.

view this post on Zulip Luke Boswell (Jul 24 2024 at 07:58):

@John Murray -- it's been a while, but wondering if you would be able to skim through my flake changes and let me know if there is anything I've probably broken?? https://github.com/roc-lang/roc/tree/upgrade-llvm-zig

view this post on Zulip Anton (Jul 24 2024 at 09:48):

zig 13 is on the latest nix unstable which we can easily use by updating to a more recent commit here. It's just called zig, not zig_0_13. I'd prefer to use that over the overlay because it's less complex.

view this post on Zulip Luke Boswell (Jul 24 2024 at 12:02):

Ok, sounds good.

What is the difference between a release like 24.05 and unstable? Is it like tracking main but we can still pin to a specific commit? With the intention to switch to a release later when it's available.

view this post on Zulip Anton (Jul 24 2024 at 12:07):

Is it like tracking main but we can still pin to a specific commit?

Yes, exactly. I think the main benefit of using e.g. 24.05 is better caching and build times.

view this post on Zulip Luke Boswell (Jul 26 2024 at 01:51):

@Anton -- I had a crack at switching back to using the unstable branch. I had a few issues though. For some reason that llvm_18 package doesn't include the same folder structure. It doesn't include a /lib, and it's missing lld. I've just left it using zig-overlay for now.

view this post on Zulip Basile Henry (Jul 26 2024 at 02:48):

It has multiple outputs, I would try llvm_18.dev or llvm_18.lib

view this post on Zulip Luke Boswell (Jul 28 2024 at 00:42):

Quick update on this.

To get the compiler compiling, all we needed to do was comment out a few calls to different LLVM optimisation passes, as these are no longer exposed in the new Inkwell API. I ran the gen-llvm tests and had only 1 failure. Our examples seems to run fine. But I think we should investigate what impact the API change has, and if we need to do something differently to turn optimisations on, or if it's just on by default now.

The next issue with this change is that zig has changed it's cli arguments, and the --mod and --deps arguments are no longer available. We used these to pass in the zig builtins "glue" code for each of the platform's hosts.

Thanks to @Ryan Barth who helped me come up with a really simple way to workaround this issue. I had been exploring using zig packages, and build scripts and all kinds of things but thankfully these aren't necessary.

We can simply use a bash script and copy the zig src files into the platform directories. There doesn't need to be any big changes, and this also plays nicely with the refactor host PR. We can gitignore the copied files, and when the script re-runs it will overwrite these.

So I'm just working on making this script and having it called once before the tests are run. Then we should be able to run the full test suite and see if anything breaks.

view this post on Zulip Luke Boswell (Jul 28 2024 at 07:29):

Ok, so I think I have all of the fiddly and tedious bits out of the way. We've upgraded the test platforms, zig builtins etc.

I've left the CI parts to work through with Anton later, as we may want to do things differently -- but that should be really straightforward I think.

I'm only seeing 1 test failure when running gen-test-llvm, num_to_str_f32 -- I'm thinking this will be a simple fix, but haven't really looked at the IR or anything yet. (not that I know what I'm looking for :sweat_smile: )

---- gen_num::num_to_str_f32 stdout ----
thread 'gen_num::num_to_str_f32' panicked at crates/compiler/test_gen/src/helpers/llvm.rs:575:13:
assertion `left == right` failed: LLVM test failed
  left: "340282350000000000000000000000000000000"
 right: "340282346638528860000000000000000000000"


failures:
    gen_num::num_to_str_f32

test result: FAILED. 1293 passed; 1 failed; 17 ignored; 0 measured; 0 filtered out; finished in 612.91s

The only other tests that are failing now are in roc_glue which all look to be related to the same issue. I think its related to zig builtin linking -- maybe zig changed the way it does things or something about remove dead code maybe.

$ ./target/debug/roc build crates/glue/tests/fixtures/basic-record/app.roc
🔨 Rebuilding platform...
Undefined symbols for architecture arm64:
  "_roc_getppid", referenced from:
      _roc_builtins.utils.expect_failed_start_shared_file in roc_appNDMFxA.o
      _roc_builtins.utils.read_env_shared_buffer in roc_appNDMFxA.o
  "_roc_mmap", referenced from:
      _roc_builtins.utils.expect_failed_start_shared_file in roc_appNDMFxA.o
      _roc_builtins.utils.read_env_shared_buffer in roc_appNDMFxA.o
  "_roc_shm_open", referenced from:
      _roc_builtins.utils.expect_failed_start_shared_file in roc_appNDMFxA.o
      _roc_builtins.utils.read_env_shared_buffer in roc_appNDMFxA.o
ld: symbol(s) not found for architecture arm64
crates/glue/tests/fixtures/basic-record/app: is already signed
thread 'main' panicked at crates/compiler/build/src/program.rs:1031:17:
not yet implemented: gracefully handle `ld` (or `zig` in the case of wasm with --optimize) returning exit code Some(1)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

view this post on Zulip Luke Boswell (Jul 28 2024 at 07:34):

Also, I would appreciate assistance from anyone with rust or LLVM knowledge to look at the changes in crates/compiler/gen_llvm/src/llvm/build.rs and help me investigate what the changes mean.

I think this is the relevant part of the release notes.

view this post on Zulip Brendan Hansknecht (Jul 28 2024 at 07:39):

I bet that zig changed there float printing algorim and how it rounds

view this post on Zulip Brendan Hansknecht (Jul 28 2024 at 07:39):

So I bet the test case is incorrect now.

view this post on Zulip Luke Boswell (Jul 28 2024 at 10:34):

What a guess. Looks like your correct :smile:

0.11.0

$ zig build run
All your 340282346638528860000000000000000000000 are belong to us.
Run `zig build test` to run the tests.

0.13.0

(nix:nix-shell-env) 192-168-1-103:zig-test-13 luke$ zig build-exe src/main.zig && ./main
All your 340282350000000000000000000000000000000 are belong to us.

view this post on Zulip Brendan Hansknecht (Jul 28 2024 at 16:29):

I work with floats a lot in ml....that just looked correct to be the minimum length float that still would parse back to the original value

view this post on Zulip Anton (Jul 29 2024 at 09:25):

Also, I would appreciate assistance from anyone with rust or LLVM knowledge to look at the changes

I would try to find other rust projects that use inkwell and that have already performed this update to the new pass manager.

view this post on Zulip Luke Boswell (Jul 30 2024 at 19:43):

Thanks to @Folkert de Vries we're down to one issue blocking a run at CI now I think -- just the "Undefined symbols for architecture arm64" thing above which I haven't looked into yet.

We got the optimisation passes back (or at least hacked in) and working for llvm, and fixed the gen_num::num_to_str_f32 test.

view this post on Zulip Luke Boswell (Jul 31 2024 at 04:50):

Ok, 100% tests are :check: passing on my mac :tada:

view this post on Zulip Luke Boswell (Jul 31 2024 at 04:52):

I might give CI a run just to see if everything passes there too.

Still need to cleanup the LLVM stuff, mostly ripping out the unused things, and also fixing up the pointer types which are giving us a warning now -- but should be pretty straight forward.

view this post on Zulip Luke Boswell (Jul 31 2024 at 04:54):

We are still using the zig and zls overlays... I couldn't get the nixpkgs thing working when I tried. I might defer changing that back to someone who knows what they are doing.

view this post on Zulip Luke Boswell (Jul 31 2024 at 04:55):

Also, there are documentation and various references to the old LLVM 16 which need to be updated

view this post on Zulip Luke Boswell (Jul 31 2024 at 05:13):

Ok, CI fails with this. Which I think means I've done something wrong in nix/builder.nix

$ nix build
warning: Git tree '/Users/luke/Documents/GitHub/roc' is dirty
warning: input 'rust-overlay' has an override for a non-existent input 'flake-utils'
error: builder for '/nix/store/gi3j8vs8qq54zy5vv0zji6h2l6cdrqsv-roc_cli-0.0.1.drv' failed with exit code 101;
       last 25 log lines:
       >    Compiling roc_bitcode_bc v0.0.1 (/private/tmp/nix-build-roc_cli-0.0.1.drv-0/source/crates/compiler/builtins/bitcode/bc)
       >    Compiling packed_struct v0.10.1
       >    Compiling inkwell v0.4.0 (https://github.com/TheDan64/inkwell?rev=89e06af#89e06af9)
       > error: No suitable version of LLVM was found system-wide or pointed
       >               to by LLVM_SYS_180_PREFIX.
       >

nix develop work for me, so it's just a nix setup thing.

view this post on Zulip Luke Boswell (Jul 31 2024 at 05:35):

I think I (actually Claude) found a solution.

view this post on Zulip Brendan Hansknecht (Aug 01 2024 at 03:59):

Awesome thing about the llvm 18 upgrade. Nice perf gain:

Summary
  ./cc-fluxsort ran
    1.20 ± 0.02 times faster than ./roc-builtinsort-new
    1.36 ± 0.02 times faster than ./cc-quadsort
    1.64 ± 0.02 times faster than ./roc-builtinsort-old

Not quite up to speed with raw c++ on m1 mac, but now in the same ballpark.

view this post on Zulip Folkert de Vries (Aug 01 2024 at 07:38):

I think we also just run more passes now maybe? anyway that is really nice

view this post on Zulip Luke Boswell (Aug 04 2024 at 00:31):

(deleted)

view this post on Zulip Luke Boswell (Aug 04 2024 at 03:36):

Update with this PR.

It's passing all tests on macos in the nix shell for me locally. I've kicked off CI to see if it passes there too.

Our CI machines don't have ZIG 13 and LLVM 18 installed, so I expect the non-nix machines to fail.

I've cleaned up all of the LLVM issues, and updated the stray references to LLVM 16 -- there's some work here to upgrade the CI machines and build scripts. I don't think I can make progress this part without Anton.

There is one blocking issue on linux that needs further investigation. Unfortunately, I don't think I can make much progress to investigate this. It looks to be related to absolute relocations in the surgical linker. @Brendan Hansknecht and I thought we fixed it by adding an llvm globaldce pass, but it's still an issue.

___________
The roc command:

  "/home/lb-dev/Documents/Github/roc/target/release/roc --max-threads=1 /home/lb-dev/Documents/Github/roc/examples/helloWorld.roc --"

had unexpected stderr:

  The surgical linker currently has issue #3609 and would fail linking your app.
Please use `--linker=legacy` to avoid the issue for now.

___________

TLDR - we're really close, one linux surgical linker issue and then just CI admin to go. :fingers_crossed:

view this post on Zulip Luke Boswell (Aug 04 2024 at 10:20):

New linux specific issue

thread 'cli_run::expects_dev_and_test' panicked at crates/cli/tests/cli_run.rs:153:13:

___________
The roc command:

  "/home/lb-dev/Documents/Github/roc/target/debug/roc dev --max-threads=1 /home/lb-dev/Documents/Github/roc/crates/cli/tests/expects/expects.roc --"

had unexpected stderr:

  🔨 Rebuilding platform...
thread 220007 panic: cast causes pointer to be null
/nix/store/5yk32f31879lfsnyv0yhl0af0v2dz9dz-zig-0.13.0/lib/zig/std/start.zig:497:40: 0x55555560ee46 in main (host)
    return callMainWithArgs(@as(usize, @intCast(c_argc)), @as([*][*:0]u8, @ptrCast(c_argv)), envp);
                                       ^
???:?:?: 0x7ffff7df214d in ??? (libc.so.6)
Unwind information for `libc.so.6:0x7ffff7df214d` was not available, trace may be incomplete


___________

view this post on Zulip Luke Boswell (Aug 04 2024 at 10:58):

I think I found the issue -- hard to know until I get the changes across to my linux machine and run a full test suite. But basically zig is stricter now with exporting symbols I think -- and the ABI of those symbols. So you can't have a host with signature pub fn main() !void { any longer, it needs to be pub export fn main() u8 to link correctly with the roc side.

view this post on Zulip Luke Boswell (Aug 05 2024 at 04:18):

I've got most of the CI tests passing on macos/linux now.

I'm down to the wasm repl tests and we've got a bunch of errors like

thread '<unnamed>' panicked at crates/repl_wasm/src/repl.rs:265:86:
called `Result::unwrap()` on an `Err` value: ParseError { offset: 469912, message: "Unknown relocation type 0x b" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I haven't figured out where this is coming from.

My best guess is that zig 13 behaves differently -- and the way are using zig to build the test platform, get wasi-libc, and linking things together needs to be investigated.

I've tested the cli repl manually and it looks like there is no issue. I haven't figured out how to spool up a wasm repl to test that manually.

I can repro the test failures locally in nix using bash crates/repl_test/test_wasm.sh.

view this post on Zulip Anton (Aug 05 2024 at 08:31):

Lots of progress :tada: I'll try to install the CI dependencies today

view this post on Zulip Anton (Aug 05 2024 at 17:11):

zig 13 and llvm 18 have been installed on all machines

view this post on Zulip Luke Boswell (Sep 05 2024 at 02:34):

Run curl.exe -f -L -O -H "Authorization: token ***" https://github.com/roc-lang/llvm-package-windows/releases/download/v18.1.8/LLVM-18.1.8-win64.7z
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (22) The requested URL returned error: 404

@Anton -- when you get some time, can you please load LLVM 18 for windows onto roc-lang/llvm-package-windows.

view this post on Zulip Luke Boswell (Sep 05 2024 at 03:01):

I commented all the jobs out in CI manager (GH workflows) except the nix-linux-x64 and nix-macos-apple-silicon jobs. I just want to see if we can get this PR to pass all the tests on those first and add the others back once we know it's working correctly.

view this post on Zulip Luke Boswell (Sep 05 2024 at 03:10):

I've spent some time looking at this PR today. I merged main, and ran through the tests and pushed to see how it runs against CI.

My conclusion is that we should wait until after the build-host PR lands before trying to land this. There are issues that affect both, and fixing them over there first, and then merging will be much simpler.

view this post on Zulip Anton (Sep 06 2024 at 09:32):

can you please load LLVM 18 for windows onto roc-lang/llvm-package-windows

Done

view this post on Zulip Luke Boswell (Oct 08 2024 at 08:09):

Just merged main, and ran a full test using nix on macos and linux with both passing.

$ nix develop
$ cargo test --release

view this post on Zulip Luke Boswell (Nov 12 2024 at 22:08):

Ok, merged main in... let's see how CI fairs. It's looking pretty good :smiley:

view this post on Zulip Luke Boswell (Nov 12 2024 at 22:12):

We talked about making a zig package in https://roc.zulipchat.com/#narrow/channel/304902-show-and-tell/topic/Advent.20of.20Code.20platform/near/481816535 @Jasper Woudenberg

I think this is a great idea, though we can probably live with my workaround for the purpose of landing this change. Essentially, I've made a rust crate that copies the zig glue files into each of the test platforms once per test run. It works, but it would be nice to use the zig ecosystem properly and cleanup these zig test platforms (and also improve the zig platform dev experience).

view this post on Zulip Luke Boswell (Nov 12 2024 at 23:20):

Everything is :check: for me locally on my mac.

I think all that's left to land this is some code review, minor cleanup, and fixing the GH actions (making sure the zig 13, llvm deps are correct etc).

view this post on Zulip Anton (Nov 13 2024 at 09:52):

I'll check it out :)

view this post on Zulip Luke Boswell (Nov 30 2024 at 00:39):

Ok, I've got this PR :check: on both apple silicon macos, and x64 linux running in Nix. I think we are on the home stretch. :smiley:

view this post on Zulip Luke Boswell (Dec 12 2024 at 00:26):

@Brendan Hansknecht we need to merge the latest changes.

I haven't got around to investigating why the nix CI's are failing tests now. I thought they were good.

I'm expecting the only failure that should remain is CI manager / start-ubuntu-x86-64-tests / test zig, rust, wasm... that looks to be failing with this which needs investigation.

+ zig test -target wasm32-wasi-musl -O ReleaseFast src/main.zig --test-cmd ../../../../target/release/roc_wasm_interp --test-cmd-bin
thread 'main' panicked at crates/wasm_interp/src/wasi.rs:142:26:
not yet implemented: WASI fd_fdstat_get([I32(2), I32(16777168)])

view this post on Zulip Luke Boswell (Dec 12 2024 at 00:27):

I'm still messing around with basic-cli and basic-webserver, just wanted to give you an update

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 00:30):

Did we leave in some sort of debug printing that might call fd_fdstat_get?

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 00:31):

Accidentally compiling it into the zig bitcode for wasm?

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 00:32):

Oh, also looks like we have special cased stdout for fd_fdstat_get, but don't have anything for stderr. So if it isn't an accidental extra debug print, looks easy to add the functionality.

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 00:38):

don't see any debug prints. Let me try to push a fix

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 01:39):

Ok. Pushed a handful of changes. I'm hopeful

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 02:09):

Benchmarks are failing in a way that doesn't make sense to me: https://github.com/roc-lang/roc/actions/runs/12288409183/job/34292167154?pr=6921

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 02:09):

It is failing in a function that no longer exists on the branch somehow:

cannot find `glue.zig`. Check the source code in find_zig_glue_path() to show all the paths I tried.
Location: crates/compiler/build/src/link.rs:95:5

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 02:11):

oh, it's in bench-folder-main...that makes sense

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 02:12):

So now :fingers_crossed:

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 03:16):

@Anton any idea for https://github.com/roc-lang/roc/actions/runs/12288516921/job/34292459513?pr=6921

  zig build ir -Drelease=true failed with:

    ir
  +- install generated to builtins-host.ll
     +- zig build-obj builtins-host ReleaseFast native-macos failure
  error: error: CacheUnavailable

looks to only happen for nix macos on CI. Note, zig cache dir is now .zig-cache

view this post on Zulip Luke Boswell (Dec 12 2024 at 03:49):

Normally I just restart the runs when this happens and it fixes itself

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 03:49):

I've seen it multiple times in a row now, but hopefully goes away

view this post on Zulip Luke Boswell (Dec 12 2024 at 03:50):

Ah that's a pain

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 04:17):

Yeah, seems to be hit 100% of the time on the mac builder during nix-build

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 04:28):

Also, I'm hopeful that's soon to be the last error...but hitting relocation issues in wasm and benchmarking issue due to glue.zig being removed. I think I have a fix for both, but hard to say.

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 06:13):

I have been poking at this for a while and am not having luck.

nix-build on apple silicon is failing with: https://github.com/roc-lang/roc/actions/runs/12290625251/job/34298069385?pr=6921

   zig build ir -Drelease=true failed with:

    ir
  +- install generated to builtins-host.ll
     +- zig build-obj builtins-host ReleaseFast native-macos failure
  error: error: CacheUnavailable

I can't repro locallly. Probably related to https://github.com/ziglang/zig/issues/20501


the benchmark job is failing due to main being unable to find glue.zig: https://github.com/roc-lang/roc/actions/runs/12290625251/job/34298069754?pr=6921

I get that we removed the file, but there are multiple search paths in find glue and I'm pretty sure it should be finding the file.


Lastly, wasm_test_platform.wasm is broken with the new version of zig building it. I have no idea why, but it is now emitting new types of relocations we don't support. I tried toggling PIC related flags and thought I had figured it out, but it is still broken. I can repro this one locally, but have not found a fix.

https://github.com/roc-lang/roc/actions/runs/12290625251/job/34298070201?pr=6921


Overall, feels close, but it is hard to tell if these issues will be minor flag changes or major drags to figure out.

view this post on Zulip Brendan Hansknecht (Dec 12 2024 at 06:56):

I think this will fix the benchmarking issue. It needs to land in main such that the main reference loaded for benchmarking is correct:
https://github.com/roc-lang/roc/pull/7349

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 00:52):

So ended up not fully fixing the benchmarking issue. Realized that the current failure is due to all benchmarks running in the context of the new nix flake. This means that all benchmarks are trying to run with zig 0.13.0. Given main is only at zig 0.11.0 this breaks. I don't think this is worth fixing now though (just think it is too much hassle for something that only matter every zig upgrade). So I think once the other two issues are fixed, we should land this PR with the benchmark failing.

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 00:54):

@Anton when you get the chance can you look at the macos failure with the unavailable cache? I feel like you will be most likely to figure out how to workaround that.

view this post on Zulip Luke Boswell (Dec 13 2024 at 01:11):

I'm digging into gen-wasm issue now. :fingers_crossed:

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 01:51):

I think gen wasm should be fixed now

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 01:51):

So if we can fix the mac cache issue, I think we can land

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 03:10):

Figured out the mac cache issue :tada:

That said, turns out that wasm isn't fully fixed. Still has 1 test failing after the previous fix.

view this post on Zulip Luke Boswell (Dec 13 2024 at 03:17):

Which test?

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 03:18):

linking_with_dce

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 03:21):

reverting the platform change fixes it

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 03:22):

So :fingers_crossed: that this next ci run passes everything but benchmarking

view this post on Zulip Luke Boswell (Dec 13 2024 at 03:23):

I thought I had to change that for zig 13, maybe that was only required in another platform -- or a specific OS/Arch

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 03:23):

I mean... I think it should be required

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 03:24):

Actually, I guess not. Cause zig is controlling the main. There is no c main needed, right?

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 03:24):

So I think for anything wasi, this should be fine

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 03:24):

I guess for anything called into by js, it is probably required to do things differently

view this post on Zulip Luke Boswell (Dec 13 2024 at 03:27):

I suspect it may be something like we build wasm_linking_test_host.zig using zig cli into an object and not a static library.

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 03:28):

in this case, we build it into an exe

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 03:28):

so that is probably why main() !void is allowed

view this post on Zulip Luke Boswell (Dec 13 2024 at 03:30):

Yeah, something like that. It's building into an object file build-obj, and then into an exe using build-exe in two steps. I suspect where we build a static library zig doesn't like the !void

view this post on Zulip Luke Boswell (Dec 13 2024 at 03:31):

Digging into that stuff was hella confusing, the combination of different options, exporting vs not exporting, static library vs exectuable, zig 0.11 vs zig .13.

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 03:31):

makes sense. Expects to follow c-abi cause static lib is not expected to have a main

view this post on Zulip Luke Boswell (Dec 13 2024 at 03:33):

All the wasm tests pass for me locally.

view this post on Zulip Luke Boswell (Dec 13 2024 at 03:35):

I'm going to read through all the changes and see if we've missed anything obvious

view this post on Zulip Luke Boswell (Dec 13 2024 at 03:53):

I haven't found anything blocking...

there was one TODO left where we verify the LLVM IR, should we measure the impact on performance?

we need to verify/test some of the building from source README changes, like is the llvm-18 package available on fedora etc

The sort -> instertion change in crates/compiler/builtins/benchmark-dec.zig may have broken things, but I guess we can fix this in a follow up and also resolve the issue preventing the benchmark CI runner from completing.

view this post on Zulip Luke Boswell (Dec 13 2024 at 03:54):

Also, just making notes here to summarise as that PR is so massive my browser struggles to find things

view this post on Zulip Luke Boswell (Dec 13 2024 at 04:10):

We did it :tada: :smiley:

Screenshot 2024-12-13 at 15.09.49.png

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 04:10):

there was one TODO left where we verify the LLVM IR, should we measure the impact on performance?

I think it is fine to leave in the todo for now. I added more color in the PR.

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 04:10):

let's merge this!

view this post on Zulip Luke Boswell (Dec 13 2024 at 04:11):

Screenshot 2024-12-13 at 15.11.18.png

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 04:12):

:tada:

view this post on Zulip Ryan Barth (Dec 13 2024 at 04:13):

Looks like it is time to brush off my static Roc branch

view this post on Zulip Luke Boswell (Dec 13 2024 at 04:14):

Don't need to ask me twice :smiley:

Thank you @Brendan Hansknecht @Anton @Sam Mohr @Folkert de Vries for all the work to land this.

For anyone interested, this change ;

view this post on Zulip Luke Boswell (Dec 13 2024 at 04:26):

We also may have broken some things or made a typo in the building from source documentation, so if you are able to do that (especially any non-nix users on different os/arch's) and can report any success/failures, that would be really helpful.

Also apologies in advance @Anton if our scripts for making a release are broken at all, I'm pretty sure they're all good, but I couldn't fully check them. I figure we can give it a test run with the prerelease for the new PI basic-cli

view this post on Zulip Sam Mohr (Dec 13 2024 at 05:01):

That static linking could be really nice for weird deployments!

view this post on Zulip Sam Mohr (Dec 13 2024 at 05:04):

I'm thankful to the people that did real work on this and made me look like I did something :sweat_smile:

view this post on Zulip Luke Boswell (Dec 13 2024 at 05:05):

Moral support counts :blush:

view this post on Zulip Oskar Hahn (Dec 13 2024 at 07:15):

Was the inline expect change part of this PR? Or how is it possible to static link a roc binaries with an inline expect?

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 07:18):

inline expects are not emitted in output object files or libaries

view this post on Zulip Brendan Hansknecht (Dec 13 2024 at 07:18):

They are only emitted when running with roc main.roc or roc dev main.roc

view this post on Zulip Ryan Barth (Dec 13 2024 at 07:54):

Luke Boswell said:

We also may have broken some things or made a typo in the building from source documentation, so if you are able to do that (especially any non-nix users on different os/arch's) and can report any success/failures, that would be really helpful.

Linux x86_64 checking in (Arch BTW). Successfully compiled roc with llvm 18.1.8 and zig 0.13.0 :sign_of_the_horns: . Got a mostly passing test suite. It hung with

test cli_tests::test_platform_simple_zig::module_params_multiline_pattern has been running for over 60 seconds

I also had a few

test cli_tests::test_platform_basic_cli::combine_tasks_with_record_builder ... ignored, broken when running in nix CI, TODO replace with a zig test platform

despite not using nix.


Last updated: Jul 06 2025 at 12:14 UTC