If any of you were getting "No space left on device" errors for apple silicon test, that should now be resolved.
@Anton I just got one for https://github.com/roc-lang/roc/pull/5983. They ran ~8 hours ago with the last commit from main being 2afd9ca0a9b846fc4127b3d7fb55c521c6ae9ff9, which was done on Nov 20.
Here are the two failed runs: devtools macos and nix macos applse silicon.
I re-merged main into the PR branch, so the workflows are waiting for approval again, but idk if that fixes anything
I've approved the run, I think merging main should fix it, I very recently added a clean up step that should prevent "No space left on device".
The apple silicon CI server is not picking up jobs, I'm investigating...
Should be fixed now
Cool! Do I need to merge main, or do anything?
No, your jobs are in the queue and will be started automatically
maybe we should add some paths-ignore
to our workflows so we don't run them when only .md
files change, e.g.
on:
push:
branches:
- main
paths-ignore:
- '**/*.md'
pull_request:
branches:
- main
paths-ignore:
- '**/*.md'
we might need to make an exception for a workflow that rebuilds the website, since we have .md
files which go into that
paths-ignore will not work unfortunately but a more convoluted solution using if:
should be possible. I'll try to look at that this week.
Is it possible to also not run CI on Draft PR's? I just cancelled CI for something that is WIP, but I wanted to push it to a PR to record my progress.
Apparently this could work
on:
push:
branches:
- main
pull_request:
branches: [main]
paths:
- "**"
- "!/*.md"
- "!/**.md"
types:
- ready_for_review
But you have to create them as Draft and then click "Ready for Review".... hmmm, I'm not good with this stuff
But you have to create them as Draft and then click "Ready for Review".... hmmm, I'm not good with this stuff
GitHub's blog post, from when they first announced the feature, shows a screenshot of how to do it. It still looks the same. Once you create it, the "ready for review" button appears near the bottom of the PR page.
Apparently this could work
I think there are some problems with that approach. Including [skip ci]
in your commit message is a simple and effective way.
For knowledge, what are the problems with the approach Luke listed?
push
is only triggered after the PR is merged, which would be too late :p
If there are only md file changes, required checks (github settings) would not be completed, because of this issue. If you have only md changes, we could then run tests with the "ready for review" button but newcomers will then press this too early. I'm also not sure if anybody but the author can trigger "ready for review". Lots of people don't have CI privileges so "ready for review" will not actually start CI.
But I will hopefully be able to prevent unnecessary runs with some changes to CI tomorrow.
I've got a working prototype of the smarter orchestration, I will set it up in full next week
Screenshot_20231125_201542.png
That looks really great!
Ci issues with the failing static_site_gen test have been resolved :)
You can use the "update branch" button to get the fix on your branch.
macos-11 is deprecated by github CI and will soon be removed so I'm going to remove it from all our workflows.
Looks like we have a common issue with CI missing xcrun
on the X86-64 MacOS machine
Yeah that will be due to the upgrade to macos 12, I'll check it out.
The xcrun issue has been fixed, I'm doing a full test run now to see if any issues come up
Test run succeeded :)
You may hit this issue when running CI on macos #7380:
error: failed to run custom build command for `roc_bitcode_bc v0.0.1 (/Users/m1ci/actions-runner2/_work/roc/roc/crates/compiler/builtins/bitcode/bc)`
note: To improve backtraces for build dependencies, set the CARGO_PROFILE_RELEASE-WITH-LTO_BUILD_OVERRIDE_DEBUG=true environment variable to enable debug information generation.
Caused by:
process didn't exit successfully: `/Users/m1ci/actions-runner2/_work/roc/roc/target/release-with-lto/build/roc_bitcode_bc-8ac377685705d80e/build-script-build` (exit status: 1)
--- stdout
cargo:rerun-if-changed=build.rs
Compiling host ir to: /Users/m1ci/actions-runner2/_work/roc/roc/crates/compiler/builtins/bitcode/bc/../zig-out/builtins-host.ll
Compiling 64-bit bitcode to: /Users/m1ci/actions-runner2/_work/roc/roc/crates/compiler/builtins/bitcode/bc/../zig-out/builtins-host.bc
Compiling host ir to: /Users/m1ci/actions-runner2/_work/roc/roc/crates/compiler/builtins/bitcode/bc/../zig-out/builtins-wasm32.ll
Compiling 64-bit bitcode to: /Users/m1ci/actions-runner2/_work/roc/roc/crates/compiler/builtins/bitcode/bc/../zig-out/builtins-wasm32.bc
--- stderr
An internal compiler expectation was broken.
This is definitely a compiler bug.
Please file an issue here: <https://github.com/roc-lang/roc/issues/new/choose>
zig build ir-wasm32 -Drelease=true failed with:
I'm investigating it now
i've disconnected the macos apple Silicon CI server so I don't have to debug this in my garage
That server is back up, the bug did not want to reproduce for me, just going add a retry workaround...
It is interesting to see the amount of flaky errors we accumulated in builtins/bitcode/build.rs:
error_str.contains("FileNotFound")
|| error_str.contains("unable to save cached ZIR code")
|| error_str.contains("LLVM failed to emit asm")
|| error_str.contains("ir-wasm32 transitive failure")
Perhaps they share the same parallelism weirdness
I've seen a bunch more failures with builtins bitcode on macos, I'm working on a new workaround
It is interesting to see the amount of flaky errors we accumulated in builtins/bitcode/build.rs:
error_str.contains("FileNotFound") || error_str.contains("unable to save cached ZIR code") || error_str.contains("LLVM failed to emit asm") || error_str.contains("ir-wasm32 transitive failure")
Perhaps they share the same parallelism weirdness
Hopefully we can fix the root at some point
Heads up: CI is broken on main on nix apple silicon main, I'll check it out
This was due to 723e35f PR#7424 I'm going to revert it
Fixed in #7435
Is it just me or does it look like sometimes it takes a workflow over 1.5h to just compile roc, and sometimes it is almost instant. Is this some caching issue perhaps that then leads to some flakiness? On the other hand, I also occasionally see the following
error: failed to run custom build command for `roc_bitcode_bc v0.0.1 (/Users/m1ci/actions-runner2/_work/roc/roc/crates/compiler/builtins/bitcode/bc)`
note: To improve backtraces for build dependencies, set the CARGO_PROFILE_RELEASE_BUILD_OVERRIDE_DEBUG=true environment variable to enable debug information generation.
Caused by:
process didn't exit successfully: `/Users/m1ci/actions-runner2/_work/roc/roc/target/release/build/roc_bitcode_bc-01dfdbf045b35be2/build-script-build` (exit status: 1)
--- stdout
cargo:rerun-if-changed=build.rs
Compiling host ir to: /Users/m1ci/actions-runner2/_work/roc/roc/crates/compiler/builtins/bitcode/bc/../zig-out/builtins-host.ll
Compiling 64-bit bitcode to: /Users/m1ci/actions-runner2/_work/roc/roc/crates/compiler/builtins/bitcode/bc/../zig-out/builtins-host.bc
Compiling host ir to: /Users/m1ci/actions-runner2/_work/roc/roc/crates/compiler/builtins/bitcode/bc/../zig-out/builtins-wasm32.ll
Compiling 64-bit bitcode to: /Users/m1ci/actions-runner2/_work/roc/roc/crates/compiler/builtins/bitcode/bc/../zig-out/builtins-wasm32.bc
--- stderr
An internal compiler expectation was broken.
This is definitely a compiler bug.
Please file an issue here: <https://github.com/roc-lang/roc/issues/new/choose>
zig build ir-wasm32 -Drelease=true failed with:
error: Unexpected
Location: crates/compiler/builtins/bitcode/bc/build.rs:115:21
it takes a workflow over 1.5h to just compile roc
Does it get stuck, or does it actually finish in that time?
The bitcode/bc/build.rs
issue is caused by multithreading but I don't know much more than that
Is this some caching issue
It's not due to a cache that we use for CI fyi, just seems to be a problematic interaction with the rust and zig build processes
We used to have a bandaid solution for this problem but it stopped working since the llvm 18 zig 13 upgrade
Anton said:
We used to have a bandaid solution for this problem but it stopped working since the llvm 18 zig 13 upgrade
Interesting, and what was the bandaid solution?
Anton said:
it takes a workflow over 1.5h to just compile roc
Does it get stuck, or does it actually finish in that time?
I've seen it do both: finish and not finish in time
Interesting, and what was the bandaid solution?
Retry the failing command up to 10 times
Ahh classic. FWIW I've seen a similar solution used in the past https://github.com/rust-lang/rust/pull/40422/files Interestingely, last I checked this bit of code is still present in the rustc implementation.
Argh, looks like clippy is hanging on the M1 https://github.com/roc-lang/roc/actions/runs/12608939154/job/35144084934?pr=7455
I'm going to look at the flaky CI issues now
We've picked up a new issue with nix-linux-x86-64-tests:
test cli_tests::test_platform_effects_zig::effectful_form has been running for over 60 seconds
Looking at it now
Fixed in PR#7475
I'm going to disconnect the macos x64 CI machine for easier debugging of a build failure.
Fixed :)
If you get the error below, update your branch with latest main:
Run zig version
/Users/m1ci/actions-runner2/_work/_temp/f59e70e6-faef-46ea-a406-0a88e0decf65.sh: line 1: zig: command not found
Error: Process completed with exit code 127.
The new CI workflow is on main :)
.github/workflows/ci_zig.yml is called if there are changes to the src folder, build.zig or build.zig.zon; modify the two lists here if you want to alter that change detection. The old workflows are not called if the changes are only to new compiler files.
If you want to add additional CI checks for the new compiler they can be added here.
I did a bunch of testing, but I may still have missed something, so feel free to mention me if you think something's off.
As a note, I think you can just zig build test
instead of doing any sort of direct exe running or oa based checks
I'm going to disconnect macos x64 CI for investigation again, it's hitting the same issue as before. If your changes are limited to the new compiler files, CI should still be able to complete.
macos x64 CI is back up but I'm still trying workarounds
Going to do some CI maintenance, this should not affect zig compiler workflows, those all use github CI machines
Done
Looks like a recent github image runner update broke something on our windows-2022 tests, I'm looking at it now
It's only on my PR but it doesn't make any sense given the changes :thinking:
Rust or zig? Also, have a link?
zig
https://github.com/roc-lang/roc/actions/runs/13658753462/job/38188514556
It may be due to random CI machines, some that have the update and some that don't
No, they're the same version :sweat_smile:
I'm just going to open a new PR, this one is haunted
Best I can guess is zig cache issue and corrupted download. Worst case, try adding:
rm -rf .zig-cache zig-out $ZIG_LOCAL_CACHE_DIR
...
Actually, might need to nuke the global cache.
Not sure it would make a difference, but I know that the mlugg zig cache thing sets some cache folders (I assume it does this cause it saves the folders).
Uhu, that could be a likely cause, it passed in the new PR, so I think we're good now but we know where to look if we see it again
CI is now calling old rust workflows when it doesn't need to :(
I'll try to fix it tomorrow.
If you are referring to this PR specifically, There is a zig file from the old compiler that got reformatted: https://github.com/roc-lang/roc/pull/7672
So CI may be working as expected.
It's not just that PR. I've had to force merge a couple today that shouldnt have ran the old workflows
I wanted the changes in main so I could merge for my PR and avoid more conflicts, so I didnt wait for Anton to look at it.
more-sexprs ran full ci due to editing .gitignore
improve-zig-comments looks like it may have been a path filters bug/issue...
Oh, found it: https://github.com/roc-lang/roc/pull/7683
it removed the predicate-quantifier
Fix: https://github.com/roc-lang/roc/pull/7685
Aside, we may want to go through and explicitly ignore ci an certain files/folders like .gitignore
. Would just take expanding the !file
list in that filter.
Brendan Hansknecht said:
Yeah, I removed it because I spotted a warning in CI saying that predicate-quantifier was not a valid option, but now I no longer see it :shrug:
Yeah, the warning was a bug that got fixed last week
They added the feature, but forgot to document it, so GitHub didn't know it existed
Oh, that makes sense :)
I may have finally found a workaround for the nix apple silicon workflow failures (old compiler), it ran successfully 3 times in a row :)
Last updated: Jul 05 2025 at 12:14 UTC