Stream: compiler development

Topic: zig 0.11


view this post on Zulip Brendan Hansknecht (Sep 14 2023 at 04:53):

I started looking at zig 0.11 again. I think we have been using the allocator api wrong (at least in tests and probably elsewhere). We have been using alloc and destroy. In reality, alloc is meant to be used with free and destroy is meant to be used with create. The difference between the apis is that alloc/free are for arrays and create/destroy are for single elements.

view this post on Zulip Brendan Hansknecht (Sep 14 2023 at 04:55):

The new annoyances is that we need to correctly use alloc and free. Sadly, free requires a slice or array pointer as the input. Both of these contain the size information. So now, for our dealloc function to work, we need to track the size and use it to generate a correct array before freeing.

view this post on Zulip Brendan Hansknecht (Sep 14 2023 at 04:58):

So either, we need to change the allocator to allocate extra bytes and store the size of the allocation on the heap, or we need to always pass the size information into roc_dealloc.

view this post on Zulip Brendan Hansknecht (Sep 14 2023 at 05:00):

Not necessarily surprising, just inconvenient. To be fair, I think this is the same issue with the rust allocator api. Which is why we tend to just call malloc and free directly in rust platforms.

view this post on Zulip Brendan Hansknecht (Sep 14 2023 at 05:02):

given our only dynamic data structures are built on lists and strings (which know their exact size), I think it may be worth revisiting if we should pass the size into roc_dealloc. We already have to pass the size into roc_realloc

view this post on Zulip Brian Carroll (Sep 14 2023 at 06:19):

Yeah I've often thought we should pass the size to roc_dealloc. The current signature feels like it's modelled specifically on libc free. It forces the allocator to remember the size. But it's easy for us to generate code to load the size.

view this post on Zulip Richard Feldman (Sep 14 2023 at 10:31):

I think there was a reason we couldn't always have access to the size when calling free - and I think it was seamless slices?

view this post on Zulip Richard Feldman (Sep 14 2023 at 10:32):

but I also think we wanted to start storing capacity on the heap, which would address that

view this post on Zulip Brendan Hansknecht (Sep 14 2023 at 16:56):

So update list/str representation that store the capacity on heap when going from 1 to 2 RC. Then when freeing (roc 1 to 0), if a proper list, just use capacity stored on stack. If seamless slices, use the capacity on the heap.

Then update roc_dealloc to also take a size?

view this post on Zulip Richard Feldman (Sep 14 2023 at 17:52):

I think we might as well always store capacity on the heap, because we have to leave room for it either way

view this post on Zulip Richard Feldman (Sep 14 2023 at 17:52):

because it's at the start of the allocation rather than the end

view this post on Zulip Richard Feldman (Sep 14 2023 at 17:53):

I forget if we talked about storing it at the end, but that's actually interesting if we only did it when RC > 1 because it could mean smaller heap allocation sizes if your thing is never shared :thinking:

view this post on Zulip Brendan Hansknecht (Sep 14 2023 at 21:05):

But then you would have to reallocate on sharing if you don't have capacity?

view this post on Zulip Richard Feldman (Sep 14 2023 at 21:10):

yep

view this post on Zulip Richard Feldman (Sep 14 2023 at 21:12):

oh wait, if it's a slice it wouldn't know where to find that capacity bc it wouldn't know the original length or the original capacity ahead of time

view this post on Zulip Richard Feldman (Sep 14 2023 at 21:13):

so I guess it would have to be next to the refcount

view this post on Zulip Brendan Hansknecht (Sep 14 2023 at 21:36):

Good catch

view this post on Zulip Brendan Hansknecht (Sep 14 2023 at 21:37):

Richard Feldman said:

I think we might as well always store capacity on the heap, because we have to leave room for it either way

Oh also, the reason I didn't say to always store it is to avoid wasting trips to memory. Only write it when there is a chance it will be needed

view this post on Zulip Brendan Hansknecht (Sep 21 2023 at 23:04):

So I changed the allocator stuff for the testing allocator in the standard library instead of trying to update the slice representation. Want to take a serious look at switching to zig 0.11 and llvm 16 with the easy path. Avoid us falling farther and farther behind on versioning. So for now, fine changing the allocator impl or switching to malloc and free if necessary.

view this post on Zulip Brendan Hansknecht (Sep 21 2023 at 23:05):

A lot of gen tests that call zig functions that format numbers to strings are stack overflowing. Really unsure as to why. Did some testing directly in zig and the equivalent tests pass just fine. Not really sure what I am missing at the moment.

view this post on Zulip Luke Boswell (Sep 21 2023 at 23:11):

Interesting. Could it be something in the test setup on the rust side? Folkert mentioned that he is suspicious of that part of our code which might be causing issues with windows. Something about rust dropping things that were allocated in zig.

view this post on Zulip Luke Boswell (Sep 21 2023 at 23:17):

Folkert de Vries said:

we know the cause of these issues right?

view this post on Zulip Brendan Hansknecht (Sep 21 2023 at 23:22):

I don't think this is the same. This is apparently only on m1 Mac, not on x86. And it is a stack overflow that looks to be happening in the bufPrint function in zig.

view this post on Zulip Brendan Hansknecht (Sep 21 2023 at 23:27):

Also, apparently on m1 macs, dec is now [2 x i64] %arg, but on x86, it is i64 %hi, i64 %lo...that's just kinda inconvenient.

view this post on Zulip Brendan Hansknecht (Sep 22 2023 at 00:11):

@Ayaz Hafiz related to your type erasure stuff, so maybe you have an idea. Do you know how we can get the type of the refcounter here. It is a pointer to a function, but we need to know the type of the underlying function, which is no longer stored with pointers in llvm anymore.

old code

Current wrong testing code. I had test just using the fn_val that was passed in, but it doesn have the right function signiture based on the generated llvm ir and is failing some capture tests.

view this post on Zulip Ayaz Hafiz (Sep 22 2023 at 01:43):

i’m not 100% sure i follow. what is the difference between the two codes you shared you’re trying to reconcile? is one compiling with llvm opaque pointers?

view this post on Zulip Brendan Hansknecht (Sep 22 2023 at 02:29):

The old code could just use build_call which under the hood would understand the function pointer type and make a indirect call. The new version has to make the indirect call via build_indirect_call. That requires us to specify the function type since it is no longer possible to just grab it from the pointer. So yes, the second one is with opaque pointers and I think I am feeding the wrong function type to it, but I am not sure where I could get the correct one from.

view this post on Zulip Brendan Hansknecht (Sep 22 2023 at 05:14):

nvm, figured it out. tests are passing now.

view this post on Zulip Brendan Hansknecht (Sep 22 2023 at 05:35):

Also the tests that are stack overflowing are only doing so on my m1 mac. Somehow it is caused by the memcpy implementatoin or by how it is being called.

view this post on Zulip Brendan Hansknecht (Sep 22 2023 at 05:36):

maybe I updated it to zig 0.11 wrong and it infinitely loops somehow or maybe it is exposed incorrectly.

view this post on Zulip Brendan Hansknecht (Sep 22 2023 at 05:49):

Also, fun failing test case: tan 1f64 on m1 mac: 1.5574077246549023. Same thing on x86 linux: 1.5574077246549020. Off by the lowest bit.

The answers rounded opposite directions and thus are now failing on one device or the other. Interestingly, this fails starting with the newer version of zig/llvm, but was passing in the past.

view this post on Zulip Brendan Hansknecht (Sep 22 2023 at 06:05):

Figured out the memcpy issue as well. The musl fallback memcpy was being optimized such that it was calling memcpy recursively. Definetly should long term look into disabling the promote to memcpy optimization here. But was able to work around it overall.

view this post on Zulip Anton (Sep 22 2023 at 09:44):

Awesome work Brendan!

view this post on Zulip Brendan Hansknecht (Sep 22 2023 at 15:57):

This branch is actually turning out better than I expected.

On x86 linux
cargo test on has only one failure. It is related to the dev backend and is probably due to incorrect cabi to a builtin (looks to be with returning a Result of a Dec).
cargo test is passing

On m1 mac,
3 cli run tests related to parsing are failing (not sure why yet, still a memcpy issue) And
Just 1 gen test due to float rounding.

wasm... is really broken right now (both dev and llvm, probably cabi and wasm-ld changes).

But given essentially all cli run tests are working, I am pretty positive that we should be able to fix up these bugs and actually finally merge an update to zig and llvm. :tada:

view this post on Zulip Brian Carroll (Sep 22 2023 at 16:29):

Oh wow, excellent stuff!

view this post on Zulip Brian Carroll (Sep 22 2023 at 16:29):

If I remember correctly, lots of things changed on the wasm side in Zig.

view this post on Zulip Brian Carroll (Sep 22 2023 at 16:29):

I think they fixed a load of CABI problems that we had workarounds for

view this post on Zulip Brian Carroll (Sep 22 2023 at 16:30):

Those were originally meant to be 0.10 but didn't make it... I think

view this post on Zulip Brian Carroll (Sep 22 2023 at 16:30):

anyway I had a branch for this at some point and I got the Wasm stuff working

view this post on Zulip Brian Carroll (Sep 22 2023 at 16:31):

I'll dig around and see if I can find something useful!

view this post on Zulip Brian Carroll (Sep 22 2023 at 16:33):

This is the PR I was thinking of. https://github.com/roc-lang/roc/pull/3856

view this post on Zulip Brian Carroll (Sep 22 2023 at 16:34):

There are lots of comments and I haven't gone through them to remember what's relevant or not

view this post on Zulip Brendan Hansknecht (Sep 22 2023 at 16:36):

I'll start looking at the commits and see if I can pull in the wasm related changes.

view this post on Zulip Brendan Hansknecht (Sep 22 2023 at 17:29):

Cherrypicking worked for the most part, dev wasm is mostly happy. It has just a couple of failures left:

failures:
    gen_list::list_range_length_overflow
    gen_list::release_excess_capacity
    gen_list::release_excess_capacity_empty
    gen_list::release_excess_capacity_with_len
    gen_refcount::non_nullable_unwrapped_alignment_8
    gen_refcount::union_recursive_inc
    wasm_linking::test_linking_with_dce
    wasm_linking::test_linking_without_dce

After those, I think all that is left should be llvm wasm updates. It has the wrong abi now in many cases.

view this post on Zulip Brian Carroll (Sep 22 2023 at 20:23):

Oh cool! I'm glad those commits finally got used, a year later, and worked!

view this post on Zulip Brian Carroll (Sep 22 2023 at 20:32):

I don't recognise those particular gen_list tests, they might be "new" since then.
The refcount stuff involves that module in mono, code_gen_help or whatever I called it, and at the very lowest level it does call Zig builtins to do the dec or inc operation.
The wasm_linking tests, as you would guess, are for the Wasm surgical linking functionality. I guess that must be some change in the ABI where we link to builtins.

view this post on Zulip Richard Feldman (Sep 22 2023 at 23:20):

yooooooo this is so great!!!

view this post on Zulip Luke Boswell (Sep 23 2023 at 00:57):

If you have any issues with windows I am happy to assist where I can. I know Folkert and I discovered some differences between register use between zig versions which shouldn't be too hard to fix.

view this post on Zulip Luke Boswell (Sep 23 2023 at 00:58):

We left a note where to change it on zig version update.

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 01:26):

Thanks. will definitely need help there. Though I think I have a computer that can boot windows, I don't know if it works and it definitely does not have roc setup.

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 01:26):

Currently getting everything else working first.

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 01:34):

Just got the llvm wasm tests passing. Hopefully that covers the full abi.

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 01:54):

So it looks like we are down to:

view this post on Zulip Luke Boswell (Sep 23 2023 at 02:55):

Which branch are you working on?

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 03:08):

I think it is zig-11-llvm-16. Not at home to double check the name rn

view this post on Zulip Luke Boswell (Sep 23 2023 at 04:32):

I had forgotten how much wrangling it was to get LLVM set up correctly... :sweat_smile:

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 04:44):

Also, I think I am getting to a point were the last few tests will be much harder for me to fix cause I don't know enough about them. I will definitely try to tinker with them, but I may need others to take over for some subsets of tests. The gen dev failures seem to be related to lambda sets, which I am not really familiar with. The wasm failures that are left are so far not obvious to me.

view this post on Zulip Luke Boswell (Sep 23 2023 at 05:03):

@Anton I'm updating roc-lang/llvm-package-windows to build 16.0.4 for Windows

view this post on Zulip Anton (Sep 23 2023 at 13:59):

I've built 16.0.6 for windows because that's also the version nix is using
https://github.com/roc-lang/llvm-package-windows/releases/tag/v16.0.6

view this post on Zulip Luke Boswell (Sep 23 2023 at 21:51):

Thank you Anton, I'll test using that.

view this post on Zulip Luke Boswell (Sep 23 2023 at 22:24):

Any idea what might be causing this issue?

error: failed to run custom build command for `roc_bitcode v0.0.1 (C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode)`

Caused by:
  process didn't exit successfully: `C:\Users\bosyl\Documents\GitHub\roc\target\release\build\roc_bitcode-1e59ee36b356bf6b\build-script-build` (exit code: 101)
  --- stdout
  cargo:rerun-if-changed=build.rs
  Compiling zig object `object` to: C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode\zig-out\builtins-host.obj

  --- stderr
  An internal compiler expectation was broken.
  This is definitely a compiler bug.
  Please file an issue here: https://github.com/roc-lang/roc/issues/new/choose
  thread 'main' panicked at 'zig build object -Drelease=true failed with:

    error(mingw): clang exited with code 1 and stderr: error: unknown target triple 'x86-unknown-windows-gnu', please use -triple or -arch

  error(mingw): clang exited with code 1 and stderr: error: unknown target triple 'x86-unknown-windows-gnu', please use -triple or -arch

  error(mingw): clang exited with code 1 and stderr: error: unknown target triple 'x86-unknown-windows-gnu', please use -triple or -arch

  error: unable to generate DLL import .lib file for advapi32: ClangPreprocessorFailed

  ', crates\compiler\builtins\bitcode\build.rs:188:21
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

view this post on Zulip Folkert de Vries (Sep 23 2023 at 22:30):

See if that target is still in zig targets

view this post on Zulip Luke Boswell (Sep 23 2023 at 22:49):

I'm pretty sure it isn't, here is the output from zig targets on my windows machine using zig version 0.11.0.

view this post on Zulip Luke Boswell (Sep 23 2023 at 22:50):

But I don't see any triples in the output... so I'm not sure

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 22:51):

x86_64-windows-gnu

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 22:51):

From the output

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 22:51):

But this seems like an issue with forwarding that info to clang rather than directly from zig

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 22:52):

Though maybe we are feeding in the unknown and need to remove it

view this post on Zulip Luke Boswell (Sep 23 2023 at 22:56):

I can definitely see arch contains x86_64, os contains windows, and abi contains gnu, also libc contains x86-windows-gnu so it should be all ok

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 22:57):

Yeah, but that is not the zig side

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 22:57):

It is failing when calling mingw clang

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 22:58):

So we probably need to see what that expects and that it is the right version

view this post on Zulip Luke Boswell (Sep 23 2023 at 22:59):

Here is all of the error it gives. I think I left some important details out before

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 23:08):

I still think it is an issue on the clang side or on the zig passing info to clang. Cause I think this error is that the builtins are failing to compile. Though not really sure why it needs clang at all

view this post on Zulip Luke Boswell (Sep 23 2023 at 23:08):

It should be using the 16.0.6 version @Anton built and I've added an environment variable LLVM_SYS_160_PREFIX pointing to it right?

view this post on Zulip Luke Boswell (Sep 23 2023 at 23:09):

Does zig use clang? maybe it has a different version compared to llvm we are using?

view this post on Zulip Luke Boswell (Sep 23 2023 at 23:11):

In crates\compiler\builtins\bitcode\build.zig this might be related?

// TODO zig 0.9 can generate .bc directly, switch to that when it is released!
fn generateLlvmIrFile( ...

view this post on Zulip Brendan Hansknecht (Sep 23 2023 at 23:20):

I updated that file but didn't pay attention to comments, so that may be stale now

view this post on Zulip Luke Boswell (Sep 23 2023 at 23:25):

I'm not very familiar with zig build but it has a .use_llvm = true option which makes me think it is using llvm -- which is clang right?

view this post on Zulip Luke Boswell (Sep 23 2023 at 23:43):

Pretty sure this is an issue locally with my zig setup

view this post on Zulip Luke Boswell (Sep 23 2023 at 23:48):

I can also reproduce the error with

PS C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode> zig build
error(mingw): clang exited with code 1 and stderr: error: unknown target triple 'x86-unknown-windows-gnu', please use -triple or -arch

Semantic Analysis [1742] error(mingw): clang exited with code 1 and stderr: error: unknown target triple 'x86-unknown-windows-gnu', please use -triple or -arch

Semantic Analysis [5018] error(mingw): clang exited with code 1 and stderr: error: unknown target triple 'x86-unknown-windows-gnu', please use -triple or -arch

error: unable to generate DLL import .lib file for advapi32: ClangPreprocessorFailed

view this post on Zulip Luke Boswell (Sep 24 2023 at 00:11):

Figured out a workaround, I had been downloading zig binaries from the zig downloads page which was giving me the above error. I installed using the scoop package manager and it it now building zig ok.

view this post on Zulip Brendan Hansknecht (Sep 24 2023 at 00:12):

Interesting

view this post on Zulip Brendan Hansknecht (Sep 24 2023 at 00:12):

So use llvm is because they use the llvm backend instead of the new zig backends

view this post on Zulip Brendan Hansknecht (Sep 24 2023 at 00:12):

So not related to clang, but related to llvm the same way that roc uses llvm

view this post on Zulip Brendan Hansknecht (Sep 24 2023 at 00:13):

We need it to generate bc and llvm ir files.

view this post on Zulip Brendan Hansknecht (Sep 24 2023 at 00:13):

Or at least I assume we need it

view this post on Zulip Brendan Hansknecht (Sep 24 2023 at 00:13):

Haven't tested otherwise

view this post on Zulip Luke Boswell (Sep 24 2023 at 00:14):

Getting a different error now

error: failed to run custom build command for `roc_bitcode v0.0.1 (C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode)`

Caused by:
  process didn't exit successfully: `C:\Users\bosyl\Documents\GitHub\roc\target\release\build\roc_bitcode-1e59ee36b356bf6b\build-script-build` (exit code: 101)
  --- stdout
  cargo:rerun-if-changed=build.rs
  Compiling zig object `object` to: C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode\zig-out\builtins-host.obj
  Moving zig object `object` to: C:\Users\bosyl\Documents\GitHub\roc\target\release\build\roc_bitcode-be7313a61ad78acf\out\builtins-host.obj

  --- stderr
  An internal compiler expectation was broken.
  This is definitely a compiler bug.
  Please file an issue here: https://github.com/roc-lang/roc/issues/new/choose
  thread 'main' panicked at 'Failed to copy object file C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode\zig-out\builtins-host.obj to C:\Users\bosyl\Documents\GitHub\roc\target\release\build\roc_bitcode-be7313a61ad78acf\out\builtins-host.obj: Os { code: 2, kind: NotFound, message: "The system cannot find the file specified." }', crates\compiler\builtins\bitcode\build.rs:85:9
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
error: failed to run custom build command for `roc_bitcode v0.0.1 (C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode)`

Caused by:
  process didn't exit successfully: `C:\Users\bosyl\Documents\GitHub\roc\target\release\build\roc_bitcode-1e59ee36b356bf6b\build-script-build` (exit code: 101)
  --- stdout
  cargo:rerun-if-changed=build.rs
  Compiling zig object `object` to: C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode\zig-out\builtins-host.obj

  --- stderr
  An internal compiler expectation was broken.
  This is definitely a compiler bug.
  Please file an issue here: https://github.com/roc-lang/roc/issues/new/choose
  thread 'main' panicked at 'zig build object -Drelease=true failed with:

    install generated to builtins-host.o: error: unable to update file from 'C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode\zig-cache\o\bd7e129e78e30bd9f9a6e12129f6341d\builtins-host.obj' to 'C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode\zig-out\builtins-host.o': AccessDenied
  Build Summary: 1/3 steps succeeded; 1 failed (disable with --summary none)
  object transitive failure
  +- install generated to builtins-host.o failure
  error: the following build command failed with exit code 1:
  C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode\zig-cache\o\574ac17d089f96475a6e004e49db08e1\build.exe C:\Users\bosyl\scoop\apps\zig\current\zig.exe C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode C:\Users\bosyl\Documents\GitHub\roc\crates\compiler\builtins\bitcode\zig-cache C:\Users\bosyl\AppData\Local\zig object -Drelease=true

  ', crates\compiler\builtins\bitcode\build.rs:188:21
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

view this post on Zulip Brendan Hansknecht (Sep 24 2023 at 00:15):

Oh, do we need to change .o to .obj? I thought I already had that.

view this post on Zulip Luke Boswell (Sep 24 2023 at 00:16):

Is this issue related?

view this post on Zulip Brendan Hansknecht (Sep 24 2023 at 00:18):

Oh, actually yeah, it is failing to write the file to the output directory

view this post on Zulip Brendan Hansknecht (Sep 24 2023 at 00:18):

Though also, it is trying to name it wrong

view this post on Zulip Luke Boswell (Sep 24 2023 at 00:38):

As a workaround I changed from builtins-host.obj to builtins-host.o in crates\compiler\builtins\bitcode\build.rs and was able to finish building roc using cargo build --release --locked with no further issues. :tada:

view this post on Zulip Luke Boswell (Sep 24 2023 at 00:39):

I'm guessing we need zig to write the host as .obj when compiling on windows. There is the below, which I think is related, but not sure why it isn't working

var suffix =
        if (target.os_tag == .windows)
        "obj"
    else
        "o";

view this post on Zulip Luke Boswell (Sep 24 2023 at 00:47):

Adding the debug print below...

// Targets
const host_target = b.standardTargetOptions(.{
    .default_target = CrossTarget{
        .cpu_model = .baseline,
    },
});

std.log.debug("{?}", .{host_target});

Gives debug: zig.CrossTarget{ .. other stuff, .os_tag = null, ...}. However we use target.os_tag in generateObjectFile to determine the suffix.

view this post on Zulip Luke Boswell (Sep 24 2023 at 00:49):

From the docs, os_tag: ?Target.Os.Tag = null, null means native.

view this post on Zulip Luke Boswell (Sep 24 2023 at 01:16):

Pushed a fix for this, now builds without any issues. Running the other tests now.

view this post on Zulip Luke Boswell (Sep 24 2023 at 01:19):

All the llvm backend tests look like they are failing with the same issue error: unrecognized parameter: '--strip'

--- STDERR:              test_gen::test_gen gen_compare::list_neq_nested ---
error: unrecognized parameter: '--strip'
thread 'gen_compare::list_neq_nested' panicked at '
___________
Linking command failed with status ExitStatus(ExitStatus(1)):

  Child { stdin: None, stdout: None, stderr: None, .. }
___________
', crates\compiler\build\src\link.rs:1305:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

view this post on Zulip Luke Boswell (Sep 24 2023 at 01:25):

All the Zig tests are still passing on Windows

view this post on Zulip Luke Boswell (Sep 24 2023 at 01:36):

Fixed the linking issue for Windows by removing the "--strip" option.

view this post on Zulip Luke Boswell (Sep 24 2023 at 02:08):

I updated the github workflows for Windows to reference the correct dependencies.

view this post on Zulip Luke Boswell (Sep 24 2023 at 02:10):

Almost all the CI tests for windows pass; however there are a few cli_run tests that are now failing with the same issue I don't know the address of the libc.memcpy_decision function

---- cli_run::hello_gui stdout ----
thread 'cli_run::hello_gui' panicked at '
___________
The roc command:

  "C:\\Users\\bosyl\\Documents\\GitHub\\roc\\target\\debug\\roc.exe build --optimize C:\\Users\\bosyl\\Documents\\GitHub\\roc\\examples\\gui\\helloBROKEN.roc --"

had unexpected stderr:

  🔨 Rebuilding platform...
I don't know the address of the libc.memcpy_decision function! this may cause segfaults
I don't know the address of the libc.memcpy_decision function! this may cause segfaults

___________
', crates\cli\tests\cli_run.rs:148:13

view this post on Zulip Luke Boswell (Sep 24 2023 at 04:01):

I found this 0.10.0/release-notes which may be related and explain why we are seeing this issue now.

Started consolidating libc functions that codegen depends on into compiler-rt (#7265)
Compiler backends want to emit calls to e.g. memcpy or sqrt. These are typically provided by libc, but they might not be in the case of freestanding, or depending on which third party libc is linked against.
Therefore Zig has started to provide all possible symbols with weak linkage, meaning that these symbols will be overridden by libc if provided. This means the only necessary runtime library that Zig objects need to be linked against is compiler-rt, no matter the target.

view this post on Zulip Brendan Hansknecht (Sep 24 2023 at 04:26):

Oh, that's really cool. May Mena for some libc functions we can just depend on zig and not provide our own if we want to cut out libc.

view this post on Zulip Brendan Hansknecht (Sep 24 2023 at 04:27):

Though that depends on if they do highly optimized libc impls and if we want those

view this post on Zulip Luke Boswell (Sep 28 2023 at 01:40):

@Folkert de Vries I was working on Windows fixes for this branch with @Brendan Hansknecht and we made some good progress fixing some LLVM backend issues in this commit. We were able to find a fix for the dec issues but we weren't sure the best way to generalise it. Brendan can probably explain this much better than I. Also our assumption that .reloc is the final section for relocations doesn't hold so that it another issue. I am happy to work on this with you when you are back online and have some free time. :smile:

view this post on Zulip Brendan Hansknecht (Sep 28 2023 at 20:41):

This PR feels so close. I think we are down to just dev wasm bugs.

view this post on Zulip Agus Zubiaga (Sep 29 2023 at 00:34):

I updated to macOS Sonoma and I can't build roc anymore because zig 0.9.1 doesn't seem to support it. I naively started to update bitcode/build.zig to work with zig 0.11, but it looks like it wasn't that easy and you folks are already on it :smile:

view this post on Zulip Agus Zubiaga (Sep 29 2023 at 00:35):

I should pay more attention to Zulip

view this post on Zulip Brendan Hansknecht (Sep 29 2023 at 05:25):

Figured out a little bit with the wasm tests. Somehow, calling roc_alloc is corrupting memory. It ends up messing up bytes in random places (for example changing bytes in a constant string). That then progagates to the output and thus we have broken tests.

view this post on Zulip Brendan Hansknecht (Sep 29 2023 at 05:36):

Maybe we are linking wrong and it is leading to malloc scratch memory overlapping with some roc constants? @Brian Carroll any thoughts?

view this post on Zulip Brendan Hansknecht (Sep 29 2023 at 05:39):

I guess this also points to: first fix any linking related bugs, then turn back to look at other tests. Cause fixing the linker may end up fixing everything else.

view this post on Zulip Brian Carroll (Sep 29 2023 at 06:42):

Yeah I agree. Everything else is way more confusing when there are linking bugs so it's worth fixing them first. Dev Wasm only uses its own linking. I wonder if there's some new kind of relocation used with roc_alloc that we don't support yet...

view this post on Zulip Brian Carroll (Sep 29 2023 at 06:47):

There is just one memory block, all with the same permissions. So yeah it's possible in theory for the stack to overflow into constants. There is no segmentation fault if that happens.

Our memory organisation is:
Constants are at low addresses.
Then the stack, which grows downward.
Then the heap, which grows upwards when we extend the size of our total memory.

view this post on Zulip Brian Carroll (Sep 29 2023 at 11:49):

There's a source file called linking.rs and I think I put in comments with URLs of relevant docs.

view this post on Zulip Brian Carroll (Sep 29 2023 at 11:51):

Also copied and pasted some relevant parts of the docs

view this post on Zulip Luke Boswell (Oct 04 2023 at 05:57):

Unrelated to the current bugs but I just noticed the following comment in crates/compiler/build/src/link.rs, might be worth removing in this PR?

// some examples need the compiler-rt in the app object file.
// but including it on windows causes weird crashes, at least
// when we use zig 0.9. It looks like zig 0.10 is going to fix
// this problem for us, so this is a temporary workaround
if !target.contains("windows") {
    zig_cmd.args([
        // include the zig runtime
        "-fcompiler-rt",
    ]);
}

view this post on Zulip Brian Carroll (Oct 04 2023 at 06:46):

I would say we should first find out if we still need the code. Try deleting and see what happens. But we might want to get everything else green first so that if it goes red, we know it's because of this.
Then if we still need the code, we update the comment to reflect the current reality.
If we don't need the code, we delete code and comment.

view this post on Zulip Brendan Hansknecht (Oct 13 2023 at 01:05):

valgrind: the 'impossible' happened

view this post on Zulip Brendan Hansknecht (Oct 13 2023 at 01:06):

When valgrind itself crashes as doesn't know why... the "impossible" happened

view this post on Zulip Agus Zubiaga (Oct 13 2023 at 01:07):

#goals

view this post on Zulip Brendan Hansknecht (Oct 13 2023 at 01:07):

@Anton can you double check that valgrind is fully updated on the ci server?

view this post on Zulip Brendan Hansknecht (Oct 13 2023 at 01:08):

Whichever server would run this test: https://github.com/roc-lang/roc/actions/runs/6502213723/job/17660877316?pr=5851

view this post on Zulip Luke Boswell (Oct 13 2023 at 01:18):

I wonder if that one of the tests that was failing for the aarch64 dev backend stuff, might be related.

view this post on Zulip Luke Boswell (Oct 13 2023 at 01:42):

I checked my results and it doesn't look like it was one of the aarch64 test failures

view this post on Zulip Brendan Hansknecht (Oct 13 2023 at 02:02):

So trying to dig into the wasm issues currently. Somehow we are getting memory corruption. It happens when calling malloc.

view this post on Zulip Brendan Hansknecht (Oct 13 2023 at 02:04):

At least that is my understanding currently

view this post on Zulip Brendan Hansknecht (Oct 13 2023 at 03:53):

I assume this is just the manifestation of a linking bug, but will be interesting to dig into

view this post on Zulip Anton (Oct 13 2023 at 08:17):

Anton can you double check that valgrind is fully updated on the ci server?

I've upgraded valgrind on the CI machine from 3.18 to 3.21 and restarted the job.

view this post on Zulip Anton (Oct 13 2023 at 09:38):

The error is very similar with the new valgrind but it did add names to the stacktrace instead of ???:

host stacktrace:
==167378==    at 0x5819CDAA: getUChar (guest_amd64_toIR.c:524)
==167378==    by 0x5819CDAA: dis_ESC_NONE (guest_amd64_toIR.c:19967)
==167378==    by 0x581BEA29: disInstr_AMD64_WRK (guest_amd64_toIR.c:32515)
==167378==    by 0x581BF2D3: disInstr_AMD64 (guest_amd64_toIR.c:32683)
==167378==    by 0x5814A176: disassemble_basic_block_till_stop (guest_generic_bb_to_IR.c:956)
==167378==    by 0x5814B2C2: bb_to_IR (guest_generic_bb_to_IR.c:1365)
==167378==    by 0x5812EE72: LibVEX_FrontEnd (main_main.c:583)
==167378==    by 0x5812F7E0: LibVEX_Translate (main_main.c:1235)
==167378==    by 0x5805B0D2: vgPlain_translate (m_translate.c:1831)
==167378==    by 0x580A0679: handle_tt_miss (scheduler.c:1136)
==167378==    by 0x580A0679: vgPlain_scheduler (scheduler.c:1526)
==167378==    by 0x580E6A0D: thread_wrapper (syswrap-linux.c:102)
==167378==    by 0x580E6A0D: run_a_thread_NORETURN (syswrap-linux.c:155)

view this post on Zulip Brendan Hansknecht (Oct 13 2023 at 14:39):

So we are crashing valgrind on that machine, but not on any other machines...fun

view this post on Zulip Richard Feldman (Oct 13 2023 at 14:48):

what are the other differences between that machine and the others? OS version? CPU architecture?

view this post on Zulip Brendan Hansknecht (Oct 13 2023 at 15:15):

@Anton should know, I think he mentioned that it compiles for the specific cpu instead of generic x86 for example

view this post on Zulip Anton (Oct 13 2023 at 15:16):

That one runs without nix, so I believe some dependency difference to eventually result in this behavior. I'll do some digging.

view this post on Zulip Anton (Oct 13 2023 at 15:24):

I'm also going to a apt update && upgrade on that server, you never know...

view this post on Zulip Luke Boswell (Oct 14 2023 at 19:30):

@Anton for the bus error issue. I do know of an issue in the zig platform in roc_panic, the arg should be a *RocStr not *anyopaque. Maybe that is the issue here? I found this when working on my zig platform, but forgot to submit a PR for it as it seemed really minor.

view this post on Zulip Luke Boswell (Oct 14 2023 at 19:37):

Here is a roc_panic that works well, sorry on my phone LINK

view this post on Zulip Luke Boswell (Oct 15 2023 at 04:28):

I've been looking at the *RocStr thing and I cannot reproduce the issue I was having with the zig platform. I'm not sure how a pointer to [*:0]const u8 and a pointer to RocStr can both print out correctly.

pub const RocStr = extern struct {
    str_bytes: ?[*]u8,
    str_len: usize,
    str_capacity: usize,
}

The issue I saw was roc_panic would print out garbage but I am not seeing that on the current example at crates/cli_testing_examples/expects/expects.roc so ::man_shrugging:

view this post on Zulip Brendan Hansknecht (Oct 15 2023 at 04:44):

If the error is a small string, they would have the same format

view this post on Zulip Luke Boswell (Oct 15 2023 at 04:45):

Ah that makes sense. Can confirm that it does mess things up with a large string. I'll make the changes and push to the zig-11 branch, and maybe that will help with our bug.

view this post on Zulip Anton (Oct 15 2023 at 09:25):

I've edited the bus error issue (#5904) to clarify things. For using dbg with roc dev we use two "connected" processes. If you don't set up the two processes correctly, like when manually running the executable produced by dev, you also get a bus error. So I think something in this code is going wrong on the zig-11-llvm-16 branch on the CI machine.

view this post on Zulip Richard Feldman (Oct 22 2023 at 14:34):

this might get fixed by replacing the current dbg implementation with one that uses Inspect (once that's auto-derived, which is WIP on a branch but has bugs and doesn't work yet)

view this post on Zulip Richard Feldman (Oct 22 2023 at 14:35):

at that point we no longer have the "two connected processes" design anymore (which has been basically the entire reason we've had so many dbg issues)

view this post on Zulip Luke Boswell (Oct 24 2023 at 04:31):

It looks like all the test now pass on CI, awesome work :tada: this will make a big difference I think.

view this post on Zulip Brendan Hansknecht (Oct 24 2023 at 04:37):

Oh wow...yeah. We pushed off 2 issues for a later time (related to dbg rework and a specific wasm issue).

view this post on Zulip Brendan Hansknecht (Oct 24 2023 at 04:37):

That said, I think we are ready to merge

view this post on Zulip Brendan Hansknecht (Oct 24 2023 at 04:38):

We also may eventually need to update debugir as well, but again, that has no need to block this branch

view this post on Zulip Brendan Hansknecht (Oct 24 2023 at 04:40):

Given this is a bigger change @Richard Feldman, if you want to give a review to approve the merge, that would be great. #5851

Otherwise, if you have any concerns or blockers, please comment as so.

view this post on Zulip Brendan Hansknecht (Oct 24 2023 at 04:42):

Also, thanks @Anton, @Brian Carroll and @Folkert de Vries!

You all helped fix various bugs on this branch to make it a reality.

view this post on Zulip Brian Carroll (Oct 24 2023 at 05:23):

Woohoo! This is great! Well done to you @Brendan Hansknecht for taking on this huge piece of work and pushing through it to the end!

view this post on Zulip Richard Feldman (Oct 25 2023 at 18:50):

this is now merged!!! :heart_eyes: :heart_eyes: :heart_eyes:

view this post on Zulip Richard Feldman (Oct 25 2023 at 18:50):

thank you so much to everyone who helped make this happen!

view this post on Zulip Brendan Hansknecht (Oct 25 2023 at 19:23):

Yay!!!

view this post on Zulip Agus Zubiaga (Oct 25 2023 at 19:29):

woohoo I can build on Sonoma :smile:

view this post on Zulip Richard Feldman (Oct 25 2023 at 19:38):

cc @Joshua Warner I think this was also blocking you on macOS?

view this post on Zulip Joshua Warner (Oct 25 2023 at 19:38):

Woot!

view this post on Zulip Joshua Warner (Oct 25 2023 at 19:39):

Been very busy recently with the process of buying and managing renovations on a house, but will look soon.

view this post on Zulip Richard Feldman (Oct 25 2023 at 19:52):

whoa, awesome - congrats on the house! :smiley:


Last updated: Jul 06 2025 at 12:14 UTC