Stream: beginners

Topic: troubleshooting trunk tests


view this post on Zulip Emi (Feb 20 2022 at 19:51):

Howdy! I'm having some trouble getting tests to complete on trunk (4990fb3). I'm running with earthly +test-all with Docker v20.10.12 on Fedora and getting several tests failing with the message:

          +test-rust *failed* | valgrind stderr was: "runtime: vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0x7C 0x48 0x28 0xD 0x2D 0x69 0x1 0x0
          +test-rust *failed* | vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
          +test-rust *failed* | vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
          +test-rust *failed* | vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
          +test-rust *failed* | ==12150== valgrind: Unrecognised instruction at address 0x10e809.
          +test-rust *failed* | ==12150== Your program just tried to execute an instruction that Valgrind
          +test-rust *failed* | ==12150== did not recognise.  There are two possible reasons for this.
          +test-rust *failed* | ==12150== 1. Your program has a bug and erroneously jumped to a non-code
          +test-rust *failed* | ==12150==    location.  If you are running Memcheck and you just saw a
          +test-rust *failed* | ==12150==    warning about a bad jump, it's probably your program's fault.
          +test-rust *failed* | ==12150== 2. The instruction is legitimate but Valgrind doesn't handle it,
          +test-rust *failed* | ==12150==    i.e. it's Valgrind's fault.  If you think this is the case or
          +test-rust *failed* | ==12150==    you are not sure, please let us know and we'll try to fix it.
          +test-rust *failed* | ==12150== Either way, Valgrind will now raise a SIGILL signal which will
          +test-rust *failed* | ==12150== probably kill your program.
          +test-rust *failed* | "', cli/tests/cli_run.rs:153:17
          +test-rust *failed* | stack backtrace:
          +test-rust *failed* |    0: rust_begin_unwind
          +test-rust *failed* |              at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/panicking.rs:517:5
          +test-rust *failed* |    1: std::panicking::begin_panic_fmt
          +test-rust *failed* |              at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/std/src/panicking.rs:460:5
          +test-rust *failed* |    2: cli_run::cli_run::check_output_with_stdin
          +test-rust *failed* |    3: core::ops::function::FnOnce::call_once
          +test-rust *failed* |    4: serial_test::serial_core
          +test-rust *failed* |    5: core::ops::function::FnOnce::call_once
          +test-rust *failed* |              at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/core/src/ops/function.rs:227:

view this post on Zulip Emi (Feb 20 2022 at 19:51):

any tips on how to deal with this?

view this post on Zulip Brendan Hansknecht (Feb 20 2022 at 19:54):

I think that most people don't actually run tests with earthly, that is mostly used for CI.

view this post on Zulip Brendan Hansknecht (Feb 20 2022 at 19:54):

As for the actually issue, what cpu/architecture are you running on?

view this post on Zulip Emi (Feb 20 2022 at 19:54):

This is an intel x86_64 (11th Gen Intel(R) Core(TM) i3-1115G4)

view this post on Zulip Emi (Feb 20 2022 at 19:55):

also good to know! is cargo test a better way to do it?

view this post on Zulip Emi (Feb 20 2022 at 19:55):

i had been running the earthly because that's what's in the CONTRIBUTING.md

view this post on Zulip Brendan Hansknecht (Feb 20 2022 at 20:05):

cargo test should be good

view this post on Zulip Emi (Feb 20 2022 at 20:06):

thanks!

view this post on Zulip Emi (Feb 20 2022 at 20:23):

I'm actually getting the same error with running tests outside of Docker? Command is cargo test cli_run

view this post on Zulip Emi (Feb 20 2022 at 20:24):

wait no slightly different

view this post on Zulip Emi (Feb 20 2022 at 20:24):

might be able to debug this one on my own, hold on

view this post on Zulip Emi (Feb 20 2022 at 20:29):

okay no, im out of my league here, sorry

view this post on Zulip Emi (Feb 20 2022 at 20:29):

just running the zig test (to minimize output), I get this error message:

---- cli_run::hello_zig stdout ----
thread 'cli_run::hello_zig' panicked at '`valgrind` exited with no exit code. valgrind stdout was: ""

valgrind stderr was: "vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFE 0x48 0x6F 0x5 0xD6 0xDC 0x4 0x0
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==106615== valgrind: Unrecognised instruction at address 0x114c88.
==106615== Your program just tried to execute an instruction that Valgrind
==106615== did not recognise.  There are two possible reasons for this.
==106615== 1. Your program has a bug and erroneously jumped to a non-code
==106615==    location.  If you are running Memcheck and you just saw a
==106615==    warning about a bad jump, it's probably your program's fault.
==106615== 2. The instruction is legitimate but Valgrind doesn't handle it,
==106615==    i.e. it's Valgrind's fault.  If you think this is the case or
==106615==    you are not sure, please let us know and we'll try to fix it.
==106615== Either way, Valgrind will now raise a SIGILL signal which will
==106615== probably kill your program.
"', cli/tests/cli_run.rs:153:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

view this post on Zulip Emi (Feb 20 2022 at 20:30):

the binaries produced seem to run find on their own
(when i run the hello-world binary, i get the expected output)

╰─$ cargo run examples/hello-zig/Hello.roc                                                                                                                                               ↵ 101
    Finished dev [unoptimized + debuginfo] target(s) in 0.14s
     Running `target/debug/roc examples/hello-zig/Hello.roc`
🔨 Rebuilding host... Done!
Hello, World!
runtime: 0.020ms

view this post on Zulip Emi (Feb 20 2022 at 20:31):

but running it with valgrind produces this output

==107064== Memcheck, a memory error detector
==107064== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==107064== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==107064== Command: examples/hello-zig/hello-world
==107064==
vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFE 0x48 0x6F 0x5 0xD6 0xDC 0x4 0x0
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==107064== valgrind: Unrecognised instruction at address 0x114c88.
==107064==    at 0x114C88: std.debug.attachSegfaultHandler (debug.zig:1763)
==107064==    by 0x113428: std.debug.maybeEnableSegfaultHandler (debug.zig:1748)
==107064==    by 0x111AD2: std.start.callMainWithArgs (start.zig:366)
==107064==    by 0x111882: main (start.zig:383)
==107064== Your program just tried to execute an instruction that Valgrind
==107064== did not recognise.  There are two possible reasons for this.
==107064== 1. Your program has a bug and erroneously jumped to a non-code
==107064==    location.  If you are running Memcheck and you just saw a
==107064==    warning about a bad jump, it's probably your program's fault.
==107064== 2. The instruction is legitimate but Valgrind doesn't handle it,
==107064==    i.e. it's Valgrind's fault.  If you think this is the case or
==107064==    you are not sure, please let us know and we'll try to fix it.
==107064== Either way, Valgrind will now raise a SIGILL signal which will
==107064== probably kill your program.
==107064==
==107064== Process terminating with default action of signal 4 (SIGILL): dumping core
==107064==  Illegal opcode at address 0x114C88
==107064==    at 0x114C88: std.debug.attachSegfaultHandler (debug.zig:1763)
==107064==    by 0x113428: std.debug.maybeEnableSegfaultHandler (debug.zig:1748)
==107064==    by 0x111AD2: std.start.callMainWithArgs (start.zig:366)
==107064==    by 0x111882: main (start.zig:383)
==107064==
==107064== HEAP SUMMARY:
==107064==     in use at exit: 0 bytes in 0 blocks
==107064==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==107064==
==107064== All heap blocks were freed -- no leaks are possible
==107064==
==107064== For lists of detected and suppressed errors, rerun with: -s
==107064== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
fish: Job 1, 'valgrind examples/hello-zig/hel…' terminated by signal SIGILL (Illegal instruction)

view this post on Zulip Emi (Feb 20 2022 at 20:32):

valgrind is valgrind-3.18.1 from the Fedora repos

view this post on Zulip Folkert de Vries (Feb 20 2022 at 20:40):

that kinda looks like a zig issue then

view this post on Zulip Folkert de Vries (Feb 20 2022 at 20:41):

how does hello-rust do?

view this post on Zulip Brendan Hansknecht (Feb 20 2022 at 21:59):

This likely means that zig is generating an instruction that valgrind doesn't know about. That has happened to me in the past, but it is pretty rare. Though it is totally possible especially with cpu target of native/host.

view this post on Zulip Brendan Hansknecht (Feb 20 2022 at 22:00):

May mean they need a newer version of valgrind to run the tests, or tell zig to target an older cpu.

view this post on Zulip Folkert de Vries (Feb 20 2022 at 22:04):

3.18.1 seems to be the most recent https://valgrind.org/downloads/current.html

view this post on Zulip Brian Carroll (Feb 20 2022 at 22:44):

We're still on Zig 0.8.1 rather than the latest 0.9.1, right? What's your zig version, @Emi ?

view this post on Zulip Emi (Feb 20 2022 at 22:44):

Oh yes! It's just the zig examples, I don't know how I missed that, thank you! I'm running zig 0.8.1, but I'll try messing around with valgrind versions and seeing if I can figure out how to target different cpus, like Brendan said

view this post on Zulip Brian Carroll (Feb 20 2022 at 22:46):

Also just FYI, I've never run valgrind on this project but I'm a regular contributor!

view this post on Zulip Emi (Feb 20 2022 at 22:46):

oh fair! what do you do for tests? just skip the valgrind ones?

view this post on Zulip Folkert de Vries (Feb 20 2022 at 22:51):

a strategy is to just run the tests that work and have CI do valgrind

view this post on Zulip Folkert de Vries (Feb 20 2022 at 22:51):

depending on what you do, it's probably very rare you break the valgrind tests

view this post on Zulip Emi (Feb 20 2022 at 22:52):

gotcha, thanks!

view this post on Zulip Folkert de Vries (Feb 20 2022 at 22:55):

so something you could do is have valgrind just do nothing (e.g. make a script with the name "valgrind" that just calls the binary)

view this post on Zulip Folkert de Vries (Feb 20 2022 at 22:55):

if the problem persists of course, it might also be fixed when we upgrade to zig 0.9.1

view this post on Zulip Emi (Feb 20 2022 at 23:10):

I think I'll probably do something just like that

view this post on Zulip Emi (Feb 20 2022 at 23:15):

thanks everyone for the help!

view this post on Zulip Brendan Hansknecht (Feb 20 2022 at 23:27):

Related to this: should we change the contributing docs to say earthly is for CI mostly and that contributors can just do cargo test

view this post on Zulip Anton (Feb 21 2022 at 08:25):

I made a PR for this

view this post on Zulip Aaron (Feb 22 2022 at 20:03):

this worked for me on OSX 10.15.7

brew tap LouisBrunner/valgrind
brew install --HEAD LouisBrunner/valgrind/valgrind

view this post on Zulip Brendan Hansknecht (Feb 22 2022 at 22:42):

Yeah, if you have older macos and it is x86, that does work.


Last updated: Jul 06 2025 at 12:14 UTC