rust 1.67 upgrade · compiler development

The upgrade to rust 1.67 somehow breaks the wasm repl. We get an invalid instruction byte (the OpCode type). I don't see why incorrect wasm would be produced now.

I ran wasm-objdump on the compiler wasm blob (so the webassembly that the rust compiler generates from our code base) and it does not report any problems. I'd assume it would error out when it hits some invalid instruction.

the basic_1_field_f64_record test fails with the following backtrace / invalid instruction byte

706d29 GETLOCAL 0        [I32(2ac7f0)]
706d2b I32CONST 0        [I32(2ac7f0), I32(0)]
706d2d GETLOCAL 2        [I32(2ac7f0), I32(0), I32(c)]

706d2f CALL 5420          args=[I32(2ac7f0), I32(0), I32(c)] return_type=Some(I32)

708df0 BLOCK 64          []
708df2 GETLOCAL 2        [I32(c)]
708df4 I32CONST 33       [I32(c), I32(21)]
708df6 I32LTU            [I32(1)]
thread 'tests::basic_1_field_f64_record' panicked at 'invalid instruction byte 0xfc', crates/wasm_module/src/opcodes.rs:373:21

This is confusing because when I dump the wasm that is executed to a file, the program at those memory locations is very different

 706d2f: 10 ac aa 80 80 00          |   call 5420 <memset>

 708df0: 41 01                      |   i32.const 1
 708df2: 6a                         |   i32.add
 708df3: 10 a7 aa 80 80 00          |   call 5415
 708df9: 1a                         |   drop
 708dfa: 41 00                      |   i32.const 0
 708dfc: 21 03                      |   local.set 3

708deb func[5420] <memset>:
 708dec: 03 7f                      | local[0..2] type=i32
 708dee: 01 7e                      | local[3] type=i64
 708df0: 02 40                      | block
 708df2: 20 02                      |   local.get 2
 708df4: 41 21                      |   i32.const 33
 708df6: 49                         |   i32.lt_u

so something gets messed up in our wasm codegen somewhere? and for some reason a rust compiler upgrade surfaces the problem. Is there some unsoundness? or maybe the wasm blob that rust generates has a subtly different structure now?

I'll continue investigating (I hope a diff between what main generates and what the 1.67 branch generates proves helpful)

Brian Carroll (Jul 08 2023 at 07:07):

Brian Carroll (Jul 08 2023 at 07:08):

I assume this is using our own interpreter, right? The dump looks like the one I wrote.

Brian Carroll (Jul 08 2023 at 07:11):

So internally the program counter is an index into the code section of the binary, not an index into the full Wasm binary. When we print out an error dump we adjust it to display it relative to the full binary, in an attempt to match wasm-objdump. There have been bugs there before and there might be some remaining.

Brian Carroll (Jul 08 2023 at 07:13):

So let's assume we can't rely on the code addresses in the error dump. I would search the dump for the instruction CALL 5420 and find a place where the surrounding code matches the dump.

Brian Carroll (Jul 08 2023 at 07:14):

Also might be worth running it in some 3rd party Wasm interpreter if you have one installed.

Brian Carroll (Jul 08 2023 at 07:14):

Brian Carroll (Jul 08 2023 at 07:17):

It's possible that the interpreter is miscalculating the number of bytes in some instruction and therefore losing its place in the instruction stream.

Brian Carroll (Jul 08 2023 at 07:18):

And maybe the previous Rust version didn't use that instruction for some reason.

Folkert de Vries (Jul 08 2023 at 11:04):

is there some easy way to export the program so it is runnable with an external interpreter? I'm using the WasmModule serialize right now but that is not quite a complete program it looks like

> wasmtime run --invoke entrypoint_from_test full-new.wasm
Error: failed to run main module `full-new.wasm`

Caused by:
    0: failed to instantiate "full-new.wasm"
    1: unknown import: `env::test_run_app` has not been defined

Folkert de Vries (Jul 08 2023 at 11:41):

706d29 GETLOCAL 0        [I32(2ac7f0)]
706d2b I32CONST 0        [I32(2ac7f0), I32(0)]
706d2d GETLOCAL 2        [I32(2ac7f0), I32(0), I32(c)]

706d2f CALL 5420          args=[I32(2ac7f0), I32(0), I32(c)] return_type=Some(I32)

 706d42: 20 00                      |   local.get 0
 706d44: 41 00                      |   i32.const 0
 706d46: 20 02                      |   local.get 2
 706d48: 10 ac aa 80 80 00          |   call 5420

so our interpreter seems to think that function[5420] is in a different place than it actually is, right? because in practice it is here

708e04 func[5420]:
 708e05: 03 7f                      | local[0..2] type=i32
 708e07: 01 7e                      | local[3] type=i64
 708e09: 02 40                      | block

708df0 BLOCK 64          []
708df2 GETLOCAL 2        [I32(c)]
708df4 I32CONST 33       [I32(c), I32(21)]
708df6 I32LTU            [I32(1)]

it is off by 20 bytes. Could it be relevant that that is how many bytes of locals we have here (3 * 4 + 8)?

Folkert de Vries (Jul 08 2023 at 11:56):

I'm pretty sure it is this is where the program counter is set to an incorrect value

            self.program_counter = module.code.function_offsets[internal_fn_index] as usize;

the offset this gives (corrected by the section offset) is somewhere in the middle of a function

Folkert de Vries (Jul 08 2023 at 15:55):

allright, found the problem. There are some (new?) wasm instructions that our interpreter does not have yet

Folkert de Vries (Jul 08 2023 at 15:55):

Brian Carroll (Jul 08 2023 at 17:09):

Yes that could be relevant for sure. It depends specifically what you mean by "our interpreter thinks it is here"! Expected behaviour is that it reads the local variable declarations, allocates space for them, then resumes executing instructions at the next byte. So if you are just dumping the program counter it sounds like normal behaviour.

Brian Carroll (Jul 08 2023 at 17:10):

Brian Carroll (Jul 08 2023 at 17:11):

I've never tried that or looked it up or anything, just saying it might be possible :shrug:

Brian Carroll (Jul 08 2023 at 17:13):

Though of course we could then immediately run into another unsupported instruction!

Folkert de Vries (Jul 08 2023 at 17:26):

Folkert de Vries (Jul 08 2023 at 18:41):

PR https://github.com/roc-lang/roc/pull/5651. quite the debugging journey. turns out that the original error was correct; it was just an instruction that we did not implement. Also, there should be a panic in such cases, but we just get UB. That is unfortunate, because the safe alternative is _much_ slower. So, just good to know I guess for next time, I don't want to make things slower really

Brian Carroll (Jul 08 2023 at 19:58):

Hmm the main match produces a Result with an error enum and eventually panics. We could try that. Or is that already what you did?

Folkert de Vries (Jul 08 2023 at 20:00):

impl From<u8> for OpCode {
    fn from(x: u8) -> Self {
        unsafe { std::mem::transmute(x) }
    }
}

and putting in an invalid value here (for which there is no OpCode variant) is just immediately UB

Folkert de Vries (Jul 08 2023 at 20:03):

when you wrote it, you did some testing and it did not cause issues immediately and this branch here would catch some errors

        // Catch-all in case of an invalid cast from u8 to OpCode while parsing binary
        // (rustc keeps this code, I verified in Compiler Explorer)
        #[allow(unreachable_patterns)]
        _ => return Err(format!("Unknown Wasm instruction 0x{:02x}", op as u8)),

but that function is not in all paths. I think the rust compiler is now stricter in the conversion than when the code was first written, which means that an invalid conversion by that unsafe code causes some compilation that (on my system) will just crash (it hits an illegal instruction, but it's UB so it could just do whatever)

Folkert de Vries (Jul 08 2023 at 20:04):

but, any other conversion mechanism (certainly a match, i tried that today, I believe we also tried a lookup table at the time?) are much, much slower than this unsafe cast

Folkert de Vries (Jul 08 2023 at 20:05):

and because we control the input for our interpreter, I think it is OK to cut this corner. but for the future: when new instructions are added, our wasm interpreter will break

Brian Carroll (Jul 09 2023 at 12:47):

Brian Carroll (Jul 09 2023 at 12:50):

Hmm I don't _think_ we tried defining a static array of Option<OpCode> and looking it up.

Folkert de Vries (Jul 09 2023 at 13:12):

we can test that separately, it doesn't really matter for the current fix, which is the final blocker for the 1.67 upgrade.

Folkert de Vries (Jul 09 2023 at 17:59):

I took a wasm binary (the roc compiler blob) and filtered out all invalid instruction bytes. Then I have two loops summing the instruction byte values. Here are the runtimes

[benches/conversion.rs:14] input.len() = 8084815
unsafe cast             time:   [10.461 ms 10.479 ms 10.499 ms]
                        change: [-0.0001% +0.2507% +0.4915%] (p = 0.05 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

lookup table            time:   [12.507 ms 12.524 ms 12.543 ms]
                        change: [-0.1140% +0.0861% +0.2867%] (p = 0.42 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

[benches/conversion.rs:43] x = Some(
    386028267,
)
[benches/conversion.rs:43] y = Some(
    386028267,
)

so, it's slower, but not that much. Not sure what to do with that, but now we have some actual numbers

Folkert de Vries (Jul 09 2023 at 18:04):

I guess really that the conversion of byte opcode is not the bottleneck, given that we can do it 8 million times in ~10ms. that becoming 12ms is not really a big slowdown, and other parts of the interpreter loop are much slower than that

Brian Carroll (Jul 10 2023 at 12:41):

OK so the change gives us better errors for unsupported instructions and doesn't have much of a performance penalty. Sounds great.

Folkert de Vries (Jul 10 2023 at 17:05):

hmm, do we accept this? this is in debug mode, so I guess it's fine? 30 seconds delta

Executed in  415.85 secs   fish           external
   usr time   48.30 mins  797.39 millis   48.29 mins
   sys time    0.13 mins  154.94 millis    0.13 mins
________________________________________________________
Executed in  384.09 secs   fish           external
   usr time   46.50 mins  819.00 micros   46.50 mins
   sys time    0.08 mins    0.00 micros    0.08 mins

Brian Carroll (Jul 10 2023 at 17:16):

So this is still the test that sums up instruction bytes, right? Hard to make a call based on that alone. I was assuming that we'd see little difference in some of our real use cases like the tests for the Wasm REPL.

Folkert de Vries (Jul 10 2023 at 17:18):

Brian Carroll (Jul 10 2023 at 17:18):

OK when we did analysis on this kind of thing before we used release so that it would be representative of CI

Stream: compiler development

Topic: rust 1.67 upgrade

Folkert de Vries (Jul 07 2023 at 19:00):

Brian Carroll (Jul 08 2023 at 07:07):

Brian Carroll (Jul 08 2023 at 07:08):

Brian Carroll (Jul 08 2023 at 07:11):

Brian Carroll (Jul 08 2023 at 07:13):

Brian Carroll (Jul 08 2023 at 07:14):

Brian Carroll (Jul 08 2023 at 07:14):

Brian Carroll (Jul 08 2023 at 07:17):

Brian Carroll (Jul 08 2023 at 07:18):

Folkert de Vries (Jul 08 2023 at 11:04):

Folkert de Vries (Jul 08 2023 at 11:41):

Folkert de Vries (Jul 08 2023 at 11:56):

Folkert de Vries (Jul 08 2023 at 15:55):

Folkert de Vries (Jul 08 2023 at 15:55):

Brian Carroll (Jul 08 2023 at 17:09):

Brian Carroll (Jul 08 2023 at 17:10):

Brian Carroll (Jul 08 2023 at 17:10):

Brian Carroll (Jul 08 2023 at 17:11):

Brian Carroll (Jul 08 2023 at 17:13):

Brian Carroll (Jul 08 2023 at 17:13):

Folkert de Vries (Jul 08 2023 at 17:26):

Folkert de Vries (Jul 08 2023 at 17:26):

Folkert de Vries (Jul 08 2023 at 18:41):

Brian Carroll (Jul 08 2023 at 19:58):

Folkert de Vries (Jul 08 2023 at 20:00):

Folkert de Vries (Jul 08 2023 at 20:03):

Folkert de Vries (Jul 08 2023 at 20:04):

Folkert de Vries (Jul 08 2023 at 20:05):

Brian Carroll (Jul 09 2023 at 12:47):

Brian Carroll (Jul 09 2023 at 12:50):

Folkert de Vries (Jul 09 2023 at 13:12):

Folkert de Vries (Jul 09 2023 at 17:59):

Folkert de Vries (Jul 09 2023 at 18:04):

Brian Carroll (Jul 10 2023 at 12:41):

Folkert de Vries (Jul 10 2023 at 17:05):

Brian Carroll (Jul 10 2023 at 17:16):

Folkert de Vries (Jul 10 2023 at 17:18):

Folkert de Vries (Jul 10 2023 at 17:18):

Brian Carroll (Jul 10 2023 at 17:18):

Folkert de Vries (Jul 10 2023 at 17:19):

Folkert de Vries (Jul 10 2023 at 17:19):