mono mismatch between mac aarch64 and rest of targets · bugs

Stream: bugs

Topic: mono mismatch between mac aarch64 and rest of targets

shua (Jan 15 2025 at 17:39):

https://github.com/roc-lang/roc/pull/7514 is currently failing CI due to one of the mono tests failing. Unfortunately, it seems mac aarch64 outputs different mono than the rest of the targets. It seems there's some off-by-one difference in Derived_gen identifiers that only shows up in one example. If we fix it for mac aarch64, it breaks the other targets, if we fix the other targets, it breaks for mac aarch64.

Is there host/arch-specific logic in the mono pass? I searched for any code with #[cfg(.*target_os = "macos" and nothing jumped out at me, but that's about as far as I knew to look.

shua (Jan 15 2025 at 17:41):

the CI run where mac aarch64 fails on mono mismatch: https://github.com/roc-lang/roc/actions/runs/12778134753/job/35620405661
the CI run where everything else fails on mono mismatch: https://github.com/roc-lang/roc/actions/runs/12778510523/job/35623205009

Sam Mohr (Jan 15 2025 at 17:43):

Nix calls the architecture darwin

Sam Mohr (Jan 15 2025 at 17:43):

Maybe Rust does the same?

Sam Mohr (Jan 15 2025 at 17:44):

But yeah, it's very rare for us to do something for Macs, it's more often Unix filtering

shua (Jan 15 2025 at 17:51):

nah, no results with cfg\(.*darwin, it looks like target_os = "macos" is the rust cfg used in roc to conditionally do things for mac.

just because I didn't see anything doesn't mean someone with a better understanding of mono or preceding passes wouldn't. I don't have a good idea how/where Derived_gen identifiers are even created or where that number comes from in eg #Derived_gen.22

Anton (Jan 15 2025 at 18:19):

Derived_gen.21 does not exist in https://github.com/roc-lang/roc/blob/144b02939d8ac04b32ae1a932ddc6f722adef01e/crates/compiler/test_mono/generated/inspect_derived_tag_one_field_string.txt

Anton (Jan 15 2025 at 18:19):

That seems like a bug

Anton (Jan 15 2025 at 18:20):

Skipping a number seems very suspicious

Anton (Jan 15 2025 at 18:49):

I can investigate on Friday if you don't have a mac @shua

shua (Jan 15 2025 at 19:25):

Thanks, I do not have a mac. I'll still keep reading mono source just for the heck of it

shua (Jan 15 2025 at 19:29):

just to sanity check: "apple-silicon" would follow "aarch64" codepaths right? not, for instance, "aarch32"

Richard Feldman (Jan 15 2025 at 19:31):

that's correct

Anton (Jan 17 2025 at 19:00):

I have not figured this out yet but will continue tomorrow @shua

shua (Jan 17 2025 at 19:10):

Anton said:

Skipping a number seems very suspicious

Is skipping a number suspicious only for #Derived_gen module or for all? I ask because, eg #Derived.1 and #Derived.3 are in that output but #Derived.2 is nowhere to be found.

shua (Jan 18 2025 at 01:05):

best I've managed to dig:

the differing function is Str.replace_each_help (or Str.61 in the non-prettified output)
the #Derived_gen.20 and #Derived_gen.21 idents are (on my linux machine) generated in apply_trmc for the two recursive functions Str.first_match_help/Str.63 and Str.replace_each_help/Str.61 respectively. Whatever fishy is happening has to be happening after Str.first_match_help arg symbols are generated but before Str.replace_each_help arg symbols are generated
apply_trmc is looping over procs, and holding a mut reference to the IdentIds struct which generates #Derived_gen idents, so I can't see how anything else could be incrementing the counter there, it must be something in apply_trmc loop
procs have to be iterated over in the same order on linux and mac osx, otherwise the generated idents might differ

I've added a dbg!(_) to the current branch to hopefully get enough info from a failing test run to tell what might be different between mac and linux.

shua (Jan 18 2025 at 08:00):

Thanks @Luke Boswell for the mac output. It looks like on mac, the order in which procs are processed is changed. So Str.matches_at_help is processed, creates #Derived_gen.21 then throws the result away, then Str.replace_each_help is processed. On linux that order is switched, and Str.matches_at_help creates #Derived_gen.25 but since it's thrown away, you don't see that in the output.

That is the only difference in the order of how procs are processed between the two platforms for the test_mono inspect_derived_tag_one_field_string test.

Now to figure out why the order changes.

Luke Boswell (Jan 18 2025 at 09:19):

Glad you found that. Sounds puzzling

Anton (Jan 18 2025 at 10:20):

Yeah, that's some great info

shua (Jan 18 2025 at 12:04):

trying to find any info if HashMap with wyhash::WyHash is stable across cpu/os targets or not. There's an implicit requirement in our code that the procs are iterated in the same order, but we're just calling HashMap::values which I don't think gives much guarantees in terms of ordering.

Anton (Jan 18 2025 at 12:35):

I think @Brendan Hansknecht knows all bout that

Anton (Jan 18 2025 at 14:35):

we're just calling HashMap::values

Can you link to this line of code @shua?

Anton (Jan 18 2025 at 14:36):

We could try sorting them and see if it fixes our issue.

Anton (Jan 18 2025 at 14:38):

Anton said:

we're just calling HashMap::values

Can you link to this line of code @shua?

Is it here?

Anton (Jan 18 2025 at 14:56):

We could try sorting them and see if it fixes our issue.

Sorting comes with a perf cost, maybe we should just ignore diffs where only the numbers are different? That seems pretty reasonable

Ayaz Hafiz (Jan 18 2025 at 15:55):

that would hide bugs where there are symbol collisions

Anton (Jan 18 2025 at 15:55):

Ayaz Hafiz (Jan 18 2025 at 15:56):

i think sorting is okay, can revisit it if it becomes a big problem or do it only in tests

Anton (Jan 18 2025 at 15:57):

A lot of Eq, Ord, PartialOrd would have to be added all over the place to be able to sort, WyHash allows passing a seed, I'm looking into that now

shua (Jan 18 2025 at 15:58):

we could try using indexmap or btreemap, something that does guarantee iteration based on insertion order (which should be the same I guess?)

shua (Jan 18 2025 at 15:59):

Anton said:

A lot of Eq, Ord, PartialOrd would have to be added all over the place to be able to sort, WyHash allows passing a seed, I'm looking into that now

We could sort_by_key to avoid those bounds (as long as you can get some sortable key from the keys or values)

shua (Jan 18 2025 at 16:03):

^at least, Vec::sort_by_key exists, not sure if that's what yous were thinking

Brendan Hansknecht (Jan 18 2025 at 17:25):

Yeah, if you need stable insertion order, I would switch to index map

Brendan Hansknecht (Jan 18 2025 at 17:27):

Wyhash generally uses the same hash for given bit widths, but it definitely differs on 32bit and could differ in other cases.

Brendan Hansknecht (Jan 18 2025 at 17:28):

All that said, feels quite surprising if this is the very first time we hit an ordering difference, but it's possible

Anton (Jan 18 2025 at 19:33):

Alright, my sort is pretty hacky but that fixed it. I'll push my code later tonight and will make an issue to switch to an index map.

Anton (Jan 18 2025 at 22:17):

Fix in PR#7532
Index map issue: #7531

shua (Jan 18 2025 at 23:24):

Brendan Hansknecht said:

All that said, feels quite surprising if this is the very first time we hit an ordering difference, but it's possible

It's strange to me that something as wide-ranging as hashing differences would only show up by swapping two elements in a single test run. I would expect to be seeing a bunch more things out of order

shua (Jan 18 2025 at 23:34):

I changed the dbg's in my PR, because I want to check

that the keys are hashing the same between linux and mac, and
that the insertion into the map is happening in the same order

if the problem is 1, then something like indexmap would help, but if it's 2 then indexmap would not fix this issue and I'd have to look further back in the flow to see where they diverge.

Specifically, my ask is someone with an arm mac please pull the latest from that branch, run cargo test -p test_mono inspect_derived_tag_one -- --nocapture >dbg-mac.out 2>&1 and then send me dbg-mac.out file.

edit: 2>&1 was in the wrong place

shua (Jan 18 2025 at 23:35):

gotta sign off soon but feel free to add a :eyes: to that comment if you're doing it so two people don't duplicate work

shua (Jan 18 2025 at 23:36):

ha, comment was ninja'd by @Luke Boswell , thanks for looking into this with me!

Luke Boswell (Jan 18 2025 at 23:36):

dbg-mac.out

shua (Jan 18 2025 at 23:37):

ah dang, I got my shell args out of order, can you run

cargo test -p test_mono inspect_derived_tag_one -- --nocapture >dbg-mac.out 2>&1

instead?

Luke Boswell (Jan 18 2025 at 23:38):

dbg-mac.out

shua (Jan 18 2025 at 23:43):

oh weird, the dbg info is identical, so I can't find any difference in how the keys are hashed, nor any difference in the insertion order. Still intrigued, but this hopefully give more evidence that using something like indexmap to store the procedures would fix this issue, because it guarantees iteration over the insertion order (and insertion order is identical).

Brendan Hansknecht (Jan 19 2025 at 00:54):

Intriguing. You would expect the keys would need to hash differently for the iteration order to change. That is assuming:

The hash tables are the same size (thus bucket indices are the same)
The hash table isn't intentionally random iteration order like with go.
The hash tables has no other architecture specific differences we are missing.

Brendan Hansknecht (Jan 19 2025 at 00:54):

So I'm still quite surprised something is causing a difference here. Cause I think all 3 of those points should be the same between machines.

Richard Feldman (Jan 19 2025 at 01:45):

as an aside, this is making me feel great about the design decision to have Roc's Dict iterate in insertion order :big_smile:

Brendan Hansknecht (Jan 19 2025 at 01:50):

Yeah, I think you really need to either do insertion order or randomized (like in go). I guess what absl does is also kinda ok. Randomized in debug builds and arbitrary but consistent in release builds.

Brendan Hansknecht (Jan 19 2025 at 01:51):

Also, randomized doesn't have to be truly random. Can just be starting at a random point in the middle and then looping back to the front.

shua (Jan 19 2025 at 12:01):

Rust's default HashMap uses RandomState which initializes with seed from system random, but we override it in roc to be WyHash::default() which starts with the same seed (the 0 value) every time. Maybe it would be a good idea to seed our WyHash instances with random numbers to flush out these incorrect assumptions on iteration order?

shua (Jan 19 2025 at 12:04):

.. trying that now to see how many tests fail

shua (Jan 21 2025 at 17:15):

spicy, we start hitting a couple unreachable!'s in inc_dec and mono/src/borrow.rs if we make MutMap use std::collections::hash_map::RandomState

So I guess there is an assumption that all MutMaps iterate in the same order (if inserted in the same order I guess)?

shua (Jan 21 2025 at 17:21):

I'm going to try making a single seed that all MutMaps WyHash state start with, but which isn't the default 0, that should hopefully give less errors, but maybe might still give some if there's an implicit assumption that iteration is the same across multiple executions.

shua (Jan 21 2025 at 19:32):

https://github.com/roc-lang/roc/blob/5b4c8e70d873e7f5e30f995aedea0199bf1f99b1/crates/compiler/gen_dev/src/lib.rs#L363-L364

also found this nice comment, which was wrong at the time. WyHash's default value, unlike RandomState is not random but the zero value.

shua (Jan 21 2025 at 23:56):

that was an interesting chase, but it looks like another part of the code (the lambda set interner) is under the assumption that maps, etc _should_ hash consistently with the default BuildHasher. https://github.com/roc-lang/roc/blob/a8def1a974a6dacdcd18656e66114880f4e92b59/crates/compiler/mono/src/layout/intern.rs#L585-L591

So we cannot change BuildHasher to be something like RandomState (ie two calls to BuildHasher::build_hasher will return two differently behaving hashers) because that breaks some assumptions in the lambda set interner, which expects every call to BuildHasher::hash_one to be the same for the same input.

shua (Jan 21 2025 at 23:57):

doesn't explain why mac aarch64 was swapping elements, but it was an interesting dive.

Brendan Hansknecht (Jan 22 2025 at 00:00):

This has been a really interesting read. Definitely seems that we have a solid amount of code with implicit assumptions around hashing

shua (Jan 22 2025 at 00:22):

honourable mention to:

calling .values().enumerate() https://github.com/roc-lang/roc/blob/a8def1a974a6dacdcd18656e66114880f4e92b59/crates/compiler/mono/src/borrow.rs#L403-L415 and then using .values().nth(i) later https://github.com/roc-lang/roc/blob/a8def1a974a6dacdcd18656e66114880f4e92b59/crates/compiler/mono/src/borrow.rs#L146

shua (Jan 22 2025 at 08:18):

I think the last thing I want to look into for this is how many tests fail with a different global seed on program start, where before it was effectively a different seed for every hashmap. I expect a bunch of test_mono tests to fail, and maybe fixed by replacing MutMap usage with VecMap or something that guarantees consistent iteration.

Brendan Hansknecht (Jan 22 2025 at 19:27):

Yeah, I think we should switch over to indexmap for current uses of mutmap.

Brendan Hansknecht (Jan 22 2025 at 19:28):

Vecmap won't scale for large maps due to being a linear scan to lookup

Anton (Jan 23 2025 at 09:16):

Brendan Hansknecht said:

Yeah, I think we should switch over to indexmap for current uses of mutmap.

Issue is at #7531

Last updated: Jul 26 2025 at 12:14 UTC