https://github.com/roc-lang/roc/pull/7514 is currently failing CI due to one of the mono tests failing. Unfortunately, it seems mac aarch64 outputs different mono than the rest of the targets. It seems there's some off-by-one difference in Derived_gen
identifiers that only shows up in one example. If we fix it for mac aarch64, it breaks the other targets, if we fix the other targets, it breaks for mac aarch64.
Is there host/arch-specific logic in the mono pass? I searched for any code with #[cfg(.*target_os = "macos"
and nothing jumped out at me, but that's about as far as I knew to look.
the CI run where mac aarch64 fails on mono mismatch: https://github.com/roc-lang/roc/actions/runs/12778134753/job/35620405661
the CI run where everything else fails on mono mismatch: https://github.com/roc-lang/roc/actions/runs/12778510523/job/35623205009
Nix calls the architecture darwin
Maybe Rust does the same?
But yeah, it's very rare for us to do something for Macs, it's more often Unix filtering
nah, no results with cfg\(.*darwin
, it looks like target_os = "macos"
is the rust cfg used in roc to conditionally do things for mac.
just because I didn't see anything doesn't mean someone with a better understanding of mono or preceding passes wouldn't. I don't have a good idea how/where Derived_gen
identifiers are even created or where that number comes from in eg #Derived_gen.22
Derived_gen.21
does not exist in https://github.com/roc-lang/roc/blob/144b02939d8ac04b32ae1a932ddc6f722adef01e/crates/compiler/test_mono/generated/inspect_derived_tag_one_field_string.txt
That seems like a bug
Skipping a number seems very suspicious
I can investigate on Friday if you don't have a mac @shua
Thanks, I do not have a mac. I'll still keep reading mono source just for the heck of it
just to sanity check: "apple-silicon" would follow "aarch64" codepaths right? not, for instance, "aarch32"
that's correct
I have not figured this out yet but will continue tomorrow @shua
Anton said:
Skipping a number seems very suspicious
Is skipping a number suspicious only for #Derived_gen
module or for all? I ask because, eg #Derived.1
and #Derived.3
are in that output but #Derived.2
is nowhere to be found.
best I've managed to dig:
Str.replace_each_help
(or Str.61
in the non-prettified output)#Derived_gen.20
and #Derived_gen.21
idents are (on my linux machine) generated in apply_trmc
for the two recursive functions Str.first_match_help
/Str.63
and Str.replace_each_help
/Str.61
respectively. Whatever fishy is happening has to be happening after Str.first_match_help
arg symbols are generated but before Str.replace_each_help
arg symbols are generatedapply_trmc
is looping over procs, and holding a mut
reference to the IdentIds
struct which generates #Derived_gen
idents, so I can't see how anything else could be incrementing the counter there, it must be something in apply_trmc
loopI've added a dbg!(_)
to the current branch to hopefully get enough info from a failing test run to tell what might be different between mac and linux.
Thanks @Luke Boswell for the mac output. It looks like on mac, the order in which procs are processed is changed. So Str.matches_at_help
is processed, creates #Derived_gen.21
then throws the result away, then Str.replace_each_help
is processed. On linux that order is switched, and Str.matches_at_help
creates #Derived_gen.25
but since it's thrown away, you don't see that in the output.
That is the only difference in the order of how procs are processed between the two platforms for the test_mono
inspect_derived_tag_one_field_string
test.
Now to figure out why the order changes.
Glad you found that. Sounds puzzling
Yeah, that's some great info
trying to find any info if HashMap
with wyhash::WyHash
is stable across cpu/os targets or not. There's an implicit requirement in our code that the procs are iterated in the same order, but we're just calling HashMap::values
which I don't think gives much guarantees in terms of ordering.
I think @Brendan Hansknecht knows all bout that
we're just calling
HashMap::values
Can you link to this line of code @shua?
We could try sorting them and see if it fixes our issue.
Anton said:
we're just calling
HashMap::values
Can you link to this line of code @shua?
Is it here?
We could try sorting them and see if it fixes our issue.
Sorting comes with a perf cost, maybe we should just ignore diffs where only the numbers are different? That seems pretty reasonable
that would hide bugs where there are symbol collisions
Ok
i think sorting is okay, can revisit it if it becomes a big problem or do it only in tests
A lot of Eq, Ord, PartialOrd would have to be added all over the place to be able to sort, WyHash allows passing a seed, I'm looking into that now
we could try using indexmap or btreemap, something that does guarantee iteration based on insertion order (which should be the same I guess?)
Anton said:
A lot of Eq, Ord, PartialOrd would have to be added all over the place to be able to sort, WyHash allows passing a seed, I'm looking into that now
We could sort_by_key
to avoid those bounds (as long as you can get some sortable key from the keys or values)
^at least, Vec::sort_by_key
exists, not sure if that's what yous were thinking
Yeah, if you need stable insertion order, I would switch to index map
Wyhash generally uses the same hash for given bit widths, but it definitely differs on 32bit and could differ in other cases.
All that said, feels quite surprising if this is the very first time we hit an ordering difference, but it's possible
Alright, my sort is pretty hacky but that fixed it. I'll push my code later tonight and will make an issue to switch to an index map.
Fix in PR#7532
Index map issue: #7531
Brendan Hansknecht said:
All that said, feels quite surprising if this is the very first time we hit an ordering difference, but it's possible
It's strange to me that something as wide-ranging as hashing differences would only show up by swapping two elements in a single test run. I would expect to be seeing a bunch more things out of order
I changed the dbg's in my PR, because I want to check
if the problem is 1, then something like indexmap would help, but if it's 2 then indexmap would not fix this issue and I'd have to look further back in the flow to see where they diverge.
Specifically, my ask is someone with an arm mac please pull the latest from that branch, run cargo test -p test_mono inspect_derived_tag_one -- --nocapture >dbg-mac.out 2>&1
and then send me dbg-mac.out
file.
edit: 2>&1
was in the wrong place
gotta sign off soon but feel free to add a :eyes: to that comment if you're doing it so two people don't duplicate work
ha, comment was ninja'd by @Luke Boswell , thanks for looking into this with me!
ah dang, I got my shell args out of order, can you run
cargo test -p test_mono inspect_derived_tag_one -- --nocapture >dbg-mac.out 2>&1
instead?
oh weird, the dbg info is identical, so I can't find any difference in how the keys are hashed, nor any difference in the insertion order. Still intrigued, but this hopefully give more evidence that using something like indexmap to store the procedures would fix this issue, because it guarantees iteration over the insertion order (and insertion order is identical).
Intriguing. You would expect the keys would need to hash differently for the iteration order to change. That is assuming:
So I'm still quite surprised something is causing a difference here. Cause I think all 3 of those points should be the same between machines.
as an aside, this is making me feel great about the design decision to have Roc's Dict
iterate in insertion order :big_smile:
Yeah, I think you really need to either do insertion order or randomized (like in go). I guess what absl does is also kinda ok. Randomized in debug builds and arbitrary but consistent in release builds.
Also, randomized doesn't have to be truly random. Can just be starting at a random point in the middle and then looping back to the front.
Rust's default HashMap
uses RandomState
which initializes with seed from system random, but we override it in roc to be WyHash::default()
which starts with the same seed (the 0 value) every time. Maybe it would be a good idea to seed our WyHash
instances with random numbers to flush out these incorrect assumptions on iteration order?
.. trying that now to see how many tests fail
spicy, we start hitting a couple unreachable!
's in inc_dec and mono/src/borrow.rs
if we make MutMap
use std::collections::hash_map::RandomState
So I guess there is an assumption that all MutMap
s iterate in the same order (if inserted in the same order I guess)?
I'm going to try making a single seed that all MutMap
s WyHash
state start with, but which isn't the default 0, that should hopefully give less errors, but maybe might still give some if there's an implicit assumption that iteration is the same across multiple executions.
also found this nice comment, which was wrong at the time. WyHash
's default value, unlike RandomState
is not random but the zero value.
that was an interesting chase, but it looks like another part of the code (the lambda set interner) is under the assumption that maps, etc _should_ hash consistently with the default BuildHasher
. https://github.com/roc-lang/roc/blob/a8def1a974a6dacdcd18656e66114880f4e92b59/crates/compiler/mono/src/layout/intern.rs#L585-L591
So we cannot change BuildHasher
to be something like RandomState
(ie two calls to BuildHasher::build_hasher
will return two differently behaving hashers) because that breaks some assumptions in the lambda set interner, which expects every call to BuildHasher::hash_one
to be the same for the same input.
doesn't explain why mac aarch64 was swapping elements, but it was an interesting dive.
This has been a really interesting read. Definitely seems that we have a solid amount of code with implicit assumptions around hashing
honourable mention to:
calling .values().enumerate()
https://github.com/roc-lang/roc/blob/a8def1a974a6dacdcd18656e66114880f4e92b59/crates/compiler/mono/src/borrow.rs#L403-L415 and then using .values().nth(i)
later https://github.com/roc-lang/roc/blob/a8def1a974a6dacdcd18656e66114880f4e92b59/crates/compiler/mono/src/borrow.rs#L146
I think the last thing I want to look into for this is how many tests fail with a different global seed on program start, where before it was effectively a different seed for every hashmap. I expect a bunch of test_mono tests to fail, and maybe fixed by replacing MutMap
usage with VecMap
or something that guarantees consistent iteration.
Yeah, I think we should switch over to indexmap for current uses of mutmap.
Vecmap won't scale for large maps due to being a linear scan to lookup
Brendan Hansknecht said:
Yeah, I think we should switch over to indexmap for current uses of mutmap.
Issue is at #7531
Last updated: Jul 06 2025 at 12:14 UTC