Custom equality · beginners · Zulip Chat Archive

Stream: beginners

Topic: Custom equality

Martin Stewart (Mar 11 2022 at 21:30):

Brendan Hansknecht said:

If a user makes a custom equality method, would it work with ==

Continuing in a new thread:
What kinds of use cases do people have in mind for overriding ==? I think often instead of custom equality it's better to normalize the data so that the default == works. For example a rational value type defined as { numerator : Int, denominator : Int } has a constructor that makes sure the numerator and denominator are lowest common multiples so that 4/6 and 2/3 are equal.

The only situation I've personally found where custom equality is needed is with HashDict and HashSet since you'd want two dicts with the same contents to be equal even if the insertion order is different and normalizing the data isn't possible if the keys aren't comparable.

Are there other use cases for custom equality where just normalizing the data isn't possible?

jan kili (Mar 11 2022 at 21:33):

Maybe in cases where equivalency is enough?
black == veryDarkGray
Idk, maybe using == for equivalency is an anti-pattern.

Richard Feldman (Mar 11 2022 at 21:35):

What kinds of use cases do people have in mind for overriding ==?

basically just custom collections

Richard Feldman (Mar 11 2022 at 21:35):

e.g. a red-black tree, where insertion order results in a different internal structure but not a different ordering for traversals or lookups

Richard Feldman (Mar 11 2022 at 21:36):

normalizing would make equality a lot more expensive

Martin Stewart (Mar 11 2022 at 21:38):

Are there a lot of different collections that people will need? For example, if HashDict, RedBlackTree, and maybe 2-3 more data structures are what people need, maybe it would make more sense to have those as part of a standard library?

Martin Stewart (Mar 11 2022 at 21:39):

JanCVanB said:

Maybe in cases where equivalency is enough?
black == veryDarkGray
Idk, maybe using == for equivalency is an anti-pattern.

I mean no offense but this is one of the reasons I get nervous when I hear there's custom equality :sweat_smile:

Richard Feldman (Mar 11 2022 at 21:40):

people can mess up equality regardless of whether there's custom equality

Richard Feldman (Mar 11 2022 at 21:41):

for example, making a custom red-black tree in Elm results in == returning false sometimes for trees that are identical in every observable way :sweat_smile:

Richard Feldman (Mar 11 2022 at 21:41):

I think equality is just innately susceptible to bugs from userspace

Martin Stewart (Mar 11 2022 at 21:45):

I'm guessing you agree though that there's more chance of getting things wrong by allowing custom equality than there's ways to get things wrong if it's not allowed? Deciding whether to include custom equality is a question of if there's enough use cases to make it still worthwhile.

Richard Feldman (Mar 11 2022 at 21:46):

I'm actually not sure honestly

Richard Feldman (Mar 11 2022 at 21:46):

like I've used Rust and Java, both of which have custom equality, and I haven't seen it be a problem in practice

Richard Feldman (Mar 11 2022 at 21:47):

but I have seen in Elm and JS it (very rarely) come up that == works in a surprising way because there's no custom equality, and you just have to know to call a .equals function or something instead

Richard Feldman (Mar 11 2022 at 21:48):

separately, in this case it would make the language more complex to disallow it

Richard Feldman (Mar 11 2022 at 21:48):

because by default, adding abilities to things is just a thing you can do

Richard Feldman (Mar 11 2022 at 21:49):

so we'd have to make a separate compiler rule where if you tried to do it for equality (or hashing, since equality and hashing have to stay in sync, so if you can't do one, you shouldn't be allowed to do the other) then you got a compiler error

Martin Stewart (Mar 11 2022 at 21:52):

Sorry this is a bit off topic but how do you ensure equality and hashing stay in sync if there's custom equality? I'm assuming you mean that x == y implies hash x == hash y?

Brendan Hansknecht (Mar 11 2022 at 21:53):

That should be true. Though I guess it may not be 100% guaranteed.

Brendan Hansknecht (Mar 11 2022 at 21:54):

Probably an implementation bug when not true.

Richard Feldman (Mar 11 2022 at 21:56):

how do you ensure equality and hashing stay in sync if there's custom equality?

it's not possible in the general case

Richard Feldman (Mar 11 2022 at 21:56):

also something that I haven't seen being a problem in practice in Java and Rust, although it theoretically could be!

Brendan Hansknecht (Mar 11 2022 at 21:57):

I have seen it in practice, generally when someone adds a record to a struct and fails to update the hashing function.

Martin Stewart (Mar 11 2022 at 21:57):

If someone to overrides == without overriding hash and then insert something into a HashDict, isn't it possible for that value to become unreachable? Maybe I'm misremembering how HashDict works but I vaguely remember this being something I had to keep in mind when overriding equality in C#

Richard Feldman (Mar 11 2022 at 21:58):

it's also not quite the case that x == y should imply hash x == hash y but rather the other way around, hash x == hash y should imply x == y

edit: nm, other way around :laughing:

Brendan Hansknecht (Mar 11 2022 at 21:58):

I think your backwards, Richard

Brendan Hansknecht (Mar 11 2022 at 21:58):

Hash collisions happen

Richard Feldman (Mar 11 2022 at 21:59):

oh right, oops haha

Brendan Hansknecht (Mar 11 2022 at 21:59):

So you still have to check for equality even if hashes match

Richard Feldman (Mar 11 2022 at 21:59):

yeah of course haha

Richard Feldman (Mar 11 2022 at 21:59):

dunno why I was thinking that

Richard Feldman (Mar 11 2022 at 22:00):

@Martin Stewart an important thing to keep in mind is that I think in practice, overriding default equality would be something done extremely extremely rarely in Roc

Richard Feldman (Mar 11 2022 at 22:00):

because equality is automatically derived for records, tag unions, etc.

Richard Feldman (Mar 11 2022 at 22:00):

as is hashing

Richard Feldman (Mar 11 2022 at 22:01):

so I don't know what might motivate someone to go out of their way to override that implementation, aside from making a custom collection :thinking:

Richard Feldman (Mar 11 2022 at 22:01):

and of course the docs for both the equality and hashing abilities would emphasize "if you're overriding this, you must override the other one too!"

Richard Feldman (Mar 11 2022 at 22:02):

heh, in fact we could even have a warning for overriding one and not the other

Richard Feldman (Mar 11 2022 at 22:02):

for what it's worth, for a long time I had this concern too and was like "we're never gonna have custom equality in Roc, people will just mess it up!"

Richard Feldman (Mar 11 2022 at 22:03):

but I had trouble reconciling that in my head with the knowledge that I haven't seen it actually be a problem in practice with languages that support it

Richard Feldman (Mar 11 2022 at 22:03):

kinda similar to list overflow panicking

Richard Feldman (Mar 11 2022 at 22:04):

like theoretically List.concat could panic if you added two gigantic lists together, so you could say it ought to return a Result to be safe, but in practice it's not really a thing that comes up in enough real-world cases to be worth the cost of having it return Result

Martin Stewart (Mar 11 2022 at 22:10):

Fair enough. I guess you're not a fan of the approach of adding data structures that really do need this to the standard library or only allowing for certain packages to use it? (I don't blame you if you don't like that second approach, it seems prone to controversy :sweat_smile: )

I have one last concern about custom equality making it harder to support a elm-review type of tool but I'm having trouble giving a concrete example of why that would be the case. Maybe @Jeroen Engels has input there.

Richard Feldman (Mar 11 2022 at 22:12):

I guess you're not a fan of the approach of adding data structures that really do need this to the standard library or only allowing for certain packages to use it?

yeah I definitely wouldn't want to do the "only certain packages can use it" and I think there's a pretty long tail of useful custom data structures that matter for certain use cases to run fast :big_smile:

Jeroen Engels (Mar 11 2022 at 22:22):

Brendan Hansknecht said:

I have seen it in practice, generally when someone adds a record to a struct and fails to update the hashing function.

This kind of error would be my biggest worry for when comparing two elements of the same type. Allowing to override == for 2 different types sounds like opening Pandora's box to me. It would be quite nice if the compiler/language provided automatic ways of making sure that the implementation is correct and reaonsable, but that sounds super hard.

But if we consider that this is always used well (big if?) in order to make two things that are in any otherwise observable way considered equal, I can't think of any issues off the top of my head.

Richard Feldman (Mar 11 2022 at 22:31):

Allowing to override == for 2 different types sounds like opening Pandora's box to me.

oh yeah that wouldn't happen - it'll always be isEq : a, a -> Bool

Richard Feldman (Mar 11 2022 at 22:31):

types have to be the same

Richard Feldman (Mar 11 2022 at 22:33):

It would be quite nice if the compiler/language provided automatic ways of making sure that the implementation is correct and reaonsable, but that sounds super hard.

it's probably pretty easy to write a property-based test for it

Martin Stewart (Mar 11 2022 at 22:35):

Maybe even some kind of roc-review rule :smiley:

Richard Feldman (Mar 11 2022 at 22:38):

that's a sweet idea!

Pit Capitain (Mar 11 2022 at 22:39):

If someone to overrides == without overriding hash and then insert something into a HashDict, isn't it possible for that value to become unreachable?

That shouldn't be a problem (besides potentially suboptimal performance, because of more hash collisions)

Jeroen Engels (Mar 11 2022 at 23:16):

Richard Feldman said:

It would be quite nice if the compiler/language provided automatic ways of making sure that the implementation is correct and reaonsable, but that sounds super hard.

it's probably pretty easy to write a property-based test for it

Sure, but that property-based test would have to be written by the user right? If the way the type is built (like trees) impacts the data structure, I can imagine even that not catching everything. But yeah, good docs and tooling assistance would help quite a bit here.

Jeroen Engels (Mar 11 2022 at 23:18):

Would it be possible to make something == something return False by overriding the equal function to always return False? Or would there be a cheap reference check (===) always anyway?
If that's possible, would there be use-cases ever?

Derek Gustafson (Mar 11 2022 at 23:22):

You keep saying 'override', but I don't think that's what's going on here.
When you define a new type, is when you would define a custom equality

Jeroen Engels (Mar 11 2022 at 23:32):

Doesn't defining a custom equality equal overriding the == operation for that type?
(I don't know the original context of the discussion, sorry)

Derek Gustafson (Mar 11 2022 at 23:39):

I don't think there necesarrily is an equality that automatically exists.

Derek Gustafson (Mar 11 2022 at 23:40):

If you define a custom tag, especially if it's an opaque type, I wouldn't expect equality to work without defining how

Richard Feldman (Mar 11 2022 at 23:52):

I don't think there necesarrily is an equality that automatically exists.

there is, actually! :smiley:

Derek Gustafson (Mar 11 2022 at 23:53):

I stand corrected

Last updated: Aug 17 2025 at 12:14 UTC