supporting NaN/Infinity/-Infinity? · ideas

Stream: ideas

Topic: supporting NaN/Infinity/-Infinity?

Richard Feldman (Mar 14 2022 at 01:31):

so here's an interesting question: should Roc support NaN, Infinity, and -Infinity?

with integers, arithmetic panics on overflow. This means that it has an extra conditional (which is cheap in isolation, but which can prevent certain LLVM optimizations that are a bigger deal), but you can opt out of it by using something like Num.addWrap - which wraps on overflow. This is important for things like hashing functions, but also it can be used as a performance optimization if you're very confident that overflow won't happen, because it skips the conditional check. (The CPU has a single "add with wrapping overflow" instruction but not a single "add with panicking overflow" instruction, because it doesn't have a concept of panicking.)

Floating-point numbers ordinarily overflow to Infinity or -Infinity. Currently, Roc's design is to add a conditional which detects when this happens and turns it into a panic instead - just like we do with integers. However, there is no equivalent of "wrapping overflow" for floats, because there's no single CPU instruction for wrapping overflow of floats (so it couldn't be a performance optimization; the single CPU instruction versions of things overflow to either -Infinity or Infinity).

This means that there are some high-performance floating-point math operations in Roc that just can't be done without sacrificing some performance due to intermediate conditionals checking for overflow. There's a similar situation with operations like sqrt and division, both of which can produce NaN (division by zero usually produces either Infinity or -Infinity, but specifically 0/0 returns NaN).

Infinity and -Infinity introduce several edge cases. For example, how should Num.toStr print them? "Infinity" and "-Infinity"? "∞" and "-∞"? And then how should it parse them when converting from a Str to a number? If it's the written word "infinity" should it have to be capitalized? Also, now Roc programmers potentially need to do defensive isFinite checks in various places where they wouldn't need to if we ruled out Infinity and -Infinity in the language by always either panicking or having a function return Result if either of them could happen.

NaN is an even bigger problem, because NaN is defined at the hardware level to be not equal to itself. This means that putting NaN into a Set or using it as a key in a Dict unavoidably causes strange behavior - like if you do Dict.insert x y where x is NaN, the dictionary's length will increase by 1, but Dict.get x will report that x is not in the dictionary.

The problem is that if we don't allow Roc values to become NaN, Infinity, or -Infinity, then there are some floating-point math operations that just can never be as fast as they would be in other languages. Allowing just Infinity and -Infinity might sound appealing, because they're less troublesome than NaN, but that would mean that only division and sqrt are slower than in other languages, when all the other floating-point operations can be just as fast as C. That would be strange.

Richard Feldman (Mar 14 2022 at 01:38):

there are a variety of possible designs here.

For example, in theory we could have the compiler treat bit patterns of NaN, Infinity, and -Infinity in floating-point numbers as Result values with Err tags. This would allow for a function like Num.addFloatUnchecked : Float a, Float a -> Result (Float a) [ InvalidFloat ]* which could compile to a single instruction. Then when pattern matching on that exact Result type, we could compile to a single CPU instruction of "is infinite"

However, even assuming this turned out to be doable, it still raises a problem: if I want to chain together several floating-point operations without handling the Result in between, how do I do that? Do we instead make it Num.addFloatUnchecked : Result (Float a) [ InvalidFloat ], Result (Float a) [ InvalidFloat ] -> Result (Float a) [ InvalidFloat ]? That sounds very clunky!

But maybe the idea is salvageable if we instead had something like a RawFloat a opaque type which wrapped a float type and let you do raw floating point operations on it, and then have a toFloat : RawFloat a -> Result (Float a) [ Infinity, MinusInfinity, NaN ]* function to turn it back into a regular number. Is that too much special-casing just for fast float math though?

Richard Feldman (Mar 14 2022 at 01:40):

anyway, I'd love to hear anyone's thoughts on this topic!

Richard Feldman (Mar 14 2022 at 01:58):

another option is to do something like what Rust does: it distinguishes between PartialEq and Eq - most types have both traits, but floats have PartialEq and not Eq because of NaN. Rust's == operator only requires PartialEq, but hashmaps require full Eq, so by default in Rust you can't sort floats or use them as keys in hashmaps.

Richard Feldman (Mar 14 2022 at 02:32):

another potential idea: define equality on floats to be x == y || (isNaN x && isNaN y) - so basically use a couple of extra instructions to redefine NaN to be effectively equal to itself, so that it doesn't break collections or sorting...and then offer a separate float-specific function that does single-instruction equality (e.g. where if both arguments are NaN it returns false)

Kevin Gillette (Mar 14 2022 at 06:12):

I was rather pleased to have read in other Roc materials that Infinities and NaNs do not occur in practice. When reading that other material, I inferred that tags would be used to represent those special cases when they were needed (i.e. for "checked" floating-point arithmetic operations), such as the Result form you speak of. I don't believe the default behavior should be for infinities or NaN to happen "by accident," since this kind of in-band signalling has the same problems as null and other sentinel values.

Regarding sqrt performance, is it not the case that only negative numbers produce an imaginary result (i.e. undefined/NaN when done using floats)? If so, sqrt could be a builtin that has an input validity check at the beginning, but is otherwise annotated to not spend extra instructions checking divisors. For division performance in general, the impact of optimization is only felt over many operations (either performing the same operation on a list, or performing many operations in sequence on the same data in registers). In the latter case (a sequence of operations to a single value), problematic inputs can often be ruled out prior to starting the transformations.

Richard Feldman said:

Do we instead make it Num.addFloatUnchecked : Result (Float a) [ InvalidFloat ], Result (Float a) [ InvalidFloat ] -> Result (Float a) [ InvalidFloat ]? That sounds very clunky!

That does sound clunky, yes. Also, why is it called addFloatUnchecked, given that the inputs (and output) look very much like "checked," aka "tagged" values?

But maybe the idea is salvageable if we instead had something like a RawFloat a opaque type which wrapped a float type and let you do raw floating point operations on it, and then have a toFloat : RawFloat a -> Result (Float a) [ Infinity, MinusInfinity, NaN ]* function to turn it back into a regular number. Is that too much special-casing just for fast float math though?

In terms of special casing, would this be something added to the language now, which we could plausibly optimize later (iow, are we taking the C++ route of adding keywords and other features that become defunct or get repurposed, all just so that we can have better performance right now at the expense of language simplicity over time)?

At least for me, opaque numeric types are not as compelling as a richer set of operations (though I'm not at all consistent on this: for non-numerics, I've felt in some cases that opaque types are better than having more operations). I guess it comes down to: what would we do with "regular floats" that we couldn't do with the opaque type? Presumably the opaque type could be compared, be an input to arithmetic, stringified (if we chose to add a function for it), rounded to an int, etc. At that point, it seems like we'd be adding a raw type just to allow infinities and NaN, but not because we want infinities and NaN, but only because it's faster to allow them. We could achieve the same effect with Num.isNaN: Float a -> Bool and [something like] Num.isInfinity: Float a -> [ Nope, Positive, Negative ], or even Num.checkFloat : Num a -> Result (Num a) [ PlusInfinity, MinusInfinity, NaN ]; when the compiler detects the use of these functions, similar to memory fencing/synchronization, it can generate code that elides more expensive panic-checks/normalization in each operation in exchange for a final one-time check. In other words, the presence of the function call permits the compiler to generate fast float computations which are still, in the end, safe, and the absence of any of these function calls results in the compiler generating code which can produce panics after each operation. This approach is somewhat magical, however, and would need to be thoroughly documented.

That said, I would generally like to see both checked and unchecked operations (with both available, either can be the default). There is a distinction between "unchecked for speed" and "unchecked because I don't care" (usually "saturation arithmetic," i.e. for integers not wrapping but also not panicking; just remaining at the extreme upon which it lands). Both of these flavors of unchecked solve quite different needs. I tend to care about saturation arithmetic more than implicit panics (if there's a way for me to generate my own panics, I can do that while handling checked results). However, I don't tend to write math-intensive code: others will have the opposite need.

I do like where the [ Infinity, MinusInfinity, NaN]tag union is going, though it'd be nice to see balanced terms: PosInfinity paired with NegInfinity, or PlusInfinity paired with MinusInfinity. I could easily mis-infer that Infinity means "some infinity, either positive or negative."

PartialEq doesn't seem compelling to me (avoided complexity if we guarantee that Roc floats cannot represent these special values).

Richard Feldman said:

another potential idea: define equality on floats to be x == y || (isNaN x && isNaN y) - so basically use a couple of extra instructions to redefine NaN to be effectively equal to itself, so that it doesn't break collections or sorting...and then offer a separate float-specific function that does single-instruction equality (e.g. where if both arguments are NaN it returns false)

I'd love to see what people doing a lot of math-oriented programming think, but I believe that while NaN != NaN is confusing to beginners, NaN == NaN would be equally confusing to non-beginners. We side-step this confusion, again, by prohibiting floats from being NaN, and it'll not be all that surprising if the tag NaN is equal to itself (at least if it's clear that NaN it's a tag rather than a specially named float constant).

Martin Stewart (Mar 14 2022 at 10:18):

What if phantom types where used to indicate if a Float was safe or not?

For example:

add: Float *, Float * -> Float Unsafe  #We can end up with infinity so we force the phantom type to become Unsafe
sub: Float *, Float * -> Float *  #This can't become infinity or NaN so we let Float keep whatever type var it already had
check: Float * -> Result (Float Safe) [ FloatMightBeInfiniteOrNan ]

someMath : Float * -> Result (Float Safe) [ FloatMightBeInfiniteOrNan  ]
someMath a = a |> add 5 |> sub 2 |> check

Advantages:

If you see Float Safe you know for certain it's safe
No runtime overhead
No need to handle many chained results. You only check when you feel the need to.

Disadvantages:

Extra type variable for all Floats
Probably won't work with +, *, /since now you need a type signature that doesn't match Num a -> Num a -> Num a
This doesn't address the NaN /= NaN issue. I think this should be seperately handled so that in Roc NaN == NaN even if this comes at a performance cost.

Folkert de Vries (Mar 14 2022 at 14:05):

random other suggestion: F32/F64 follow the spec, but Dec gets the good, ergonomic semantics

Folkert de Vries (Mar 14 2022 at 14:06):

since Dec is supposed to become the default, that would mean we mostly won't have issues related to inf/nan, certainly beginners won't

Richard Feldman (Mar 14 2022 at 14:08):

oh yeah I'd want to do that regardless! :big_smile:

Richard Feldman (Mar 14 2022 at 14:08):

this is really just about actual floating-point numbers, not Dec

Folkert de Vries (Mar 14 2022 at 14:10):

that makes the downsides smaller I think

Folkert de Vries (Mar 14 2022 at 14:11):

except that because of interop, F32/F64 might still be very common

Richard Feldman (Mar 15 2022 at 11:59):

I forgot about some -0 edge cases: https://stackoverflow.com/questions/5095968/does-float-have-a-negative-zero-0f#5096205

so even without NaN in the picture, it's actually not true that x == y implies f x == f y when it comes to floats, unless you want to define -0 to be unequal to 0...which definitely seems like it would cause bugs :sweat_smile:

Richard Feldman (Mar 15 2022 at 12:22):

:thinking: here's an interesting idea: what if comparing NaNs for equality panics?

Richard Feldman (Mar 15 2022 at 12:22):

instead of returning either True or False

Richard Feldman (Mar 15 2022 at 12:24):

that way:

you never get bugs with Dict or Set
you never get bugs with sorting
you also never get answers to computations that diverge from the IEEE-754 floating-point standard that hardware manufacturers all follow

Richard Feldman (Mar 15 2022 at 12:25):

I could also see an argument for panicking when trying to toStr a NaN

Richard Feldman (Mar 15 2022 at 12:26):

after all, NaN was designed to be unequal to itself in order to be a self-propagating error

Richard Feldman (Mar 15 2022 at 12:26):

when the only primitives they could specify were CPU primitives, the idea was "if you try to do anything useful with this thing, it absolutely refuses to be useful - all operations return either NaN or false, even equality with itself"

Kevin Gillette (Mar 15 2022 at 14:28):

I've seen how concessions in a type system early on can lead to awkwardness later on, as has happened as Go is about to launch generics: interface values are comparable based on the combination of runtime type and runtime value, but since interface values may contain non-comparable types, comparison can panic at runtime. This has lead Go's comparable ability to permit panics, which is is pretty surprising, undesirable, and merely a consequence of an earlier language choice.

In Roc, if our choices with NaNs are to 1) make them always equal to each other, 2) make them never equal to each other, or 3) make them panic at runtime when compared, then each of those seem like they will surprise a notable portion of the user base. NaNs themselves are inherently surprising. If we're discussing this to save an extra instruction in the worst case, then it doesn't seem worth complexifying the language by either declaring that Eq can panic or alternatively that anything anywhere can panic, while also complexifying the language to allow a case where x < y == y < x

To be confident in our software, when there are traps in a language, they should be surrounded by large swaths of safe, flat terrain. If something so fundamental as equality can panic, then my view of the safety and trustability of a language goes down.

It just seems that, for the standard numeric types, NaN and probably infinities should be excluded in favor of zero/Result and saturated finites (which are already so large that they're usually excellent stand-ins for "infinity" anyways).

If we exclude those special in-band values (in favor of Result tags or normalization), then we can be sure they won't be inputs to further arithmetic operations, and thus we can save that extra instruction in many cases anyways since relatively few operations produce NaN, and many of those require NaN or an infinity as input.

I'd also recommend that if we introduce additional NaN/infinity-allowing float types, we give them a name including the word "Unsafe". If we call them "fast" or "standard", people will choose them first, almost never for any noticeable gain (most applications do not need even competitive floating point performance), and that choice would be to the detriment to the quality of their software.

Derek Gustafson (Mar 15 2022 at 14:44):

Something else we might want to consider: Could the compiler recognize Results around numeric operations, and then unbox them so there is a single infinity/nan check after a bunch of computations finish?

Derek Gustafson (Mar 15 2022 at 14:44):

Would Result.map make that unboxing easier?

Brendan Hansknecht (Mar 15 2022 at 14:55):

If something so fundamental as equality can panic, then my view of the safety and trustability of a language goes down.

To be fair, so can addition and subtraction, so it may fit the languages to make the default equality panic but also have a special float equality that deals with Nan and such

Brendan Hansknecht (Mar 15 2022 at 14:57):

Something like == panics but Num.eqUnsafe sets is false with Nan

Brendan Hansknecht (Mar 15 2022 at 15:00):

And the compiler definitely could optimize result float in general. It could make it not have a tag but consider the tag to be the part of the float that specifies if it is Nan or Inf. Optimizing a chain of operations should work as well, but probably would need to be explicit depended on how the result is used after it is checked.

Richard Feldman (Mar 15 2022 at 15:06):

an annoying thing I just realized about having == work differently is that <= and >= would too, since otherwise x <= x || x >= x would evaluate to False if x was NaN

Richard Feldman (Mar 15 2022 at 15:18):

NaN and probably infinities should be excluded in favor of zero/Result and saturated finites (which are already so large that they're usually excellent stand-ins for "infinity" anyways).

I think it's okay if Num.add on floats goes to Infinity/-Infinity, or if it panics, but I'm not okay with it saturating (add silently returning the highest non-Infinity float value). That would mean it's paying an additional runtime cost (a conditional on every add) in order to silently and intentionally accumulate increasingly wrong answers, which I'd definitely rather not do!

we haven't really talked about the "what if we prevent NaN, Infinity, and -Infinity through panics" design, so let's do that

Richard Feldman (Mar 15 2022 at 15:24):

doing that means:

==, <=, and >= all work as expected
-0 with atan2 is still surprising, considering -0 == 0 but using each of those with atan2 is not equivalent, so arguably atan2 should also be special-cased to turn arguments of -0 into 0.
All operations that would be modified to support this (e.g. float arithmetic, atan2, possibly ==, <=, and >= depending on how other design questions are answered) would also need corresponding functions that give similar answers in a single CPU instruction. I'm not okay with saying "Roc is permanently slower at floating-point operations than C and there's no way you can get around that." The design has to support a way that an application author can access every individual floating point CPU operation with no overhead.

Richard Feldman (Mar 15 2022 at 15:25):

(we have the same thing with ints, e.g. Num.addWrap : Int a, Int a -> Int a is a single-cpu-instruction version of Num.add for integers; unlike Num.add, it doesn't have an extra conditional that panics on overflow)

Richard Feldman (Mar 15 2022 at 15:25):

it's that last requirement that makes floats much trickier than ints!

Richard Feldman (Mar 15 2022 at 15:27):

because whereas ints do things like silently wrapping on overflow, floats silently turn into erroneous values that automatically propagate.

Richard Feldman (Mar 15 2022 at 15:28):

so just supporting addUnsafe : Float a, Float a -> Float a means that now == has to deal with Infinity and -Infinity at a minimum

Richard Feldman (Mar 15 2022 at 15:28):

supporting divUnsafe : Float a, Float a -> Float a or sqrtUnsafe means it has to deal with NaN

Richard Feldman (Mar 15 2022 at 15:31):

the only way to make the unsafe arithmetic ops single-instruction and not do that would be to have a type like RawFloat

Richard Feldman (Mar 15 2022 at 15:31):

which could be an opaque type that doesn't support equality

Richard Feldman (Mar 15 2022 at 15:32):

but of course then it either couldn't support <= or >=, or else we'd have to be okay with those having the (x <= x || x >= x) == False semantics

Richard Feldman (Mar 15 2022 at 15:33):

but at that point, we're giving up "Roc can be as fast as C at floating-point operations"

Richard Feldman (Mar 15 2022 at 15:35):

unless we say that this special RawFloat type can't use the common infix operators for comparison, but rather has to be using things like RawFloat.isEq and RawFloat.isLte

Richard Feldman (Mar 15 2022 at 15:38):

...at which point we have a design where:

You use Float normally, store it in your data structures, etc. Equality works intuitively.
If you have a hot loop or something that's really performance-sensitive, you convert to a RawFloat and do all your math with that, then convert back to Float at the end.
When using RawFloat you miss out on some infix operators.

Richard Feldman (Mar 15 2022 at 15:38):

so this maybe sounds appealing in terms of guarantees, but imagine yourself as someone who uses floats all the time

Richard Feldman (Mar 15 2022 at 15:39):

like you're doing 3D graphics stuff, or physics simulations

Richard Feldman (Mar 15 2022 at 15:40):

is this giving you a good experience with Roc? I doubt it. Your program is doing tons of floating-point math, which means either you're constantly doing conversions to and from RawFloat, or else you're storing RawFloats directly in your data structures, meaning basic stuff like == is no longer supported across your program, or else you're paying a performance penalty everywhere.

Richard Feldman (Mar 15 2022 at 15:42):

when you consider that Dec exists in Roc to give you a pleasant, predictable experience with fractional numbers, and the main reason that F32 and F64 are in the language at all are for performance reasons...is it really a good design to make it cumbersome to get the best performance out of them?

Anton (Mar 15 2022 at 15:43):

meaning basic stuff like == is no longer supported across your program

Would that really be a big problem? It is often advised to not check floats for equality because of binary representation issues.

Richard Feldman (Mar 15 2022 at 15:43):

I think it would be a big deal for tests

Richard Feldman (Mar 15 2022 at 15:43):

you can no longer do like entity1 == entity2

Richard Feldman (Mar 15 2022 at 15:44):

if either of them contains a RawFloat

Richard Feldman (Mar 15 2022 at 16:41):

to summarize alternative designs that preserve x == x implies f x == f x across builtins (except in cases of panics), while still supporting single-instruction versions of all floating-point operations:

Redefine equality `NaN` to be equal to itself, modify `div`, `mod`, and `atan2` to treat `-0` as equivalent to `0`

Change ==, <=, and >= to special-case NaN
Change div, mod, and atan2 to special-case -0 to work the same way as 0 (e.g. 3/-0 == Infinity, not -Infinity)
Offer alternative single-instruction versions of all the above operations

Redefine all operations that could produce `NaN` to panic instead of producing `NaN`, modify `atan2` to treat `-0` as equivalent to `0`, use `RawFloat` for fastest operations

Leave ==, <=, and >= alone, and assume that a Floatvalue will never contain NaN. This implies assuming that a platform will never mistakenly send a NaN over from the host. I worry a bit that this would be an easy mistake for a host author to make, but I'm open to it possibly being the right design.
Change div to panic on division by 0
Change atan2 to special-case -0 to work the same way as 0
Offer alternative single-instruction versions of all the above modified operations. Since other Float operations are relying on the assumption that they will never contain NaN, this essentially requires having a separate type like RawFloat for all the single-instruction operations, with a conversion from RawFloat to Float that returns a Result in case the RawFloat was NaN

Brendan Hansknecht (Mar 15 2022 at 17:11):

you can no longer do like entity1 == entity2

For floats that is explicitly frowned upon even in tests. So not sure it is a loss.

Richard Feldman (Mar 15 2022 at 18:02):

ok, so in that vein, here's another potential design:

Don't have floats support the `Equating` or `Ordering` abilities, modify `div`, `mod`, and `atan2` to treat `-0` as equivalent to `0`

Attempting to use ==, <=, and >= on floats would be a type mismatch, as would attempting to use List.sort on any list containing a float. Floats could not be used in Sets or as keys of Dicts.
Change div, mod, and atan2 to special-case -0 to work the same way as 0 (e.g. 3/-0 == Infinity, not -Infinity)
Offer alternative single-instruction versions of all the above operations.
Someone can make to make a third-party SafeFloat (or equivalent) which could use Result to make sure it only ever consists of finite floats, and then it could support Equating and Ordering safely.

Richard Feldman (Mar 15 2022 at 18:15):

:point_up: is similar to what Rust does, except they have PartialEq and PartialOrd which == and <= and >= use, and basically what Partial___ means is "equality except NaN is weird" - so in Rust:

== supports floats, but if you use it with two NaNs it returns false - same with <= and >=
you can't use floats for sorting, or put them in sets or dictionaries

Richard Feldman (Mar 15 2022 at 18:16):

I'd rather not go down the PartialEq road though

Richard Feldman (Mar 15 2022 at 18:23):

I think the thing that draws me to the "redefine NaN to be equal to itself" design is that it has these characteristics:

Floats basically Just Work the same way as everything else, except they don't burn you in ways they typically would.
If you want to performance optimize a hot loop, there are a couple of extra functions you can opt into to squeeze out a few extra arithmetic instructions.

Richard Feldman (Mar 15 2022 at 18:23):

those downsides seem less bad than the other designs' downsides

Richard Feldman (Mar 15 2022 at 18:27):

I'm having a hard time coming up with a way in which it's less error-prone to have NaN be unequal to itself (like the spec defines it to be), which is a legendary gotcha for floats, and the "panic whenever we'd get a NaN so we don't have to worry about handling them" design has the downside of needing a totally separate float type and API for people to learn about if they want maximally fast float operations, which is likely what they want since if they didn't care that much about performance, they'd probably be using Dec anyway

Derek Gustafson (Mar 15 2022 at 18:31):

Floats not supporting ordering feels like a bad idea

Derek Gustafson (Mar 15 2022 at 18:32):

If I have a list of floats, I expect to be able to sort them

Anton (Mar 15 2022 at 19:30):

We could provide sort functions that use the different recommended ways to compare floats, see the section "Know what you’re doing" here. We could then also provide the necessary background information on which algorithm to choose.

I feel that it is actually a good thing to not be able to call List.sort for these RawFloat, because it will make you aware of how comparing floats can go wrong. If we do allow List.sort to be called with floats, it is guaranteed to produce hard to debug strange behavior for some people. If Roc is used at scale this will eventually go wrong in a critical situation. Floating point issues have led to loss of life before..

Derek Gustafson (Mar 15 2022 at 19:37):

I'm okay with making a test of equality difficult/impossible for floats. That is error prone.
But lessThan is reasonable for floats. And lots of the sorts of numeric algorithms that are the work horse of high efficiency code depend on that sort of comparison

Derek Gustafson (Mar 15 2022 at 19:38):

I mean, quick sort IS Roc's major selling point right now. Do we really want to undermine that by saying only for integers?

Richard Feldman (Mar 15 2022 at 22:04):

Floating point issues have led to loss of life before

I didn't know about this! Reading through the article, apparently it was specifically overflow when converting from f64 to i16:

The greater values of BH caused a data conversion from a 64-bit floating point number to a 16-bit signed integer value to overflow

so I guess in this case that wouldn't have made a difference, but still - fair point in general about floating point inaccuracies causing problems

Richard Feldman (Mar 15 2022 at 22:09):

I feel that it is actually a good thing to not be able to call List.sort for these RawFloat, because it will make you aware of how comparing floats can go wrong.

another angle to consider: what if == isn't supported for floats, and instead there's only a comparison function? e.g. you give it two floats and an epsilon, and it compares them to see if they're no more than that far apart?

(I assume - but haven't verified - that if you give it 0, then after inlining, LLVM would optimize it to a single equality check)

Richard Feldman (Mar 15 2022 at 22:11):

or perhaps use the ULP algorithm mentioned in that article might be a nicer default - e.g. call it Num.isApproxEq : Float a, Float a -> Bool

Richard Feldman (Mar 15 2022 at 22:13):

so Num.isApproxEq (0.1 + 0.2) 0.3 would return True

Richard Feldman (Mar 15 2022 at 22:13):

the "make it most ergonomic to reach for the recommended and less error-prone strategy" angle appeals to me about that design!

Richard Feldman (Mar 15 2022 at 22:17):

hm, I guess another argument for trying that approach first is that if the ergonomics prove too painful in practice, it's easier to relax the constraints into one of the other designs later :thinking:

Richard Feldman (Mar 15 2022 at 22:17):

by defining equality (somehow) on floats

Brendan Hansknecht (Mar 15 2022 at 22:18):

Do we want an epsilon for isApproxEq? Or maybe also have a version where you could specify an epsilon?

Richard Feldman (Mar 15 2022 at 22:19):

yeah I was thinking have two different versions

Richard Feldman (Mar 15 2022 at 22:19):

isApproxEq and like isWithin or something

Brendan Hansknecht (Mar 15 2022 at 22:20):

Yeah, I like that naming

Richard Feldman (Mar 15 2022 at 22:20):

should floats support Hashing by default?

Richard Feldman (Mar 15 2022 at 22:20):

if they're not gonna support Ordering or Equating

Richard Feldman (Mar 15 2022 at 22:20):

(I actually have no idea what the tradeoffs are there!)

Brendan Hansknecht (Mar 15 2022 at 22:20):

I guess it doesn't make sense if they don't have Equating

Richard Feldman (Mar 15 2022 at 22:21):

yeah I mean on the one hand it wouldn't be very useful

Richard Feldman (Mar 15 2022 at 22:21):

but on the other hand...it should work fine? haha

Richard Feldman (Mar 15 2022 at 22:21):

I mean you can hash a bunch of bits, no problem :stuck_out_tongue:

Brendan Hansknecht (Mar 15 2022 at 22:21):

Sure, but it wouldn't let you use them as a key of aDict or Set

Brendan Hansknecht (Mar 15 2022 at 22:21):

you also need Equating for that

Richard Feldman (Mar 15 2022 at 22:22):

right

Richard Feldman (Mar 15 2022 at 22:22):

so is there any harm in having them support Hashing?

Richard Feldman (Mar 15 2022 at 22:23):

since Dict and Set already have to require both Hashing and Equating anyway

Brendan Hansknecht (Mar 15 2022 at 22:25):

I mean probably not, but I personally think Hashing should probably include Equating.

Brendan Hansknecht (Mar 15 2022 at 22:25):

if a == b then hash(a) == hash(b)

Brendan Hansknecht (Mar 15 2022 at 22:25):

That feels that an invariant of hashing. So feels weird to have hashing were that doesn't makes sense.

Derek Gustafson (Mar 15 2022 at 22:26):

Are we committed to floats not supporting Ordering?
I still think that's going to be a problem for a lot of users

Richard Feldman (Mar 15 2022 at 22:27):

I'm not sure why we would support Ordering but not Equating - the reasons to exclude one seem to apply to the other, yeah?

Brendan Hansknecht (Mar 15 2022 at 22:27):

Quantiles and histograms require Ordering.

Brendan Hansknecht (Mar 15 2022 at 22:27):

I think we should support Ordering

Brendan Hansknecht (Mar 15 2022 at 22:28):

Doesn't the minimal form of ordering only need <. So it shouldn't need Equating?

Derek Gustafson (Mar 15 2022 at 22:28):

Sorting a list of floats feels foundational

Brendan Hansknecht (Mar 15 2022 at 22:29):

Or at least some sort of PartialOrdering so you can still sort?

Derek Gustafson (Mar 15 2022 at 22:30):

And people aren't going to be surprised if 2 numbers that differ in the 12th decimal place end up next to each other in a list

Richard Feldman (Mar 15 2022 at 22:33):

ordering needs ==

Brendan Hansknecht (Mar 15 2022 at 22:33):

Why?

Richard Feldman (Mar 15 2022 at 22:34):

Order is defined as [ Lt, Gt, Eq ]

Brendan Hansknecht (Mar 15 2022 at 22:34):

But I can order numbers without ever equating them. So I don't think it needs Eq

Richard Feldman (Mar 15 2022 at 22:35):

hmm

Richard Feldman (Mar 15 2022 at 22:35):

does merge sort need it? Or need it in order to be fast?

Richard Feldman (Mar 15 2022 at 22:36):

I'm assuming that's what we'd use for List.sort

Brendan Hansknecht (Mar 15 2022 at 22:36):

Hmm...though this is still problematic. Cause someone can do Eq = \x, y -> !(Lt x y) && !(Gt x y)

Derek Gustafson (Mar 15 2022 at 22:37):

My assumption is that merge sort can be written using just < or just >

Brendan Hansknecht (Mar 15 2022 at 22:37):

So Ordering/PartialOrdering/etc would need to be just Lt or just be Gt to avoid someone deriving Eq

Richard Feldman (Mar 15 2022 at 22:38):

they can't add an ability to floats after the fact

Richard Feldman (Mar 15 2022 at 22:38):

but with or without Ordering, there should still be an operation for comparing for greater than or less than

Richard Feldman (Mar 15 2022 at 22:39):

just not gte/lte

Brendan Hansknecht (Mar 15 2022 at 22:40):

pdquicksort which is the standard sorting algorithm for a number of languages definitely requires Eq to be as fast as possible.

Richard Feldman (Mar 15 2022 at 22:40):

does merge sort run faster if it can use equality?

Richard Feldman (Mar 15 2022 at 22:41):

I suppose we could always have sort and sortSlower with the latter requiring only Ordering and the former requiring Equating too :big_smile:

Brendan Hansknecht (Mar 15 2022 at 22:42):

Not to my knowledge, but I would have to double check on that.

Brendan Hansknecht (Mar 15 2022 at 22:49):

Actually, I guess all sorting algorithms that support floats in other languages have to work without depending on Eq. Otherwise, they would break if there is a NaN in the list. In the case of Pdqsort, it just uses Eq for a potential optimization where many of the elements in the list are equal. I would assume that is pretty unlikely with a list of floats. So maybe that optimization doesn't matter for floats.

Richard Feldman (Mar 16 2022 at 00:38):

so in most languages that have an Ordering equivalent, there's a way to say something like compare : a, a -> [ Lt, Gt, Eq ] | a supports Ordering, Equating

Richard Feldman (Mar 16 2022 at 00:39):

which is normally what get used for sorting

Richard Feldman (Mar 16 2022 at 00:39):

we could replace that with:

isLt : a, a -> Bool | a supports Ordering
isGt : a, a -> Bool | a supports Ordering

Richard Feldman (Mar 16 2022 at 00:41):

and then you could implement compare (or an equivalent) in terms of those, since if !(x < y || x > y) then they must be equal

Richard Feldman (Mar 16 2022 at 00:41):

or, put another way, you can implement isNotEq using those

Richard Feldman (Mar 16 2022 at 00:42):

if we defined Ordering that way, then we could implement these:

List.sort : List elem -> List elem | elem supports Ordering, Equating
List.sortSlower : List elem -> List elem | elem supports Ordering

Richard Feldman (Mar 16 2022 at 00:43):

but then there would be no way to have the Ordering ability include compare : a, a -> [ Lt, Gt, Eq ] | a supports Ordering, Equating

Richard Feldman (Mar 16 2022 at 00:43):

we'd have to make a separate ability for that

Richard Feldman (Mar 16 2022 at 00:43):

so what I'm wondering is: is compare actually necessary?

Richard Feldman (Mar 16 2022 at 00:44):

like for integers I'd assume that LLVM could probably optimize something like !(x < y || x > y) into ==

Richard Feldman (Mar 16 2022 at 00:44):

but what about for more complicated types?

Richard Feldman (Mar 16 2022 at 00:45):

basically: would accommodating sorting floats in this way make other operations slower?

Richard Feldman (Mar 16 2022 at 00:47):

ah, yes it would

Richard Feldman (Mar 16 2022 at 00:47):

here's an example: say I have a record with a bunch of info in it, and instead of having compare I have less than, greater than, and equals

Richard Feldman (Mar 16 2022 at 00:48):

in order to check equals, I may have to traverse every field in the record to make sure they're all structurally equal

Richard Feldman (Mar 16 2022 at 00:48):

and to determine less than, I might have to do the same

Richard Feldman (Mar 16 2022 at 00:49):

if the first several records I check happen to be equal, so I keep going looking for a tiebreaker

Richard Feldman (Mar 16 2022 at 00:50):

so if there's compare, I definitely only traverse the fields once, but if there isn't, I may do up to 2x as much traversing

Richard Feldman (Mar 16 2022 at 00:54):

so to sum up, that means that one of the following has to be true: either...

Floats don't support Ordering, so there's no way to sort them
Floats support Ordering, but Ordering doesn't include compare (which it couldn't, because compare requires equality), which means all sorting operations on compound datatypes like records slow down unavoidably
There are two different ordering abilities, e.g. equivalents of PartialOrd and Ord in Rust - one is basically just for floats, and everything but floats has neither (e.g. functions) or both (e.g. every other builtin besides floats). Those basically exist just so that there can be two List.sort functions, one of which works on floats (but slower), and the other of which works on everything else

Richard Feldman (Mar 16 2022 at 00:54):

between those three options, I gotta say - disallowing sorting floats seems most appealing :sweat_smile:

Richard Feldman (Mar 16 2022 at 00:55):

especially considering someone could always make a SafeFloat wrapper around Float which uses Result to verify that it's not a NaN, and then implement Equating and Ordering on that, and then sort as normal

Richard Feldman (Mar 16 2022 at 01:13):

huh, this is surprising: https://godbolt.org/z/fKqYrP7oh

LLVM optimizes !(x < y || x > y) to fewer instructions than a regular ==

(but it's a SSE instruction, so I guess maybe the downside is more register traffic?)

Ayaz Hafiz (Mar 16 2022 at 01:13):

in my mind it’s important to be able to sort floats as other comments have mentioned. not doing so would either exclude a large class of numerical mathematics from being done in Roc or the emergence of a ecosystem library that allows you to compare floats and it would likely become a de facto standard, i’m not sure that’s any better?

at the least, i think it would make sense to be able to define a strict preorder over the floats (that is, be able to answer the question “is m<n?”) and not strict equality

Richard Feldman (Mar 16 2022 at 01:15):

define a strict preorder over the floats

I think that's a reasonable idea, but specifically how should that work?

Richard Feldman (Mar 16 2022 at 01:15):

PartialOrd and Ord equivalents?

Richard Feldman (Mar 16 2022 at 01:16):

or defining Equating for floats?

Ayaz Hafiz (Mar 16 2022 at 01:18):

I’m not sure there’s an equivalent in rust

basically you would only be able to define the method “lessThan: a, a -> Bool | a has …” (apologies for formatting, on phone)

Richard Feldman (Mar 16 2022 at 01:18):

right, sorry - I should probably stop saying PartialOrd

Richard Feldman (Mar 16 2022 at 01:18):

let's call it LtGt for now

Richard Feldman (Mar 16 2022 at 01:18):

it just supports less than and greater than, and that's it

Richard Feldman (Mar 16 2022 at 01:19):

so, specifically the design would be:

LtGt is
   isLt : a, a -> Bool | a supports LtGt
   isGt : a, a -> Bool | a supports LtGt

Ordering supports LtGt, Equating, is
    compare : a, a -> Order | a supports Ordering

List.sort : List elem -> List elem | elem supports Ordering
List.sortSlower : List elem -> List elem | elem supports LtGt

Ayaz Hafiz (Mar 16 2022 at 01:20):

Partial orders support less than or equals to as well, are you intentionally excluding that? If so then that’s just a strict preorder where you can only compare things in an order but there’s no notion of equality

Richard Feldman (Mar 16 2022 at 01:20):

how would you check less than or equal if you can't do equals? :thinking:

Richard Feldman (Mar 16 2022 at 01:20):

NaN <= NaN is false

Richard Feldman (Mar 16 2022 at 01:21):

right, the point is that if floats have NaN, then any operations they do involving ==, <=, or >= will work differently from how those operations work on everything else in the language

Richard Feldman (Mar 16 2022 at 01:25):

other designs previously discussed include things like:

just let floats work like they do in most languages, where sometimes you just get bizarre behavior when NaNs happen :shrug: - this isn't the same thing, but emotionally it's always kinda felt to me like "NaNs are UB, good luck!"
redefine NaN to be equal to itself, despite what the IEEE-754 spec says
have two separate float types, one of which uses checks to guarantee no NaNs at the cost of performance, and that one gets to support equating, ordering, hashing, etc.

Brendan Hansknecht (Mar 16 2022 at 01:26):

"NaNs are UB, good luck!"

Are they? They have a clear spec, right? It is just a weird spec.

Richard Feldman (Mar 16 2022 at 01:26):

yeah I mean emotionally they feel that way

Richard Feldman (Mar 16 2022 at 01:26):

in the sense of like

Richard Feldman (Mar 16 2022 at 01:26):

"you're not gonna get an error, but weird stuff is gonna happen, so just...don't let it happen, ok?"

Richard Feldman (Mar 16 2022 at 01:28):

so I think of them similarly in terms of "I hope this doesn't happen, because if it does, the symptoms will be weird and possibly distant from where the original problem occurred"

Richard Feldman (Mar 16 2022 at 01:28):

like dictionaries/sets having lengths longer than the number of items I can get out of them

Richard Feldman (Mar 16 2022 at 01:28):

etc

Richard Feldman (Mar 16 2022 at 01:30):

so to continue with the LtGt design, a thing I don't like about it is that LtGt (pretend it has a better name) is literally only going to be used by floats. Every other type in the language will almost certainly have either Ordering (which implies LtGt) or neither LtGt nor Ordering

Richard Feldman (Mar 16 2022 at 01:31):

but that means everyone who wants to implement Ordering between now and forever needs to also learn about and implement LtGt, and that is literally only so that floats can be sorted

Richard Feldman (Mar 16 2022 at 01:32):

ok, here's a weird idea that might be awesome :stuck_out_tongue:

Richard Feldman (Mar 16 2022 at 01:33):

what if we have this?

Ordering is
    compare : a, a -> [ Lt, Gt, Eq ] | a supports Ordering

Richard Feldman (Mar 16 2022 at 01:33):

so like

Richard Feldman (Mar 16 2022 at 01:33):

we don't actually say you have to support equating, but you are required to say whether two things are equal

Richard Feldman (Mar 16 2022 at 01:33):

now technically

Derek Gustafson (Mar 16 2022 at 01:33):

And floats just don't return Eq

Derek Gustafson (Mar 16 2022 at 01:33):

I like it

Richard Feldman (Mar 16 2022 at 01:34):

actually no, floats would :joy:

Richard Feldman (Mar 16 2022 at 01:34):

that's one of the ways in which it's weird

Richard Feldman (Mar 16 2022 at 01:34):

so floats wouldn't support == because it's error prone

Richard Feldman (Mar 16 2022 at 01:34):

but if you really wanted to do float equality, you could do compare myFloat == Eq

Richard Feldman (Mar 16 2022 at 01:35):

and that would in turn mean List.sort would work optimally for all types, including floats

Richard Feldman (Mar 16 2022 at 01:36):

heh, although in that case, NaNs would have to return Eq

Richard Feldman (Mar 16 2022 at 01:37):

I mean if they have to return one of those 3 options, it's certainly not going to go well if they return Gt or Lt!

Brendan Hansknecht (Mar 16 2022 at 01:37):

LLVM optimizes !(x < y || x > y) to fewer instructions than a regular ==

Just dug into this a bit. == is longer because it has to check the parity flag. Thus it uses a different instruction than !(x < y || x > y). I assume this is due to handling NaN

Richard Feldman (Mar 16 2022 at 01:38):

interesting, thanks for looking into it!

Richard Feldman (Mar 16 2022 at 01:39):

ha, I just realized something else: it can't be true that x == y implies f x == f y for any language that supports both floats and serializing them to bytes

Richard Feldman (Mar 16 2022 at 01:39):

because 0 and -0 are equal but have different bit patterns

Richard Feldman (Mar 16 2022 at 01:40):

so if f is "serialize my argument to bytes" and x is 0 when y is -0 then they'll be different

Richard Feldman (Mar 16 2022 at 01:40):

so I guess we're already priced into giving up that invariant for floats :sweat_smile:

Richard Feldman (Mar 16 2022 at 01:44):

interestingly, this design feels less weird if you choose different names than Lt,Gt, and Eq

Richard Feldman (Mar 16 2022 at 01:44):

like what if it's Order : [ Before, Same, After ] or something like that

Richard Feldman (Mar 16 2022 at 01:44):

like it's about their relative orders, not a notion of equality

Richard Feldman (Mar 16 2022 at 01:44):

(which is exactly how it would be used in sorting)

Richard Feldman (Mar 16 2022 at 01:45):

now we're not saying that NaN is equal to NaN, but rather we're saying that if I have two NaNs next to each other, they should stay in the same positions instead of needing to being reordered

Richard Feldman (Mar 16 2022 at 01:46):

and of course the way that would be implemented in all other cases except for floats would be using ==

Richard Feldman (Mar 16 2022 at 01:46):

but the point is that you don't need equality to be able to answer the question "should I reorder these two floats or keep them where they are?"

Derek Gustafson (Mar 16 2022 at 01:48):

I'm not sure I have the same reaction to renaming to Before, Same, After, but I don't immediately see a problem with it.

Derek Gustafson (Mar 16 2022 at 01:49):

Probably I'm just more okay with Lt, Gt, Eq than other people are

Derek Gustafson (Mar 16 2022 at 01:49):

And I don't have a problem with that name change

Brendan Hansknecht (Mar 16 2022 at 01:53):

A note on cmp in rust. It doesn't work on floats. They just have partial_cmp which returns an option of an ordering.

Brendan Hansknecht (Mar 16 2022 at 01:54):

They also have total_cmp on nightly with an explicit nan ordering

Richard Feldman (Mar 16 2022 at 01:54):

ok so naming aside, the fundamental semantics of this design proposal would be:

Floats don't support Equating, so you can't use them with == ,<=, or >=, or use them in sets or dictionaries
Floats do support Ordering, and their compare implementation treats NaNs as equal to themselves - so you can use them in List.sort
there are Float.isApproxEq and Float.isWithin for epsilon-based comparisons, or if you really need to use float == you can do compare x y == Eq or compare x y == Same or whatever it ends up being called.

Richard Feldman (Mar 16 2022 at 01:55):

~~I just verified that LLVM optimizes compare x y == Eq all the way down to == https://godbolt.org/z/oTj35qjsq~~ nm, forgot to order the conditionals to account for NaN

Brendan Hansknecht (Mar 16 2022 at 01:55):

now we're not saying that NaN is equal to NaN, but rather we're saying that if I have two NaNs next to each other, they should stay in the same positions instead of needing to being reordered

Brendan Hansknecht (Mar 16 2022 at 01:55):

I like the rewording

Brendan Hansknecht (Mar 16 2022 at 01:59):

I mean it is the same code but reworded, right?

Richard Feldman (Mar 16 2022 at 02:00):

kinda?

Brendan Hansknecht (Mar 16 2022 at 02:00):

To get a 0 you need to not be < and to not be >

Richard Feldman (Mar 16 2022 at 02:00):

compare x y == Same definitely would give you the equivalent of x == y || (isNaN x && isNaN y)

Richard Feldman (Mar 16 2022 at 02:00):

right

Richard Feldman (Mar 16 2022 at 02:01):

but it wouldn't give you the equivalent of doing an ordinary CPU-level float equality comparison, which would return false if both are NaN

Richard Feldman (Mar 16 2022 at 02:01):

I'm mainly surprised that there's a single SSE instruction for comparing them in a way that returns true if both are NaN!

Brendan Hansknecht (Mar 16 2022 at 02:02):

it's not the same instruction ucomisd vs cmpeqsd

Richard Feldman (Mar 16 2022 at 02:02):

right

Richard Feldman (Mar 16 2022 at 02:02):

and there may be a slight perf difference between the two

Brendan Hansknecht (Mar 16 2022 at 02:02):

ucomisd enables NaN == NaN

Richard Feldman (Mar 16 2022 at 02:02):

I'm just surprised there's this level of hardware support at all!

Richard Feldman (Mar 16 2022 at 02:02):

TIL :smiley:

Jared Cone (Mar 16 2022 at 03:38):

I've always liked sort predicates that just return true if the pair is in the correct order. Are there sort algorithms that need to distinguish between before, equal, and after?

Richard Feldman (Mar 16 2022 at 03:38):

yeah we talked about it earlier and apparently there are some important ones whose performance relies on being able to check for equality

Brendan Hansknecht (Mar 16 2022 at 07:26):

I don't think saying "relies on" is fair. I think it just enables more shortcuts/optimizations.

Brendan Hansknecht (Mar 16 2022 at 07:27):

Also, I am kinda stunned that I have apparently never sorted floats in rust. Apparently, you have to define a custom comparison and use sort_by to sort floats in rust.

Brendan Hansknecht (Mar 16 2022 at 07:30):

Hmm... I guess that is also an option for roc. Don't support Ordering on floats and just force people to define the comparison operator deciding how they do or don't want to deal with NaN.

Richard Feldman (Mar 16 2022 at 11:11):

it is an option! However, I think it seems reasonable to define a default Ordering that reliably sorts NaNs to the front. I can't really think of any downsides to doing that (although maybe they're there and I'm not thinking of them!) and people can always opt into List.sortWith to give a custom comparator that's faster but doesn't handle NaNs as well.

Brendan Hansknecht (Mar 16 2022 at 15:26):

Yeah, soundd fine. Either front or back.

Richard Feldman (Mar 17 2022 at 04:15):

observation: Num.nan < 0 would be False, and so would Num.nan > 0, and yet Num.nan is not 0

Richard Feldman (Mar 17 2022 at 04:16):

so that means if you do this:

if x > y then
    ...
else if x < y then
    ...
else # x must be == y
    ...

that comment is incorrect if these are floats!

Richard Feldman (Mar 17 2022 at 04:16):

because either x or y (or both) could be NaN

Richard Feldman (Mar 17 2022 at 04:16):

in which case relying on the assumption that they're equal in that else branch could lead to bugs

Richard Feldman (Mar 17 2022 at 04:17):

we were just running into this in practice last night when trying to write a sorting function that handles NaN properly :sweat_smile:

Richard Feldman (Mar 17 2022 at 04:17):

is this an argument that < and > on floats shouldn't be allowed either? :scream:

Richard Feldman (Mar 17 2022 at 04:18):

it seems like it would be too hard to use them if there are no comparison functions at all

Richard Feldman (Mar 17 2022 at 04:18):

and I guess even isApproxEq and isWithin can return weird answers with NaNs too

Kevin Gillette (Mar 17 2022 at 04:30):

Richard Feldman said:

another angle to consider: what if == isn't supported for floats, and instead there's only a comparison function? e.g. you give it two floats and an epsilon, and it compares them to see if they're no more than that far apart?

An epsilon approach does seem better for "equal enough" checks, yet having skimmed the "know what you're doing" article, I wouldn't trust myself or the average programmer to pick a good epsilon except in cases in which they should be using Dec instead (i.e. in financial cases, like checking if two values are within 0.01, exclusive, of each other).

Regarding float equality (not counting RawFloat, since it sounds like people using that "know what they're doing" and should be able to do what they want), I like the idea of F32 and F64 being ordered (< and >) but not directly comparable for equality (which means also no <= or >=). A List F64 would thus theoretically be sortable (since sorting strictly only needs <).

I agree that sortable lists of floats make sense, but float comparison should be avoided. That also would mean floats can't be used as Dict keys (assuming Hash implies Eq) nor Set elements, both of which are generally regarded as quite inadvisable. Dec wouldn't have these restrictions, being fixed-point, though Dec use as a Dict key or Set element would also be suspect, though not nearly to the same degree as a float.

Kevin Gillette (Mar 17 2022 at 04:32):

Richard Feldman said:

so is there any harm in having them support Hashing?

In what situations would hashability be useful on values that cannot be compared for equality?

Richard Feldman (Mar 17 2022 at 04:35):

only use case I can think of is if you're trying to come up with a salt or a random seed or something, using a record that has a float in it

Kevin Gillette (Mar 17 2022 at 04:36):

Richard Feldman said:

Order is defined as [ Lt, Gt, Eq ]

This isn't fundamental, just an historical convention. For values with a "total ordering," you can derive x == y from !(x < y) && !(y < x). The compiler can also infer that floats have equality for comparison and such even if the language itself chose to prevent programmer-accessible equality to prevent common floating point blunders.

Brendan Hansknecht (Mar 17 2022 at 04:36):

hmm. I think in that case we should let the user just turn the float to bytes rather than hash it.

Richard Feldman (Mar 17 2022 at 04:38):

good point about bytes - we need to support that anyway!

Brendan Hansknecht (Mar 17 2022 at 04:38):

So just realized something. If you put a float in any record or container otherwise, would Roc just remove == from that version of the container. So like List U32 has ==, but List F32 doesn't have ==. Same with a dictionary where the values are floats?

So as a user, I can compare my model for equality, but the moment I add a float, I can no longer do that and the code won't compiler. Arguably, that is probably a good idea since the user needs to decide what they mean when they say float equality, but it feels really strange to remove a functionality when someone expands their model.

Richard Feldman (Mar 17 2022 at 04:38):

yep

Richard Feldman (Mar 17 2022 at 04:39):

to me that's the biggest risk of "disallow equality on floats" not being nice in practice

Brendan Hansknecht (Mar 17 2022 at 04:39):

Sounds like an alien concept, but I feel like it actually does make a lot of sense.

Richard Feldman (Mar 17 2022 at 04:39):

is the effect it has on collections containing floats, including records that have a single float field

Richard Feldman (Mar 17 2022 at 04:40):

yeah - at the same time, a nice property of the design is that if we don't like it, we can introduce float equality (by some definition) without breaking existing code

Brendan Hansknecht (Mar 17 2022 at 04:40):

But on my custom struct, I could define has Eq even if it contains a float and just set the float to use Num.isApproxEq or something and get == back for the struct as a whole, right?

Richard Feldman (Mar 17 2022 at 04:41):

if you make it an opaque type, yeah

Brendan Hansknecht (Mar 17 2022 at 04:41):

Wait, do only opaque types get abilities?

Richard Feldman (Mar 17 2022 at 04:44):

yeah that's essential

Richard Feldman (Mar 17 2022 at 04:44):

records are anonymous

Richard Feldman (Mar 17 2022 at 04:45):

if you said like "the type alias User gets this ability," since type aliases are interchangeable with what they alias, that would be equivalent to saying "all records with the same shape as User have this ability now"

Richard Feldman (Mar 17 2022 at 04:45):

which would lead to a big mess pretty quickly :sweat_smile:

Kevin Gillette (Mar 17 2022 at 04:46):

Richard Feldman said:

There are two different ordering abilities, e.g. equivalents of PartialOrd and Ord in Rust - one is basically just for floats, and everything but floats has neither (e.g. functions) or both (e.g. every other builtin besides floats). Those basically exist just so that there can be two List.sort functions, one of which works on floats (but slower), and the other of which works on everything else

It sounds like Rust includes/implies Eq in Ord as a convenience. The mathematical notion of true partial ordering does not seem like it'd show up frequently in programming, except for abstract mathematical programming, or perhaps anything dealing with value lattices (constraint solving languages). I'm not sure the distinction in Rust implies anything fundamental is going on: Roc could have Eq and Ord as non-overlapping abilities (Roc's Ord could be identical to Rust's PartialOrd, yet just be called "Ord").

I don't see why there'd be a speed difference. Roc would know that machine code can and should be generated to compare floats with <= or == if doing so is faster, e.g. for sorting (equality checks may indeed save a few sorting swaps, though it's very much dependent on the input); those operations may just be excluded from F32/F64 in the Roc language to prevent mistakes when programming at a high level, encouraging isWithin instead.

Brendan Hansknecht (Mar 17 2022 at 04:48):

if you said like "the type alias User gets this ability," since type aliases are interchangeable with what they alias, that would be equivalent to saying "all records with the same shape as User have this ability now"

Totally makes sense. I guess I am gonna be making really large roc modules so that everything can have access to the opaque type internals :rolling_on_the_floor_laughing:. For things that I really wanted to be regular records, but need to be opaque types for abilities.

Richard Feldman (Mar 17 2022 at 04:50):

well you can also always expose like toRaw / fromRaw or something like that

Richard Feldman (Mar 17 2022 at 04:50):

or vice versa, if you just want the opaque type for certain functions, you can expose operations on the raw type and convert to opaque just-in-time (e.g. for serialization if you want the default Encode)

Richard Feldman (Mar 17 2022 at 04:51):

since opaque types can be defined in any scope, you can even do that in the middle of a function if you want

Brendan Hansknecht (Mar 17 2022 at 04:51):

hmm... didn't think about that.

Brendan Hansknecht (Mar 17 2022 at 04:51):

Ok, this seems like less of a concern than I originally thought.

Richard Feldman (Mar 17 2022 at 04:51):

also, records and tag unions will get all the default abilities for free

Richard Feldman (Mar 17 2022 at 04:52):

so you'd only need to do that if you wanted to override the default ones or add additional ones

Brendan Hansknecht (Mar 17 2022 at 04:55):

I don't see why there'd be a speed difference.

PDQsort which rust uses by default specifically checks if the current pivot is equal to the parent pivot, if so, it can do a smarter partitioning such that it only has to recurse on one of the two children lists after partitioning instead of both of them. Only matters for lists where there are multiple of the same item. Probably less common with floats, but still a potential missed optimization that can cut out large chunks of a quick sort when there are duplicates.

Update: It is also used to avoid bounds checking apparently, so that actually makes it potentially very broken if there is a NaN in the list.

Kevin Gillette (Mar 17 2022 at 04:57):

Richard Feldman said:

we don't actually say you have to support equating, but you are required to say whether two things are equal

I don't get it.

In any case, LtGt could really just be:

Ordering is
    isLessThan: a, a -> Bool | a supports Ordering

Gt is, in a sense, a lie with any value for which we're not defining ==, since while you can say "it's less than, or it's not less than," the "not less than" part means "greater or equal," not "greater than", and in this scenario, we don't want to say anything in terms of equality. And so when you're left with the presence of absence of the property of being less than, it's really just a bool. We could of course use isGreaterThan instead (it's an equivalent capability), we can't have [Lt, Gt], and would instead have [Lt, Gt, WeWouldRatherNotSay]

Richard Feldman (Mar 17 2022 at 04:59):

Kevin Gillette said:

Richard Feldman said:

we don't actually say you have to support equating, but you are required to say whether two things are equal

I don't get it.

later on I revised that to be "you are required to say whether they sort to the same position"

Kevin Gillette (Mar 17 2022 at 04:59):

Richard Feldman said:

because 0 and -0 are equal but have different bit patterns

They have different bit patterns, but I had thought that IEEE754 hardware instructions specially treat those bit patterns as "equal" (assuming you don't reinterpret the memory location as being a U64, for example).

Richard Feldman (Mar 17 2022 at 04:59):

usually yes, but not in the case of division, mod, or atan2

Richard Feldman (Mar 17 2022 at 05:00):

those all give different (as in actually unequal) answers for some input combinations where the only difference is 0 vs -0

Kevin Gillette (Mar 17 2022 at 05:08):

Derek Gustafson said:

Probably I'm just more okay with Lt, Gt, Eq than other people are

If we allow float equality, then we don't lose anything relative to other [common] languages, in that floats will be just as tricky/error prone as in those other languages. But the wider programming community won't fault Roc for the choice of using floats conventionally because the wider community won't notice the choice not made. But Roc also won't gain anything either. We can't really take away equality once we give it to floats (at least after a 1.0 release, presumably, without a major/breaking version release), but we can add equality to floats later, if we decide to, without breaking anyone in the process. In the best case, we are right to prohibit direct equality checks in the language, and Roc float-handling code will be safer than elsewhere; in the worst case, we might gain some evidence/experience that demonstrates we need float equality, and we'd have lost nothing in the process (thus the worst case has no downside). In that worst case, we could change Ord to imply Eq via inferring that x == y is equivalent to !(x < y) && !(y < x), which should be trivial for LLVM to optimize quite well in most cases, though most of the types implementing Ord would probably explicitly implement Eq anyways either way.

Kevin Gillette (Mar 17 2022 at 05:11):

Richard Feldman said:

to me that's the biggest risk of "disallow equality on floats" not being nice in practice

I think we can redirect that towards "you probably meant to use Dec? use binary floats for speed, not precision" ?

Brendan Hansknecht (Mar 17 2022 at 05:15):

In that worst case, we could change Ord to imply Eq via inferring that x == y is equivalent to !(x < y) && !(y < x), which should be trivial for LLVM to optimize quite well in most cases, though most of the types implementing Ord would probably explicitly implement Eq anyways either way.

This is explicitly not true for floats. They will not generate the same instructions. So I don't think we can change Ord to just imply that in the worst case, but I am maybe missing some part of what you mean here.

Kevin Gillette (Mar 17 2022 at 05:19):

Richard Feldman said:

those all give different (as in actually unequal) answers for some input combinations where the only difference is 0 vs -0

How much of that is historical C/pre-C computing lineage? How much of it is "intrinsic, valuable mathematical law?" I'm hesitant to accept that we must accept the precise existing edge case behaviors as-is for atan2, etc (or even give them those same names). This could also be a RawFloat vs F32/F64 distinction. I suspect any instructions we'd be saving by letting -0 have distinct behavior (and for which Roc probably wouldn't let you construct -0 via any literal) are instructions that, with code sufficiently robust and aware of ieee754 idiosyncrasies, would just be reinvested back into explicit -0 normalization checks anyways, except in certain performance-critical use-cases (which would use RawFloat).

Kevin Gillette (Mar 17 2022 at 05:23):

Brendan Hansknecht said:

In that worst case, we could change Ord to imply Eq via inferring that x == y is equivalent to !(x < y) && !(y < x), which should be trivial for LLVM to optimize quite well in most cases, though most of the types implementing Ord would probably explicitly implement Eq anyways either way.

This is explicitly not true for floats. They will not generate the same instructions. So I don't think we can change Ord to just imply that in the worst case, but I am maybe missing some part of what you mean here.

I'm still dreaming of Roc's core floats precluding at least NaN. At that point, I believe the property would hold for floats as well, no (since, iiuc, infinity is equal to "itself," as is negative infinity equal to "itself" (or at least it's universally less controversial to make that leap compared to making NaN equal to itself).

Brendan Hansknecht (Mar 17 2022 at 05:24):

Ah, I see. In that case, I think yes.

Brendan Hansknecht (Mar 17 2022 at 05:26):

I think that is definitely interesting to explore, what if our spec says that F32 and F64 have undefined behavior if they are ever NaN or -0. Then do everything in the fastest way possible assuming that is the case. Of course for operations that could return NaN or -0, we would have to make them return a result type.

Brendan Hansknecht (Mar 17 2022 at 05:26):

Though as a side note, -0 is actually arguably useful. It if the floating point additive inverse. x + -0 == x with floats. x + 0 does not always equal x. (have we mentioned that floats are horrible).

Kevin Gillette (Mar 17 2022 at 05:31):

@Richard Feldman where have we landed on NaN thus far? Is it just the domain of RawFloat? Is it sufficiently valuable in non-RawFloat to be worth its downsides (and all the deliberating about how to handle the special cases of == around NaN)?

I believe I see your earlier point about saturating to the most extreme finite being problematic in some ways for float, but am not convinced that "compounds error" more than a somewhat arbitrary-seeming boundary (from a non-expert observer) in which finite values cease to be finite, and then stay non-finite ever after (infinity could be considered "maximum error"). I do see the argument that saturating at finite, and then dividing, subtracting, or otherwise beginning to reduce towards zero again would certainly compound error, whereas float-infinity would make the error clear, though a Result with an Infinity tag could achieve that in a way that has better support from the type system.

Brendan Hansknecht (Mar 17 2022 at 05:36):

Another thought, would it be reasonable to treat floats like integers? If we overflow an addition or multiplication(aka get inifinty), we panic. Of course there would be a checked version of the op that returns an infinity tag that someone could use. Could build a safe, consistent, and reasonably fast implementation for Roc. And you could use saturated math if you actually want infinity to be valid.

Brendan Hansknecht (Mar 17 2022 at 05:37):

Still would need specific nan check on division and things like that similar to ints as well.

Kevin Gillette (Mar 17 2022 at 05:38):

Brendan Hansknecht said:

Though as a side note, -0 is actually arguably useful. It if the floating point additive inverse. x + -0 == x with floats. x + 0 does not always equal x. (have we mentioned that floats are horrible).

Another reason, in my mind, to exclude float equality than a reason to make -0 allowed, though I don't understand the specifics of the point you made. Most compilers would strength-reduce optimize x + 0 and x + -0 into just x, right (assuming the language holds them to be equivalent to x)? Thus to achieve that oddity, you'd need to arrange for it to happen at runtime rather than directly using the exact expressions you listed. Also, since most languages do not permit -0 literals (they just get rewritten at compile time into the bit pattern for "normal" 0), you'd need to construct -0 at runtime also by some means (such as low-level sign manipulation).

Even then, iiuc, -0 isn't mathematically distinct from 0, and thus I presume the utility of -0 is in some kind of in-band signalling or sentinel value approach, which is pretty error prone (whereas tags are the general FP answer, moving the signalling out-of-band, such that the distinction is visible to the type system).

Kevin Gillette (Mar 17 2022 at 05:41):

Regarding infinity and NaN, does Dec have these concepts? What does Dec do when you overflow it or produce values that are non mathematically defined? RawFloat aside (since by definition that's a special case type for performing operations that may have special values), if Dec doesn't have infinity or NaN, and since Dec and F32/F64 are all Frac, should we not have them all have equivalent behaviors and properties (merely just having different notions of precision)?

Brendan Hansknecht (Mar 17 2022 at 05:42):

I honestly don't know all of the details of -0, but someone who deals with floats and quantization and all the fun numerics of ML told me that using 0 over -0 as the additive inverse can have notable accuracy impacts on some ML models. So, I just know that it can matter.

Brendan Hansknecht (Mar 17 2022 at 05:45):

does Dec have these concepts?

What does Dec do when you overflow it or produce values that are non mathematically defined?

I think wrapping overflow currently because no one has added the code to make it do otherwise. I think it should panic since internally it is just an I128 and those panic.

should we not have them all have equivalent behaviors and properties (merely just having different notions of precision)?

I don't think so. Dec is much more sound. It really isn't fractional at all. It is just an I128 that happens to have a unit of 10^-18

Kevin Gillette (Mar 17 2022 at 06:04):

Brendan Hansknecht said:

should we not have them all have equivalent behaviors and properties (merely just having different notions of precision)?

I don't think so. Dec is much more sound. It really isn't fractional at all. It is just an I128 that happens to have a unit of 10^-18

Isn't that an implementation detail though? We could consider F32 and F64 to be really buggy fixed-point number implementations also (i.e. a unit of 2^-1022). You could implement Dec with at a uniform minimum of 10^-18 precision throughout its supported value range, with a ieee754 strategy, iiuc, if you invested enough bits (and added some mandatory rounding rules), not that you'd want to solve it that way.

Brendan Hansknecht (Mar 17 2022 at 06:07):

Dec is base 10 and it is fixed point. I think those both make it much more intuitive and usable than either F32 or F64. Sure the specifics of 10^-18 are an implementation detail, but with dec, I know that base 10 math will work until I have too many decimal places. 0.1+0.9 will for sure equal 1 and things of that nature. That is not the case with floats.

Kevin Gillette (Mar 17 2022 at 06:08):

By that I mean that we probably don't want people to actually, day to day, think of Dec as "just an integer semantically scaled down." We'd rather have them think of it as a fractional number that is slower, yes, but has much more reliable precision and representational semantics, and is a better fit for many use cases compared to F32/F64.

Brendan Hansknecht (Mar 17 2022 at 06:09):

For sure

Brendan Hansknecht (Mar 17 2022 at 06:09):

I just think that due to its unique properties we probably don't want to box it up and restrict it like we do F32 and F64.

Brendan Hansknecht (Mar 17 2022 at 06:10):

For example, I think it is reasonable for Dec to support Eq. Especially since it is specifically made to work with things like monetary values. I should be able to say 1.50 dollars == 1.50 dollars with a Dec.

Kevin Gillette (Mar 17 2022 at 06:11):

Brendan Hansknecht said:

I just think that due to its unique properties we probably don't want to box it up and restrict it like we do F32 and F64.

Yeah, I meant more "if F32/F64 have NaN and infinities, doesn't that mean we should give Dec NaN and infinities too," to which we'd probably answer "no, we don't want that," to which I'd reply "so why do we actually want F32/F64 to have those special values too?"

Kevin Gillette (Mar 17 2022 at 06:15):

Yeah, for Dec, equality is clearly free of the same caveats that floats have. Python, for example, actually does heuristic rounding of numbers like 10.3300000000001 into 10.33 when displaying the value (even though the internal value still has the long tail minutiae), and thus comparing 10.3300000000001 to 10.29999999999999999 might display as 10.33 == 10.33 yet be false, whereas with Dec, a value is represented solely in its true semantic form (presuming you're representing it in decimal notation), thus if values look equal, then, without exception, they are equal.

Brendan Hansknecht (Mar 17 2022 at 06:16):

Ah, I see what you are getting at. We could make it so that at least by default float and double essentially acted like a Dec on these sort of issues. I definitely think that would be a reasonable approach.

Side note: I don't remember if we also wanted Dec to check for underflows, but I think we did. If we treated floats that way as well, they would underflow all the time and be pretty useless.

Brendan Hansknecht (Mar 17 2022 at 06:18):

Yeah, most smart float printing algorithms now print the shortest number of of characters version of a float that is still less than the next float and greater than the previous float. I believe that python uses one of those algorithms. So if they print the same, they should be equal, but maybe they use a worse algorthm that does more rounding and breaks that invariant.

Kevin Gillette (Mar 17 2022 at 06:19):

By underflow, do you mean go more negative than the negative most value (that's sometimes called underflow when talking about two's complement ints, iiuc), or do you [probably] mean dividing (or multiplying by a fraction) enough that the result can't be precisely represented? If the precision-loss case, I believe that's reasonable to panic or return a failed result, depending, just as would happen for exceeding the largest representable value.

Brendan Hansknecht (Mar 17 2022 at 06:20):

yeah, precision loss

Richard Feldman (Mar 17 2022 at 20:06):

Brendan Hansknecht said:

x + -0 == x with floats. x + 0 does not always equal x.

whoa, that's news to me! Do you happen to have any links on that? Search results seem unhelpful so far. :sweat_smile:

Richard Feldman (Mar 17 2022 at 20:08):

regarding some earlier questions: one important guideline I want to follow for Roc's numeric builtins is that Roc's performance ceiling should be as high as C's except where that would require UB or memory unsafety

Richard Feldman (Mar 17 2022 at 20:09):

so for example, division by integers is undefined behavior in C. I don't want any UB in Roc, so we take the hit of always doing an extra instruction there, to have it panic on division by 0.

Richard Feldman (Mar 17 2022 at 20:09):

but I don't like the idea of the design being "there is no way in Roc to do division on floating-point numbers without it being slower than what the hardware is capable of"

Richard Feldman (Mar 17 2022 at 20:11):

one big reason for this is the ecosystem. A really important part of Roc's design is that the only way to run code that's less safe than Roc code (e.g. C, C++, Zig, or Rust code) is via a platform's host

Richard Feldman (Mar 17 2022 at 20:11):

and the only way to do it there is via asynchronous effects, which are a really inconvenient way to do math

Richard Feldman (Mar 17 2022 at 20:12):

so if Roc is slower at math than what's possible in C, we don't have the escape hatch that other languages have (e.g. Python) of "it's fine, just call C synchronously"

Richard Feldman (Mar 17 2022 at 20:12):

therefore it's important that arithmetic performance not be a limitation!

Richard Feldman (Mar 17 2022 at 20:14):

this has an important implication: if Roc is going to support (regardless of whether it's RawFloat or Float or whatever) the maximum performance ceiling, it means there will be NaNs in Roc somewhere. That is not avoidable.

Richard Feldman (Mar 17 2022 at 20:15):

so I don't think we can have a design where we always rule out NaNs and don't have to think about them

Richard Feldman (Mar 17 2022 at 20:15):

we have to deal with them

Richard Feldman (Mar 17 2022 at 20:22):

so then there's the question of "should we have Float and RawFloat or just Float?"

Richard Feldman (Mar 17 2022 at 20:23):

I think if the general guidance is "use Dec over float unless you need float for performance," then it's weird to have the guidance also be "use Float over RawFloat unless you need RawFloat for performance"

Richard Feldman (Mar 17 2022 at 20:24):

like the use case for "faster than Dec but not as fast as possible, but more reliable than the fastest-possible thing, but still less reliable than Dec"

Richard Feldman (Mar 17 2022 at 20:24):

seems very limited :sweat_smile:

Richard Feldman (Mar 17 2022 at 20:25):

to me it seems simpler to be like "use Dec unless it doesn't meet your performance needs, in which case use Float - which is as fast as the CPU will let you go - and prepare to deal with imprecision, NaN, etc."

Richard Feldman (Mar 17 2022 at 20:25):

so, big-picture, that's what I'm thinking :point_up:

Brendan Hansknecht (Mar 17 2022 at 21:07):

Makes sense overall, but I think it still makes sense to expose safe floats. The main reason is that I think it will be common to want the speed improvement but not want the foot gun. I would need to look back at the Dec speed numbers, but I feel that it will be the case that safe Float is 90% the speed of RawFloat while being close to as safe as Dec except for precision loss.

Richard Feldman (Mar 17 2022 at 22:03):

two potential ways we could go about that:

Have a builtin safe float
Let it be implemented in userspace as a separate package, since you can implement custom numeric types via Abilities

thoughts on those?

Brendan Hansknecht (Mar 17 2022 at 22:11):

I think that should be pick on what we believe the default should be. Fast with gotchas or Safe with escape hatches for speed.

Also, I would assume any safe float package implemented in user space would be super niche and not used. Those libraries could be made in any language, but you don't seem them. Instead float code generally works until it has a bad input and triggers something crazy and is broken.

Richard Feldman (Mar 17 2022 at 22:41):

while being close to as safe as Dec except for precision loss.

so something I've talked about with Ayaz is wanting to have a compiler warning for division by 0

Richard Feldman (Mar 17 2022 at 22:43):

where the basic rule would be that if you're dividing by a non-constant, then you get a warning unless:

you did expect thatDenominator != 0 before the division
the division is happening inside an if thatDenominator != 0 then (or an else with the if checking to see that it is 0)

and then have something similar for sqrt and negative numbers

Richard Feldman (Mar 17 2022 at 22:43):

at that point, it seems like floats in general would be "close to as safe as Dec without precision loss," yeah?

Richard Feldman (Mar 17 2022 at 22:44):

because sqrt and 0/0 are the only ways to get NaN I'm aware of

Richard Feldman (Mar 17 2022 at 22:45):

the expect would tell you in either development builds or tests if it was ever wrong, and the conditional would rule it out at runtime even in optimized builds

Richard Feldman (Mar 17 2022 at 22:47):

you could still get a NaN from the platform or from a Roc package, but if that ever happens it's almost certainly going to be due to a bug in the platform or package, and this would only be one of lots of ways bugs in third-party code could lead to bugs in your program

Richard Feldman (Mar 17 2022 at 23:15):

another related thought is that we could have some debug tooling for letting you debug where a NaN, Infinity, or -Infinity came from. Specifically what we could do is something like expect, where in debug builds (in this case only ones being run through the editor) we add an extra instruction to have every float builtin function check for NaN before returning; if it encounters one, it lets the editor know (e.g. by pushing the current stack trace onto a global List, similar to what expect does, which the editor could then traverse later to report these)

Richard Feldman (Mar 17 2022 at 23:16):

I talked with Mason Remaley (full-time indie game dev; he made Way of Rhea) and he does a ton of stuff with floats

Richard Feldman (Mar 17 2022 at 23:17):

one of the things he mentioned is that whenever he encounters a NaN, the time-consuming thing is tracking down where it came from, and it usually involves going around and littering a bunch of "if NaN here then report it" checks around his code base

Richard Feldman (Mar 17 2022 at 23:18):

but if we were to make that process quick and easy, especially considering the replay features we already want to do for debugging in the editor...how bad would it be if the occasional NaN came up?

Richard Feldman (Mar 17 2022 at 23:19):

basically here I'm trying to answer the question "if we made the experience as good as possible when you're going for Maximum Speed," how much demand would really remain for there to be a whole separate API for Almost Maximum speed?

Richard Feldman (Mar 18 2022 at 00:40):

hmmmm, just had a new realization that's cause for revisiting an earlier design!

Richard Feldman (Mar 18 2022 at 00:40):

we established earlier that the check for "are these two floats equal, while treating NaN as equal to itself?" actually runs faster than the typical "are these two floats equal, while treating NaN as unequal to itself?" check

Richard Feldman (Mar 18 2022 at 00:41):

I just tested it out, and it turns out there are also single-instruction comparison ops for <= and >= which treat NaN as equal to itself

Richard Feldman (Mar 18 2022 at 00:42):

the only catch is that they give the opposite answer when NaN is compared with a non-NaN

Richard Feldman (Mar 18 2022 at 00:43):

e.g. traditionally x <= NaN and x >= NaN both return false, but with this instruction they both return true instead

Richard Feldman (Mar 18 2022 at 00:45):

hm, but that would be weird considering x < NaN and x > NaN would still return false, and the only way to get them to return true in a single instruction would be to use the instruction that flips the truth on those as well, which would mean NaN < NaN would return true - which clearly isn't desirable

Richard Feldman (Mar 18 2022 at 01:00):

so I guess the idea would be:

NaN is equal to itself
all comparisons compile to single CPU instructions, so they are naturally as fast as possible (and in fact float equality is even faster than in most languages, strangely, which we might also see if we benchmark the other comparisons)
Comparing things to NaN is even weirder than in most languages, because although it's still the case that both x < NaN is false and x > NaN is false (even when x != NaN), which is contradictory, we now also have the additional contradiction of x <= NaN is true (even though x < NaN is false), and same with >=

Richard Feldman (Mar 18 2022 at 01:09):

so with that in mind, maybe the question of whether to disallow equality on floats really comes down to "do we want to disallow it for precision reasons alone?"

Richard Feldman (Mar 18 2022 at 01:09):

I could see arguments either way!

Richard Feldman (Mar 18 2022 at 01:27):

hmmmm, something else I just realized: compare : Float a, Float a -> Order might not be enough to sort maximally fast

Richard Feldman (Mar 18 2022 at 01:28):

we might actually need equivalents of <, <=, and ==

Richard Feldman (Mar 18 2022 at 01:28):

for the sorting algorithm to be as fast as possible

Richard Feldman (Mar 18 2022 at 01:32):

if that's true, then means one of these three things must be true:

floats don't support Ordering, or
floats support Equating, or
floats support Ordering but not Equating, and sorting is slower than it needs to be across the whole language, not just whenever we're sorting floats

Richard Feldman (Mar 18 2022 at 01:38):

actually that's probably fine, because we can cheat in the List.sort implementation :stuck_out_tongue:

Richard Feldman (Mar 18 2022 at 01:38):

and make monomorphized calls to List.sort that use floats use different operations that would give the same answer as if it were using Order, while not actually using it, so that it runs faster

Richard Feldman (Mar 18 2022 at 01:40):

if we did that, it would resolve the question of "should the default Ordering for floats sort all NaNs to the front or to the back?" nicely, because now we know about single-CPU-instruction operations that do both < and == comparisons on floats that can sort NaNs to one end or the other

Richard Feldman (Mar 18 2022 at 01:41):

which means that even if the default compare for floats isn't maximally fast, the default List.sort on floats still can be!

Kevin Gillette (Mar 18 2022 at 04:12):

Richard Feldman said:

I think if the general guidance is "use Dec over float unless you need float for performance," then it's weird to have the guidance also be "use Float over RawFloat unless you need RawFloat for performance"

That's sold me on your point. Coming from a billing systems background, I have been thinking of this in terms of "use Dec when you need precision, but use Floats otherwise," when I should be thinking of Dec as the default except when I really don't care about precision (when approximations with compounding error is fine in exchange for speed).

Given that, I don't see the benefit of RawFloat, since Float is already the "raw float" of Dec. At that point I also don't really care as much about NaN or Infinities being present either: they already can't happen with Dec, and Dec is what I would be using in most cases. It still may be nice to have checked operations available for use with Float, but which are not used by default.

Kevin Gillette (Mar 18 2022 at 04:20):

Richard Feldman said:

Comparing things to NaN is even weirder than in most languages, because although it's still the case that both x < NaN is false and x > NaN is false (even when x != NaN), which is contradictory, we now also have the additional contradiction of x <= NaN is true (even though x < NaN is false), and same with >=

iiuc, this logically implies that x is equal to NaN. Consider:

x < NaN => false
x > NaN => false

At this point, if we don't have equality defined on floats, then the above result is just that "nothing can be said about the relationship of x to NaN." iow, floats would not have a total ordering (the finite values and infinities would have a total ordering relative to each other, but NaN simply could not be ordered relative to the other possible values).

Now we further say:

x <= NaN => true
x >= NaN => true

At this point we are saying something about equality, and, the only possible answer is that every x is logically equal to NaN.

That would be pretty unusual. While technically not UB, it would take long enough to explain and reason about that it might just be easier for most Roc programmers to consider it magical and undefined.

We can side-step that by making it impossible to construct a NaN value directly (there may be an isNaN function, but no literal or constant expression which the compiler would permit to _directly_ produce NaN, thus we wouldn't need to worry about any of the above expressions actually occurring, except with dynamic values.

Brendan Hansknecht (Mar 18 2022 at 15:33):

I personally think we should just label NaN as UB. Make it such that Roc can never generate NaN and say that it is a bug if a platform passes a NaN float into Roc. Then just design everything else with that in mind.

I think we should still allow for Infinity though. Offer checked and unchecked versions of ops that might generate it.

Brendan Hansknecht (Mar 18 2022 at 15:34):

I think that we should use all of the faster comparisons that Richard mentioned, not because we are thinking about behavior and NaN, but because we are ignoring NaN and can go faster.

Brendan Hansknecht (Mar 18 2022 at 15:35):

Also, if Roc can't generate NaN no need to ever track it down on the Roc part of the code.

Brendan Hansknecht (Mar 18 2022 at 15:37):

Side note: a NaN check and jump can theoretically be as be as cheap as a panic that is never take for addition overflow....which would actually probably make it even cheaper than a overflow cause float instructions take longer to run.

Richard Feldman (Mar 18 2022 at 16:12):

Make it such that Roc can never generate NaN

so this means that Roc can never have single-instruction float division, modulo, or sqrt?

Folkert de Vries (Mar 18 2022 at 16:13):

can you use asserts to rule out the NaN case?

Richard Feldman (Mar 18 2022 at 16:14):

asserts still require extra runtime instructions

Richard Feldman (Mar 18 2022 at 16:14):

conditionals specifically

Folkert de Vries (Mar 18 2022 at 16:14):

and use those strategically so the optimizer can, in many cases, figure out that the fast division can be used

Folkert de Vries (Mar 18 2022 at 16:15):

so do bounds checks

Richard Feldman (Mar 18 2022 at 16:15):

yeah I think these are very comparable to bounds checks! :thumbs_up:

Folkert de Vries (Mar 18 2022 at 16:15):

and that is a cost we accept and hope that the optimizer removes them

Richard Feldman (Mar 18 2022 at 16:15):

right, but we do that because the alternative is memory unsafety :big_smile:

Richard Feldman (Mar 18 2022 at 16:15):

and also we do it with int division because the alternative is UB

Richard Feldman (Mar 18 2022 at 16:16):

but with float division the alternative is less severe

Richard Feldman (Mar 18 2022 at 16:16):

(potentially returning NaN)

Richard Feldman (Mar 18 2022 at 16:16):

I'm very open to the idea that maybe the answer is that we should always pay the cost btw!

Richard Feldman (Mar 18 2022 at 16:16):

I just want to acknowledge it as a serious downside

Brendan Hansknecht (Mar 18 2022 at 16:19):

Richard Feldman said:

so this means that Roc can never have single-instruction float division, modulo, or sqrt?

Do we have those for integers? I thought division had to always be checked in roc.

Richard Feldman (Mar 18 2022 at 16:23):

that's the way it's currently implemented, but I've considered that design relatively experimental and not necessarily what we'd want to do long-term

Richard Feldman (Mar 18 2022 at 16:25):

the current design rules out NaNs unless you get them from the host (which I'm generally okay with; we also have to trust pointers we get from the host will not be null. I'd say it's more likely that the host would accidentally give us a NaN than a null pointer, but then again it's also probably way less common for us to get a float from the host that the host actually knew was a float, as opposed to something we gave the host - as part of our app state - and that the host gave back to us unmodified.)

Richard Feldman (Mar 18 2022 at 16:25):

but it comes at the cost of making it be impossible to have single-instruction division, modulo, or sqrt

Richard Feldman (Mar 18 2022 at 16:26):

which on the one hand is presumably a bigger deal in some domains than others - but on the other hand, the domains that are more likely to consider them a downside (games, physics simulations) are the ones that are most likely to want to use floats for performance in the first place!

Brendan Hansknecht (Mar 18 2022 at 16:28):

I would say that at least for now we follow along with what integers are doing. We can always add an unsafe escape hatch later.

Richard Feldman (Mar 18 2022 at 16:30):

We can always add an unsafe escape hatch later.

I think if we don't think that through, it will cause all sorts of problems :grimacing:

Richard Feldman (Mar 18 2022 at 16:31):

I think it would be very easy to accidentally commit to a design that has really nasty error-prone scenarios if NaNs happen if we design everything with the assumption that NaNs will never enter the system

Richard Feldman (Mar 18 2022 at 16:31):

and everything works great until we reintroduce the possibility of NaNs later, and then everything falls apart

Richard Feldman (Mar 18 2022 at 16:32):

so I'd really rather talk through the implications of them up front, and either commit to never having them or else plan to allow them and design accordingly!

Richard Feldman (Mar 18 2022 at 16:37):

I think a reasonable question to ask is: let's say we assume NaNs can't happen, but infinity and -infinity are still allowed...do we still want to disallow float equality (which includes disallowing using them in sets or as keys of dictionaries) just because of precision concerns?

Martin Stewart (Mar 18 2022 at 16:37):

Can we just implement our own sane floats purely in software and then lobby for CPU manufacturers to add hardware acceleration? :sweat_smile:

Richard Feldman (Mar 18 2022 at 16:37):

and if we've already disallowed float equality, how troublesome even are NaNs anymore?

Richard Feldman (Mar 18 2022 at 16:38):

like they have weird properties with x < NaN and x > NaN returning the same answer...but what specific bugs would that lead to?

Richard Feldman (Mar 18 2022 at 16:47):

and are we more worried about those bugs than about the impossibility of getting single-instruction div/mod/sqrt even in the hottest of hot loops in a game?

Richard Feldman (Mar 18 2022 at 16:57):

another potential design angle we haven't talked about: there's a way you can tell the CPU "from now on, whenever a floating point error happens, like a NaN, set this flag and never unset it again until I explicitly clear it"

Richard Feldman (Mar 18 2022 at 16:57):

so we could use that to do something like: when the host calls into Roc, we set that flag, and then we do a bunch of stuff, and then right before we return to the host, we check to see if any floating point errors happened, and then panic if so

Richard Feldman (Mar 18 2022 at 16:58):

the big downside there would be that we'd have no way of knowing where the error actually happened

Richard Feldman (Mar 18 2022 at 16:58):

just "at some point, something was gonna return NaN, so we exploded after the fact with no stack trace"

Richard Feldman (Mar 18 2022 at 17:00):

there's also SIGFPE, which actually seems very promising except for the "behavior varies by hardware" part

Richard Feldman (Mar 18 2022 at 17:00):

I guess I should really look into that more

Richard Feldman (Mar 18 2022 at 17:02):

like this is on Solaris, and looks exciting: https://docs.oracle.com/cd/E88353_01/html/E37843/sigfpe-3c.html

Richard Feldman (Mar 18 2022 at 17:03):

it would basically be "host runs 1 instruction to tell the CPU to do a SIGFPE whenever a floating-point operation would overflow, underflow, or produce a NaN, then host sets up a SIGFPE handler which can be handled using the same strategy as roc_panic"

Richard Feldman (Mar 18 2022 at 17:05):

and I read somewhere else that it can also happen for integer overflow/underflow and division by 0, although I think if we wanted to rely on that, I'm not sure whether we might need to tell LLVM to treat integer division as "not UB"

Richard Feldman (Mar 18 2022 at 17:05):

for optimization purposes

Richard Feldman (Mar 18 2022 at 17:05):

but maybe it's fine if it does? :thinking:

Brendan Hansknecht (Mar 18 2022 at 17:16):

another potential design angle we haven't talked about: there's a way you can tell the CPU "from now on, whenever a floating point error happens, like a NaN, set this flag and never unset it again until I explicitly clear it"

That would remove single instruction float operations as well (need instructions to update the flag). On top of that, it would not tell you were the error was (something a panic theoretically could do)

Also means you could wait hours for a computation to run in roc and then get a panic right at the end when you thought it would be done. Instead of getting the panic early right when it happened.

Brendan Hansknecht (Mar 18 2022 at 17:16):

like they have weird properties with x < NaN and x > NaN returning the same answer...but what specific bugs would that lead to?

Cause UB with most sorting algorthms. More specifically the issue tends to be that x < NaN and NaN < x return the same answer.

Brendan Hansknecht (Mar 18 2022 at 17:17):

WIth SIGFPE -> "or an inexact computation"

That sounds like it would happen all the time.

Brendan Hansknecht (Mar 18 2022 at 17:19):

For example, anything to do with 1/3 would almost always cause inexact computation.

Kevin Gillette (Mar 18 2022 at 17:27):

so this means that Roc can never have single-instruction float division, modulo, or sqrt?

It appears there are multiple instructions for sqrt (or maybe multiple instruction precision variants), at least on EMT64, with a wide range of clock cycle costs. Which would we use/offer? Multiple of them?

If making NaN impossible from floats generated by Roc ends up being a goal, we can amortize the cost pretty heavily, for example if there is a long chain of float operations which can produce NaN anywhere within it, and since NaN propagates, we can just do a NaN check at the end and have a panic with an imprecise location ("something in lines 10 to 35 produced a NaN"). Given that tail call optimization already produces imprecise traces, this doesn't seem too bad (you don't if it was the first or millionth iteration that had the issue).

Since Roc is pure, when a NaN is detected at the end, you can also reapply the span of possibly NaN'ing operations to whatever was in the stack frame or a register copy, but this time with NaN checks after each suspect operation to find the precise panic point. That replay is not fast, but the initial computation was, and since you're about to crash Roc anyways, speed no longer matters.

Alternatively you could have the panic be defined as occurring on the first data read in a non-arithmetic context (when doing a comparison to find the lesser of two floats, or stringifying, or converting to a Dec). If you do an explicit isNaN check, you can prevent the panic, but otherwise the panic will occur.

Brendan Hansknecht (Mar 18 2022 at 17:30):

An extra note, apparently SIGFPE doesn't catch floating point division by zero. It only catches integer division by zero.

Brendan Hansknecht (Mar 18 2022 at 17:35):

If making NaN impossible from floats generated by Roc ends up being a goal, we can amortize the cost pretty heavily, for example if there is a long chain of float operations which can produce NaN anywhere within it, and since NaN propagates, we can just do a NaN check at the end and have a panic with an imprecise location ("something in lines 10 to 35 produced a NaN"). Given that tail call optimization already produces imprecise traces, this doesn't seem too bad (you don't if it was the first or millionth iteration that had the issue).

I totally agree with the general sentiment, but only a few ops can generate NaN I think it would be better to just check at those locations and never check elsewhere. Then most float operations and comparisons don't have to be slowed down at all. Of course we can still optimize a chain of NaN checking ops into a chain of ops followed by a single NaN check.

Brendan Hansknecht (Mar 18 2022 at 17:41):

~~Haha, I just realized something. Float division can't generate NaN without a NaN input. It generates Inf or -Inf when dividing by 0.~~

Brian Carroll (Mar 18 2022 at 17:43):

What about 0/0?

Brian Carroll (Mar 18 2022 at 17:45):

I just tried this in Rust and it printed "NaN"

fn main() {
    println!("0/0 = {}", 0.0/0.0);
}

Brendan Hansknecht (Mar 18 2022 at 17:46):

ah. missed that.

Brendan Hansknecht (Mar 18 2022 at 17:47):

So only 0.0 and -0.0 as both numerator and denominator can generate NaN with float divisions.

Kevin Gillette (Mar 18 2022 at 18:02):

I think it would be better to just check at those locations and never check elsewhere.

@Brendan Hansknecht I'm assuming we'd never NaN-check runs of instructions that don't involve any NaN-producing instructions.

Brendan Hansknecht (Mar 18 2022 at 18:06):

Sure, but you want to check at the use once we reach non-math operations. That is what I mean by elsewere. That could be in a totally different function. On top of that, it may lead to needing to check in many more locations rather than just at the source.

Would require essentially having an internal maybe nan type and dealing with propogation and checking at many sites. I think this would be a lot messier than you imagine it. It gets a lot worse due to not knowing what will or won't be inlined.

Kevin Gillette (Mar 18 2022 at 18:06):

And an imprecise panic is probably good enough if the panic can come with info you can put right into the Roc editor/debugger to reproduce, and thus the developer tooling can refine an imprecise panic into something exact.

Kevin Gillette (Mar 18 2022 at 18:09):

I think it's problematic for functions to check their inputs for NaN, because then every float accepting function would need to do so, and the panic tells you little. I believe it'd be better for functions that return or could pass NaN to a subcall to do the NaN check at the call/return boundary

Kevin Gillette (Mar 18 2022 at 18:10):

And even then, I'm just talking about loose semantics, not what the compiler will generate

Brendan Hansknecht (Mar 18 2022 at 18:10):

I think it's problematic for functions to check their inputs for NaN

Why would they do that?

Richard Feldman (Mar 18 2022 at 18:11):

That would remove single instruction float operations as well (need instructions to update the flag)

for what it's worth, I think you can set the flag once at the start of the program (e.g. in the host) and that's it - I don't think it's 1 instruction per float op, I think it's one instruction ever

Richard Feldman (Mar 18 2022 at 18:12):

but I also don't like that design direction anyway :big_smile:

Brendan Hansknecht (Mar 18 2022 at 18:14):

I think you can set the flag once at the start of the program

Yeah, I didn't realize you were talking about SIGFPE when you said that. I thought you meant tracking whether or not we hit NaN with a global flag variable.

On that note, can the roc compiler set an interrupt at all? That sounds like it would only be allowed by the platform. The interupt definition would be os/platform specific

Richard Feldman (Mar 18 2022 at 18:17):

Brendan Hansknecht said:

WIth SIGFPE -> "or an inexact computation"

That sounds like it would happen all the time.

it looks like you can specify which ones you want it to signal on: https://linux.die.net/man/3/feenableexcept

Richard Feldman (Mar 18 2022 at 18:22):

Brendan Hansknecht said:

On that note, can the roc compiler set an interrupt at all? That sounds like it would only be allowed by the platform. The interupt definition would be os/platform specific

yeah I think hosts would have to set it

Richard Feldman (Mar 18 2022 at 18:22):

bc they need to enable it

Brendan Hansknecht (Mar 18 2022 at 18:25):

So if we went with the SIGFPE route, every host would be expected to decide how/if roc handles 0/0, sqrt(-1), etc?

Richard Feldman (Mar 18 2022 at 18:26):

yeah in the same way that they specify how panic is handled

Richard Feldman (Mar 18 2022 at 18:26):

so the requirement would be "the handler must be set, and it must halt execution"

Richard Feldman (Mar 18 2022 at 18:26):

just like roc_panic

Richard Feldman (Mar 18 2022 at 18:27):

so from the app author's perspective it's indistinguishable from a panic, since the host decides what to do in both cases

Brendan Hansknecht (Mar 18 2022 at 18:27):

Sure, but panic is different. We call panic. We don't call an interrupt.

Brendan Hansknecht (Mar 18 2022 at 18:28):

Like with panic, I don't think returning would work. you would be returning to garbage code between functions and probably crash. With an interrupt that is not the case.

Richard Feldman (Mar 18 2022 at 18:29):

sure

Richard Feldman (Mar 18 2022 at 18:29):

is the concern that host authors wouldn't adhere to that specification?

Richard Feldman (Mar 18 2022 at 18:29):

and would do something like "div by 0 silently returns 0" instead?

Brendan Hansknecht (Mar 18 2022 at 18:30):

I think plenty of host authors would just not add the interrupt at all and have "normal" floats. At that point, there is no point in making roc care about floats in a special way at all.

Brendan Hansknecht (Mar 18 2022 at 18:31):

They would be exactly as unsafe as any other language

Richard Feldman (Mar 18 2022 at 18:31):

in debug builds we could test for the handler being set

Richard Feldman (Mar 18 2022 at 18:31):

like whenever the host calls main we could have it start with a check for that

Brendan Hansknecht (Mar 18 2022 at 18:32):

How? If the interrupt is os/platform specific? That was the point of having the host enable it in the first place

Richard Feldman (Mar 18 2022 at 18:43):

hm, true

Richard Feldman (Mar 18 2022 at 18:45):

one idea is that we could make it opt-in, and if you don't opt into it, then we insert panics by default

Richard Feldman (Mar 18 2022 at 18:46):

so there'd be some way in the platform module that you'd specify "I will set up my own handling of floating point and integer errors" and if you don't opt into that, then we generate conditionals and panics

Brendan Hansknecht (Mar 18 2022 at 18:46):

Oh, maybe a worse problem. If the host ever generates a NaN, it would also call the interrupt handler. So they may need to toggle it with every call into Roc. If the interrupt can't be set in a thread/co-routine/fiber/etc specific way, that may lead to issues with multithreading and NaN in the host.

Richard Feldman (Mar 18 2022 at 18:50):

Brendan Hansknecht said:

Oh, maybe a worse problem. If the host ever generates a NaN, it would also call the interrupt handler. So they may need to toggle it with every call into Roc. If the interrupt can't be set in a thread/co-routine/fiber/etc specific way, that may lead to issues with multithreading and NaN in the host.

I think that's true, yeah - e.g. https://stackoverflow.com/questions/63095616/feenableexcept-not-permanent

Richard Feldman (Mar 18 2022 at 18:51):

http://www.tin.org/bin/man.cgi?section=3&topic=feenableexcept says all of these operations are thread-safe (under "attributes")

Brendan Hansknecht (Mar 18 2022 at 18:54):

Yes, but co-routines and fibers share a single thread pool. They can get swapped out at random times and replaced by another one. This means that even if it is thread safe you will run into problems.

Richard Feldman (Mar 18 2022 at 18:54):

also on Windows - https://docs.microsoft.com/en-us/windows/win32/dxtecharts/top-issues-for-windows-titles?redirectedfrom=MSDN#manipulation-of-the-floating-point-control-word mentions that it's thread-safe, while also noting that Direct3D will set its own settings until you tell it not to

Richard Feldman (Mar 18 2022 at 18:55):

Brendan Hansknecht said:

Yes, but co-routines and fibers share a single thread pool. They can get swapped out at random times and replaced by another one. This means that even if it is thread safe you will run into problems.

would that be true in Roc though? like the host would be in charge of knowing when that happened

Richard Feldman (Mar 18 2022 at 18:55):

and it would know exactly when it starts running a single-threaded Roc function vs not

Brendan Hansknecht (Mar 18 2022 at 18:56):

Currently all Roc functions are thread safe to my knowledge

Richard Feldman (Mar 18 2022 at 18:56):

they actually aren't because refcounts aren't atomic

Richard Feldman (Mar 18 2022 at 18:56):

that's something I definitely think should be platform-configured in the future

Richard Feldman (Mar 18 2022 at 18:56):

bc they're slower if they have to use the atomic operations (e.g. Arc vs Rc in Rust)

Richard Feldman (Mar 18 2022 at 18:57):

so if you know the Roc code will only ever be running on a single thread, it will run faster if we generate non-atomic refcount operations only

Brendan Hansknecht (Mar 18 2022 at 18:57):

Ah, fair, but I think this is still different. That is about making sure the data you hand to roc isn't be accessed by multiple threads. This is about not calling a function that might trigger SIGFPE while a roc function is potentially suspended on the same thread.

Richard Feldman (Mar 18 2022 at 18:58):

:thinking: would that be a problem though?

Richard Feldman (Mar 18 2022 at 18:59):

like I can suspend the Roc function, change the float settings back to "don't interrupt for NaN", and then change them back before resuming a Roc function

Brendan Hansknecht (Mar 18 2022 at 19:00):

My example situation is a game that uses fibers for multiprocessing. It calls all roc functions with data that they know has been safely loaded to the thread (so no roc refcount issues). It enables SIGFPE and calls into roc. While roc is running, the fiber is context switched and a new c++ fiber is running. The new fiber triggers SIGFPE even though it thought it was disabled.

Brendan Hansknecht (Mar 18 2022 at 19:00):

The other fiber would have no idea that a roc function was even running

Brendan Hansknecht (Mar 18 2022 at 19:01):

The fiber engine is just part of a library and doesn't concern itself with things like SIGFPE That would just increase context switch time (I think this is the actual case in most fiber libraries).

Richard Feldman (Mar 18 2022 at 19:01):

sure

Richard Feldman (Mar 18 2022 at 19:01):

I guess that's an argument for making this be platform-configured

Richard Feldman (Mar 18 2022 at 19:02):

like maybe the game platform actually wants to enforce that NaNs never come up, and is ok with panicking!

Richard Feldman (Mar 18 2022 at 19:03):

since games are an example of a use case where it would be desirable to have the fastest possible math operations in Roc

Brendan Hansknecht (Mar 18 2022 at 19:04):

like maybe the game platform actually wants to enforce that NaNs never come up, and is ok with panicking!

I would guess most actual games would just assume NaN would never come up and would not want panics either.

Richard Feldman (Mar 18 2022 at 19:04):

but if that's what they want, then the signal should be fine, yeah?

Richard Feldman (Mar 18 2022 at 19:04):

like they pay one instruction at the beginning of the program, and then nothing else ever happens

Brendan Hansknecht (Mar 18 2022 at 19:05):

No, they don't want the signal either. They want neither

Richard Feldman (Mar 18 2022 at 19:05):

they want actual NaN?

Richard Feldman (Mar 18 2022 at 19:05):

and integer division by 0 to return 0? (since that's UB otherwise)

Richard Feldman (Mar 18 2022 at 19:07):

interestingly, it looks like doing things this way actually enables more LLVM optimizations than doing a conditional panic!

from the LLVM LangRef:

The number and order of floating-point exceptions is NOT guaranteed. For example, a series of FP operations that each may raise exceptions may be vectorized into a single instruction that raises each unique exception a single time.

Brendan Hansknecht (Mar 18 2022 at 19:07):

Not necessarily, but for many values NaN may just cause something silly for a frame and things will move on.

Richard Feldman (Mar 18 2022 at 19:07):

that's very true - that is something Mason mentioned

Richard Feldman (Mar 18 2022 at 19:07):

things looking weird in the game can be more desirable than having the game halt

Brendan Hansknecht (Mar 18 2022 at 19:09):

But yeah, if the NaN gets into important game state, the game will probably eventually break in some way if not crash.

Richard Feldman (Mar 18 2022 at 19:09):

also interesting: LLVM can do additional optimizations if you tell it to assume there won't be NaNs:

nnan No NaNs - Allow optimizations to assume the arguments and result are not NaN. If an argument is a nan, or the result would be a nan, it produces a poison value instead.

Richard Feldman (Mar 18 2022 at 19:09):

same thing with ninf for no infinities

Brendan Hansknecht (Mar 18 2022 at 19:15):

Those seem useful depending on what we end up with

Richard Feldman (Mar 18 2022 at 19:16):

agreed!

Brendan Hansknecht (Mar 18 2022 at 19:16):

Another minor note with: SIGFPE. IIUC, it will be slow to toggle (kernel call). So it will require Roc functions to do more work for them to be worth calling into if it is toggled with every call.

Richard Feldman (Mar 18 2022 at 19:17):

fair point!

Richard Feldman (Mar 18 2022 at 19:17):

honestly I think most hosts would just set it and then ignore it forever

Richard Feldman (Mar 18 2022 at 19:17):

like they'd just assume it would never come up in the host code

Richard Feldman (Mar 18 2022 at 19:20):

I have no idea how safe of an assumption that is for games

Richard Feldman (Mar 18 2022 at 19:21):

e.g. might non-Roc code outside the GPU (shaders etc) do those things? No idea!

Richard Feldman (Mar 18 2022 at 19:21):

one interesting thing about this is that if we try it out as the default, we can get a lot of signal pretty quickly on whether it's a problem in practice

Brendan Hansknecht (Mar 18 2022 at 19:24):

honestly I think most hosts would just set it and then ignore it forever

Yeah, it hopefully will work fine that way, but it does fill the default of what every programmer expects in their language.

Richard Feldman (Mar 18 2022 at 19:24):

right

Richard Feldman (Mar 18 2022 at 19:25):

to be honest, just thinking about the types of things hosts will be doing, it seems kinda unlikely to me that they'd be writing code which would rely on that behavior being different

Richard Feldman (Mar 18 2022 at 19:25):

unless they're using libraries which do, of course

Richard Feldman (Mar 18 2022 at 19:26):

but I don't know what types of libraries might be doing that - e.g. I kinda hope a physics library or something like that is not producing NaNs or inf/-inf internally and then recovering from it? :sweat_smile:

Brendan Hansknecht (Mar 18 2022 at 19:26):

idk. I feel like it might be semi-common in certain code areas to run a bunch of ops and then check for nan.

Richard Feldman (Mar 18 2022 at 19:26):

sure, but they'd be expecting it wouldn't be NaN, right?

Richard Feldman (Mar 18 2022 at 19:26):

and the check would be defensive

Brendan Hansknecht (Mar 18 2022 at 19:26):

Oh, also, inf and -inf are definitely more common in random libraries.

Richard Feldman (Mar 18 2022 at 19:27):

yeah I've heard of people (e.g. game devs) intentionally making use of inf and -inf

Brendan Hansknecht (Mar 18 2022 at 19:27):

So those are probably more likely to be problematic than nan

Richard Feldman (Mar 18 2022 at 19:28):

I wonder what would happen if we went all the way to essentially enabling -ffast-math

Brendan Hansknecht (Mar 18 2022 at 19:28):

we would lose referential transparency across multiple compilations/cpus

Richard Feldman (Mar 18 2022 at 19:28):

just go all the way in on "floats are for speed, not for accuracy - and wow will they be speedy and inaccurate!"

Richard Feldman (Mar 18 2022 at 19:28):

whoa, seriously?

Richard Feldman (Mar 18 2022 at 19:29):

well that's no good

Richard Feldman (Mar 18 2022 at 19:30):

that reminds me: Mason mentioned that apparently it's a common problem in game dev that sin and cos and some other trig values give different answers depending on which libc you're compiling with

Brendan Hansknecht (Mar 18 2022 at 19:30):

I have to think about that more. We definitely at least get the case were you could change one function and then it would modify the result of another function. Cause changing one function may stop it from inlining which may stop it from reordering instructions which would lead to different results.

Richard Feldman (Mar 18 2022 at 19:31):

something about people preferring to use the software implementations than hardware ones bc they give better answers or something

Brendan Hansknecht (Mar 18 2022 at 19:31):

But yeah, I think llvm could generate faster instructions on one cpu with certain extensions that are not guaranteed to return the exact same results

Richard Feldman (Mar 18 2022 at 19:32):

yeah that makes sense

Brendan Hansknecht (Mar 18 2022 at 19:32):

Of course if we don't have float equality, it would be hard to observe the behaviour.

Richard Feldman (Mar 18 2022 at 19:32):

well, a use case Mason brought up is saved replays in games

Richard Feldman (Mar 18 2022 at 19:32):

where you want to save all the user inputs and then replay them on a different machine to watch the game

Richard Feldman (Mar 18 2022 at 19:33):

for that to work, everything has to be completely reproducible - otherwise one tiny difference can have a cascading effect - e.g. one hitbox detection doesn't quite pass, like it did on the original machine, and from then on you're watching a totally different replay :sweat_smile:

Brendan Hansknecht (Mar 18 2022 at 19:33):

something about people preferring to use the software implementations than hardware ones bc they give better answers or something

Yeah, they are not very fast and not very accurate. I feel like I read an article before of a software based implementation that was essentially the same speed and more accurate at the cost of more memory.

Richard Feldman (Mar 18 2022 at 19:34):

yeah I think we should draw the line at "same code always gives the same results" if possible

Richard Feldman (Mar 18 2022 at 19:34):

I actually don't know how possible that is, to be fair

Richard Feldman (Mar 18 2022 at 19:34):

like I think I may have heard of different CPUs giving slightly different answers for the exact same floating point ops

Richard Feldman (Mar 18 2022 at 19:35):

but I don't have anything I can point to to confirm or refute that!

Brendan Hansknecht (Mar 18 2022 at 19:35):

I mean it is easy if everyone buys the exact same computer down to the last chip and they don't have any bugs.

Brendan Hansknecht (Mar 18 2022 at 19:36):

CPUs of the same model even, definitely will have bugs due to the creation process. Bad cpus that didn't get caught or that have results that are close enough to correct.

Brendan Hansknecht (Mar 18 2022 at 19:38):

It is rare, but it definitely happens. Including the super fun case of one cpu core having broken encryption hardware such that it can encrypt and decrypt it's own data, but other cores on the same cpu would fail to decrypt data from that core.

Richard Feldman (Mar 18 2022 at 19:47):

yiiikes

Richard Feldman (Mar 18 2022 at 19:48):

well so maybe that's not a guarantee we can (or perhaps should) even rely on anyway

Richard Feldman (Mar 18 2022 at 19:48):

if it's just asking for trouble to tell people they can count on it

Richard Feldman (Mar 18 2022 at 19:48):

"trouble" being bugs in this case

Brendan Hansknecht (Mar 18 2022 at 19:48):

I think in practice it is a reasonable guarentee, but at scale it does have exceptions.

Richard Feldman (Mar 18 2022 at 19:49):

well for something like replay it could matetr

Richard Feldman (Mar 18 2022 at 19:49):

*matter

Brendan Hansknecht (Mar 18 2022 at 19:49):

These type of bugs take mega corporations to notice.

Richard Feldman (Mar 18 2022 at 19:49):

like I wouldn't want to tell people "build your game in Roc and saved replays will Just Work!" and then they occasionally don't :grimacing:

Brendan Hansknecht (Mar 18 2022 at 19:51):

It would only happen when share a replay between multiple computers, at least one of the cpus having a bug (super rare) and the bug matter to the replay(probably also quite rare).

Richard Feldman (Mar 18 2022 at 19:51):

that's fair

Brendan Hansknecht (Mar 18 2022 at 19:51):

I don't think roc would get blamed here.

Brendan Hansknecht (Mar 18 2022 at 19:52):

It would be the hardware bug that is to blame.

Richard Feldman (Mar 18 2022 at 19:52):

yeah, I suppose "this works unless your hardware has a bug" is in general the strongest guarantee any piece of software can give about anything :laughing:

Brendan Hansknecht (Mar 18 2022 at 19:53):

So I would just assume that these type of CPU bugs don't exist for our purposes.

Brendan Hansknecht (Mar 18 2022 at 19:53):

I think it would just muddle design and aspiration too much

Richard Feldman (Mar 18 2022 at 19:54):

I'm sold!

Richard Feldman (Mar 18 2022 at 19:55):

so that suggests we should be choosy about which LLVM optimizations we enable

Richard Feldman (Mar 18 2022 at 19:55):

and stick to the ones that wouldn't give different answers on different CPUs (assuming no hardware bugs)

Brendan Hansknecht (Mar 18 2022 at 19:59):

Anyway, for the float discussion, I feel that we have touched on a ton of points. I think that it would be best to pick one of the more controversial solutions and then test it to see how it turns out in practice. I think the main candidates are:

Require the host to set SIGFPE with a handler (probably only for a specific subset of the signal? cause we don't want it to trigger on integer division by zero?) (check wasm exposes SIGFPE)
Make an explicit RawFloat for anything unsafe. Limit Float to safe things and assume it isn't NaN
Make NaN checks an explicit part of the language like division by zero with integers. Have a specific function to opt out of the NaN check for performance
Make anything that would return NaN just call panic
Other? A mix?

The later 2 if they end up being too slow to be acceptable, could be optimized to group multiple operations before doing the NaN check. I also have just talked about NaN, but this may also apply to Inf and -Inf if we want.

Kevin Gillette (Mar 18 2022 at 20:42):

@Richard Feldman a real-world game/simulation replay would likely have a "state checkpoint" concept (like video i-frames) and/or a high level annotation ("the collision occurred") anyways, to both make auditing easier at negligible extra cost (and storage _is_ cheap), as well as to allow fast "seeking" through the replay, so that you don't need to recompute the state from the very beginning just to see what happens one hour in.

It's a nice thought that we _could_ do replays that way, but I doubt that's how we'd realistically do them in practice, and having current state represented merely as the result of an unbounded sequence of float operations seems precarious in the best case (it's essentially unstructured analog computing that happens to have a paper trail).

Kevin Gillette (Mar 18 2022 at 20:45):

Even if you're debugging, you probably just want to isolate changes since some reference state, not since the beginning of the program. Bugs tend to repeat, and so there's not much difference between "since program start" and "since reference state" besides memory required and the degree of debugging step-through tedium involved

Richard Feldman (Mar 18 2022 at 20:56):

it appears that wasm doesn't support float exceptions but it does support trapping on integer overflow and division by 0

Richard Feldman (Mar 18 2022 at 20:59):

this to me actually makes an interesting case for using SIGFPE on integer division

Richard Feldman (Mar 18 2022 at 20:59):

because otherwise it is literally impossible to have single-instruction integer division or modulo without also having UB

Richard Feldman (Mar 18 2022 at 21:00):

that is the only possible way to support it, and it turns out wasm supports it already

Richard Feldman (Mar 18 2022 at 21:00):

so I kinda think we should just do that, and have division no longer return a Result

Richard Feldman (Mar 18 2022 at 21:00):

(separately, I've also gotten feedback that it would be nicer for division by 0 to panic than returning a Result, especially if we had the "check for div by 0" lint mentioned earlier)

Richard Feldman (Mar 18 2022 at 21:03):

now knowing that wasm doesn't support SIGFPE for floats makes for a tricky tradeoff.

if we use SIGFPE on non-wasm hosts, then we can get better performance than is otherwise possible, because we can safely enable LLVM optimizations that assume no NaNs and no +Inf/-Inf
however, if we rely on that invariant for API design, then when we compile to wasm we have to insert conditional panics to make the behavior match

Richard Feldman (Mar 18 2022 at 21:04):

which reduces to: either we make wasm slower at float ops or we make every other target slower at float ops

Richard Feldman (Mar 18 2022 at 21:05):

"make wasm slower" seems better to me for two reasons: one, wasm already has baseline performance overhead, and if you really need the performance to be maximally fast, you probably want to go native instead of wasm anyway; and two, it's possible that wasm might add support for SIGFPE on float exceptions in the future, since they clearly are already handling SIGFPE for integer division by 0!

Richard Feldman (Mar 18 2022 at 21:06):

so to this question:

Brendan Hansknecht said:

Anyway, for the float discussion, I feel that we have touched on a ton of points. I think that it would be best to pick one of the more controversial solutions and then test it to see how it turns out in practice. I think the main candidates are:

Require the host to set SIGFPE with a handler (probably only for a specific subset of the signal? cause we don't want it to trigger on integer division by zero?) (check wasm exposes SIGFPE)

Make anything that would return NaN just call panic

I think we should try out those two :point_up:

that is, require SIGFPE handling in hosts, for both floats and integer division by 0, and then generate corresponding panics for wasm only

Richard Feldman (Mar 18 2022 at 21:07):

that design would give us:

maximum performance outside wasm, for both floats and integers
no error-prone NaN cases to deal with; they just do not come up

Richard Feldman (Mar 18 2022 at 21:09):

the downside would be:

host authors need to set up the SIGFPE handler on non-wasm targets
unless wasm gets support for floating-point exceptions, wasm division, modulo, and sqrt operations will always have an extra conditional check, and will panic if that check fails

Richard Feldman (Mar 18 2022 at 21:10):

also, if it turns out that requiring it to be enabled is to onerous for hosts in practice, or if it's common for people to forget to set it up, leading to UB, we can introduce a way (possibly default) to have panics get generated instead even outside wasm

Richard Feldman (Mar 18 2022 at 21:12):

one potential way we could make it as easy as possible for hosts to set up is that maybe we could make the signature of roc_panic compatible with the signal handler, such that there would be a single line of boilerplate we could give host authors to run which sets up roc_panic as the handler for all those exceptions

Richard Feldman (Mar 18 2022 at 21:13):

an exciting part about trying this out is that if it works, Roc could actually be not only less error-prone (no NaNs to worry about!) than something like C++, but roc --optimize could actually legit beat clang -O3 at certain benchmarks because we'd have the same (zero) overhead on our arithmetic, but we'd be enabling more LLVM float optimizations by default!

Richard Feldman (Mar 18 2022 at 21:14):

(obviously clang lets you opt into those too, so it's not like Roc would be actually faster, but even "faster by default" for math-heavy workloads would be pretty exciting!)

Richard Feldman (Mar 18 2022 at 21:16):

assuming we go with that strategy, the case for whether or not floats should support Equating would really come down to how worried we are about precision

Richard Feldman (Mar 18 2022 at 21:17):

I was talking to Ian MacKenzie about his use cases (lots of geometry stuff), and it came up that Num.isZero actually covers a lot of the cases you'd want to use exact float equality for anyway

Richard Feldman (Mar 18 2022 at 21:18):

he did come up with an example where you do want to test for exact float equality, which when testing serialization/deserialization code (e.g. for a protobuf implementation or something) - it's pretty important to verify that exactly what you serialized is what gets deserialized afterwards

Richard Feldman (Mar 18 2022 at 21:18):

but of course in cases like that you can always do the ol' !(x < y || x > y)

Richard Feldman (Mar 18 2022 at 21:18):

or something like that

Richard Feldman (Mar 18 2022 at 21:18):

or Num.isZero (x - y)

Kevin Gillette (Mar 18 2022 at 21:21):

And I believe the discussion about equality truly has been regarding ==. We can provide a named function for equality that is compiled into a single instruction, just making it clear that it should only be used by those that know what they're doing.

And yeah, I don't see a compelling case for floats as Dict keys or set elems. If someone really needs that, they can easily enough make their own type that implements Equating

Brendan Hansknecht (Mar 18 2022 at 21:29):

he did come up with an example where you do want to test for exact float equality, which when testing serialization/deserialization code (e.g. for a protobuf implementation or something) - it's pretty important to verify that exactly what you serialized is what gets deserialized afterwards

Could also use some sort of bitwise comparison or make an exceptions for tests.

Richard Feldman (Mar 18 2022 at 21:33):

true! Like converting both the expected and actual floats to bytes and comparing those

Brendan Hansknecht (Mar 18 2022 at 21:34):

yep

Brendan Hansknecht (Mar 18 2022 at 21:34):

Or i guess subtract them and check Num.isZero

Richard Feldman (Mar 18 2022 at 21:42):

Mason gave a good example of inf/-inf being useful in game dev:

For example an easy way to make an immovable object is to make its mass infinite
Or if I’m comparing a bunch of objects distances I’ll initialize best result to + or - infinity

Brendan Hansknecht (Mar 18 2022 at 21:46):

I think that should still work even with SIGFPE. I think you can still use infinity in sound ways.

Brendan Hansknecht (Mar 18 2022 at 21:46):

Just would need to give roc a way to generate it. Cause overflowing to generate it would trigger SIGFPE

Richard Feldman (Mar 18 2022 at 21:48):

Screenshot_20220318-174738.png

Richard Feldman (Mar 18 2022 at 21:48):

yeah I think that's reasonable

Richard Feldman (Mar 18 2022 at 21:49):

we could just have Num.infinity and Num.negInfinity

Richard Feldman (Mar 18 2022 at 22:21):

from this article:

To generate a trap, a program must change the execution state of the process using the fp_trap subroutine and enable the exception to be trapped using the fp_enable or fp_enable_all subroutine.

Changing the execution state of the program may slow performance because floating-point trapping causes the process to execute in serial mode.

anyone know what "serial mode" refers to here? 🤨

Richard Feldman (Mar 18 2022 at 22:22):

not sure if that's something specific to IBM's AIX operating system

Brendan Hansknecht (Mar 18 2022 at 22:22):

I have never heard of it before.

Richard Feldman (Mar 18 2022 at 22:25):

yeah all the other search results I can find seem to be IBM-specific

Kevin Gillette (Mar 18 2022 at 22:33):

we could just have Num.infinity and Num.negInfinity

Perhaps

Num.floatInfinity : Num -> Float
Num.floatInfinity = \sign -> ...

floatInfinity, because just infinity suggests (to me) that it's available to number types in general, including Dec, and accepting a sign because that's more convenient than branching in many cases (yet can still be used with branching if desired for code style)

Brian Carroll (Mar 19 2022 at 00:19):

~~But Dec does have infinity, and it has NaN too~~. Isn't this what were using? https://en.m.wikipedia.org/wiki/Decimal128_floating-point_format

Richard Feldman (Mar 19 2022 at 00:31):

nope! We're using a 128-bit fixed point representation

Richard Feldman (Mar 19 2022 at 00:31):

it doesn't have infinity or nan

Richard Feldman (Mar 19 2022 at 03:05):

cool, TIL every basic FP arithmetic operation can produce NaN:

-Infinity + Infinity == NaN
Infinity - Infinity == NaN
Infinity * 0 == NaN
0 / 0 == NaN

Richard Feldman (Mar 19 2022 at 03:06):

also not only sqrt of negative but also logof a negative

Richard Feldman (Mar 19 2022 at 03:06):

source: https://www.agner.org/optimize/nan_propagation.pdf

Richard Feldman (Mar 19 2022 at 03:06):

(plus I tried them, and...yup)

Richard Feldman (Mar 19 2022 at 03:22):

so I'm finding an argument to not do commit to the SIGFPE design, which is that I'm not actually sure if it's gonna be available indefinitely :grimacing:

Richard Feldman (Mar 19 2022 at 03:22):

it seems like Apple is not even bothering to support it in their libc for M1s

Richard Feldman (Mar 19 2022 at 03:23):

I've found people on the Internet asking how to do it and the only one response was an Apple framework dev saying "we don't support that for M1 yet but you can set these flags manually" and then the guy replied that he did set the flags manually and they were ignored, and no response, and that was a year ago :sweat_smile:

Richard Feldman (Mar 19 2022 at 03:23):

and I've been unable to get it working on my M1 even just using C

Richard Feldman (Mar 19 2022 at 03:25):

plus that Agner Fog article basically explains how hardware people don't want it to be a thing because it overcomplicates things in the presence of CPU parallelism, which makes me concerned that future chips might drop support for it altogether and encourage OSes to fall back on software emulation for anything relying on it

Richard Feldman (Mar 19 2022 at 03:27):

unfortunately, that same article revealing that plus, minus, times, and division all can produce NaN makes some of these alternatives less appealing:

Brendan Hansknecht said:

Anyway, for the float discussion, I feel that we have touched on a ton of points. I think that it would be best to pick one of the more controversial solutions and then test it to see how it turns out in practice. I think the main candidates are:

~~Require the host to set SIGFPE with a handler (probably only for a specific subset of the signal? cause we don't want it to trigger on integer division by zero?) (check wasm exposes SIGFPE)~~

Make an explicit RawFloat for anything unsafe. Limit Float to safe things and assume it isn't NaN

Make NaN checks an explicit part of the language like division by zero with integers. Have a specific function to opt out of the NaN check for performance

Make anything that would return NaN just call panic

Other? A mix?

The later 2 if they end up being too slow to be acceptable, could be optimized to group multiple operations before doing the NaN check. I also have just talked about NaN, but this may also apply to Inf and -Inf if we want.

Kevin Gillette (Mar 19 2022 at 03:28):

@Richard Feldman so if SGIFPE has an unclear future, would anything in https://roc.zulipchat.com/#narrow/stream/304641-ideas/topic/supporting.20NaN.2FInfinity.2F-Infinity.3F/near/275836457 work?

Richard Feldman (Mar 19 2022 at 03:34):

Kevin Gillette said:

If making NaN impossible from floats generated by Roc ends up being a goal, we can amortize the cost pretty heavily, for example if there is a long chain of float operations which can produce NaN anywhere within it, and since NaN propagates, we can just do a NaN check at the end and have a panic with an imprecise location ("something in lines 10 to 35 produced a NaN").

assuming we could have the compiler reliably detect this, I'm not sure how much of a difference it would make. In a lot of cases, the chains are pretty short but they happen inside a hot loop. It's also common for the result inside the hot loop to be setting values inside structs (in Roc's case, the example would be a List.map over some records which hold floats, and then the compiler would emit in-place mutations of those records) so if we waited to the end of the loop to detect if any NaNs happened, we'd have to iterate over the whole loop again

Richard Feldman (Mar 19 2022 at 03:35):

or, alternatively, do a check on each iteration of the loop and write to a local variable if anything is NaN - but at that point we might as well just panic if the check ever passes

Richard Feldman (Mar 19 2022 at 03:38):

in general, I still think if we're going to panic on things, we also have to offer non-panicking versions so that we aren't introducing a cap on maximum possible performance

Richard Feldman (Mar 19 2022 at 03:38):

but the SIGFPE design is the only one that allows that while ruling out NaNs

Richard Feldman (Mar 19 2022 at 03:39):

so I guess the summary is, we have to do one of two things: either

the SIGFPE design, or
do not attempt to rule out NaNs, and design things with them in mind

Richard Feldman (Mar 19 2022 at 03:41):

we definitely can't treat NaNs as UB while still allowing them to be the result of every arithmetic operation, since that would mean every floating-point arithmetic operation in Roc would become a potential source of UB

Richard Feldman (Mar 19 2022 at 03:42):

we also can't rule out infinity or -infinity altogether (again except for the SIGFPE design) because overflow causes that, and requiring that all floating-point arithmetic operations do a check-and-panic would make them all more expensive and also defeat LLVM optimizations for all of them

Richard Feldman (Mar 19 2022 at 03:43):

so it's really seeming more and more like the most viable option is:

do not attempt to rule out NaNs, and design things with them in mind

Kevin Gillette (Mar 19 2022 at 03:50):

Richard Feldman said:

It's also common for the result inside the hot loop to be setting values inside structs (in Roc's case, the example would be a List.map over some records which hold floats, and then the compiler would emit in-place mutations of those records) so if we waited to the end of the loop to detect if any NaNs happened, we'd have to iterate over the whole loop again

I see. I'm guessing this is one of those places where the list has runtime-determined size or is large enough that it can't be fully unrolled. Thus we're likely talking about memory writes rather than just register operations, and memory is slow. I know on some architectures you can treat memory reads and writes as asynchronous operations, but I don't think that's common. In any case, if so, you could start the memory write of the current element, start the memory read of the next element, and while those are happening, do the NaN check against a register.

On architectures where branch prediction is common (x86-64), the same effect might be achievable transparently: Intel execution pipelines, for example, may proceed upon the assumption that the panic branch will not execute, because it hasn't on any of the last iterations, and thus proceed with pre-fetch, making some of the post-mutation reads "free"

Brendan Hansknecht (Mar 19 2022 at 03:56):

Essentially all modern architectures should do this, though it still better to put reads as early as possible instead of after NaN checks where possible.

Kevin Gillette (Mar 19 2022 at 03:57):

Brendan Hansknecht said:

Essentially all modern architectures should do this, though it still better to put reads as early as possible instead of after NaN checks where possible.

Please elaborate on "put reads as early as possible"

Brendan Hansknecht (Mar 19 2022 at 03:58):

Richard Feldman said:

cool, TIL every basic FP arithmetic operation can produce NaN:

-Infinity + Infinity == NaN

Infinity - Infinity == NaN

Infinity * 0 == NaN

0 / 0 == NaN

This is the real killer. Especially since we have already stated that inifinity is useful and we want it.

Brendan Hansknecht (Mar 19 2022 at 04:02):

Please elaborate on "put reads as early as possible"

I mean that it is always best to put memory loading/reading instructions as early in the instruction stream as possible. Given that will at minimum take a few cycles on modern CPUs, you want the pre-fetching to start as soon as possible. So if you need to load some piece of memory into a register for your next floating point instruction, it would be best to put it before the NaN check of the previous instruction.

This all being said, in many cases it will make little to no difference because the CPU will already be looking at 4+ instructions at a time and potentially looking a few hundred instructions into the future.

Brendan Hansknecht (Mar 19 2022 at 04:04):

I was trying to point out that on essentially all modern hardware you should consider:

I know on some architectures you can treat memory reads and writes as asynchronous operations, but I don't think that's common

to be true even if it isn't represented in the instruction stream.

Brendan Hansknecht (Mar 19 2022 at 04:09):

Also, TIL apparently the apple M1 has a larger reorder buffer than most x86_64 cpus by about 2x . It has 630 slots.

Just imagine the fact that the CPU could be executing an instruction that is 630 instructions after the memory load that your program is currently stalled on.

To top off the crazy numbers, an M1 mac can have 130 loads in flight along with 107 stores. Talk about being able to do run things super far into the future.

Kevin Gillette (Mar 19 2022 at 04:13):

Richard Feldman said:

so it's really seeming more and more like the most viable option is:

do not attempt to rule out NaNs, and design things with them in mind

Having reflected upon the discussion so far, it seems like a pretty good compromise to not have RawFloat, and instead just ensure that panics happen as a result of _producing_ NaN or, perhaps, overflow to infinity. I don't believe it's important to panic immediately in either case, as long as the panic is roughly in the right place (at least all but the last stack frame should be accurate). It'd be an open question as to how Roc handles NaN given to it from the platform.

When exceptional performance is needed, perhaps unchecked arithmetic functions can be provided. Perhaps also the NaN panic implementation can be provided by the platform (with a default/reasonable implementation provided by Roc itself is the platform does not override). I don't know how well optimizations across the platform boundary work, but if the platform provides a no-op nan-check-and-panic function, then perhaps it would optimize away, thus having no runtime cost for that platform.

Brendan Hansknecht (Mar 19 2022 at 04:15):

I don't know how well optimizations across the platform boundary work

They don't exist, but theoretically could with something like LTO and a solid amount of hassle.

Richard Feldman (Mar 19 2022 at 04:18):

and instead just ensure that panics happen as a result of _producing_ NaN

When exceptional performance is needed, perhaps unchecked arithmetic functions can be provided.

putting these two together, what we're saying is:

we'll assume that NaNs don't happen
well also let you opt into letting them happen

which reduces to "we'll treat NaNs as UB" :sweat_smile:

Kevin Gillette (Mar 19 2022 at 04:22):

Brendan Hansknecht said:

Given that will at minimum take a few cycles on modern CPUs, you want the pre-fetching to start as soon as possible.

iiuc, memory reads (at least those that need to go to main memory) take a lot more than a few cycles to complete.

Brendan Hansknecht said:

I was trying to point out that on essentially all modern hardware you should consider:

I know on some architectures you can treat memory reads and writes as asynchronous operations, but I don't think that's common

to be true even if it isn't represented in the instruction stream.

...

Just imagine the fact that the CPU could be executing an instruction that is 630 instructions after the memory load that your program is currently stalled on.

This would suggest that the described common case of short sequences of float operations per memory location would mean that an extra instruction (per memory location in a list) may be negligible because of pipelining, instruction reordering, etc.

Brendan Hansknecht (Mar 19 2022 at 04:26):

iiuc, memory reads (at least those that need to go to main memory) take a lot more than a few cycles to complete.

L1 cache...probably about 4 cycles
Main memmory...100-200ish cycles

Kevin Gillette (Mar 19 2022 at 04:29):

Richard Feldman said:

well also let you opt into letting them happen

which reduces to "we'll treat NaNs as UB" :sweat_smile:

It's not UB in the C/C++ sense, where the compiler is allowed to do truly silly things based on arbitrary [mis]assumptions (like access out-of-bounds memory). The extent of this UB is that NaN would propagate, and presumably anyone opting into unchecked operations for better performance will check for NaN before they make decisions based on the result of the computation.

Brendan Hansknecht (Mar 19 2022 at 04:29):

This would suggest that the described common case of short sequences of float operations per memory location would mean that an extra instruction (per memory location in a list) may be negligible because of pipelining, instruction reordering, etc.

It does help a ton, but a lot of times chains of dependencies really stop it from actually executing very far ahead. Like the counter value of a loop for example.

On top of that, there is a large horizon issue. If you have extra instructions in the hot loop, it will take extra longer to read through the instructions in the hot loop. So there is a chance the important load instruction is just outside of the current window and is delayed by a cycle or 2. And that happens on every loop iterations

Richard Feldman (Mar 19 2022 at 04:33):

presumably anyone opting into unchecked operations for better performance will check for NaN before they make decisions based on the result of the computation.

I think this reduces to basically everyone doing game dev though!

considering addition, subtraction, multiplication, division, sqrt, and log can all produce NaN from non-NaN arguments...people would be opting into the better-performance version all the time!

Kevin Gillette (Mar 19 2022 at 04:38):

Richard Feldman said:

we definitely can't treat NaNs as UB while still allowing them to be the result of every arithmetic operation, since that would mean every floating-point arithmetic operation in Roc would become a potential source of UB

I suspect this really applies to spans of floating point operations, not individual operations. For example, given a sequence of 10 float operations that is converted to an integer and used as a list index, if a NaN is produced anywhere in the span (even in the first instruction), that really doesn't matter until the integer conversion.

Kevin Gillette (Mar 19 2022 at 04:39):

Richard Feldman said:

presumably anyone opting into unchecked operations for better performance will check for NaN before they make decisions based on the result of the computation.

I think this reduces to basically everyone doing game dev though!

considering addition, subtraction, multiplication, division, sqrt, and log can all produce NaN from non-NaN arguments...people would be opting into the better-performance version all the time!

That brings us back then to "use Dec unless you know you need Float, and float has propagating NaNs and infinities, just like most other languages."

Richard Feldman (Mar 19 2022 at 04:46):

yeah I'm thinking that design is now the most promising, after exploring SIGFPE more. :thumbs_up:

to recap the design from the end of yesterday:

Have a compiler warning when dividing by a non-constant, unless you either use a conditional or an expect to confirm you're aware of it and are confident it's not zero. (I think we should do this regardless; it benefits integers too.)
Don't have floats support Equating, which means they don't get ==, <=, or >=, and that they can't be used as keys in hashmaps.
This in turn means the NaN != NaN downside should primarily come up in practice when sorting, and it seems possible to provide default sorting for floats that takes this into account without any runtime performance cost.

with those in place, the remaining NaN "gotchas" are that x < NaN == NaN < x even when x is not NaN. This primarily could cause problems in a sorting algorithm, although most sorting algorithms on collections would likely Order, which would explicitly sort NaNs consistently.

Richard Feldman (Mar 19 2022 at 04:46):

I wish there were some way to implement < in a single instruction such that x < NaN == NaN < x was false if x wasn't NaN, but unfortunately I suspect there's no such instruction

Brendan Hansknecht (Mar 19 2022 at 05:01):

an expect to confirm you're aware of it and are confident it's not zero.

Does expect do anything at runtime? Or is it just a modification at compile time? If just compile time, we might want to change the word to assume. Most people associate expect with test failures and panicking.

Richard Feldman (Mar 19 2022 at 05:04):

expect fails tests, and will report (but not panic) an error in dev builds

Richard Feldman (Mar 19 2022 at 05:04):

it does nothing in optimized builds

Richard Feldman (Mar 19 2022 at 05:04):

assume feels like it might be a weird word to use when writing tests :thinking:

Kevin Gillette (Mar 19 2022 at 05:08):

assume sounds to me like a compiler hint: "Roc, don't check this thing for safety because you can assume it's safe"

Brendan Hansknecht (Mar 19 2022 at 05:20):

Maybe I am mis-interpreting what you mean here:

Have a compiler warning when dividing by a non-constant, unless you either use a conditional or an expect.

This sound like expect would be used in regular code to cause roc to remove a safety conditional.

Brendan Hansknecht (Mar 19 2022 at 05:23):

Sounds like I can do:

expect (x != 0)
y =  3 / x

instead of:

y = when Num.div 3 x is
    Ok v -> v
    Err DivByZero -> #idk do something with errors

Brendan Hansknecht (Mar 19 2022 at 05:25):

If you meant that you can do what I said above. That is where I am suggesting it should be:

assume (x != 0) # I know what I am doing, skip the safety
y = 3 / x

Kevin Gillette (Mar 19 2022 at 13:14):

That kind of approach would be intriguing UB though, since that last example means "if x is zero, you'd normally panic because it's my fault, but in this case don't panic because it won't happen, and if it does actually happen, it's doubly my fault."

It's certainly a nicer way to introduce UB than the classic C/C++ way of the compiler implicitly making that assumption when asked to optimize, and it's certainly a more limited form of UB than C/C++ compilers may produce (it's actually just NaN propagation in Roc's case).

That said, coding assumptions are frequently incorrect in my experience: even if the assumption completely holds at the time the code was initially written, some later change in a distant part of the code will allow zeros to propagate into x.

Per Richard Feldman's point, even an "assume for speed" should probably become a panicking assert in debug builds.

Kevin Gillette (Mar 19 2022 at 13:20):

I'd be concerned that an "assume for speed" would get misused, i.e. someone may add a dozen irrelevant assume lines to something trying to get more speed ("there, that should do it"), without actually telling the compiler anything useful, and without having done any profiling.

If allowed as a language feature, the expressions should be constrained somehow, so that you can't do:

x, y, z ->
    assume (z != 0)
    x / y

Kevin Gillette (Mar 19 2022 at 13:27):

@Richard Feldman why exactly are we considering omission of NaN checks to be undefined behavior? Is it because LLVM may itself produce true UB when NaN is assumed to not occur (such as hypothetical jmps based on clever float bit patterns that go to unrelated areas in the code)? Or is it because we're considering NaN propagation to be UB? or something else?

Richard Feldman (Mar 19 2022 at 13:33):

here's how I'm thinking about things:

If we say "let's assume NaNs won't happen" then the practical consequence of that is not memory unsafety, but rather that we don't design APIs with NaNs in mind. An example of this is sorting - if we assume NaNs won't happen, then if a NaN gets into a list, and that gets sorted, what happens is then undefined behavior in that we have intentionally chosen not to think about it - that is, to define it. It'll be a surprise for whoever is unfortunate enough to encounter it!

If we know NaNs can actually happen (meaning we don't do the SIGFPE thing), then NaNs need to be considered in every floating-point API design (like sorting), and have their behavior defined - even if other constraints mean that the defined behavior in some cases is "somewhat error-prone, but making it less error-prone would have unacceptable performance characteristics."

Richard Feldman (Mar 19 2022 at 13:35):

Brendan Hansknecht said:

Have a compiler warning when dividing by a non-constant, unless you either use a conditional or an expect.

This sound like expect would be used in regular code to cause roc to remove a safety conditional.

it wouldn't be a safety conditional, it's just a lint reminding you to consider division by zero.

the thing that's error-prone about div by 0 is that it's easy to forget to think about it. If you do forget, the consequence is (at worst) an inf/-inf/NaN, not memory unsafety. So it's there to help you, but if you say "I know what I'm doing," it's not like the program will segfault or anything

Kevin Gillette (Mar 19 2022 at 14:12):

Okay, I see what you mean with sort. So that doesn't seem to be an issue with < itself, for example, but just how we're using it.

Perhaps we could just define NaN as being less than all values so that x < y == y > x will always hold to be true. Right now it only doesn't hold because of NaN, and that's pretty unfortunate. We've called into question the value of NaN being unequal to itself (which we're sidestepping with the consideration of having no float equality), leaving us just with "unorderable NaN," which seems like plainly unnecessary complexity (no value gained for the pain it inflicts.

Such a change would make < and > cost 2-3 instructions instead of one in the worst case (when the compiler can't prove that NaN is not involved), but it seems that performance is more important for arithmetic than comparisons (since in Roc you already can't do x + (y > z) as you can in C). It also means that < and > have a better language-level definition.

When speed or comparisons-as-math are needed, we can provide functions for the C style behavior of having numeric comparisons returning 0 for false and 1 for true, and these can be defined as following ieee754 NaN rules.

Richard Feldman (Mar 19 2022 at 14:51):

is that the way to give people who need floats the best experience? :thinking:

Richard Feldman (Mar 19 2022 at 14:52):

obviously x < NaN == NaN < x is unintuitive, but can we think of some specific realistic scenarios where this would cause a bug?

Richard Feldman (Mar 19 2022 at 14:53):

I don't think we should automatically sacrifice performance or ergonomics just because a bug is theoretically possible

Richard Feldman (Mar 19 2022 at 14:53):

those considerations have to be balanced against how likely the bug actually is to occur in practice, and what the consequences would be if it did occur!

Richard Feldman (Mar 19 2022 at 14:54):

and especially in the context of who would be using the feature in question

Richard Feldman (Mar 19 2022 at 14:55):

like I think it's very safe to assume that if I'm a game programmer, my default preference is that < works normally on non-NaNs, and that it doesn't have any extra performance cost

Richard Feldman (Mar 19 2022 at 14:56):

what's the specific thing that sacrificing either of those would save me from? :thinking:

Brendan Hansknecht (Mar 19 2022 at 15:36):

obviously x < NaN == NaN < x is unintuitive, but can we think of some specific realistic scenarios where this would cause a bug?

Would this but lead to a bug in most sorting algorithms?

Brendan Hansknecht (Mar 19 2022 at 15:40):

it wouldn't be a safety conditional, it's just a lint reminding you to consider division by zero.

the thing that's error-prone about div by 0 is that it's easy to forget to think about it. If you do forget, the consequence is (at worst) an inf/-inf/NaN, not memory unsafety. So it's there to help you, but if you say "I know what I'm doing," it's not like the program will segfault or anything

I still don't understand how expect would be used outside of tests. Can you give an example?

Richard Feldman (Mar 19 2022 at 15:40):

so if we have a default List.sort for be floats, and a default compare for floats, and they both sort NaNs to the edge...what would be the situation where someone writes their own sorting algorithm using < instead of compare?

Brendan Hansknecht (Mar 19 2022 at 15:41):

I see. So we are defining the implementation of NaN comparison such that they sort to the edge. Just only if you use the compare function, not if you use <

Richard Feldman (Mar 19 2022 at 16:00):

Brendan Hansknecht said:

I still don't understand how expect would be used outside of tests. Can you give an example?

sure! Let's say I'm running a development build and I hit expect 1 > 2 - it would log that an expectation failed (to stderr if I'm running in a terminal, or to the editor if running in the editor), but the program would continue running. So I'd be notified that my expectation didn't hold, but it wouldn't block execution from continuing - kinda like a dbg! behind an if in Rust.

in an --optimize build, all the expects would do nothing; they'd just get removed.

Brendan Hansknecht (Mar 19 2022 at 17:35):

Have a compiler warning when dividing by a non-constant, unless you either use a conditional or an expect.

So unless I have expect (x != 0) I will get a compiler warning when typing 3 / x?

Richard Feldman (Mar 19 2022 at 18:13):

yeah, unless x is assigned to a nonzero literal, in which case we know for sure it's fine

Richard Feldman (Mar 19 2022 at 18:15):

or you use Num.divChecked which would check for div by 0 and return a Result

Richard Feldman (Mar 19 2022 at 18:15):

and wouldn't have a warning

Richard Feldman (Mar 19 2022 at 18:17):

also it would presumably need to be expect Num.isNotZero x since != won't be supported on floats!

Brendan Hansknecht (Mar 19 2022 at 18:17):

so in the integer case, does the expect actually make any sense. Won't the division by zero panic anyway?

Brendan Hansknecht (Mar 19 2022 at 18:17):

Is it just to make sure people explicitly choose either to use divChecked or add an expect?

Brendan Hansknecht (Mar 19 2022 at 18:18):

or and a conditional around the div I guess

Richard Feldman (Mar 19 2022 at 19:39):

yeah exactly

Richard Feldman (Mar 19 2022 at 19:40):

divChecked would just be easier for the compiler to verify than a conditional

Brendan Hansknecht (Mar 19 2022 at 20:24):

So in a really math heavy function, I could put a bunch of expects at the beginning and then just write normal math code. That sounds nice.

Brendan Hansknecht (Mar 19 2022 at 20:26):

Would love if this could expand to things like list bounds checking. So that I can remove the checking from hot loops in optimized builds, but not sure if that would be too unsafe. Frankly some sort of assert that was still around in an optimized build would likely also be useful just to get llvm to put the conditional where I want it and potentially group them.

Brendan Hansknecht (Mar 19 2022 at 20:26):

but this is all quite a different topic from floats.

Kevin Gillette (Mar 19 2022 at 20:27):

Regarding bounds checking, aren't compilers pretty good at eliminating bounds checks when it's safe to do so?

Brendan Hansknecht (Mar 19 2022 at 20:34):

Decently, but in writing wyhash in Roc, I have run into a number of cases where I can't get certain hot loop bounds checks to get removed. I at best can get it to merge multiple bounds checks into 1 per loop iteration, but that still has noticeable overhead.

Richard Feldman (Mar 19 2022 at 21:04):

yeah that's definitely an interesting idea!

Richard Feldman (Mar 19 2022 at 21:05):

I don't think we should make it possible to eliminate the bounds checks entirely - potentially introducing memory unsafety at runtime is a bright line I don't want to cross - but making it possible to extract them outside loops sounds like an interesting idea to explore for sure

Last updated: Jul 23 2026 at 13:15 UTC

Stream: ideas

Topic: supporting NaN/Infinity/-Infinity?

Richard Feldman (Mar 14 2022 at 01:31):

Richard Feldman (Mar 14 2022 at 01:38):

Richard Feldman (Mar 14 2022 at 01:40):

Richard Feldman (Mar 14 2022 at 01:58):

Richard Feldman (Mar 14 2022 at 02:32):

Kevin Gillette (Mar 14 2022 at 06:12):

Martin Stewart (Mar 14 2022 at 10:18):

Folkert de Vries (Mar 14 2022 at 14:05):

Folkert de Vries (Mar 14 2022 at 14:06):

Richard Feldman (Mar 14 2022 at 14:08):

Richard Feldman (Mar 14 2022 at 14:08):

Folkert de Vries (Mar 14 2022 at 14:10):

Folkert de Vries (Mar 14 2022 at 14:11):

Richard Feldman (Mar 15 2022 at 11:59):

Richard Feldman (Mar 15 2022 at 12:22):

Richard Feldman (Mar 15 2022 at 12:22):

Richard Feldman (Mar 15 2022 at 12:24):

Richard Feldman (Mar 15 2022 at 12:25):

Richard Feldman (Mar 15 2022 at 12:26):

Richard Feldman (Mar 15 2022 at 12:26):

Kevin Gillette (Mar 15 2022 at 14:28):

Derek Gustafson (Mar 15 2022 at 14:44):

Derek Gustafson (Mar 15 2022 at 14:44):

Brendan Hansknecht (Mar 15 2022 at 14:55):

Brendan Hansknecht (Mar 15 2022 at 14:57):

Brendan Hansknecht (Mar 15 2022 at 15:00):

Richard Feldman (Mar 15 2022 at 15:06):

Richard Feldman (Mar 15 2022 at 15:18):

Richard Feldman (Mar 15 2022 at 15:24):

Richard Feldman (Mar 15 2022 at 15:25):

Richard Feldman (Mar 15 2022 at 15:25):

Richard Feldman (Mar 15 2022 at 15:27):

Richard Feldman (Mar 15 2022 at 15:28):

Richard Feldman (Mar 15 2022 at 15:28):

Richard Feldman (Mar 15 2022 at 15:31):

Richard Feldman (Mar 15 2022 at 15:31):

Richard Feldman (Mar 15 2022 at 15:32):

Richard Feldman (Mar 15 2022 at 15:33):

Richard Feldman (Mar 15 2022 at 15:35):

Richard Feldman (Mar 15 2022 at 15:38):

Richard Feldman (Mar 15 2022 at 15:38):

Richard Feldman (Mar 15 2022 at 15:39):

Richard Feldman (Mar 15 2022 at 15:40):

Richard Feldman (Mar 15 2022 at 15:42):

Anton (Mar 15 2022 at 15:43):

Richard Feldman (Mar 15 2022 at 15:43):

Richard Feldman (Mar 15 2022 at 15:43):

Richard Feldman (Mar 15 2022 at 15:44):

Richard Feldman (Mar 15 2022 at 16:41):

Redefine equality NaN to be equal to itself, modify div, mod, and atan2 to treat -0 as equivalent to 0

Redefine all operations that could produce NaN to panic instead of producing NaN, modify atan2 to treat -0 as equivalent to 0, use RawFloat for fastest operations

Brendan Hansknecht (Mar 15 2022 at 17:11):

Richard Feldman (Mar 15 2022 at 18:02):

Don't have floats support the Equating or Ordering abilities, modify div, mod, and atan2 to treat -0 as equivalent to 0

Richard Feldman (Mar 15 2022 at 18:15):

Richard Feldman (Mar 15 2022 at 18:16):

Richard Feldman (Mar 15 2022 at 18:23):

Richard Feldman (Mar 15 2022 at 18:23):

Richard Feldman (Mar 15 2022 at 18:27):

Derek Gustafson (Mar 15 2022 at 18:31):

Derek Gustafson (Mar 15 2022 at 18:32):

Anton (Mar 15 2022 at 19:30):

Derek Gustafson (Mar 15 2022 at 19:37):

Derek Gustafson (Mar 15 2022 at 19:38):

Richard Feldman (Mar 15 2022 at 22:04):

Richard Feldman (Mar 15 2022 at 22:09):

Richard Feldman (Mar 15 2022 at 22:11):

Richard Feldman (Mar 15 2022 at 22:13):

Richard Feldman (Mar 15 2022 at 22:13):

Richard Feldman (Mar 15 2022 at 22:17):

Richard Feldman (Mar 15 2022 at 22:17):

Brendan Hansknecht (Mar 15 2022 at 22:18):

Richard Feldman (Mar 15 2022 at 22:19):

Richard Feldman (Mar 15 2022 at 22:19):

Brendan Hansknecht (Mar 15 2022 at 22:20):

Richard Feldman (Mar 15 2022 at 22:20):

Richard Feldman (Mar 15 2022 at 22:20):

Richard Feldman (Mar 15 2022 at 22:20):

Redefine equality `NaN` to be equal to itself, modify `div`, `mod`, and `atan2` to treat `-0` as equivalent to `0`

Redefine all operations that could produce `NaN` to panic instead of producing `NaN`, modify `atan2` to treat `-0` as equivalent to `0`, use `RawFloat` for fastest operations

Don't have floats support the `Equating` or `Ordering` abilities, modify `div`, `mod`, and `atan2` to treat `-0` as equivalent to `0`