so here's an interesting question: should Roc support NaN, Infinity, and -Infinity?
with integers, arithmetic panics on overflow. This means that it has an extra conditional (which is cheap in isolation, but which can prevent certain LLVM optimizations that are a bigger deal), but you can opt out of it by using something like Num.addWrap - which wraps on overflow. This is important for things like hashing functions, but also it can be used as a performance optimization if you're very confident that overflow won't happen, because it skips the conditional check. (The CPU has a single "add with wrapping overflow" instruction but not a single "add with panicking overflow" instruction, because it doesn't have a concept of panicking.)
Floating-point numbers ordinarily overflow to Infinity or -Infinity. Currently, Roc's design is to add a conditional which detects when this happens and turns it into a panic instead - just like we do with integers. However, there is no equivalent of "wrapping overflow" for floats, because there's no single CPU instruction for wrapping overflow of floats (so it couldn't be a performance optimization; the single CPU instruction versions of things overflow to either -Infinity or Infinity).
This means that there are some high-performance floating-point math operations in Roc that just can't be done without sacrificing some performance due to intermediate conditionals checking for overflow. There's a similar situation with operations like sqrt and division, both of which can produce NaN (division by zero usually produces either Infinity or -Infinity, but specifically 0/0 returns NaN).
Infinity and -Infinity introduce several edge cases. For example, how should Num.toStr print them? "Infinity" and "-Infinity"? "∞" and "-∞"? And then how should it parse them when converting from a Str to a number? If it's the written word "infinity" should it have to be capitalized? Also, now Roc programmers potentially need to do defensive isFinite checks in various places where they wouldn't need to if we ruled out Infinity and -Infinity in the language by always either panicking or having a function return Result if either of them could happen.
NaN is an even bigger problem, because NaN is defined at the hardware level to be not equal to itself. This means that putting NaN into a Set or using it as a key in a Dict unavoidably causes strange behavior - like if you do Dict.insert x y where x is NaN, the dictionary's length will increase by 1, but Dict.get x will report that x is not in the dictionary.
The problem is that if we don't allow Roc values to become NaN, Infinity, or -Infinity, then there are some floating-point math operations that just can never be as fast as they would be in other languages. Allowing just Infinity and -Infinity might sound appealing, because they're less troublesome than NaN, but that would mean that only division and sqrt are slower than in other languages, when all the other floating-point operations can be just as fast as C. That would be strange.
there are a variety of possible designs here.
For example, in theory we could have the compiler treat bit patterns of NaN, Infinity, and -Infinity in floating-point numbers as Result values with Err tags. This would allow for a function like Num.addFloatUnchecked : Float a, Float a -> Result (Float a) [ InvalidFloat ]* which could compile to a single instruction. Then when pattern matching on that exact Result type, we could compile to a single CPU instruction of "is infinite"
However, even assuming this turned out to be doable, it still raises a problem: if I want to chain together several floating-point operations without handling the Result in between, how do I do that? Do we instead make it Num.addFloatUnchecked : Result (Float a) [ InvalidFloat ], Result (Float a) [ InvalidFloat ] -> Result (Float a) [ InvalidFloat ]? That sounds very clunky!
But maybe the idea is salvageable if we instead had something like a RawFloat a opaque type which wrapped a float type and let you do raw floating point operations on it, and then have a toFloat : RawFloat a -> Result (Float a) [ Infinity, MinusInfinity, NaN ]* function to turn it back into a regular number. Is that too much special-casing just for fast float math though?
anyway, I'd love to hear anyone's thoughts on this topic!
another option is to do something like what Rust does: it distinguishes between PartialEq and Eq - most types have both traits, but floats have PartialEq and not Eq because of NaN. Rust's == operator only requires PartialEq, but hashmaps require full Eq, so by default in Rust you can't sort floats or use them as keys in hashmaps.
another potential idea: define equality on floats to be x == y || (isNaN x && isNaN y) - so basically use a couple of extra instructions to redefine NaN to be effectively equal to itself, so that it doesn't break collections or sorting...and then offer a separate float-specific function that does single-instruction equality (e.g. where if both arguments are NaN it returns false)
I was rather pleased to have read in other Roc materials that Infinities and NaNs do not occur in practice. When reading that other material, I inferred that tags would be used to represent those special cases when they were needed (i.e. for "checked" floating-point arithmetic operations), such as the Result form you speak of. I don't believe the default behavior should be for infinities or NaN to happen "by accident," since this kind of in-band signalling has the same problems as null and other sentinel values.
Regarding sqrt performance, is it not the case that only negative numbers produce an imaginary result (i.e. undefined/NaN when done using floats)? If so, sqrt could be a builtin that has an input validity check at the beginning, but is otherwise annotated to not spend extra instructions checking divisors. For division performance in general, the impact of optimization is only felt over many operations (either performing the same operation on a list, or performing many operations in sequence on the same data in registers). In the latter case (a sequence of operations to a single value), problematic inputs can often be ruled out prior to starting the transformations.
Richard Feldman said:
Do we instead make it
Num.addFloatUnchecked : Result (Float a) [ InvalidFloat ], Result (Float a) [ InvalidFloat ] -> Result (Float a) [ InvalidFloat ]? That sounds very clunky!
That does sound clunky, yes. Also, why is it called addFloatUnchecked, given that the inputs (and output) look very much like "checked," aka "tagged" values?
But maybe the idea is salvageable if we instead had something like a
RawFloat aopaque type which wrapped a float type and let you do raw floating point operations on it, and then have atoFloat : RawFloat a -> Result (Float a) [ Infinity, MinusInfinity, NaN ]*function to turn it back into a regular number. Is that too much special-casing just for fast float math though?
In terms of special casing, would this be something added to the language now, which we could plausibly optimize later (iow, are we taking the C++ route of adding keywords and other features that become defunct or get repurposed, all just so that we can have better performance right now at the expense of language simplicity over time)?
At least for me, opaque numeric types are not as compelling as a richer set of operations (though I'm not at all consistent on this: for non-numerics, I've felt in some cases that opaque types are better than having more operations). I guess it comes down to: what would we do with "regular floats" that we couldn't do with the opaque type? Presumably the opaque type could be compared, be an input to arithmetic, stringified (if we chose to add a function for it), rounded to an int, etc. At that point, it seems like we'd be adding a raw type just to allow infinities and NaN, but not because we want infinities and NaN, but only because it's faster to allow them. We could achieve the same effect with Num.isNaN: Float a -> Bool and [something like] Num.isInfinity: Float a -> [ Nope, Positive, Negative ], or even Num.checkFloat : Num a -> Result (Num a) [ PlusInfinity, MinusInfinity, NaN ]; when the compiler detects the use of these functions, similar to memory fencing/synchronization, it can generate code that elides more expensive panic-checks/normalization in each operation in exchange for a final one-time check. In other words, the presence of the function call permits the compiler to generate fast float computations which are still, in the end, safe, and the absence of any of these function calls results in the compiler generating code which can produce panics after each operation. This approach is somewhat magical, however, and would need to be thoroughly documented.
That said, I would generally like to see both checked and unchecked operations (with both available, either can be the default). There is a distinction between "unchecked for speed" and "unchecked because I don't care" (usually "saturation arithmetic," i.e. for integers not wrapping but also not panicking; just remaining at the extreme upon which it lands). Both of these flavors of unchecked solve quite different needs. I tend to care about saturation arithmetic more than implicit panics (if there's a way for me to generate my own panics, I can do that while handling checked results). However, I don't tend to write math-intensive code: others will have the opposite need.
I do like where the [ Infinity, MinusInfinity, NaN]tag union is going, though it'd be nice to see balanced terms: PosInfinity paired with NegInfinity, or PlusInfinity paired with MinusInfinity. I could easily mis-infer that Infinity means "some infinity, either positive or negative."
PartialEq doesn't seem compelling to me (avoided complexity if we guarantee that Roc floats cannot represent these special values).
Richard Feldman said:
another potential idea: define equality on floats to be
x == y || (isNaN x && isNaN y)- so basically use a couple of extra instructions to redefineNaNto be effectively equal to itself, so that it doesn't break collections or sorting...and then offer a separate float-specific function that does single-instruction equality (e.g. where if both arguments areNaNit returns false)
I'd love to see what people doing a lot of math-oriented programming think, but I believe that while NaN != NaN is confusing to beginners, NaN == NaN would be equally confusing to non-beginners. We side-step this confusion, again, by prohibiting floats from being NaN, and it'll not be all that surprising if the tag NaN is equal to itself (at least if it's clear that NaN it's a tag rather than a specially named float constant).
What if phantom types where used to indicate if a Float was safe or not?
For example:
add: Float *, Float * -> Float Unsafe #We can end up with infinity so we force the phantom type to become Unsafe
sub: Float *, Float * -> Float * #This can't become infinity or NaN so we let Float keep whatever type var it already had
check: Float * -> Result (Float Safe) [ FloatMightBeInfiniteOrNan ]
someMath : Float * -> Result (Float Safe) [ FloatMightBeInfiniteOrNan ]
someMath a = a |> add 5 |> sub 2 |> check
Advantages:
Float Safe you know for certain it's safeDisadvantages:
+, *, /since now you need a type signature that doesn't match Num a -> Num a -> Num aNaN /= NaN issue. I think this should be seperately handled so that in Roc NaN == NaN even if this comes at a performance cost.random other suggestion: F32/F64 follow the spec, but Dec gets the good, ergonomic semantics
since Dec is supposed to become the default, that would mean we mostly won't have issues related to inf/nan, certainly beginners won't
oh yeah I'd want to do that regardless! :big_smile:
this is really just about actual floating-point numbers, not Dec
that makes the downsides smaller I think
except that because of interop, F32/F64 might still be very common
I forgot about some -0 edge cases: https://stackoverflow.com/questions/5095968/does-float-have-a-negative-zero-0f#5096205
so even without NaN in the picture, it's actually not true that x == y implies f x == f y when it comes to floats, unless you want to define -0 to be unequal to 0...which definitely seems like it would cause bugs :sweat_smile:
:thinking: here's an interesting idea: what if comparing NaNs for equality panics?
instead of returning either True or False
that way:
Dict or SetI could also see an argument for panicking when trying to toStr a NaN
after all, NaN was designed to be unequal to itself in order to be a self-propagating error
when the only primitives they could specify were CPU primitives, the idea was "if you try to do anything useful with this thing, it absolutely refuses to be useful - all operations return either NaN or false, even equality with itself"
I've seen how concessions in a type system early on can lead to awkwardness later on, as has happened as Go is about to launch generics: interface values are comparable based on the combination of runtime type and runtime value, but since interface values may contain non-comparable types, comparison can panic at runtime. This has lead Go's comparable ability to permit panics, which is is pretty surprising, undesirable, and merely a consequence of an earlier language choice.
In Roc, if our choices with NaNs are to 1) make them always equal to each other, 2) make them never equal to each other, or 3) make them panic at runtime when compared, then each of those seem like they will surprise a notable portion of the user base. NaNs themselves are inherently surprising. If we're discussing this to save an extra instruction in the worst case, then it doesn't seem worth complexifying the language by either declaring that Eq can panic or alternatively that anything anywhere can panic, while also complexifying the language to allow a case where x < y == y < x
To be confident in our software, when there are traps in a language, they should be surrounded by large swaths of safe, flat terrain. If something so fundamental as equality can panic, then my view of the safety and trustability of a language goes down.
It just seems that, for the standard numeric types, NaN and probably infinities should be excluded in favor of zero/Result and saturated finites (which are already so large that they're usually excellent stand-ins for "infinity" anyways).
If we exclude those special in-band values (in favor of Result tags or normalization), then we can be sure they won't be inputs to further arithmetic operations, and thus we can save that extra instruction in many cases anyways since relatively few operations produce NaN, and many of those require NaN or an infinity as input.
I'd also recommend that if we introduce additional NaN/infinity-allowing float types, we give them a name including the word "Unsafe". If we call them "fast" or "standard", people will choose them first, almost never for any noticeable gain (most applications do not need even competitive floating point performance), and that choice would be to the detriment to the quality of their software.
Something else we might want to consider: Could the compiler recognize Results around numeric operations, and then unbox them so there is a single infinity/nan check after a bunch of computations finish?
Would Result.map make that unboxing easier?
If something so fundamental as equality can panic, then my view of the safety and trustability of a language goes down.
To be fair, so can addition and subtraction, so it may fit the languages to make the default equality panic but also have a special float equality that deals with Nan and such
Something like == panics but Num.eqUnsafe sets is false with Nan
And the compiler definitely could optimize result float in general. It could make it not have a tag but consider the tag to be the part of the float that specifies if it is Nan or Inf. Optimizing a chain of operations should work as well, but probably would need to be explicit depended on how the result is used after it is checked.
an annoying thing I just realized about having == work differently is that <= and >= would too, since otherwise x <= x || x >= x would evaluate to False if x was NaN
NaN and probably infinities should be excluded in favor of zero/Result and saturated finites (which are already so large that they're usually excellent stand-ins for "infinity" anyways).
I think it's okay if Num.add on floats goes to Infinity/-Infinity, or if it panics, but I'm not okay with it saturating (add silently returning the highest non-Infinity float value). That would mean it's paying an additional runtime cost (a conditional on every add) in order to silently and intentionally accumulate increasingly wrong answers, which I'd definitely rather not do!
we haven't really talked about the "what if we prevent NaN, Infinity, and -Infinity through panics" design, so let's do that
doing that means:
==, <=, and >= all work as expected-0 with atan2 is still surprising, considering -0 == 0 but using each of those with atan2 is not equivalent, so arguably atan2 should also be special-cased to turn arguments of -0 into 0.atan2, possibly ==, <=, and >= depending on how other design questions are answered) would also need corresponding functions that give similar answers in a single CPU instruction. I'm not okay with saying "Roc is permanently slower at floating-point operations than C and there's no way you can get around that." The design has to support a way that an application author can access every individual floating point CPU operation with no overhead.(we have the same thing with ints, e.g. Num.addWrap : Int a, Int a -> Int a is a single-cpu-instruction version of Num.add for integers; unlike Num.add, it doesn't have an extra conditional that panics on overflow)
it's that last requirement that makes floats much trickier than ints!
because whereas ints do things like silently wrapping on overflow, floats silently turn into erroneous values that automatically propagate.
so just supporting addUnsafe : Float a, Float a -> Float a means that now == has to deal with Infinity and -Infinity at a minimum
supporting divUnsafe : Float a, Float a -> Float a or sqrtUnsafe means it has to deal with NaN
the only way to make the unsafe arithmetic ops single-instruction and not do that would be to have a type like RawFloat
which could be an opaque type that doesn't support equality
but of course then it either couldn't support <= or >=, or else we'd have to be okay with those having the (x <= x || x >= x) == False semantics
but at that point, we're giving up "Roc can be as fast as C at floating-point operations"
unless we say that this special RawFloat type can't use the common infix operators for comparison, but rather has to be using things like RawFloat.isEq and RawFloat.isLte
...at which point we have a design where:
Float normally, store it in your data structures, etc. Equality works intuitively.RawFloat and do all your math with that, then convert back to Float at the end.RawFloat you miss out on some infix operators.so this maybe sounds appealing in terms of guarantees, but imagine yourself as someone who uses floats all the time
like you're doing 3D graphics stuff, or physics simulations
is this giving you a good experience with Roc? I doubt it. Your program is doing tons of floating-point math, which means either you're constantly doing conversions to and from RawFloat, or else you're storing RawFloats directly in your data structures, meaning basic stuff like == is no longer supported across your program, or else you're paying a performance penalty everywhere.
when you consider that Dec exists in Roc to give you a pleasant, predictable experience with fractional numbers, and the main reason that F32 and F64 are in the language at all are for performance reasons...is it really a good design to make it cumbersome to get the best performance out of them?
meaning basic stuff like == is no longer supported across your program
Would that really be a big problem? It is often advised to not check floats for equality because of binary representation issues.
I think it would be a big deal for tests
you can no longer do like entity1 == entity2
if either of them contains a RawFloat
to summarize alternative designs that preserve x == x implies f x == f x across builtins (except in cases of panics), while still supporting single-instruction versions of all floating-point operations:
NaN to be equal to itself, modify div, mod, and atan2 to treat -0 as equivalent to 0==, <=, and >= to special-case NaNdiv, mod, and atan2 to special-case -0 to work the same way as 0 (e.g. 3/-0 == Infinity, not -Infinity)NaN to panic instead of producing NaN, modify atan2 to treat -0 as equivalent to 0, use RawFloat for fastest operations==, <=, and >= alone, and assume that a Floatvalue will never contain NaN. This implies assuming that a platform will never mistakenly send a NaN over from the host. I worry a bit that this would be an easy mistake for a host author to make, but I'm open to it possibly being the right design.div to panic on division by 0atan2 to special-case -0 to work the same way as 0Float operations are relying on the assumption that they will never contain NaN, this essentially requires having a separate type like RawFloat for all the single-instruction operations, with a conversion from RawFloat to Float that returns a Result in case the RawFloat was NaNyou can no longer do like
entity1 == entity2
For floats that is explicitly frowned upon even in tests. So not sure it is a loss.
ok, so in that vein, here's another potential design:
Equating or Ordering abilities, modify div, mod, and atan2 to treat -0 as equivalent to 0==, <=, and >= on floats would be a type mismatch, as would attempting to use List.sort on any list containing a float. Floats could not be used in Sets or as keys of Dicts.div, mod, and atan2 to special-case -0 to work the same way as 0 (e.g. 3/-0 == Infinity, not -Infinity)SafeFloat (or equivalent) which could use Result to make sure it only ever consists of finite floats, and then it could support Equating and Ordering safely.:point_up: is similar to what Rust does, except they have PartialEq and PartialOrd which == and <= and >= use, and basically what Partial___ means is "equality except NaN is weird" - so in Rust:
== supports floats, but if you use it with two NaNs it returns false - same with <= and >=I'd rather not go down the PartialEq road though
I think the thing that draws me to the "redefine NaN to be equal to itself" design is that it has these characteristics:
those downsides seem less bad than the other designs' downsides
I'm having a hard time coming up with a way in which it's less error-prone to have NaN be unequal to itself (like the spec defines it to be), which is a legendary gotcha for floats, and the "panic whenever we'd get a NaN so we don't have to worry about handling them" design has the downside of needing a totally separate float type and API for people to learn about if they want maximally fast float operations, which is likely what they want since if they didn't care that much about performance, they'd probably be using Dec anyway
Floats not supporting ordering feels like a bad idea
If I have a list of floats, I expect to be able to sort them
We could provide sort functions that use the different recommended ways to compare floats, see the section "Know what you’re doing" here. We could then also provide the necessary background information on which algorithm to choose.
I feel that it is actually a good thing to not be able to call List.sort for these RawFloat, because it will make you aware of how comparing floats can go wrong. If we do allow List.sort to be called with floats, it is guaranteed to produce hard to debug strange behavior for some people. If Roc is used at scale this will eventually go wrong in a critical situation. Floating point issues have led to loss of life before..
I'm okay with making a test of equality difficult/impossible for floats. That is error prone.
But lessThan is reasonable for floats. And lots of the sorts of numeric algorithms that are the work horse of high efficiency code depend on that sort of comparison
I mean, quick sort IS Roc's major selling point right now. Do we really want to undermine that by saying only for integers?
Floating point issues have led to loss of life before
I didn't know about this! Reading through the article, apparently it was specifically overflow when converting from f64 to i16:
The greater values of BH caused a data conversion from a 64-bit floating point number to a 16-bit signed integer value to overflow
so I guess in this case that wouldn't have made a difference, but still - fair point in general about floating point inaccuracies causing problems
I feel that it is actually a good thing to not be able to call
List.sortfor theseRawFloat, because it will make you aware of how comparing floats can go wrong.
another angle to consider: what if == isn't supported for floats, and instead there's only a comparison function? e.g. you give it two floats and an epsilon, and it compares them to see if they're no more than that far apart?
(I assume - but haven't verified - that if you give it 0, then after inlining, LLVM would optimize it to a single equality check)
or perhaps use the ULP algorithm mentioned in that article might be a nicer default - e.g. call it Num.isApproxEq : Float a, Float a -> Bool
so Num.isApproxEq (0.1 + 0.2) 0.3 would return True
the "make it most ergonomic to reach for the recommended and less error-prone strategy" angle appeals to me about that design!
hm, I guess another argument for trying that approach first is that if the ergonomics prove too painful in practice, it's easier to relax the constraints into one of the other designs later :thinking:
by defining equality (somehow) on floats
Do we want an epsilon for isApproxEq? Or maybe also have a version where you could specify an epsilon?
yeah I was thinking have two different versions
isApproxEq and like isWithin or something
Yeah, I like that naming
should floats support Hashing by default?
if they're not gonna support Ordering or Equating
(I actually have no idea what the tradeoffs are there!)
I guess it doesn't make sense if they don't have Equating
yeah I mean on the one hand it wouldn't be very useful
but on the other hand...it should work fine? haha
I mean you can hash a bunch of bits, no problem :stuck_out_tongue:
Sure, but it wouldn't let you use them as a key of aDict or Set
you also need Equating for that
right
so is there any harm in having them support Hashing?
since Dict and Set already have to require both Hashing and Equating anyway
I mean probably not, but I personally think Hashing should probably include Equating.
if a == b then hash(a) == hash(b)
That feels that an invariant of hashing. So feels weird to have hashing were that doesn't makes sense.
Are we committed to floats not supporting Ordering?
I still think that's going to be a problem for a lot of users
I'm not sure why we would support Ordering but not Equating - the reasons to exclude one seem to apply to the other, yeah?
Quantiles and histograms require Ordering.
I think we should support Ordering
Doesn't the minimal form of ordering only need <. So it shouldn't need Equating?
Sorting a list of floats feels foundational
Or at least some sort of PartialOrdering so you can still sort?
And people aren't going to be surprised if 2 numbers that differ in the 12th decimal place end up next to each other in a list
ordering needs ==
Why?
Order is defined as [ Lt, Gt, Eq ]
But I can order numbers without ever equating them. So I don't think it needs Eq
hmm
does merge sort need it? Or need it in order to be fast?
I'm assuming that's what we'd use for List.sort
Hmm...though this is still problematic. Cause someone can do Eq = \x, y -> !(Lt x y) && !(Gt x y)
My assumption is that merge sort can be written using just < or just >
So Ordering/PartialOrdering/etc would need to be just Lt or just be Gt to avoid someone deriving Eq
they can't add an ability to floats after the fact
but with or without Ordering, there should still be an operation for comparing for greater than or less than
just not gte/lte
pdquicksort which is the standard sorting algorithm for a number of languages definitely requires Eq to be as fast as possible.
does merge sort run faster if it can use equality?
I suppose we could always have sort and sortSlower with the latter requiring only Ordering and the former requiring Equating too :big_smile:
Not to my knowledge, but I would have to double check on that.
Actually, I guess all sorting algorithms that support floats in other languages have to work without depending on Eq. Otherwise, they would break if there is a NaN in the list. In the case of Pdqsort, it just uses Eq for a potential optimization where many of the elements in the list are equal. I would assume that is pretty unlikely with a list of floats. So maybe that optimization doesn't matter for floats.
so in most languages that have an Ordering equivalent, there's a way to say something like compare : a, a -> [ Lt, Gt, Eq ] | a supports Ordering, Equating
which is normally what get used for sorting
we could replace that with:
isLt : a, a -> Bool | a supports Ordering
isGt : a, a -> Bool | a supports Ordering
and then you could implement compare (or an equivalent) in terms of those, since if !(x < y || x > y) then they must be equal
or, put another way, you can implement isNotEq using those
if we defined Ordering that way, then we could implement these:
List.sort : List elem -> List elem | elem supports Ordering, Equating
List.sortSlower : List elem -> List elem | elem supports Ordering
but then there would be no way to have the Ordering ability include compare : a, a -> [ Lt, Gt, Eq ] | a supports Ordering, Equating
we'd have to make a separate ability for that
so what I'm wondering is: is compare actually necessary?
like for integers I'd assume that LLVM could probably optimize something like !(x < y || x > y) into ==
but what about for more complicated types?
basically: would accommodating sorting floats in this way make other operations slower?
ah, yes it would
here's an example: say I have a record with a bunch of info in it, and instead of having compare I have less than, greater than, and equals
in order to check equals, I may have to traverse every field in the record to make sure they're all structurally equal
and to determine less than, I might have to do the same
if the first several records I check happen to be equal, so I keep going looking for a tiebreaker
so if there's compare, I definitely only traverse the fields once, but if there isn't, I may do up to 2x as much traversing
so to sum up, that means that one of the following has to be true: either...
Ordering, so there's no way to sort themOrdering, but Ordering doesn't include compare (which it couldn't, because compare requires equality), which means all sorting operations on compound datatypes like records slow down unavoidablyPartialOrd and Ord in Rust - one is basically just for floats, and everything but floats has neither (e.g. functions) or both (e.g. every other builtin besides floats). Those basically exist just so that there can be two List.sort functions, one of which works on floats (but slower), and the other of which works on everything elsebetween those three options, I gotta say - disallowing sorting floats seems most appealing :sweat_smile:
especially considering someone could always make a SafeFloat wrapper around Float which uses Result to verify that it's not a NaN, and then implement Equating and Ordering on that, and then sort as normal
huh, this is surprising: https://godbolt.org/z/fKqYrP7oh
LLVM optimizes !(x < y || x > y) to fewer instructions than a regular ==
(but it's a SSE instruction, so I guess maybe the downside is more register traffic?)
in my mind it’s important to be able to sort floats as other comments have mentioned. not doing so would either exclude a large class of numerical mathematics from being done in Roc or the emergence of a ecosystem library that allows you to compare floats and it would likely become a de facto standard, i’m not sure that’s any better?
at the least, i think it would make sense to be able to define a strict preorder over the floats (that is, be able to answer the question “is m<n?”) and not strict equality
define a strict preorder over the floats
I think that's a reasonable idea, but specifically how should that work?
PartialOrd and Ord equivalents?
or defining Equating for floats?
I’m not sure there’s an equivalent in rust
basically you would only be able to define the method “lessThan: a, a -> Bool | a has …” (apologies for formatting, on phone)
right, sorry - I should probably stop saying PartialOrd
let's call it LtGt for now
it just supports less than and greater than, and that's it
so, specifically the design would be:
LtGt is
isLt : a, a -> Bool | a supports LtGt
isGt : a, a -> Bool | a supports LtGt
Ordering supports LtGt, Equating, is
compare : a, a -> Order | a supports Ordering
List.sort : List elem -> List elem | elem supports Ordering
List.sortSlower : List elem -> List elem | elem supports LtGt
Partial orders support less than or equals to as well, are you intentionally excluding that? If so then that’s just a strict preorder where you can only compare things in an order but there’s no notion of equality
how would you check less than or equal if you can't do equals? :thinking:
NaN <= NaN is false
right, the point is that if floats have NaN, then any operations they do involving ==, <=, or >= will work differently from how those operations work on everything else in the language
other designs previously discussed include things like:
NaNs happen :shrug: - this isn't the same thing, but emotionally it's always kinda felt to me like "NaNs are UB, good luck!"NaN to be equal to itself, despite what the IEEE-754 spec says"NaNs are UB, good luck!"
Are they? They have a clear spec, right? It is just a weird spec.
yeah I mean emotionally they feel that way
in the sense of like
"you're not gonna get an error, but weird stuff is gonna happen, so just...don't let it happen, ok?"
so I think of them similarly in terms of "I hope this doesn't happen, because if it does, the symptoms will be weird and possibly distant from where the original problem occurred"
like dictionaries/sets having lengths longer than the number of items I can get out of them
etc
so to continue with the LtGt design, a thing I don't like about it is that LtGt (pretend it has a better name) is literally only going to be used by floats. Every other type in the language will almost certainly have either Ordering (which implies LtGt) or neither LtGt nor Ordering
but that means everyone who wants to implement Ordering between now and forever needs to also learn about and implement LtGt, and that is literally only so that floats can be sorted
hm
ok, here's a weird idea that might be awesome :stuck_out_tongue:
what if we have this?
Ordering is
compare : a, a -> [ Lt, Gt, Eq ] | a supports Ordering
so like
we don't actually say you have to support equating, but you are required to say whether two things are equal
now technically
And floats just don't return Eq
I like it
actually no, floats would :joy:
that's one of the ways in which it's weird
so floats wouldn't support == because it's error prone
but if you really wanted to do float equality, you could do compare myFloat == Eq
and that would in turn mean List.sort would work optimally for all types, including floats
heh, although in that case, NaNs would have to return Eq
I mean if they have to return one of those 3 options, it's certainly not going to go well if they return Gt or Lt!
LLVM optimizes
!(x < y || x > y)to fewer instructions than a regular==
Just dug into this a bit. == is longer because it has to check the parity flag. Thus it uses a different instruction than !(x < y || x > y). I assume this is due to handling NaN
interesting, thanks for looking into it!
ha, I just realized something else: it can't be true that x == y implies f x == f y for any language that supports both floats and serializing them to bytes
because 0 and -0 are equal but have different bit patterns
so if f is "serialize my argument to bytes" and x is 0 when y is -0 then they'll be different
so I guess we're already priced into giving up that invariant for floats :sweat_smile:
interestingly, this design feels less weird if you choose different names than Lt,Gt, and Eq
like what if it's Order : [ Before, Same, After ] or something like that
like it's about their relative orders, not a notion of equality
(which is exactly how it would be used in sorting)
now we're not saying that NaN is equal to NaN, but rather we're saying that if I have two NaNs next to each other, they should stay in the same positions instead of needing to being reordered
and of course the way that would be implemented in all other cases except for floats would be using ==
but the point is that you don't need equality to be able to answer the question "should I reorder these two floats or keep them where they are?"
I'm not sure I have the same reaction to renaming to Before, Same, After, but I don't immediately see a problem with it.
Probably I'm just more okay with Lt, Gt, Eq than other people are
And I don't have a problem with that name change
A note on cmp in rust. It doesn't work on floats. They just have partial_cmp which returns an option of an ordering.
They also have total_cmp on nightly with an explicit nan ordering
ok so naming aside, the fundamental semantics of this design proposal would be:
Equating, so you can't use them with == ,<=, or >=, or use them in sets or dictionariesOrdering, and their compare implementation treats NaNs as equal to themselves - so you can use them in List.sortFloat.isApproxEq and Float.isWithin for epsilon-based comparisons, or if you really need to use float == you can do compare x y == Eq or compare x y == Same or whatever it ends up being called.I just verified that LLVM optimizes nm, forgot to order the conditionals to account for NaNcompare x y == Eq all the way down to == https://godbolt.org/z/oTj35qjsq
now we're not saying that
NaNis equal toNaN, but rather we're saying that if I have twoNaNs next to each other, they should stay in the same positions instead of needing to being reordered
I like the rewording
I mean it is the same code but reworded, right?
kinda?
To get a 0 you need to not be < and to not be >
compare x y == Same definitely would give you the equivalent of x == y || (isNaN x && isNaN y)
right
but it wouldn't give you the equivalent of doing an ordinary CPU-level float equality comparison, which would return false if both are NaN
I'm mainly surprised that there's a single SSE instruction for comparing them in a way that returns true if both are NaN!
it's not the same instruction ucomisd vs cmpeqsd
right
and there may be a slight perf difference between the two
ucomisd enables NaN == NaN
I'm just surprised there's this level of hardware support at all!
TIL :smiley:
I've always liked sort predicates that just return true if the pair is in the correct order. Are there sort algorithms that need to distinguish between before, equal, and after?
yeah we talked about it earlier and apparently there are some important ones whose performance relies on being able to check for equality
I don't think saying "relies on" is fair. I think it just enables more shortcuts/optimizations.
Also, I am kinda stunned that I have apparently never sorted floats in rust. Apparently, you have to define a custom comparison and use sort_by to sort floats in rust.
Hmm... I guess that is also an option for roc. Don't support Ordering on floats and just force people to define the comparison operator deciding how they do or don't want to deal with NaN.
it is an option! However, I think it seems reasonable to define a default Ordering that reliably sorts NaNs to the front. I can't really think of any downsides to doing that (although maybe they're there and I'm not thinking of them!) and people can always opt into List.sortWith to give a custom comparator that's faster but doesn't handle NaNs as well.
Yeah, soundd fine. Either front or back.
observation: Num.nan < 0 would be False, and so would Num.nan > 0, and yet Num.nan is not 0
so that means if you do this:
if x > y then
...
else if x < y then
...
else # x must be == y
...
that comment is incorrect if these are floats!
because either x or y (or both) could be NaN
in which case relying on the assumption that they're equal in that else branch could lead to bugs
we were just running into this in practice last night when trying to write a sorting function that handles NaN properly :sweat_smile:
is this an argument that < and > on floats shouldn't be allowed either? :scream:
it seems like it would be too hard to use them if there are no comparison functions at all
and I guess even isApproxEq and isWithin can return weird answers with NaNs too
Richard Feldman said:
another angle to consider: what if
==isn't supported for floats, and instead there's only a comparison function? e.g. you give it two floats and an epsilon, and it compares them to see if they're no more than that far apart?
An epsilon approach does seem better for "equal enough" checks, yet having skimmed the "know what you're doing" article, I wouldn't trust myself or the average programmer to pick a good epsilon except in cases in which they should be using Dec instead (i.e. in financial cases, like checking if two values are within 0.01, exclusive, of each other).
Regarding float equality (not counting RawFloat, since it sounds like people using that "know what they're doing" and should be able to do what they want), I like the idea of F32 and F64 being ordered (< and >) but not directly comparable for equality (which means also no <= or >=). A List F64 would thus theoretically be sortable (since sorting strictly only needs <).
I agree that sortable lists of floats make sense, but float comparison should be avoided. That also would mean floats can't be used as Dict keys (assuming Hash implies Eq) nor Set elements, both of which are generally regarded as quite inadvisable. Dec wouldn't have these restrictions, being fixed-point, though Dec use as a Dict key or Set element would also be suspect, though not nearly to the same degree as a float.
Richard Feldman said:
so is there any harm in having them support
Hashing?
In what situations would hashability be useful on values that cannot be compared for equality?
only use case I can think of is if you're trying to come up with a salt or a random seed or something, using a record that has a float in it
Richard Feldman said:
Orderis defined as[ Lt, Gt, Eq ]
This isn't fundamental, just an historical convention. For values with a "total ordering," you can derive x == y from !(x < y) && !(y < x). The compiler can also infer that floats have equality for comparison and such even if the language itself chose to prevent programmer-accessible equality to prevent common floating point blunders.
hmm. I think in that case we should let the user just turn the float to bytes rather than hash it.
good point about bytes - we need to support that anyway!
So just realized something. If you put a float in any record or container otherwise, would Roc just remove == from that version of the container. So like List U32 has ==, but List F32 doesn't have ==. Same with a dictionary where the values are floats?
So as a user, I can compare my model for equality, but the moment I add a float, I can no longer do that and the code won't compiler. Arguably, that is probably a good idea since the user needs to decide what they mean when they say float equality, but it feels really strange to remove a functionality when someone expands their model.
yep
to me that's the biggest risk of "disallow equality on floats" not being nice in practice
Sounds like an alien concept, but I feel like it actually does make a lot of sense.
is the effect it has on collections containing floats, including records that have a single float field
yeah - at the same time, a nice property of the design is that if we don't like it, we can introduce float equality (by some definition) without breaking existing code
But on my custom struct, I could define has Eq even if it contains a float and just set the float to use Num.isApproxEq or something and get == back for the struct as a whole, right?
if you make it an opaque type, yeah
Wait, do only opaque types get abilities?
yeah that's essential
records are anonymous
if you said like "the type alias User gets this ability," since type aliases are interchangeable with what they alias, that would be equivalent to saying "all records with the same shape as User have this ability now"
which would lead to a big mess pretty quickly :sweat_smile:
Richard Feldman said:
- There are two different ordering abilities, e.g. equivalents of
PartialOrdandOrdin Rust - one is basically just for floats, and everything but floats has neither (e.g. functions) or both (e.g. every other builtin besides floats). Those basically exist just so that there can be twoList.sortfunctions, one of which works on floats (but slower), and the other of which works on everything else
It sounds like Rust includes/implies Eq in Ord as a convenience. The mathematical notion of true partial ordering does not seem like it'd show up frequently in programming, except for abstract mathematical programming, or perhaps anything dealing with value lattices (constraint solving languages). I'm not sure the distinction in Rust implies anything fundamental is going on: Roc could have Eq and Ord as non-overlapping abilities (Roc's Ord could be identical to Rust's PartialOrd, yet just be called "Ord").
I don't see why there'd be a speed difference. Roc would know that machine code can and should be generated to compare floats with <= or == if doing so is faster, e.g. for sorting (equality checks may indeed save a few sorting swaps, though it's very much dependent on the input); those operations may just be excluded from F32/F64 in the Roc language to prevent mistakes when programming at a high level, encouraging isWithin instead.
if you said like "the type alias
Usergets this ability," since type aliases are interchangeable with what they alias, that would be equivalent to saying "all records with the same shape asUserhave this ability now"
Totally makes sense. I guess I am gonna be making really large roc modules so that everything can have access to the opaque type internals :rolling_on_the_floor_laughing:. For things that I really wanted to be regular records, but need to be opaque types for abilities.
well you can also always expose like toRaw / fromRaw or something like that
or vice versa, if you just want the opaque type for certain functions, you can expose operations on the raw type and convert to opaque just-in-time (e.g. for serialization if you want the default Encode)
since opaque types can be defined in any scope, you can even do that in the middle of a function if you want
hmm... didn't think about that.
Ok, this seems like less of a concern than I originally thought.
also, records and tag unions will get all the default abilities for free
so you'd only need to do that if you wanted to override the default ones or add additional ones
I don't see why there'd be a speed difference.
PDQsort which rust uses by default specifically checks if the current pivot is equal to the parent pivot, if so, it can do a smarter partitioning such that it only has to recurse on one of the two children lists after partitioning instead of both of them. Only matters for lists where there are multiple of the same item. Probably less common with floats, but still a potential missed optimization that can cut out large chunks of a quick sort when there are duplicates.
Update: It is also used to avoid bounds checking apparently, so that actually makes it potentially very broken if there is a NaN in the list.
Richard Feldman said:
we don't actually say you have to support equating, but you are required to say whether two things are equal
I don't get it.
In any case, LtGt could really just be:
Ordering is
isLessThan: a, a -> Bool | a supports Ordering
Gt is, in a sense, a lie with any value for which we're not defining ==, since while you can say "it's less than, or it's not less than," the "not less than" part means "greater or equal," not "greater than", and in this scenario, we don't want to say anything in terms of equality. And so when you're left with the presence of absence of the property of being less than, it's really just a bool. We could of course use isGreaterThan instead (it's an equivalent capability), we can't have [Lt, Gt], and would instead have [Lt, Gt, WeWouldRatherNotSay]
Kevin Gillette said:
Richard Feldman said:
we don't actually say you have to support equating, but you are required to say whether two things are equal
I don't get it.
later on I revised that to be "you are required to say whether they sort to the same position"
Richard Feldman said:
because
0and-0are equal but have different bit patterns
They have different bit patterns, but I had thought that IEEE754 hardware instructions specially treat those bit patterns as "equal" (assuming you don't reinterpret the memory location as being a U64, for example).
usually yes, but not in the case of division, mod, or atan2
those all give different (as in actually unequal) answers for some input combinations where the only difference is 0 vs -0
Derek Gustafson said:
Probably I'm just more okay with
Lt, Gt, Eqthan other people are
If we allow float equality, then we don't lose anything relative to other [common] languages, in that floats will be just as tricky/error prone as in those other languages. But the wider programming community won't fault Roc for the choice of using floats conventionally because the wider community won't notice the choice not made. But Roc also won't gain anything either. We can't really take away equality once we give it to floats (at least after a 1.0 release, presumably, without a major/breaking version release), but we can add equality to floats later, if we decide to, without breaking anyone in the process. In the best case, we are right to prohibit direct equality checks in the language, and Roc float-handling code will be safer than elsewhere; in the worst case, we might gain some evidence/experience that demonstrates we need float equality, and we'd have lost nothing in the process (thus the worst case has no downside). In that worst case, we could change Ord to imply Eq via inferring that x == y is equivalent to !(x < y) && !(y < x), which should be trivial for LLVM to optimize quite well in most cases, though most of the types implementing Ord would probably explicitly implement Eq anyways either way.
Richard Feldman said:
to me that's the biggest risk of "disallow equality on floats" not being nice in practice
I think we can redirect that towards "you probably meant to use Dec? use binary floats for speed, not precision" ?
In that worst case, we could change Ord to imply Eq via inferring that
x == yis equivalent to!(x < y) && !(y < x), which should be trivial for LLVM to optimize quite well in most cases, though most of the types implementing Ord would probably explicitly implement Eq anyways either way.
This is explicitly not true for floats. They will not generate the same instructions. So I don't think we can change Ord to just imply that in the worst case, but I am maybe missing some part of what you mean here.
Richard Feldman said:
those all give different (as in actually unequal) answers for some input combinations where the only difference is 0 vs -0
How much of that is historical C/pre-C computing lineage? How much of it is "intrinsic, valuable mathematical law?" I'm hesitant to accept that we must accept the precise existing edge case behaviors as-is for atan2, etc (or even give them those same names). This could also be a RawFloat vs F32/F64 distinction. I suspect any instructions we'd be saving by letting -0 have distinct behavior (and for which Roc probably wouldn't let you construct -0 via any literal) are instructions that, with code sufficiently robust and aware of ieee754 idiosyncrasies, would just be reinvested back into explicit -0 normalization checks anyways, except in certain performance-critical use-cases (which would use RawFloat).
Brendan Hansknecht said:
In that worst case, we could change Ord to imply Eq via inferring that
x == yis equivalent to!(x < y) && !(y < x), which should be trivial for LLVM to optimize quite well in most cases, though most of the types implementing Ord would probably explicitly implement Eq anyways either way.This is explicitly not true for floats. They will not generate the same instructions. So I don't think we can change Ord to just imply that in the worst case, but I am maybe missing some part of what you mean here.
I'm still dreaming of Roc's core floats precluding at least NaN. At that point, I believe the property would hold for floats as well, no (since, iiuc, infinity is equal to "itself," as is negative infinity equal to "itself" (or at least it's universally less controversial to make that leap compared to making NaN equal to itself).
Ah, I see. In that case, I think yes.
I think that is definitely interesting to explore, what if our spec says that F32 and F64 have undefined behavior if they are ever NaN or -0. Then do everything in the fastest way possible assuming that is the case. Of course for operations that could return NaN or -0, we would have to make them return a result type.
Though as a side note, -0 is actually arguably useful. It if the floating point additive inverse. x + -0 == x with floats. x + 0 does not always equal x. (have we mentioned that floats are horrible).
@Richard Feldman where have we landed on NaN thus far? Is it just the domain of RawFloat? Is it sufficiently valuable in non-RawFloat to be worth its downsides (and all the deliberating about how to handle the special cases of == around NaN)?
I believe I see your earlier point about saturating to the most extreme finite being problematic in some ways for float, but am not convinced that "compounds error" more than a somewhat arbitrary-seeming boundary (from a non-expert observer) in which finite values cease to be finite, and then stay non-finite ever after (infinity could be considered "maximum error"). I do see the argument that saturating at finite, and then dividing, subtracting, or otherwise beginning to reduce towards zero again would certainly compound error, whereas float-infinity would make the error clear, though a Result with an Infinity tag could achieve that in a way that has better support from the type system.
Another thought, would it be reasonable to treat floats like integers? If we overflow an addition or multiplication(aka get inifinty), we panic. Of course there would be a checked version of the op that returns an infinity tag that someone could use. Could build a safe, consistent, and reasonably fast implementation for Roc. And you could use saturated math if you actually want infinity to be valid.
Still would need specific nan check on division and things like that similar to ints as well.
Brendan Hansknecht said:
Though as a side note,
-0is actually arguably useful. It if the floating point additive inverse.x + -0 == xwith floats.x + 0does not always equalx. (have we mentioned that floats are horrible).
Another reason, in my mind, to exclude float equality than a reason to make -0 allowed, though I don't understand the specifics of the point you made. Most compilers would strength-reduce optimize x + 0 and x + -0 into just x, right (assuming the language holds them to be equivalent to x)? Thus to achieve that oddity, you'd need to arrange for it to happen at runtime rather than directly using the exact expressions you listed. Also, since most languages do not permit -0 literals (they just get rewritten at compile time into the bit pattern for "normal" 0), you'd need to construct -0 at runtime also by some means (such as low-level sign manipulation).
Even then, iiuc, -0 isn't mathematically distinct from 0, and thus I presume the utility of -0 is in some kind of in-band signalling or sentinel value approach, which is pretty error prone (whereas tags are the general FP answer, moving the signalling out-of-band, such that the distinction is visible to the type system).
Regarding infinity and NaN, does Dec have these concepts? What does Dec do when you overflow it or produce values that are non mathematically defined? RawFloat aside (since by definition that's a special case type for performing operations that may have special values), if Dec doesn't have infinity or NaN, and since Dec and F32/F64 are all Frac, should we not have them all have equivalent behaviors and properties (merely just having different notions of precision)?
I honestly don't know all of the details of -0, but someone who deals with floats and quantization and all the fun numerics of ML told me that using 0 over -0 as the additive inverse can have notable accuracy impacts on some ML models. So, I just know that it can matter.
does Dec have these concepts?
No
What does Dec do when you overflow it or produce values that are non mathematically defined?
I think wrapping overflow currently because no one has added the code to make it do otherwise. I think it should panic since internally it is just an I128 and those panic.
should we not have them all have equivalent behaviors and properties (merely just having different notions of precision)?
I don't think so. Dec is much more sound. It really isn't fractional at all. It is just an I128 that happens to have a unit of 10^-18
Brendan Hansknecht said:
should we not have them all have equivalent behaviors and properties (merely just having different notions of precision)?
I don't think so. Dec is much more sound. It really isn't fractional at all. It is just an
I128that happens to have a unit of10^-18
Isn't that an implementation detail though? We could consider F32 and F64 to be really buggy fixed-point number implementations also (i.e. a unit of 2^-1022). You could implement Dec with at a uniform minimum of 10^-18 precision throughout its supported value range, with a ieee754 strategy, iiuc, if you invested enough bits (and added some mandatory rounding rules), not that you'd want to solve it that way.
Dec is base 10 and it is fixed point. I think those both make it much more intuitive and usable than either F32 or F64. Sure the specifics of 10^-18 are an implementation detail, but with dec, I know that base 10 math will work until I have too many decimal places. 0.1+0.9 will for sure equal 1 and things of that nature. That is not the case with floats.
By that I mean that we probably don't want people to actually, day to day, think of Dec as "just an integer semantically scaled down." We'd rather have them think of it as a fractional number that is slower, yes, but has much more reliable precision and representational semantics, and is a better fit for many use cases compared to F32/F64.
For sure
I just think that due to its unique properties we probably don't want to box it up and restrict it like we do F32 and F64.
For example, I think it is reasonable for Dec to support Eq. Especially since it is specifically made to work with things like monetary values. I should be able to say 1.50 dollars == 1.50 dollars with a Dec.
Brendan Hansknecht said:
I just think that due to its unique properties we probably don't want to box it up and restrict it like we do F32 and F64.
Yeah, I meant more "if F32/F64 have NaN and infinities, doesn't that mean we should give Dec NaN and infinities too," to which we'd probably answer "no, we don't want that," to which I'd reply "so why do we actually want F32/F64 to have those special values too?"
Yeah, for Dec, equality is clearly free of the same caveats that floats have. Python, for example, actually does heuristic rounding of numbers like 10.3300000000001 into 10.33 when displaying the value (even though the internal value still has the long tail minutiae), and thus comparing 10.3300000000001 to 10.29999999999999999 might display as 10.33 == 10.33 yet be false, whereas with Dec, a value is represented solely in its true semantic form (presuming you're representing it in decimal notation), thus if values look equal, then, without exception, they are equal.
Ah, I see what you are getting at. We could make it so that at least by default float and double essentially acted like a Dec on these sort of issues. I definitely think that would be a reasonable approach.
Side note: I don't remember if we also wanted Dec to check for underflows, but I think we did. If we treated floats that way as well, they would underflow all the time and be pretty useless.
Yeah, most smart float printing algorithms now print the shortest number of of characters version of a float that is still less than the next float and greater than the previous float. I believe that python uses one of those algorithms. So if they print the same, they should be equal, but maybe they use a worse algorthm that does more rounding and breaks that invariant.
By underflow, do you mean go more negative than the negative most value (that's sometimes called underflow when talking about two's complement ints, iiuc), or do you [probably] mean dividing (or multiplying by a fraction) enough that the result can't be precisely represented? If the precision-loss case, I believe that's reasonable to panic or return a failed result, depending, just as would happen for exceeding the largest representable value.
yeah, precision loss
Brendan Hansknecht said:
x + -0 == xwith floats.x + 0does not always equalx.
whoa, that's news to me! Do you happen to have any links on that? Search results seem unhelpful so far. :sweat_smile:
regarding some earlier questions: one important guideline I want to follow for Roc's numeric builtins is that Roc's performance ceiling should be as high as C's except where that would require UB or memory unsafety
so for example, division by integers is undefined behavior in C. I don't want any UB in Roc, so we take the hit of always doing an extra instruction there, to have it panic on division by 0.
but I don't like the idea of the design being "there is no way in Roc to do division on floating-point numbers without it being slower than what the hardware is capable of"
one big reason for this is the ecosystem. A really important part of Roc's design is that the only way to run code that's less safe than Roc code (e.g. C, C++, Zig, or Rust code) is via a platform's host
and the only way to do it there is via asynchronous effects, which are a really inconvenient way to do math
so if Roc is slower at math than what's possible in C, we don't have the escape hatch that other languages have (e.g. Python) of "it's fine, just call C synchronously"
therefore it's important that arithmetic performance not be a limitation!
this has an important implication: if Roc is going to support (regardless of whether it's RawFloat or Float or whatever) the maximum performance ceiling, it means there will be NaNs in Roc somewhere. That is not avoidable.
so I don't think we can have a design where we always rule out NaNs and don't have to think about them
we have to deal with them
so then there's the question of "should we have Float and RawFloat or just Float?"
I think if the general guidance is "use Dec over float unless you need float for performance," then it's weird to have the guidance also be "use Float over RawFloat unless you need RawFloat for performance"
like the use case for "faster than Dec but not as fast as possible, but more reliable than the fastest-possible thing, but still less reliable than Dec"
seems very limited :sweat_smile:
to me it seems simpler to be like "use Dec unless it doesn't meet your performance needs, in which case use Float - which is as fast as the CPU will let you go - and prepare to deal with imprecision, NaN, etc."
so, big-picture, that's what I'm thinking :point_up:
Makes sense overall, but I think it still makes sense to expose safe floats. The main reason is that I think it will be common to want the speed improvement but not want the foot gun. I would need to look back at the Dec speed numbers, but I feel that it will be the case that safe Float is 90% the speed of RawFloat while being close to as safe as Dec except for precision loss.
two potential ways we could go about that:
thoughts on those?
I think that should be pick on what we believe the default should be. Fast with gotchas or Safe with escape hatches for speed.
Also, I would assume any safe float package implemented in user space would be super niche and not used. Those libraries could be made in any language, but you don't seem them. Instead float code generally works until it has a bad input and triggers something crazy and is broken.
while being close to as safe as
Decexcept for precision loss.
so something I've talked about with Ayaz is wanting to have a compiler warning for division by 0
where the basic rule would be that if you're dividing by a non-constant, then you get a warning unless:
expect thatDenominator != 0 before the divisionif thatDenominator != 0 then (or an else with the if checking to see that it is 0)and then have something similar for sqrt and negative numbers
at that point, it seems like floats in general would be "close to as safe as Dec without precision loss," yeah?
because sqrt and 0/0 are the only ways to get NaN I'm aware of
the expect would tell you in either development builds or tests if it was ever wrong, and the conditional would rule it out at runtime even in optimized builds
you could still get a NaN from the platform or from a Roc package, but if that ever happens it's almost certainly going to be due to a bug in the platform or package, and this would only be one of lots of ways bugs in third-party code could lead to bugs in your program
another related thought is that we could have some debug tooling for letting you debug where a NaN, Infinity, or -Infinity came from. Specifically what we could do is something like expect, where in debug builds (in this case only ones being run through the editor) we add an extra instruction to have every float builtin function check for NaN before returning; if it encounters one, it lets the editor know (e.g. by pushing the current stack trace onto a global List, similar to what expect does, which the editor could then traverse later to report these)
I talked with Mason Remaley (full-time indie game dev; he made Way of Rhea) and he does a ton of stuff with floats
one of the things he mentioned is that whenever he encounters a NaN, the time-consuming thing is tracking down where it came from, and it usually involves going around and littering a bunch of "if NaN here then report it" checks around his code base
but if we were to make that process quick and easy, especially considering the replay features we already want to do for debugging in the editor...how bad would it be if the occasional NaN came up?
basically here I'm trying to answer the question "if we made the experience as good as possible when you're going for Maximum Speed," how much demand would really remain for there to be a whole separate API for Almost Maximum speed?
hmmmm, just had a new realization that's cause for revisiting an earlier design!
we established earlier that the check for "are these two floats equal, while treating NaN as equal to itself?" actually runs faster than the typical "are these two floats equal, while treating NaN as unequal to itself?" check
I just tested it out, and it turns out there are also single-instruction comparison ops for <= and >= which treat NaN as equal to itself
the only catch is that they give the opposite answer when NaN is compared with a non-NaN
e.g. traditionally x <= NaN and x >= NaN both return false, but with this instruction they both return true instead
hm, but that would be weird considering x < NaN and x > NaN would still return false, and the only way to get them to return true in a single instruction would be to use the instruction that flips the truth on those as well, which would mean NaN < NaN would return true - which clearly isn't desirable
so I guess the idea would be:
NaN is equal to itselfNaN is even weirder than in most languages, because although it's still the case that both x < NaN is false and x > NaN is false (even when x != NaN), which is contradictory, we now also have the additional contradiction of x <= NaN is true (even though x < NaN is false), and same with >=so with that in mind, maybe the question of whether to disallow equality on floats really comes down to "do we want to disallow it for precision reasons alone?"
I could see arguments either way!
hmmmm, something else I just realized: compare : Float a, Float a -> Order might not be enough to sort maximally fast
we might actually need equivalents of <, <=, and ==
for the sorting algorithm to be as fast as possible
if that's true, then means one of these three things must be true:
actually that's probably fine, because we can cheat in the List.sort implementation :stuck_out_tongue:
and make monomorphized calls to List.sort that use floats use different operations that would give the same answer as if it were using Order, while not actually using it, so that it runs faster
if we did that, it would resolve the question of "should the default Ordering for floats sort all NaNs to the front or to the back?" nicely, because now we know about single-CPU-instruction operations that do both < and == comparisons on floats that can sort NaNs to one end or the other
which means that even if the default compare for floats isn't maximally fast, the default List.sort on floats still can be!
Richard Feldman said:
I think if the general guidance is "use
Decover float unless you need float for performance," then it's weird to have the guidance also be "useFloatoverRawFloatunless you needRawFloatfor performance"
That's sold me on your point. Coming from a billing systems background, I have been thinking of this in terms of "use Dec when you need precision, but use Floats otherwise," when I should be thinking of Dec as the default except when I really don't care about precision (when approximations with compounding error is fine in exchange for speed).
Given that, I don't see the benefit of RawFloat, since Float is already the "raw float" of Dec. At that point I also don't really care as much about NaN or Infinities being present either: they already can't happen with Dec, and Dec is what I would be using in most cases. It still may be nice to have checked operations available for use with Float, but which are not used by default.
Richard Feldman said:
- Comparing things to
NaNis even weirder than in most languages, because although it's still the case that bothx < NaNis false andx > NaNis false (even whenx != NaN), which is contradictory, we now also have the additional contradiction ofx <= NaNis true (even thoughx < NaNis false), and same with>=
iiuc, this logically implies that x is equal to NaN. Consider:
x < NaN => false
x > NaN => false
At this point, if we don't have equality defined on floats, then the above result is just that "nothing can be said about the relationship of x to NaN." iow, floats would not have a total ordering (the finite values and infinities would have a total ordering relative to each other, but NaN simply could not be ordered relative to the other possible values).
Now we further say:
x <= NaN => true
x >= NaN => true
At this point we are saying something about equality, and, the only possible answer is that every x is logically equal to NaN.
That would be pretty unusual. While technically not UB, it would take long enough to explain and reason about that it might just be easier for most Roc programmers to consider it magical and undefined.
We can side-step that by making it impossible to construct a NaN value directly (there may be an isNaN function, but no literal or constant expression which the compiler would permit to _directly_ produce NaN, thus we wouldn't need to worry about any of the above expressions actually occurring, except with dynamic values.
I personally think we should just label NaN as UB. Make it such that Roc can never generate NaN and say that it is a bug if a platform passes a NaN float into Roc. Then just design everything else with that in mind.
I think we should still allow for Infinity though. Offer checked and unchecked versions of ops that might generate it.
I think that we should use all of the faster comparisons that Richard mentioned, not because we are thinking about behavior and NaN, but because we are ignoring NaN and can go faster.
Also, if Roc can't generate NaN no need to ever track it down on the Roc part of the code.
Side note: a NaN check and jump can theoretically be as be as cheap as a panic that is never take for addition overflow....which would actually probably make it even cheaper than a overflow cause float instructions take longer to run.
Make it such that Roc can never generate
NaN
so this means that Roc can never have single-instruction float division, modulo, or sqrt?
can you use asserts to rule out the NaN case?
asserts still require extra runtime instructions
conditionals specifically
and use those strategically so the optimizer can, in many cases, figure out that the fast division can be used
so do bounds checks
yeah I think these are very comparable to bounds checks! :thumbs_up:
and that is a cost we accept and hope that the optimizer removes them
right, but we do that because the alternative is memory unsafety :big_smile:
and also we do it with int division because the alternative is UB
but with float division the alternative is less severe
(potentially returning NaN)
I'm very open to the idea that maybe the answer is that we should always pay the cost btw!
I just want to acknowledge it as a serious downside
Richard Feldman said:
so this means that Roc can never have single-instruction float division, modulo, or sqrt?
Do we have those for integers? I thought division had to always be checked in roc.
that's the way it's currently implemented, but I've considered that design relatively experimental and not necessarily what we'd want to do long-term
the current design rules out NaNs unless you get them from the host (which I'm generally okay with; we also have to trust pointers we get from the host will not be null. I'd say it's more likely that the host would accidentally give us a NaN than a null pointer, but then again it's also probably way less common for us to get a float from the host that the host actually knew was a float, as opposed to something we gave the host - as part of our app state - and that the host gave back to us unmodified.)
but it comes at the cost of making it be impossible to have single-instruction division, modulo, or sqrt
which on the one hand is presumably a bigger deal in some domains than others - but on the other hand, the domains that are more likely to consider them a downside (games, physics simulations) are the ones that are most likely to want to use floats for performance in the first place!
I would say that at least for now we follow along with what integers are doing. We can always add an unsafe escape hatch later.
We can always add an unsafe escape hatch later.
I think if we don't think that through, it will cause all sorts of problems :grimacing:
I think it would be very easy to accidentally commit to a design that has really nasty error-prone scenarios if NaNs happen if we design everything with the assumption that NaNs will never enter the system
and everything works great until we reintroduce the possibility of NaNs later, and then everything falls apart
so I'd really rather talk through the implications of them up front, and either commit to never having them or else plan to allow them and design accordingly!
I think a reasonable question to ask is: let's say we assume NaNs can't happen, but infinity and -infinity are still allowed...do we still want to disallow float equality (which includes disallowing using them in sets or as keys of dictionaries) just because of precision concerns?
Can we just implement our own sane floats purely in software and then lobby for CPU manufacturers to add hardware acceleration? :sweat_smile:
and if we've already disallowed float equality, how troublesome even are NaNs anymore?
like they have weird properties with x < NaN and x > NaN returning the same answer...but what specific bugs would that lead to?
and are we more worried about those bugs than about the impossibility of getting single-instruction div/mod/sqrt even in the hottest of hot loops in a game?
another potential design angle we haven't talked about: there's a way you can tell the CPU "from now on, whenever a floating point error happens, like a NaN, set this flag and never unset it again until I explicitly clear it"
so we could use that to do something like: when the host calls into Roc, we set that flag, and then we do a bunch of stuff, and then right before we return to the host, we check to see if any floating point errors happened, and then panic if so
the big downside there would be that we'd have no way of knowing where the error actually happened
just "at some point, something was gonna return NaN, so we exploded after the fact with no stack trace"
there's also SIGFPE, which actually seems very promising except for the "behavior varies by hardware" part
I guess I should really look into that more
like this is on Solaris, and looks exciting: https://docs.oracle.com/cd/E88353_01/html/E37843/sigfpe-3c.html
it would basically be "host runs 1 instruction to tell the CPU to do a SIGFPE whenever a floating-point operation would overflow, underflow, or produce a NaN, then host sets up a SIGFPE handler which can be handled using the same strategy as roc_panic"
and I read somewhere else that it can also happen for integer overflow/underflow and division by 0, although I think if we wanted to rely on that, I'm not sure whether we might need to tell LLVM to treat integer division as "not UB"
for optimization purposes
but maybe it's fine if it does? :thinking:
another potential design angle we haven't talked about: there's a way you can tell the CPU "from now on, whenever a floating point error happens, like a NaN, set this flag and never unset it again until I explicitly clear it"
That would remove single instruction float operations as well (need instructions to update the flag). On top of that, it would not tell you were the error was (something a panic theoretically could do)
Also means you could wait hours for a computation to run in roc and then get a panic right at the end when you thought it would be done. Instead of getting the panic early right when it happened.
like they have weird properties with
x < NaNandx > NaNreturning the same answer...but what specific bugs would that lead to?
Cause UB with most sorting algorthms. More specifically the issue tends to be that x < NaN and NaN < x return the same answer.
WIth SIGFPE -> "or an inexact computation"
That sounds like it would happen all the time.
For example, anything to do with 1/3 would almost always cause inexact computation.
so this means that Roc can never have single-instruction float division, modulo, or sqrt?
It appears there are multiple instructions for sqrt (or maybe multiple instruction precision variants), at least on EMT64, with a wide range of clock cycle costs. Which would we use/offer? Multiple of them?
If making NaN impossible from floats generated by Roc ends up being a goal, we can amortize the cost pretty heavily, for example if there is a long chain of float operations which can produce NaN anywhere within it, and since NaN propagates, we can just do a NaN check at the end and have a panic with an imprecise location ("something in lines 10 to 35 produced a NaN"). Given that tail call optimization already produces imprecise traces, this doesn't seem too bad (you don't if it was the first or millionth iteration that had the issue).
Since Roc is pure, when a NaN is detected at the end, you can also reapply the span of possibly NaN'ing operations to whatever was in the stack frame or a register copy, but this time with NaN checks after each suspect operation to find the precise panic point. That replay is not fast, but the initial computation was, and since you're about to crash Roc anyways, speed no longer matters.
Alternatively you could have the panic be defined as occurring on the first data read in a non-arithmetic context (when doing a comparison to find the lesser of two floats, or stringifying, or converting to a Dec). If you do an explicit isNaN check, you can prevent the panic, but otherwise the panic will occur.
An extra note, apparently SIGFPE doesn't catch floating point division by zero. It only catches integer division by zero.
If making NaN impossible from floats generated by Roc ends up being a goal, we can amortize the cost pretty heavily, for example if there is a long chain of float operations which can produce NaN anywhere within it, and since NaN propagates, we can just do a NaN check at the end and have a panic with an imprecise location ("something in lines 10 to 35 produced a NaN"). Given that tail call optimization already produces imprecise traces, this doesn't seem too bad (you don't if it was the first or millionth iteration that had the issue).
I totally agree with the general sentiment, but only a few ops can generate NaN I think it would be better to just check at those locations and never check elsewhere. Then most float operations and comparisons don't have to be slowed down at all. Of course we can still optimize a chain of NaN checking ops into a chain of ops followed by a single NaN check.
Haha, I just realized something. Float division can't generate NaN without a NaN input. It generates Inf or -Inf when dividing by 0.
What about 0/0?
I just tried this in Rust and it printed "NaN"
fn main() {
println!("0/0 = {}", 0.0/0.0);
}
ah. missed that.
So only 0.0 and -0.0 as both numerator and denominator can generate NaN with float divisions.
I think it would be better to just check at those locations and never check elsewhere.
@Brendan Hansknecht I'm assuming we'd never NaN-check runs of instructions that don't involve any NaN-producing instructions.
Sure, but you want to check at the use once we reach non-math operations. That is what I mean by elsewere. That could be in a totally different function. On top of that, it may lead to needing to check in many more locations rather than just at the source.
Would require essentially having an internal maybe nan type and dealing with propogation and checking at many sites. I think this would be a lot messier than you imagine it. It gets a lot worse due to not knowing what will or won't be inlined.
And an imprecise panic is probably good enough if the panic can come with info you can put right into the Roc editor/debugger to reproduce, and thus the developer tooling can refine an imprecise panic into something exact.
I think it's problematic for functions to check their inputs for NaN, because then every float accepting function would need to do so, and the panic tells you little. I believe it'd be better for functions that return or could pass NaN to a subcall to do the NaN check at the call/return boundary
And even then, I'm just talking about loose semantics, not what the compiler will generate
I think it's problematic for functions to check their inputs for NaN
Why would they do that?
That would remove single instruction float operations as well (need instructions to update the flag)
for what it's worth, I think you can set the flag once at the start of the program (e.g. in the host) and that's it - I don't think it's 1 instruction per float op, I think it's one instruction ever
but I also don't like that design direction anyway :big_smile:
I think you can set the flag once at the start of the program
Yeah, I didn't realize you were talking about SIGFPE when you said that. I thought you meant tracking whether or not we hit NaN with a global flag variable.
On that note, can the roc compiler set an interrupt at all? That sounds like it would only be allowed by the platform. The interupt definition would be os/platform specific
Brendan Hansknecht said:
WIth
SIGFPE-> "or an inexact computation"That sounds like it would happen all the time.
it looks like you can specify which ones you want it to signal on: https://linux.die.net/man/3/feenableexcept
Brendan Hansknecht said:
On that note, can the roc compiler set an interrupt at all? That sounds like it would only be allowed by the platform. The interupt definition would be os/platform specific
yeah I think hosts would have to set it
bc they need to enable it
So if we went with the SIGFPE route, every host would be expected to decide how/if roc handles 0/0, sqrt(-1), etc?
yeah in the same way that they specify how panic is handled
so the requirement would be "the handler must be set, and it must halt execution"
just like roc_panic
so from the app author's perspective it's indistinguishable from a panic, since the host decides what to do in both cases
Sure, but panic is different. We call panic. We don't call an interrupt.
Like with panic, I don't think returning would work. you would be returning to garbage code between functions and probably crash. With an interrupt that is not the case.
sure
is the concern that host authors wouldn't adhere to that specification?
and would do something like "div by 0 silently returns 0" instead?
I think plenty of host authors would just not add the interrupt at all and have "normal" floats. At that point, there is no point in making roc care about floats in a special way at all.
They would be exactly as unsafe as any other language
in debug builds we could test for the handler being set
like whenever the host calls main we could have it start with a check for that
How? If the interrupt is os/platform specific? That was the point of having the host enable it in the first place
hm, true
one idea is that we could make it opt-in, and if you don't opt into it, then we insert panics by default
so there'd be some way in the platform module that you'd specify "I will set up my own handling of floating point and integer errors" and if you don't opt into that, then we generate conditionals and panics
Oh, maybe a worse problem. If the host ever generates a NaN, it would also call the interrupt handler. So they may need to toggle it with every call into Roc. If the interrupt can't be set in a thread/co-routine/fiber/etc specific way, that may lead to issues with multithreading and NaN in the host.
Brendan Hansknecht said:
Oh, maybe a worse problem. If the host ever generates a
NaN, it would also call the interrupt handler. So they may need to toggle it with every call into Roc. If the interrupt can't be set in a thread/co-routine/fiber/etc specific way, that may lead to issues with multithreading andNaNin the host.
I think that's true, yeah - e.g. https://stackoverflow.com/questions/63095616/feenableexcept-not-permanent
http://www.tin.org/bin/man.cgi?section=3&topic=feenableexcept says all of these operations are thread-safe (under "attributes")
Yes, but co-routines and fibers share a single thread pool. They can get swapped out at random times and replaced by another one. This means that even if it is thread safe you will run into problems.
also on Windows - https://docs.microsoft.com/en-us/windows/win32/dxtecharts/top-issues-for-windows-titles?redirectedfrom=MSDN#manipulation-of-the-floating-point-control-word mentions that it's thread-safe, while also noting that Direct3D will set its own settings until you tell it not to
Brendan Hansknecht said:
Yes, but co-routines and fibers share a single thread pool. They can get swapped out at random times and replaced by another one. This means that even if it is thread safe you will run into problems.
would that be true in Roc though? like the host would be in charge of knowing when that happened
and it would know exactly when it starts running a single-threaded Roc function vs not
Currently all Roc functions are thread safe to my knowledge
they actually aren't because refcounts aren't atomic
that's something I definitely think should be platform-configured in the future
bc they're slower if they have to use the atomic operations (e.g. Arc vs Rc in Rust)
so if you know the Roc code will only ever be running on a single thread, it will run faster if we generate non-atomic refcount operations only
Ah, fair, but I think this is still different. That is about making sure the data you hand to roc isn't be accessed by multiple threads. This is about not calling a function that might trigger SIGFPE while a roc function is potentially suspended on the same thread.
:thinking: would that be a problem though?
like I can suspend the Roc function, change the float settings back to "don't interrupt for NaN", and then change them back before resuming a Roc function
My example situation is a game that uses fibers for multiprocessing. It calls all roc functions with data that they know has been safely loaded to the thread (so no roc refcount issues). It enables SIGFPE and calls into roc. While roc is running, the fiber is context switched and a new c++ fiber is running. The new fiber triggers SIGFPE even though it thought it was disabled.
The other fiber would have no idea that a roc function was even running
The fiber engine is just part of a library and doesn't concern itself with things like SIGFPE That would just increase context switch time (I think this is the actual case in most fiber libraries).
sure
I guess that's an argument for making this be platform-configured
like maybe the game platform actually wants to enforce that NaNs never come up, and is ok with panicking!
since games are an example of a use case where it would be desirable to have the fastest possible math operations in Roc
like maybe the game platform actually wants to enforce that NaNs never come up, and is ok with panicking!
I would guess most actual games would just assume NaN would never come up and would not want panics either.
but if that's what they want, then the signal should be fine, yeah?
like they pay one instruction at the beginning of the program, and then nothing else ever happens
No, they don't want the signal either. They want neither
they want actual NaN?
and integer division by 0 to return 0? (since that's UB otherwise)
interestingly, it looks like doing things this way actually enables more LLVM optimizations than doing a conditional panic!
from the LLVM LangRef:
The number and order of floating-point exceptions is NOT guaranteed. For example, a series of FP operations that each may raise exceptions may be vectorized into a single instruction that raises each unique exception a single time.
Not necessarily, but for many values NaN may just cause something silly for a frame and things will move on.
that's very true - that is something Mason mentioned
things looking weird in the game can be more desirable than having the game halt
But yeah, if the NaN gets into important game state, the game will probably eventually break in some way if not crash.
also interesting: LLVM can do additional optimizations if you tell it to assume there won't be NaNs:
nnanNo NaNs - Allow optimizations to assume the arguments and result are not NaN. If an argument is a nan, or the result would be a nan, it produces a poison value instead.
same thing with ninf for no infinities
Those seem useful depending on what we end up with
agreed!
Another minor note with: SIGFPE. IIUC, it will be slow to toggle (kernel call). So it will require Roc functions to do more work for them to be worth calling into if it is toggled with every call.
fair point!
honestly I think most hosts would just set it and then ignore it forever
like they'd just assume it would never come up in the host code
I have no idea how safe of an assumption that is for games
e.g. might non-Roc code outside the GPU (shaders etc) do those things? No idea!
one interesting thing about this is that if we try it out as the default, we can get a lot of signal pretty quickly on whether it's a problem in practice
honestly I think most hosts would just set it and then ignore it forever
Yeah, it hopefully will work fine that way, but it does fill the default of what every programmer expects in their language.
right
to be honest, just thinking about the types of things hosts will be doing, it seems kinda unlikely to me that they'd be writing code which would rely on that behavior being different
unless they're using libraries which do, of course
but I don't know what types of libraries might be doing that - e.g. I kinda hope a physics library or something like that is not producing NaNs or inf/-inf internally and then recovering from it? :sweat_smile:
idk. I feel like it might be semi-common in certain code areas to run a bunch of ops and then check for nan.
sure, but they'd be expecting it wouldn't be NaN, right?
and the check would be defensive
Oh, also, inf and -inf are definitely more common in random libraries.
yeah I've heard of people (e.g. game devs) intentionally making use of inf and -inf
So those are probably more likely to be problematic than nan
I wonder what would happen if we went all the way to essentially enabling -ffast-math
we would lose referential transparency across multiple compilations/cpus
just go all the way in on "floats are for speed, not for accuracy - and wow will they be speedy and inaccurate!"
whoa, seriously?
well that's no good
that reminds me: Mason mentioned that apparently it's a common problem in game dev that sin and cos and some other trig values give different answers depending on which libc you're compiling with
I have to think about that more. We definitely at least get the case were you could change one function and then it would modify the result of another function. Cause changing one function may stop it from inlining which may stop it from reordering instructions which would lead to different results.
something about people preferring to use the software implementations than hardware ones bc they give better answers or something
But yeah, I think llvm could generate faster instructions on one cpu with certain extensions that are not guaranteed to return the exact same results
yeah that makes sense
Of course if we don't have float equality, it would be hard to observe the behaviour.
well, a use case Mason brought up is saved replays in games
where you want to save all the user inputs and then replay them on a different machine to watch the game
for that to work, everything has to be completely reproducible - otherwise one tiny difference can have a cascading effect - e.g. one hitbox detection doesn't quite pass, like it did on the original machine, and from then on you're watching a totally different replay :sweat_smile:
something about people preferring to use the software implementations than hardware ones bc they give better answers or something
Yeah, they are not very fast and not very accurate. I feel like I read an article before of a software based implementation that was essentially the same speed and more accurate at the cost of more memory.
yeah I think we should draw the line at "same code always gives the same results" if possible
I actually don't know how possible that is, to be fair
like I think I may have heard of different CPUs giving slightly different answers for the exact same floating point ops
but I don't have anything I can point to to confirm or refute that!
I mean it is easy if everyone buys the exact same computer down to the last chip and they don't have any bugs.
CPUs of the same model even, definitely will have bugs due to the creation process. Bad cpus that didn't get caught or that have results that are close enough to correct.
It is rare, but it definitely happens. Including the super fun case of one cpu core having broken encryption hardware such that it can encrypt and decrypt it's own data, but other cores on the same cpu would fail to decrypt data from that core.
yiiikes
well so maybe that's not a guarantee we can (or perhaps should) even rely on anyway
if it's just asking for trouble to tell people they can count on it
"trouble" being bugs in this case
I think in practice it is a reasonable guarentee, but at scale it does have exceptions.
well for something like replay it could matetr
*matter
These type of bugs take mega corporations to notice.
like I wouldn't want to tell people "build your game in Roc and saved replays will Just Work!" and then they occasionally don't :grimacing:
It would only happen when share a replay between multiple computers, at least one of the cpus having a bug (super rare) and the bug matter to the replay(probably also quite rare).
that's fair
I don't think roc would get blamed here.
It would be the hardware bug that is to blame.
yeah, I suppose "this works unless your hardware has a bug" is in general the strongest guarantee any piece of software can give about anything :laughing:
So I would just assume that these type of CPU bugs don't exist for our purposes.
I think it would just muddle design and aspiration too much
I'm sold!
so that suggests we should be choosy about which LLVM optimizations we enable
and stick to the ones that wouldn't give different answers on different CPUs (assuming no hardware bugs)
Anyway, for the float discussion, I feel that we have touched on a ton of points. I think that it would be best to pick one of the more controversial solutions and then test it to see how it turns out in practice. I think the main candidates are:
SIGFPE with a handler (probably only for a specific subset of the signal? cause we don't want it to trigger on integer division by zero?) (check wasm exposes SIGFPE)RawFloat for anything unsafe. Limit Float to safe things and assume it isn't NaNNaN checks an explicit part of the language like division by zero with integers. Have a specific function to opt out of the NaN check for performanceNaN just call panicThe later 2 if they end up being too slow to be acceptable, could be optimized to group multiple operations before doing the NaN check. I also have just talked about NaN, but this may also apply to Inf and -Inf if we want.
@Richard Feldman a real-world game/simulation replay would likely have a "state checkpoint" concept (like video i-frames) and/or a high level annotation ("the collision occurred") anyways, to both make auditing easier at negligible extra cost (and storage _is_ cheap), as well as to allow fast "seeking" through the replay, so that you don't need to recompute the state from the very beginning just to see what happens one hour in.
It's a nice thought that we _could_ do replays that way, but I doubt that's how we'd realistically do them in practice, and having current state represented merely as the result of an unbounded sequence of float operations seems precarious in the best case (it's essentially unstructured analog computing that happens to have a paper trail).
Even if you're debugging, you probably just want to isolate changes since some reference state, not since the beginning of the program. Bugs tend to repeat, and so there's not much difference between "since program start" and "since reference state" besides memory required and the degree of debugging step-through tedium involved
it appears that wasm doesn't support float exceptions but it does support trapping on integer overflow and division by 0
this to me actually makes an interesting case for using SIGFPE on integer division
because otherwise it is literally impossible to have single-instruction integer division or modulo without also having UB
that is the only possible way to support it, and it turns out wasm supports it already
so I kinda think we should just do that, and have division no longer return a Result
(separately, I've also gotten feedback that it would be nicer for division by 0 to panic than returning a Result, especially if we had the "check for div by 0" lint mentioned earlier)
now knowing that wasm doesn't support SIGFPE for floats makes for a tricky tradeoff.
SIGFPE on non-wasm hosts, then we can get better performance than is otherwise possible, because we can safely enable LLVM optimizations that assume no NaNs and no +Inf/-Infwhich reduces to: either we make wasm slower at float ops or we make every other target slower at float ops
"make wasm slower" seems better to me for two reasons: one, wasm already has baseline performance overhead, and if you really need the performance to be maximally fast, you probably want to go native instead of wasm anyway; and two, it's possible that wasm might add support for SIGFPE on float exceptions in the future, since they clearly are already handling SIGFPE for integer division by 0!
so to this question:
Brendan Hansknecht said:
Anyway, for the float discussion, I feel that we have touched on a ton of points. I think that it would be best to pick one of the more controversial solutions and then test it to see how it turns out in practice. I think the main candidates are:
- Require the host to set
SIGFPEwith a handler (probably only for a specific subset of the signal? cause we don't want it to trigger on integer division by zero?) (check wasm exposes SIGFPE)- Make anything that would return
NaNjust call panic
I think we should try out those two :point_up:
that is, require SIGFPE handling in hosts, for both floats and integer division by 0, and then generate corresponding panics for wasm only
that design would give us:
the downside would be:
SIGFPE handler on non-wasm targetsalso, if it turns out that requiring it to be enabled is to onerous for hosts in practice, or if it's common for people to forget to set it up, leading to UB, we can introduce a way (possibly default) to have panics get generated instead even outside wasm
one potential way we could make it as easy as possible for hosts to set up is that maybe we could make the signature of roc_panic compatible with the signal handler, such that there would be a single line of boilerplate we could give host authors to run which sets up roc_panic as the handler for all those exceptions
an exciting part about trying this out is that if it works, Roc could actually be not only less error-prone (no NaNs to worry about!) than something like C++, but roc --optimize could actually legit beat clang -O3 at certain benchmarks because we'd have the same (zero) overhead on our arithmetic, but we'd be enabling more LLVM float optimizations by default!
(obviously clang lets you opt into those too, so it's not like Roc would be actually faster, but even "faster by default" for math-heavy workloads would be pretty exciting!)
assuming we go with that strategy, the case for whether or not floats should support Equating would really come down to how worried we are about precision
I was talking to Ian MacKenzie about his use cases (lots of geometry stuff), and it came up that Num.isZero actually covers a lot of the cases you'd want to use exact float equality for anyway
he did come up with an example where you do want to test for exact float equality, which when testing serialization/deserialization code (e.g. for a protobuf implementation or something) - it's pretty important to verify that exactly what you serialized is what gets deserialized afterwards
but of course in cases like that you can always do the ol' !(x < y || x > y)
or something like that
or Num.isZero (x - y)
And I believe the discussion about equality truly has been regarding ==. We can provide a named function for equality that is compiled into a single instruction, just making it clear that it should only be used by those that know what they're doing.
And yeah, I don't see a compelling case for floats as Dict keys or set elems. If someone really needs that, they can easily enough make their own type that implements Equating
he did come up with an example where you do want to test for exact float equality, which when testing serialization/deserialization code (e.g. for a protobuf implementation or something) - it's pretty important to verify that exactly what you serialized is what gets deserialized afterwards
Could also use some sort of bitwise comparison or make an exceptions for tests.
true! Like converting both the expected and actual floats to bytes and comparing those
yep
Or i guess subtract them and check Num.isZero
Mason gave a good example of inf/-inf being useful in game dev:
For example an easy way to make an immovable object is to make its mass infinite
Or if I’m comparing a bunch of objects distances I’ll initialize best result to + or - infinity
I think that should still work even with SIGFPE. I think you can still use infinity in sound ways.
Just would need to give roc a way to generate it. Cause overflowing to generate it would trigger SIGFPE
Screenshot_20220318-174738.png
yeah I think that's reasonable
we could just have Num.infinity and Num.negInfinity
from this article:
To generate a trap, a program must change the execution state of the process using the fp_trap subroutine and enable the exception to be trapped using the
fp_enableorfp_enable_allsubroutine.Changing the execution state of the program may slow performance because floating-point trapping causes the process to execute in serial mode.
anyone know what "serial mode" refers to here? 🤨
not sure if that's something specific to IBM's AIX operating system
I have never heard of it before.
yeah all the other search results I can find seem to be IBM-specific
we could just have
Num.infinityandNum.negInfinity
Perhaps
Num.floatInfinity : Num -> Float
Num.floatInfinity = \sign -> ...
floatInfinity, because just infinity suggests (to me) that it's available to number types in general, including Dec, and accepting a sign because that's more convenient than branching in many cases (yet can still be used with branching if desired for code style)
But Dec does have infinity, and it has NaN too. Isn't this what were using? https://en.m.wikipedia.org/wiki/Decimal128_floating-point_format
nope! We're using a 128-bit fixed point representation
it doesn't have infinity or nan
cool, TIL every basic FP arithmetic operation can produce NaN:
-Infinity + Infinity == NaNInfinity - Infinity == NaNInfinity * 0 == NaN0 / 0 == NaNalso not only sqrt of negative but also logof a negative
source: https://www.agner.org/optimize/nan_propagation.pdf
(plus I tried them, and...yup)
so I'm finding an argument to not do commit to the SIGFPE design, which is that I'm not actually sure if it's gonna be available indefinitely :grimacing:
it seems like Apple is not even bothering to support it in their libc for M1s
I've found people on the Internet asking how to do it and the only one response was an Apple framework dev saying "we don't support that for M1 yet but you can set these flags manually" and then the guy replied that he did set the flags manually and they were ignored, and no response, and that was a year ago :sweat_smile:
and I've been unable to get it working on my M1 even just using C
plus that Agner Fog article basically explains how hardware people don't want it to be a thing because it overcomplicates things in the presence of CPU parallelism, which makes me concerned that future chips might drop support for it altogether and encourage OSes to fall back on software emulation for anything relying on it
unfortunately, that same article revealing that plus, minus, times, and division all can produce NaN makes some of these alternatives less appealing:
Brendan Hansknecht said:
Anyway, for the float discussion, I feel that we have touched on a ton of points. I think that it would be best to pick one of the more controversial solutions and then test it to see how it turns out in practice. I think the main candidates are:
Require the host to setSIGFPEwith a handler (probably only for a specific subset of the signal? cause we don't want it to trigger on integer division by zero?) (check wasm exposes SIGFPE)- Make an explicit
RawFloatfor anything unsafe. LimitFloatto safe things and assume it isn'tNaN- Make
NaNchecks an explicit part of the language like division by zero with integers. Have a specific function to opt out of theNaNcheck for performance- Make anything that would return
NaNjust call panic- Other? A mix?
The later 2 if they end up being too slow to be acceptable, could be optimized to group multiple operations before doing the
NaNcheck. I also have just talked aboutNaN, but this may also apply toInfand-Infif we want.
@Richard Feldman so if SGIFPE has an unclear future, would anything in https://roc.zulipchat.com/#narrow/stream/304641-ideas/topic/supporting.20NaN.2FInfinity.2F-Infinity.3F/near/275836457 work?
Kevin Gillette said:
If making NaN impossible from floats generated by Roc ends up being a goal, we can amortize the cost pretty heavily, for example if there is a long chain of float operations which can produce NaN anywhere within it, and since NaN propagates, we can just do a NaN check at the end and have a panic with an imprecise location ("something in lines 10 to 35 produced a NaN").
assuming we could have the compiler reliably detect this, I'm not sure how much of a difference it would make. In a lot of cases, the chains are pretty short but they happen inside a hot loop. It's also common for the result inside the hot loop to be setting values inside structs (in Roc's case, the example would be a List.map over some records which hold floats, and then the compiler would emit in-place mutations of those records) so if we waited to the end of the loop to detect if any NaNs happened, we'd have to iterate over the whole loop again
or, alternatively, do a check on each iteration of the loop and write to a local variable if anything is NaN - but at that point we might as well just panic if the check ever passes
in general, I still think if we're going to panic on things, we also have to offer non-panicking versions so that we aren't introducing a cap on maximum possible performance
but the SIGFPE design is the only one that allows that while ruling out NaNs
so I guess the summary is, we have to do one of two things: either
SIGFPE design, orwe definitely can't treat NaNs as UB while still allowing them to be the result of every arithmetic operation, since that would mean every floating-point arithmetic operation in Roc would become a potential source of UB
we also can't rule out infinity or -infinity altogether (again except for the SIGFPE design) because overflow causes that, and requiring that all floating-point arithmetic operations do a check-and-panic would make them all more expensive and also defeat LLVM optimizations for all of them
so it's really seeming more and more like the most viable option is:
do not attempt to rule out NaNs, and design things with them in mind
Richard Feldman said:
It's also common for the result inside the hot loop to be setting values inside structs (in Roc's case, the example would be a
List.mapover some records which hold floats, and then the compiler would emit in-place mutations of those records) so if we waited to the end of the loop to detect if any NaNs happened, we'd have to iterate over the whole loop again
I see. I'm guessing this is one of those places where the list has runtime-determined size or is large enough that it can't be fully unrolled. Thus we're likely talking about memory writes rather than just register operations, and memory is slow. I know on some architectures you can treat memory reads and writes as asynchronous operations, but I don't think that's common. In any case, if so, you could start the memory write of the current element, start the memory read of the next element, and while those are happening, do the NaN check against a register.
On architectures where branch prediction is common (x86-64), the same effect might be achievable transparently: Intel execution pipelines, for example, may proceed upon the assumption that the panic branch will not execute, because it hasn't on any of the last iterations, and thus proceed with pre-fetch, making some of the post-mutation reads "free"
Essentially all modern architectures should do this, though it still better to put reads as early as possible instead of after NaN checks where possible.
Brendan Hansknecht said:
Essentially all modern architectures should do this, though it still better to put reads as early as possible instead of after NaN checks where possible.
Please elaborate on "put reads as early as possible"
Richard Feldman said:
cool, TIL every basic FP arithmetic operation can produce NaN:
-Infinity + Infinity == NaNInfinity - Infinity == NaNInfinity * 0 == NaN0 / 0 == NaN
This is the real killer. Especially since we have already stated that inifinity is useful and we want it.
Please elaborate on "put reads as early as possible"
I mean that it is always best to put memory loading/reading instructions as early in the instruction stream as possible. Given that will at minimum take a few cycles on modern CPUs, you want the pre-fetching to start as soon as possible. So if you need to load some piece of memory into a register for your next floating point instruction, it would be best to put it before the NaN check of the previous instruction.
This all being said, in many cases it will make little to no difference because the CPU will already be looking at 4+ instructions at a time and potentially looking a few hundred instructions into the future.
I was trying to point out that on essentially all modern hardware you should consider:
I know on some architectures you can treat memory reads and writes as asynchronous operations, but I don't think that's common
to be true even if it isn't represented in the instruction stream.
Also, TIL apparently the apple M1 has a larger reorder buffer than most x86_64 cpus by about 2x . It has 630 slots.
Just imagine the fact that the CPU could be executing an instruction that is 630 instructions after the memory load that your program is currently stalled on.
To top off the crazy numbers, an M1 mac can have 130 loads in flight along with 107 stores. Talk about being able to do run things super far into the future.
Richard Feldman said:
so it's really seeming more and more like the most viable option is:
do not attempt to rule out NaNs, and design things with them in mind
Having reflected upon the discussion so far, it seems like a pretty good compromise to not have RawFloat, and instead just ensure that panics happen as a result of _producing_ NaN or, perhaps, overflow to infinity. I don't believe it's important to panic immediately in either case, as long as the panic is roughly in the right place (at least all but the last stack frame should be accurate). It'd be an open question as to how Roc handles NaN given to it from the platform.
When exceptional performance is needed, perhaps unchecked arithmetic functions can be provided. Perhaps also the NaN panic implementation can be provided by the platform (with a default/reasonable implementation provided by Roc itself is the platform does not override). I don't know how well optimizations across the platform boundary work, but if the platform provides a no-op nan-check-and-panic function, then perhaps it would optimize away, thus having no runtime cost for that platform.
I don't know how well optimizations across the platform boundary work
They don't exist, but theoretically could with something like LTO and a solid amount of hassle.
and instead just ensure that panics happen as a result of _producing_ NaN
When exceptional performance is needed, perhaps unchecked arithmetic functions can be provided.
putting these two together, what we're saying is:
which reduces to "we'll treat NaNs as UB" :sweat_smile:
Brendan Hansknecht said:
Given that will at minimum take a few cycles on modern CPUs, you want the pre-fetching to start as soon as possible.
iiuc, memory reads (at least those that need to go to main memory) take a lot more than a few cycles to complete.
Brendan Hansknecht said:
I was trying to point out that on essentially all modern hardware you should consider:
I know on some architectures you can treat memory reads and writes as asynchronous operations, but I don't think that's common
to be true even if it isn't represented in the instruction stream.
...
Just imagine the fact that the CPU could be executing an instruction that is 630 instructions after the memory load that your program is currently stalled on.
This would suggest that the described common case of short sequences of float operations per memory location would mean that an extra instruction (per memory location in a list) may be negligible because of pipelining, instruction reordering, etc.
iiuc, memory reads (at least those that need to go to main memory) take a lot more than a few cycles to complete.
L1 cache...probably about 4 cycles
Main memmory...100-200ish cycles
Richard Feldman said:
well also let you opt into letting them happen
which reduces to "we'll treat NaNs as UB" :sweat_smile:
It's not UB in the C/C++ sense, where the compiler is allowed to do truly silly things based on arbitrary [mis]assumptions (like access out-of-bounds memory). The extent of this UB is that NaN would propagate, and presumably anyone opting into unchecked operations for better performance will check for NaN before they make decisions based on the result of the computation.
This would suggest that the described common case of short sequences of float operations per memory location would mean that an extra instruction (per memory location in a list) may be negligible because of pipelining, instruction reordering, etc.
It does help a ton, but a lot of times chains of dependencies really stop it from actually executing very far ahead. Like the counter value of a loop for example.
On top of that, there is a large horizon issue. If you have extra instructions in the hot loop, it will take extra longer to read through the instructions in the hot loop. So there is a chance the important load instruction is just outside of the current window and is delayed by a cycle or 2. And that happens on every loop iterations
presumably anyone opting into unchecked operations for better performance will check for NaN before they make decisions based on the result of the computation.
I think this reduces to basically everyone doing game dev though!
considering addition, subtraction, multiplication, division, sqrt, and log can all produce NaN from non-NaN arguments...people would be opting into the better-performance version all the time!
Richard Feldman said:
we definitely can't treat NaNs as UB while still allowing them to be the result of every arithmetic operation, since that would mean every floating-point arithmetic operation in Roc would become a potential source of UB
I suspect this really applies to spans of floating point operations, not individual operations. For example, given a sequence of 10 float operations that is converted to an integer and used as a list index, if a NaN is produced anywhere in the span (even in the first instruction), that really doesn't matter until the integer conversion.
Richard Feldman said:
presumably anyone opting into unchecked operations for better performance will check for NaN before they make decisions based on the result of the computation.
I think this reduces to basically everyone doing game dev though!
considering addition, subtraction, multiplication, division, sqrt, and log can all produce NaN from non-NaN arguments...people would be opting into the better-performance version all the time!
That brings us back then to "use Dec unless you know you need Float, and float has propagating NaNs and infinities, just like most other languages."
yeah I'm thinking that design is now the most promising, after exploring SIGFPE more. :thumbs_up:
to recap the design from the end of yesterday:
expect to confirm you're aware of it and are confident it's not zero. (I think we should do this regardless; it benefits integers too.)Equating, which means they don't get ==, <=, or >=, and that they can't be used as keys in hashmaps.NaN != NaN downside should primarily come up in practice when sorting, and it seems possible to provide default sorting for floats that takes this into account without any runtime performance cost.with those in place, the remaining NaN "gotchas" are that x < NaN == NaN < x even when x is not NaN. This primarily could cause problems in a sorting algorithm, although most sorting algorithms on collections would likely Order, which would explicitly sort NaNs consistently.
I wish there were some way to implement < in a single instruction such that x < NaN == NaN < x was false if x wasn't NaN, but unfortunately I suspect there's no such instruction
an
expectto confirm you're aware of it and are confident it's not zero.
Does expect do anything at runtime? Or is it just a modification at compile time? If just compile time, we might want to change the word to assume. Most people associate expect with test failures and panicking.
expect fails tests, and will report (but not panic) an error in dev builds
it does nothing in optimized builds
assume feels like it might be a weird word to use when writing tests :thinking:
assume sounds to me like a compiler hint: "Roc, don't check this thing for safety because you can assume it's safe"
Maybe I am mis-interpreting what you mean here:
Have a compiler warning when dividing by a non-constant, unless you either use a conditional or an
expect.
This sound like expect would be used in regular code to cause roc to remove a safety conditional.
Sounds like I can do:
expect (x != 0)
y = 3 / x
instead of:
y = when Num.div 3 x is
Ok v -> v
Err DivByZero -> #idk do something with errors
If you meant that you can do what I said above. That is where I am suggesting it should be:
assume (x != 0) # I know what I am doing, skip the safety
y = 3 / x
That kind of approach would be intriguing UB though, since that last example means "if x is zero, you'd normally panic because it's my fault, but in this case don't panic because it won't happen, and if it does actually happen, it's doubly my fault."
It's certainly a nicer way to introduce UB than the classic C/C++ way of the compiler implicitly making that assumption when asked to optimize, and it's certainly a more limited form of UB than C/C++ compilers may produce (it's actually just NaN propagation in Roc's case).
That said, coding assumptions are frequently incorrect in my experience: even if the assumption completely holds at the time the code was initially written, some later change in a distant part of the code will allow zeros to propagate into x.
Per Richard Feldman's point, even an "assume for speed" should probably become a panicking assert in debug builds.
I'd be concerned that an "assume for speed" would get misused, i.e. someone may add a dozen irrelevant assume lines to something trying to get more speed ("there, that should do it"), without actually telling the compiler anything useful, and without having done any profiling.
If allowed as a language feature, the expressions should be constrained somehow, so that you can't do:
x, y, z ->
assume (z != 0)
x / y
@Richard Feldman why exactly are we considering omission of NaN checks to be undefined behavior? Is it because LLVM may itself produce true UB when NaN is assumed to not occur (such as hypothetical jmps based on clever float bit patterns that go to unrelated areas in the code)? Or is it because we're considering NaN propagation to be UB? or something else?
here's how I'm thinking about things:
If we say "let's assume NaNs won't happen" then the practical consequence of that is not memory unsafety, but rather that we don't design APIs with NaNs in mind. An example of this is sorting - if we assume NaNs won't happen, then if a NaN gets into a list, and that gets sorted, what happens is then undefined behavior in that we have intentionally chosen not to think about it - that is, to define it. It'll be a surprise for whoever is unfortunate enough to encounter it!
If we know NaNs can actually happen (meaning we don't do the SIGFPE thing), then NaNs need to be considered in every floating-point API design (like sorting), and have their behavior defined - even if other constraints mean that the defined behavior in some cases is "somewhat error-prone, but making it less error-prone would have unacceptable performance characteristics."
Brendan Hansknecht said:
Have a compiler warning when dividing by a non-constant, unless you either use a conditional or an
expect.This sound like
expectwould be used in regular code to cause roc to remove a safety conditional.
it wouldn't be a safety conditional, it's just a lint reminding you to consider division by zero.
the thing that's error-prone about div by 0 is that it's easy to forget to think about it. If you do forget, the consequence is (at worst) an inf/-inf/NaN, not memory unsafety. So it's there to help you, but if you say "I know what I'm doing," it's not like the program will segfault or anything
Okay, I see what you mean with sort. So that doesn't seem to be an issue with < itself, for example, but just how we're using it.
Perhaps we could just define NaN as being less than all values so that x < y == y > x will always hold to be true. Right now it only doesn't hold because of NaN, and that's pretty unfortunate. We've called into question the value of NaN being unequal to itself (which we're sidestepping with the consideration of having no float equality), leaving us just with "unorderable NaN," which seems like plainly unnecessary complexity (no value gained for the pain it inflicts.
Such a change would make < and > cost 2-3 instructions instead of one in the worst case (when the compiler can't prove that NaN is not involved), but it seems that performance is more important for arithmetic than comparisons (since in Roc you already can't do x + (y > z) as you can in C). It also means that < and > have a better language-level definition.
When speed or comparisons-as-math are needed, we can provide functions for the C style behavior of having numeric comparisons returning 0 for false and 1 for true, and these can be defined as following ieee754 NaN rules.
is that the way to give people who need floats the best experience? :thinking:
obviously x < NaN == NaN < x is unintuitive, but can we think of some specific realistic scenarios where this would cause a bug?
I don't think we should automatically sacrifice performance or ergonomics just because a bug is theoretically possible
those considerations have to be balanced against how likely the bug actually is to occur in practice, and what the consequences would be if it did occur!
and especially in the context of who would be using the feature in question
like I think it's very safe to assume that if I'm a game programmer, my default preference is that < works normally on non-NaNs, and that it doesn't have any extra performance cost
what's the specific thing that sacrificing either of those would save me from? :thinking:
obviously
x < NaN == NaN < xis unintuitive, but can we think of some specific realistic scenarios where this would cause a bug?
Would this but lead to a bug in most sorting algorithms?
it wouldn't be a safety conditional, it's just a lint reminding you to consider division by zero.
the thing that's error-prone about div by 0 is that it's easy to forget to think about it. If you do forget, the consequence is (at worst) an inf/-inf/NaN, not memory unsafety. So it's there to help you, but if you say "I know what I'm doing," it's not like the program will segfault or anything
I still don't understand how expect would be used outside of tests. Can you give an example?
so if we have a default List.sort for be floats, and a default compare for floats, and they both sort NaNs to the edge...what would be the situation where someone writes their own sorting algorithm using < instead of compare?
I see. So we are defining the implementation of NaN comparison such that they sort to the edge. Just only if you use the compare function, not if you use <
Brendan Hansknecht said:
I still don't understand how
expectwould be used outside of tests. Can you give an example?
sure! Let's say I'm running a development build and I hit expect 1 > 2 - it would log that an expectation failed (to stderr if I'm running in a terminal, or to the editor if running in the editor), but the program would continue running. So I'd be notified that my expectation didn't hold, but it wouldn't block execution from continuing - kinda like a dbg! behind an if in Rust.
in an --optimize build, all the expects would do nothing; they'd just get removed.
Have a compiler warning when dividing by a non-constant, unless you either use a conditional or an
expect.
So unless I have expect (x != 0) I will get a compiler warning when typing 3 / x?
yeah, unless x is assigned to a nonzero literal, in which case we know for sure it's fine
or you use Num.divChecked which would check for div by 0 and return a Result
and wouldn't have a warning
also it would presumably need to be expect Num.isNotZero x since != won't be supported on floats!
so in the integer case, does the expect actually make any sense. Won't the division by zero panic anyway?
Is it just to make sure people explicitly choose either to use divChecked or add an expect?
or and a conditional around the div I guess
yeah exactly
divChecked would just be easier for the compiler to verify than a conditional
So in a really math heavy function, I could put a bunch of expects at the beginning and then just write normal math code. That sounds nice.
Would love if this could expand to things like list bounds checking. So that I can remove the checking from hot loops in optimized builds, but not sure if that would be too unsafe. Frankly some sort of assert that was still around in an optimized build would likely also be useful just to get llvm to put the conditional where I want it and potentially group them.
but this is all quite a different topic from floats.
Regarding bounds checking, aren't compilers pretty good at eliminating bounds checks when it's safe to do so?
Decently, but in writing wyhash in Roc, I have run into a number of cases where I can't get certain hot loop bounds checks to get removed. I at best can get it to merge multiple bounds checks into 1 per loop iteration, but that still has noticeable overhead.
yeah that's definitely an interesting idea!
I don't think we should make it possible to eliminate the bounds checks entirely - potentially introducing memory unsafety at runtime is a bright line I don't want to cross - but making it possible to extract them outside loops sounds like an interesting idea to explore for sure
Last updated: Jun 16 2026 at 16:19 UTC