After speaking with @Jasper Woudenberg , I decided to try and implement the Sort ability for floats. But quickly after starting, I realized there isn't a clear good way to do this. I made a list of strategies below, but was hoping to get some guidance from someone with expertise in the subject.
Strategies I am aware of:
<
and ==
, which always return false if either operand is a NaN. This resulted in compare always returning GT when there is a NaN involved. I am strongly apposed to this approach because, as the Haskell source code notes, this violates reflexivity which can result in data corruption.PartialSort implements partialCompare: a, a -> Result [LessThan, Equivalent, GreaterThan] [Unorderable]
. I like this approach because, a) it is honest about the fact that not all floats are comparable, and b), a PartialSort ability is likely useful for other types. I do not like that it will likely make working with floats more awkward.My instinct would be to go with the C# strategy and just make NaNs the smallest, though Rust's partial order approach strikes me as more in line with the Roc ethos.
Let me know what y'all think.
great question! We talked about this before:
Richard Feldman said:
so if we have a default
List.sort
for be floats, and a defaultcompare
for floats, and they both sort NaNs to the edge...what would be the situation where someone writes their own sorting algorithm using<
instead ofcompare
?
I think what we want here is:
NaN
s end up sorted to either the beginning or endso I think C#'s approach makes sense, although it would also be fine to consider NaN
to be the largest value (instead of the smallest) if it turns out that's better for performance than treating it as the smallest
I assume it's probably the same perf either way
Thanks! That completely clarifies
I have another implementation question. Which of the two below options would be preferable from a performance perspective? I know a few more instructions can be preferable to branching, but I do not have a feel for how many instructions.
a) A function with if-elses. This has a lot of branches, but the top two cases are the common ones. (I am guessing this does not actually need to be a function, but I am completely new to LLVM and compilers, so I would need some pointers on how to do that.)
float_sort(a, b)
if a < b
return 2
if a > b
return 1
if a == b
return 0
if b == b // a is NaN, so smaller
return 2
if a == a
return 1
return 0
b) This expression (a == a) + 2*(b == b) - (a < b) - 2*(a > b) - 3*(a == b)
. This has no branches, but it is a lot of instructions. (It would also warrant a comment with a table demonstrating the logic)
Almost certainly branchless will be better for average performance. That said, the top implement may turn into cmovs anyway, though probably not due to returns. But if you assign that if back to a variable, probably would cmov and be quite good...though not fully sure.
Other way, I would bet on the expression being the better option
Last updated: Jul 06 2025 at 12:14 UTC