Stream: ideas

Topic: List.binarySearch


view this post on Zulip Ayaz Hafiz (Dec 08 2023 at 00:00):

Thoughts on adding List.binarySearch : List a, (a -> Bool) -> Result a [NotFound]?

view this post on Zulip Agus Zubiaga (Dec 08 2023 at 00:05):

How would this work for any a? Don’t we need an Ord ability first?

view this post on Zulip Brendan Hansknecht (Dec 08 2023 at 00:07):

I think you just need the function to be (a -> [Smaller, Eq, Larger]) and it would work.

view this post on Zulip Brendan Hansknecht (Dec 08 2023 at 00:07):

Since the user passes in the function, no need for the ability

view this post on Zulip Agus Zubiaga (Dec 08 2023 at 00:08):

Right, like List.sortWith

view this post on Zulip Agus Zubiaga (Dec 08 2023 at 00:13):

Yeah, I suppose Ord doesn’t fix this because you would also need to provide the value you’re looking for to compare

view this post on Zulip Ayaz Hafiz (Dec 08 2023 at 00:14):

Yes, sorry, s/Bool/Ord

view this post on Zulip Agus Zubiaga (Dec 08 2023 at 00:21):

I suppose my Lists aren’t usually sorted, but maybe they should be :big_smile:

view this post on Zulip Brendan Hansknecht (Dec 08 2023 at 00:26):

This is now making me think of cache friendly binary search and how for max perf you actually want to sort in a really weird way that is closer to a heap.

view this post on Zulip Richard Feldman (Dec 08 2023 at 00:51):

yeah I definitely think we should have this!

view this post on Zulip Richard Feldman (Dec 08 2023 at 00:52):

I think the main question is what sorting algorithm to use :big_smile:

view this post on Zulip Richard Feldman (Dec 08 2023 at 02:19):

(er, sorry - I meant for Ord, not for binarySearch :stuck_out_tongue:)

view this post on Zulip Brendan Hansknecht (Dec 11 2023 at 01:43):

For sorting and Ord, we definitely should pick a stable sort. That is my only request. The rest is just a performance detail.

view this post on Zulip Ayaz Hafiz (Dec 11 2023 at 01:54):

I don't think the sorting algorithm is relevant for Ord itself. Different containers can use different algorithms but compare elements with the same ability.

view this post on Zulip Brendan Hansknecht (Dec 11 2023 at 02:13):

For sure

view this post on Zulip Brendan Hansknecht (Dec 11 2023 at 02:14):

Also, two fun talks:

  1. How to make binary search way more CPU friendly: https://youtu.be/1RIPMQQRBWk?si=InuPpGmT7bUqX-gd
  2. Better sorting and some of the pains of trying to upgrade sorting algorithms that aren't stable: https://youtu.be/cMRyQkrjEeI?si=zXZV98r_rlQHCg2C

view this post on Zulip Richard Feldman (Dec 11 2023 at 02:34):

Brendan Hansknecht said:

For sorting and Ord, we definitely should pick a stable sort. That is my only request. The rest is just a performance detail.

:thinking: is stability even observable in Roc?

view this post on Zulip Ayaz Hafiz (Dec 11 2023 at 02:42):

For sure, because if a sort sorts by e.g. length of word, then stable([abc, abd]) = [abc, abd] but it may be that unstable([abc, abd]) = [abd, abc]

view this post on Zulip Richard Feldman (Dec 11 2023 at 03:42):

ahh fair point! :thumbs_up:

view this post on Zulip Kevin Gillette (Dec 11 2023 at 15:48):

Why not just call it List.search or similar. If we want to be true to the name, and have purely predictable performance characteristics, we'd perhaps lock ourselves out of opportunities to lower the average runtime (such as galloping within an already-fetched cache-line if the algorithm estimates that it's close to the answer). Do we want to promise we'll find the answer in the same number of steps as a binary search would have (for better or worse), when we have the opportunity for a more efficient algorithm?

view this post on Zulip Kevin Gillette (Dec 11 2023 at 15:56):

Also, can this benefit from being configured with an object of tags, such as to set search index bounds, and to match for a literal value vs a set of values vs taking a function (vs taking a function which takes the index and value).

The characteristics of the algorithm/dataset (such as sparse vs dense, uniform vs non-uniform) could also optionally be specified if there's any opportunity for hinting we can take advantage of.

view this post on Zulip Brendan Hansknecht (Dec 11 2023 at 16:05):

A big advantage of naming and using a form of binary search sit at users will understand expectations about setup. With a generic search function, I think it would be less clear that a list needs to be in a certain form for it to work correctly. All fast search algorithms require the list to be in a specific order.

view this post on Zulip Kevin Gillette (Dec 11 2023 at 19:19):

That's true

view this post on Zulip Kevin Gillette (Dec 11 2023 at 19:42):

though if we use tags, we could achieve a similar effect: AssumeOrdered and AssumeUnique.

Some binary search implementations continue until they find the first element that matches a "greater or equal" predicate, though an application may only require the first found element which yields Eq or equivalent. It's not ergonomic to have that flexibility when there are multiple bespoke functions, but it could make sense with a single customizable function.


Last updated: Jun 16 2026 at 16:19 UTC