Stream: beginners

Topic: Nonempty list


view this post on Zulip Martin Stewart (Jul 18 2022 at 08:10):

In Elm, if I were to define a nonempty list I'd probably write this type NonemptyList a = NonemptyList a (List a). This way it's impossible to accidentally create one that is empty.

In Roc I could define it the same way but it seems like it would come at a significant performance cost? For example,

type NonemptyList a = NonemptyList a (List a)

toList : NonemptyList a -> List a
toList \(NonemptyList first rest) ->
    -- Adding an item to the start of the list is slow
    List.cons first rest

reverse : NonemptyList a -> NonemptyList a
reverse \(NonemptyList first rest) ->
    -- Adding an item to the start of the list is slow
    tempList = List.cons first rest |> List.reverse
    when List.get 0 tempList is
        Ok value ->
            -- Removing one item from the end of the list is slow
            NonemptyList value (List.drop 1 tempList)
        Err _ ->
            -- This should never happen but I don't know how to prove it since we don't have list destructuring
            (NonemptyList first rest)

My question is, would it make more sense to just implement nonempty list unsafely, i.e. type NonemptyList a = NonemptyList (List a) and then write tests to try catching any potential mistakes?

view this post on Zulip Qqwy / Marten (Jul 18 2022 at 08:44):

Ooh! Interesting idea!

view this post on Zulip Qqwy / Marten (Jul 18 2022 at 08:49):

It might be possible to express NonemptyList just as "an array which is guaranteed to not be empty", which would allow equivalents of the List functions that otherwise have to return a Result to be total (e.g. first, last) to return the success value always.

view this post on Zulip Qqwy / Marten (Jul 18 2022 at 08:51):

Another thing I was thinking about to try more generally to port linked list-based algorithms to Roc, is to store the elements in reverse inside the underlying array: Appending is fast as reallocation only happens whenever the list size doubles, whereas prepending always requires moving all elements around.

view this post on Zulip Martin Stewart (Jul 18 2022 at 08:57):

Oh, I didn't realize that appending is fast with Roc arrays. In that case I could define my nonempty list as type NonemptyList a = NonemptyList (List a) a

view this post on Zulip Folkert de Vries (Jul 18 2022 at 09:05):

yes

view this post on Zulip Martin Stewart (Jul 18 2022 at 09:58):

In that case, why can't Roc have list pattern matching like Elm has, only you'd be taking items off the end of the list instead of the start?

view this post on Zulip Qqwy / Marten (Jul 18 2022 at 10:20):

It definitely is possible to add pattern matching syntax for lists. It would be syntactic sugar over a combination of List.sublist + List.get where intermediate bounds checks could be removed in many cases.

There was some talk about this in the past, but no consensus yet. (This is a 'nice to have'; other things have higher priority right now.)
One problem with introducing too liberal pattern matching options, is that people will start using them without considering the potential performance characteristics.
Roc strives for "abstraction without sacrificing performance". In certain areas this is a tricky balancing act.

view this post on Zulip Martin Stewart (Jul 18 2022 at 10:23):

Yeah fair enough that it's a low implementation priority. But what do you mean by "One problem with introducing too liberal pattern matching options, is that people will start using them without considering the potential performance characteristics."? You wrote that list pattern matching would remove a bounds check so wouldn't this be faster?

view this post on Zulip Qqwy / Marten (Jul 18 2022 at 10:23):

For instance, consider this Rust array-based pattern match example. It is a very elegant-looking implementation of a palindrome checker. However, it is only fast because Rust has a separate abstraction for 'slices' (a cheap view of a subpart of an array using just a pointer + length).

view this post on Zulip Qqwy / Marten (Jul 18 2022 at 10:24):

Without slices, it would require a lot of copying of the underlying array.

view this post on Zulip Qqwy / Marten (Jul 18 2022 at 10:27):

So it is more about what you do with the result of matching in the pattern, than about the pattern matching itself.

view this post on Zulip Qqwy / Marten (Jul 18 2022 at 10:28):

Does that make sense? :blush:

view this post on Zulip Martin Stewart (Jul 18 2022 at 10:31):

Yeah, that makes sense. I think I'd still prefer to have list pattern matching even if it introduces a performance foot gun, since it only took me ~20 lines of code to end up with an unreachable pattern that I couldn't remove :sweat_smile:

view this post on Zulip Martin Stewart (Jul 18 2022 at 10:32):

Though to be clear, I'm not suggesting the user should be able to destruct a single item from the start of a list, only from the end.

view this post on Zulip Martin Stewart (Jul 18 2022 at 13:37):

Never mind, it occurred to me that my nonempty list example isn’t a good motivating use case for list pattern matching since this would be something where I’d want it to run as fast as possible and therefore I wouldn’t be able to use the list pattern match anyway

view this post on Zulip Folkert de Vries (Jul 18 2022 at 13:39):

yes I wanted to ask: can't you (easily?) express the logic in terms of map/keepIf/walk?

view this post on Zulip Brendan Hansknecht (Jul 18 2022 at 15:01):

Also, we do plan to add slices eventually, which would fix the performance foot gun (at least to some extent)

view this post on Zulip Martin Stewart (Jul 18 2022 at 15:45):

Folkert de Vries said:

yes I wanted to ask: can't you (easily?) express the logic in terms of map/keepIf/walk?

True, folding over the list would work too. I used List.reverse because I figured it would be super optimised and maybe doing a fold (I'm guessing List.foldl in Elm is List.walk in Roc?) wouldn't be as efficient

view this post on Zulip Brendan Hansknecht (Jul 18 2022 at 16:49):

Hmm. That's an interesting note on API.

Reverse is actually quite costly because it does the actual work of reversing the list and doesn't just recreate a reverse iterator (or similar).

Looping over the list with fold or map or etc, would likely be much more efficient.

view this post on Zulip Brendan Hansknecht (Jul 18 2022 at 16:49):

Also, if we don't already have walkReverse we should definitely add it.

view this post on Zulip Folkert de Vries (Jul 18 2022 at 16:49):

note here is the definition of reverse

reverse : List a -> List a
reverse = \list ->
    reverseHelp list 0 (Num.subSaturated (List.len list) 1)

reverseHelp = \list, left, right ->
    if left < right then
        reverseHelp (List.swap list left right) (left + 1) (right - 1)
    else
        list

view this post on Zulip Folkert de Vries (Jul 18 2022 at 16:50):

and walkBackwards is a thing

view this post on Zulip Martin Stewart (Jul 19 2022 at 10:57):

I got Roc running on my Macbook and decided to try writing NonemptyList for real. Here's the result:

NonemptyList

I couldn't figure out how to add type variables so for now NonemptyList can only contain Str. Also the reverse function probably isn't optimal. It does a lot of unneeded checks.

view this post on Zulip Qqwy / Marten (Jul 19 2022 at 11:40):

To add type variables you can write the definition as NonemptyList a := { rest: (List a), lastItem: a }

view this post on Zulip Qqwy / Marten (Jul 19 2022 at 11:47):

Some other tips:

view this post on Zulip Qqwy / Marten (Jul 19 2022 at 11:50):

And if you want it to be even more usable, implement iterate. Following that, you can implement walk, walkUntil, find, and many others on top of that. See the source for List itself for more inspiration. Only the functions which have a type signature without an implementation are implemented as "native builtins", the others are built on top of those inside Roc itself.

view this post on Zulip Martin Stewart (Jul 19 2022 at 11:53):

Qqwy / Marten said:

To add type variables you can write the definition as NonemptyList a := { rest: (List a), lastItem: a }

That's what I tried but I got an error message that I didn't understand

view this post on Zulip Anton (Jul 19 2022 at 11:54):

You're welcome to share the error message :)

view this post on Zulip Martin Stewart (Jul 19 2022 at 11:57):

Here's the code with type vars

Code

and here's the error message

./roc examples/hello-world/zig-platform/helloZig.roc
thread '<unnamed>' panicked at 'index out of bounds: the len is 0 but the index is 0', compiler/mono/src/ir.rs:9147:35
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

view this post on Zulip Martin Stewart (Jul 19 2022 at 11:58):

With RUST_BACKTRACE=1

RUST_BACKTRACE=1 ./roc examples/hello-world/zig-platform/helloZig.roc
thread '<unnamed>' panicked at 'index out of bounds: the len is 0 but the index is 0', compiler/mono/src/ir.rs:9147:35
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::panicking::panic_bounds_check
   3: roc_mono::ir::match_on_lambda_set
   4: roc_mono::ir::with_hole
   5: roc_mono::ir::from_can
   6: roc_mono::ir::specialize_variable_help
   7: roc_mono::ir::specialize_external_help
   8: roc_mono::ir::specialize_all
   9: roc_load_internal::file::run_task
  10: core::ops::function::FnOnce::call_once{{vtable.shim}}
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

view this post on Zulip Anton (Jul 19 2022 at 11:59):

Can you make an issue for that? The compiler should never panic.
If you have the time it would be great if you could minimize the code that reproduces the issue.

view this post on Zulip Martin Stewart (Jul 19 2022 at 12:09):

https://github.com/rtfeldman/roc/issues/3584 I figured out the mistake too. I did a find+replace on Str -> a but that messed up Str.joinWith so it became a.joinWith.

view this post on Zulip Anton (Jul 19 2022 at 12:12):

Excellent, thanks for making the issue!

view this post on Zulip Martin Stewart (Jul 19 2022 at 15:00):

I've implemented nonempty versions of all the exposed List functions https://github.com/MartinSStewart/Nonempty.

One exception is List.range which currently returns [] if you call it with something like this: List.range 3 1. I was wondering if it would make more sense to have it instead return [ 3, 2, 1 ]? The reason is that then NonemptyList.range could be guaranteed to always return a nonempty list.

view this post on Zulip jan kili (Jul 19 2022 at 15:09):

@Martin Stewart We discussed List.range boundaries in another stream earlier this month: https://roc.zulipchat.com/#narrow/stream/231635-compiler-development/topic/List.2Erange.20boundaries/near/288637840

view this post on Zulip Martin Stewart (Jul 19 2022 at 15:09):

That link seems to be broken (or the conversation is private)?

view this post on Zulip jan kili (Jul 19 2022 at 15:10):

Yes that stream is private, and we did not reach a final conclusion. Idk who admins that "compiler development" stream, but I think access is granted automatically when you contribute to the compiler.

view this post on Zulip jan kili (Jul 19 2022 at 15:11):

The topic was a spin-off of another compiler-related discussion, but it probably should be moved out into public - #ideas maybe?

view this post on Zulip Martin Stewart (Jul 19 2022 at 15:11):

I'm guessing my one github issue doesn't count? :sweat_smile:

view this post on Zulip jan kili (Jul 19 2022 at 15:11):

Haha I guess not

view this post on Zulip jan kili (Jul 19 2022 at 15:12):

When my first compiler PR was merged I was invited

view this post on Zulip jan kili (Jul 19 2022 at 15:12):

The smoky back room where the llvm sausage gets made

view this post on Zulip jan kili (Jul 19 2022 at 15:12):

I don't understand half the words in there but it's fun to watch

view this post on Zulip Martin Stewart (Jul 19 2022 at 15:13):

Don't promote it too much or there's going to be hacktoberfest-esque PR spam from people in order to get in :stuck_out_tongue:

view this post on Zulip jan kili (Jul 19 2022 at 15:13):

Can someone with Zulip powers please move that "List.range boundaries" topic into the appropriate public stream?

view this post on Zulip Anton (Jul 19 2022 at 15:15):

I moved it to ideas

view this post on Zulip jan kili (Jul 19 2022 at 15:15):

@Anton you rock. https://roc.zulipchat.com/#narrow/stream/304641-ideas/topic/List.2Erange.20boundaries/near/290113775

view this post on Zulip jan kili (Jul 19 2022 at 16:32):

Qqwy / Marten said:

Unfortunately nothing prevents people from passing in ranges which are outside of the srange of the list, say List.range [1,2,3] 20 100.

They meant List.sublist.
This is a good point, @Martin Stewart, will you return a Result from functions like List.sublist?

view this post on Zulip Martin Stewart (Jul 19 2022 at 16:34):

sublist : NonemptyList a, { start: Nat, len: Nat } -> List a

In general, any List function that returns a sub list of the original list was converted to NonemptyList a -> List a

view this post on Zulip Richard Feldman (Jul 19 2022 at 22:29):

I realized if list pattern matching existed, I would have used it in the demo I'm giving tomorrow: https://gist.github.com/rtfeldman/8c83efd5e9e091254e3f842f2be217b7


Last updated: Jul 06 2025 at 12:14 UTC