Nonempty list · beginners · Zulip Chat Archive

In Elm, if I were to define a nonempty list I'd probably write this type NonemptyList a = NonemptyList a (List a). This way it's impossible to accidentally create one that is empty.

In Roc I could define it the same way but it seems like it would come at a significant performance cost? For example,

type NonemptyList a = NonemptyList a (List a)

toList : NonemptyList a -> List a
toList \(NonemptyList first rest) ->
    -- Adding an item to the start of the list is slow
    List.cons first rest

reverse : NonemptyList a -> NonemptyList a
reverse \(NonemptyList first rest) ->
    -- Adding an item to the start of the list is slow
    tempList = List.cons first rest |> List.reverse
    when List.get 0 tempList is
        Ok value ->
            -- Removing one item from the end of the list is slow
            NonemptyList value (List.drop 1 tempList)
        Err _ ->
            -- This should never happen but I don't know how to prove it since we don't have list destructuring
            (NonemptyList first rest)

My question is, would it make more sense to just implement nonempty list unsafely, i.e. type NonemptyList a = NonemptyList (List a) and then write tests to try catching any potential mistakes?

Qqwy / Marten (Jul 18 2022 at 08:44):

Qqwy / Marten (Jul 18 2022 at 08:49):

It might be possible to express NonemptyList just as "an array which is guaranteed to not be empty", which would allow equivalents of the List functions that otherwise have to return a Result to be total (e.g. first, last) to return the success value always.

Qqwy / Marten (Jul 18 2022 at 08:51):

Another thing I was thinking about to try more generally to port linked list-based algorithms to Roc, is to store the elements in reverse inside the underlying array: Appending is fast as reallocation only happens whenever the list size doubles, whereas prepending always requires moving all elements around.

Martin Stewart (Jul 18 2022 at 08:57):

Oh, I didn't realize that appending is fast with Roc arrays. In that case I could define my nonempty list as type NonemptyList a = NonemptyList (List a) a

Folkert de Vries (Jul 18 2022 at 09:05):

Martin Stewart (Jul 18 2022 at 09:58):

In that case, why can't Roc have list pattern matching like Elm has, only you'd be taking items off the end of the list instead of the start?

Qqwy / Marten (Jul 18 2022 at 10:20):

It definitely is possible to add pattern matching syntax for lists. It would be syntactic sugar over a combination of List.sublist + List.get where intermediate bounds checks could be removed in many cases.

There was some talk about this in the past, but no consensus yet. (This is a 'nice to have'; other things have higher priority right now.)
One problem with introducing too liberal pattern matching options, is that people will start using them without considering the potential performance characteristics.
Roc strives for "abstraction without sacrificing performance". In certain areas this is a tricky balancing act.

Martin Stewart (Jul 18 2022 at 10:23):

Yeah fair enough that it's a low implementation priority. But what do you mean by "One problem with introducing too liberal pattern matching options, is that people will start using them without considering the potential performance characteristics."? You wrote that list pattern matching would remove a bounds check so wouldn't this be faster?

Qqwy / Marten (Jul 18 2022 at 10:23):

For instance, consider this Rust array-based pattern match example. It is a very elegant-looking implementation of a palindrome checker. However, it is only fast because Rust has a separate abstraction for 'slices' (a cheap view of a subpart of an array using just a pointer + length).

Qqwy / Marten (Jul 18 2022 at 10:24):

Qqwy / Marten (Jul 18 2022 at 10:27):

So it is more about what you do with the result of matching in the pattern, than about the pattern matching itself.

Qqwy / Marten (Jul 18 2022 at 10:28):

Martin Stewart (Jul 18 2022 at 10:31):

Yeah, that makes sense. I think I'd still prefer to have list pattern matching even if it introduces a performance foot gun, since it only took me ~20 lines of code to end up with an unreachable pattern that I couldn't remove :sweat_smile:

Martin Stewart (Jul 18 2022 at 10:32):

Though to be clear, I'm not suggesting the user should be able to destruct a single item from the start of a list, only from the end.

Martin Stewart (Jul 18 2022 at 13:37):

Never mind, it occurred to me that my nonempty list example isn’t a good motivating use case for list pattern matching since this would be something where I’d want it to run as fast as possible and therefore I wouldn’t be able to use the list pattern match anyway

Folkert de Vries (Jul 18 2022 at 13:39):

yes I wanted to ask: can't you (easily?) express the logic in terms of map/keepIf/walk?

Brendan Hansknecht (Jul 18 2022 at 15:01):

Also, we do plan to add slices eventually, which would fix the performance foot gun (at least to some extent)

Martin Stewart (Jul 18 2022 at 15:45):

True, folding over the list would work too. I used List.reverse because I figured it would be super optimised and maybe doing a fold (I'm guessing List.foldl in Elm is List.walk in Roc?) wouldn't be as efficient

Brendan Hansknecht (Jul 18 2022 at 16:49):

Reverse is actually quite costly because it does the actual work of reversing the list and doesn't just recreate a reverse iterator (or similar).

Looping over the list with fold or map or etc, would likely be much more efficient.

Brendan Hansknecht (Jul 18 2022 at 16:49):

Folkert de Vries (Jul 18 2022 at 16:49):

Folkert de Vries (Jul 18 2022 at 16:50):

Martin Stewart (Jul 19 2022 at 10:57):

I got Roc running on my Macbook and decided to try writing NonemptyList for real. Here's the result:

NonemptyList

app "helloZig"
    packages { pf: "main.roc" }
    imports [ Result ]
    provides [main] to pf


NonemptyList :=
    { rest: (List Str), lastItem: Str }


toList : NonemptyList -> List Str
toList = \@NonemptyList { rest, lastItem } ->
    List.append rest lastItem


fromList : List Str -> Result NonemptyList [ OutOfBounds ]*
fromList = \list ->
    when List.get list (List.len list - 1) is
        Ok value ->
            @NonemptyList { rest: List.dropLast list, lastItem: value } |> Ok

        Err error ->
            Err error


fromSingle : Str -> NonemptyList
fromSingle = \item ->
    @NonemptyList { rest: [], lastItem: item }


from : List Str, Str -> NonemptyList
from = \rest, lastItem ->
    @NonemptyList { rest: rest, lastItem: lastItem }


len : NonemptyList -> Nat
len = \@NonemptyList { rest } ->
    List.len rest + 1


get : NonemptyList, Nat -> Result Str [ OutOfBounds ]*
get = \nonempty, index ->
    (@NonemptyList { rest, lastItem }) = nonempty
    if index == len nonempty - 1 then
        Ok lastItem
    else
        List.get rest (Num.subSaturated index 1)


set : NonemptyList, Nat, Str -> NonemptyList
set = \nonempty, index, value ->
    (@NonemptyList { rest, lastItem }) = nonempty
    if index == len nonempty - 1 then
        @NonemptyList { rest: rest, lastItem: value }
    else
        { rest: List.set rest (Num.subSaturated index 1) value, lastItem: lastItem }
            |> @NonemptyList


first : NonemptyList -> Str
first = \(@NonemptyList { rest, lastItem }) ->
    List.first rest |> Result.withDefault lastItem


last : NonemptyList -> Str
last = \(@NonemptyList { lastItem }) ->
    lastItem


dropLast : NonemptyList -> List Str
dropLast = \(@NonemptyList { rest }) ->
    rest


swap : NonemptyList, Nat, Nat -> NonemptyList
swap = \nonempty, left, right ->
    (@NonemptyList { rest, lastItem }) = nonempty
    lastIndex = len nonempty - 1
    if left != lastIndex && right != lastIndex then
        @NonemptyList { rest: List.swap rest left right, lastItem: lastItem }

    else if left != lastIndex then
        when List.get rest left is
            Ok leftValue ->
                @NonemptyList { rest: List.set rest left lastItem, lastItem: leftValue }

            Err _ ->
                nonempty

    else if right != lastIndex then
        when List.get rest right is
            Ok rightValue ->
                @NonemptyList { rest: List.set rest right lastItem, lastItem: rightValue }

            Err _ ->
                nonempty

    else
        nonempty


reverse : NonemptyList -> NonemptyList
reverse = \nonempty ->
    reverseHelp nonempty 0 (Num.subSaturated (len nonempty) 1)


reverseHelp : NonemptyList, Nat, Nat -> NonemptyList
reverseHelp = \nonempty, left, right ->
    if left < right then
        reverseHelp (swap nonempty left right) (left + 1) (right - 1)
    else
        nonempty


main =
    from [ "D", "C", "B" ] "A" |> reverse |> toList |> Str.joinWith ", "

I couldn't figure out how to add type variables so for now NonemptyList can only contain Str. Also the reverse function probably isn't optimal. It does a lot of unneeded checks.

Qqwy / Marten (Jul 19 2022 at 11:40):

To add type variables you can write the definition as NonemptyList a := { rest: (List a), lastItem: a }

Qqwy / Marten (Jul 19 2022 at 11:47):

Qqwy / Marten (Jul 19 2022 at 11:50):

And if you want it to be even more usable, implement iterate. Following that, you can implement walk, walkUntil, find, and many others on top of that. See the source for List itself for more inspiration. Only the functions which have a type signature without an implementation are implemented as "native builtins", the others are built on top of those inside Roc itself.

Martin Stewart (Jul 19 2022 at 11:53):

Anton (Jul 19 2022 at 11:54):

Martin Stewart (Jul 19 2022 at 11:57):

Code

app "helloZig"
    packages { pf: "main.roc" }
    imports [ Result ]
    provides [main] to pf


NonemptyList a :=
    { rest: (List a), lastItem: a }


toList : NonemptyList a -> List a
toList = \@NonemptyList { rest, lastItem } ->
    List.append rest lastItem


fromList : List a -> Result (NonemptyList a) [ OutOfBounds ]*
fromList = \list ->
    when List.get list (List.len list - 1) is
        Ok value ->
            @NonemptyList { rest: List.dropLast list, lastItem: value } |> Ok

        Err error ->
            Err error


fromSingle : a -> NonemptyList a
fromSingle = \item ->
    @NonemptyList { rest: [], lastItem: item }


from : List a, a -> NonemptyList a
from = \rest, lastItem ->
    @NonemptyList { rest: rest, lastItem: lastItem }


len : NonemptyList a -> Nat
len = \@NonemptyList { rest } ->
    List.len rest + 1


get : NonemptyList a, Nat -> Result a [ OutOfBounds ]*
get = \nonempty, index ->
    (@NonemptyList { rest, lastItem }) = nonempty
    if index == len nonempty - 1 then
        Ok lastItem
    else
        List.get rest (Num.subSaturated index 1)


set : NonemptyList a, Nat, a -> NonemptyList a
set = \nonempty, index, value ->
    (@NonemptyList { rest, lastItem }) = nonempty
    if index == len nonempty - 1 then
        @NonemptyList { rest: rest, lastItem: value }
    else
        { rest: List.set rest (Num.subSaturated index 1) value, lastItem: lastItem }
            |> @NonemptyList


first : NonemptyList a -> a
first = \(@NonemptyList { rest, lastItem }) ->
    List.first rest |> Result.withDefault lastItem


last : NonemptyList a -> a
last = \(@NonemptyList { lastItem }) ->
    lastItem


dropLast : NonemptyList a -> List a
dropLast = \(@NonemptyList { rest }) ->
    rest


swap : NonemptyList a, Nat, Nat -> NonemptyList a
swap = \nonempty, left, right ->
    (@NonemptyList { rest, lastItem }) = nonempty
    lastIndex = len nonempty - 1
    if left != lastIndex && right != lastIndex then
        @NonemptyList { rest: List.swap rest left right, lastItem: lastItem }

    else if left != lastIndex then
        when List.get rest left is
            Ok leftValue ->
                @NonemptyList { rest: List.set rest left lastItem, lastItem: leftValue }

            Err _ ->
                nonempty

    else if right != lastIndex then
        when List.get rest right is
            Ok rightValue ->
                @NonemptyList { rest: List.set rest right lastItem, lastItem: rightValue }

            Err _ ->
                nonempty

    else
        nonempty


reverse : NonemptyList a -> NonemptyList a
reverse = \nonempty ->
    reverseHelp nonempty 0 (Num.subSaturated (len nonempty) 1)


reverseHelp : NonemptyList a, Nat, Nat -> NonemptyList a
reverseHelp = \nonempty, left, right ->
    if left < right then
        reverseHelp (swap nonempty left right) (left + 1) (right - 1)
    else
        nonempty


main =
    from [ "D", "C", "B" ] "A" |> reverse |> toList |> a.joinWith ", "

Martin Stewart (Jul 19 2022 at 11:58):

RUST_BACKTRACE=1 ./roc examples/hello-world/zig-platform/helloZig.roc
thread '<unnamed>' panicked at 'index out of bounds: the len is 0 but the index is 0', compiler/mono/src/ir.rs:9147:35
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::panicking::panic_bounds_check
   3: roc_mono::ir::match_on_lambda_set
   4: roc_mono::ir::with_hole
   5: roc_mono::ir::from_can
   6: roc_mono::ir::specialize_variable_help
   7: roc_mono::ir::specialize_external_help
   8: roc_mono::ir::specialize_all
   9: roc_load_internal::file::run_task
  10: core::ops::function::FnOnce::call_once{{vtable.shim}}
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Anton (Jul 19 2022 at 11:59):

Can you make an issue for that? The compiler should never panic.
If you have the time it would be great if you could minimize the code that reproduces the issue.

Martin Stewart (Jul 19 2022 at 12:09):

Anton (Jul 19 2022 at 12:12):

Martin Stewart (Jul 19 2022 at 15:00):

One exception is List.range which currently returns [] if you call it with something like this: List.range 3 1. I was wondering if it would make more sense to have it instead return [ 3, 2, 1 ]? The reason is that then NonemptyList.range could be guaranteed to always return a nonempty list.

jan kili (Jul 19 2022 at 15:09):

Martin Stewart (Jul 19 2022 at 15:09):

jan kili (Jul 19 2022 at 15:10):

Yes that stream is private, and we did not reach a final conclusion. Idk who admins that "compiler development" stream, but I think access is granted automatically when you contribute to the compiler.

jan kili (Jul 19 2022 at 15:11):

The topic was a spin-off of another compiler-related discussion, but it probably should be moved out into public - #ideas maybe?

Martin Stewart (Jul 19 2022 at 15:11):

jan kili (Jul 19 2022 at 15:11):

jan kili (Jul 19 2022 at 15:12):

Martin Stewart (Jul 19 2022 at 15:13):

Don't promote it too much or there's going to be hacktoberfest-esque PR spam from people in order to get in :stuck_out_tongue:

jan kili (Jul 19 2022 at 15:13):

Can someone with Zulip powers please move that "List.range boundaries" topic into the appropriate public stream?

Anton (Jul 19 2022 at 15:15):

jan kili (Jul 19 2022 at 15:15):

jan kili (Jul 19 2022 at 16:32):

They meant List.sublist.
This is a good point, @Martin Stewart, will you return a Result from functions like List.sublist?

Martin Stewart (Jul 19 2022 at 16:34):

In general, any List function that returns a sub list of the original list was converted to NonemptyList a -> List a

Stream: beginners

Topic: Nonempty list

Martin Stewart (Jul 18 2022 at 08:10):

Qqwy / Marten (Jul 18 2022 at 08:44):

Qqwy / Marten (Jul 18 2022 at 08:49):

Qqwy / Marten (Jul 18 2022 at 08:51):

Martin Stewart (Jul 18 2022 at 08:57):

Folkert de Vries (Jul 18 2022 at 09:05):

Martin Stewart (Jul 18 2022 at 09:58):

Qqwy / Marten (Jul 18 2022 at 10:20):

Martin Stewart (Jul 18 2022 at 10:23):

Qqwy / Marten (Jul 18 2022 at 10:23):

Qqwy / Marten (Jul 18 2022 at 10:24):

Qqwy / Marten (Jul 18 2022 at 10:27):

Qqwy / Marten (Jul 18 2022 at 10:28):

Martin Stewart (Jul 18 2022 at 10:31):

Martin Stewart (Jul 18 2022 at 10:32):

Martin Stewart (Jul 18 2022 at 13:37):

Folkert de Vries (Jul 18 2022 at 13:39):

Brendan Hansknecht (Jul 18 2022 at 15:01):

Martin Stewart (Jul 18 2022 at 15:45):

Brendan Hansknecht (Jul 18 2022 at 16:49):

Brendan Hansknecht (Jul 18 2022 at 16:49):

Folkert de Vries (Jul 18 2022 at 16:49):

Folkert de Vries (Jul 18 2022 at 16:50):

Martin Stewart (Jul 19 2022 at 10:57):

Qqwy / Marten (Jul 19 2022 at 11:40):

Qqwy / Marten (Jul 19 2022 at 11:47):

Qqwy / Marten (Jul 19 2022 at 11:50):

Martin Stewart (Jul 19 2022 at 11:53):

Anton (Jul 19 2022 at 11:54):

Martin Stewart (Jul 19 2022 at 11:57):

Martin Stewart (Jul 19 2022 at 11:58):

Anton (Jul 19 2022 at 11:59):

Martin Stewart (Jul 19 2022 at 12:09):

Anton (Jul 19 2022 at 12:12):

Martin Stewart (Jul 19 2022 at 15:00):

jan kili (Jul 19 2022 at 15:09):

Martin Stewart (Jul 19 2022 at 15:09):

jan kili (Jul 19 2022 at 15:10):

jan kili (Jul 19 2022 at 15:11):

Martin Stewart (Jul 19 2022 at 15:11):

jan kili (Jul 19 2022 at 15:11):

jan kili (Jul 19 2022 at 15:12):

jan kili (Jul 19 2022 at 15:12):

jan kili (Jul 19 2022 at 15:12):

Martin Stewart (Jul 19 2022 at 15:13):

jan kili (Jul 19 2022 at 15:13):

Anton (Jul 19 2022 at 15:15):

jan kili (Jul 19 2022 at 15:15):

jan kili (Jul 19 2022 at 16:32):

Martin Stewart (Jul 19 2022 at 16:34):

Richard Feldman (Jul 19 2022 at 22:29):