Stream: ideas

Topic: non-empty lists


view this post on Zulip Juliano (Aug 28 2022 at 18:38):

Even though Haskell ships with a Data.List.NonEmpty, it's pretty rare for libraries to give you back a NonEmpty for operations that can't return an empty list (using sepBy1 in attoparsec rn made me reflect about this).

Elm doesn't ship with NonEmpty in the standard lib, not sure why. Could be a decision to keep the stdlib as small as possible, could be there's tradeoffs to making NonEmpty pervasive that I'm unaware.

I noticed Roc doesn't have NonEmpty yet, and was curious: do folks have an opinion about making NonEmpty pervasive/normative in a language's ecosystem, and thoughts on how to go about it in Roc?

view this post on Zulip Martin Stewart (Aug 28 2022 at 20:07):

I've started on something like this https://github.com/MartinSStewart/roc-nonempty with the hope that we can have a single package handling common nonempty use cases, rather than ending up with functions just returning (a, List a) or having multiple competing packages.

Maybe it would make more sense to have this part of the standard Roc lib though.

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:14):

so I had an idea around this awhile ago, but I wasn't convinced the stdlib complexity would be worth it

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:15):

to me, one of the natural questions that comes up when you have nonempty lists is: what are the implications for APIs?

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:15):

like for example, let's say I have Dict.fromList - do I also want Dict.fromNonEmptyList?

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:16):

arguably not, because if I already have Dict.fromList, I can always call that passing NonEmpty.toList or something like that

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:16):

but what if I have a function that accepts a List Str and returns a List Str?

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:17):

now if I pass it a NonEmpty list (via NonEmpty.toList), the returned List has lost its "nonemptiness"

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:18):

one possible solution to that is to say "if you're in that situation, just offer 2 versions, one for empty lists and one for nonempty lists"

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:18):

(or in Haskell's case, probably something like Functor f - but I definitely don't want to go down that road for this use case!)

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:19):

one possible solution is to introduce a new stdlib opaque type called (let's say for the sake of argument) Seq elem emptiness

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:19):

and then define List and NonEmptyList as type aliases of it, similar to what we do with Num:

List elem : Seq elem [PotentiallyEmpty]

NonEmptyList elem : Seq elem [NonEmpty]

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:20):

then change the meaning of list literals to be:

["a"] : Seq Str *

[] : Seq * [PotentiallyEmpty]

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:20):

so square bracket literals are no longer a List, but rather a Seq which is compatible with either potentially-empty or guaranteed NonEmpty lists

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:21):

then you can also say, instead of List Str -> List Str, something like Seq Str emptiness -> Seq Str emptiness

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:21):

and now you can give it either lists or nonempty lists, and it returns whatever you gave it

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:23):

in this world, I assume it would be recommended to accept Seq Whatever * instead of List Whatever, because it's more flexible; you can give it either a List or a NonEmptyList

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:24):

also, you could potentially take the same idea further and define Dict and Set to be type aliases of Seq as well

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:24):

such that (for example) a function that accepts a Seq could also accept either a Dict or a Set

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:25):

one downside of this is that as soon as you start down that road, much like with Num, there's now a demand for userspace extensions of Seq

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:26):

e.g. "I made my custom data structure and it doesn't work with Seq; how can I make it Just Work for any functions that accept Seq?" but of course the best you could do would be to offer a toSeq function which converts it into one of the stdlib data structures

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:26):

which would often be less performant

view this post on Zulip Martin Stewart (Aug 28 2022 at 20:26):

I vaguely remember reading about this approach in an earlier version of the roc-for-elm-programmers doc :thinking:

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:27):

like I said at the beginning...I'm not convinced it's worth the complexity, but I'm also not convinced it's definitely a mistake either, so I'm curious what others think on this topic in general :big_smile:

view this post on Zulip Martin Stewart (Aug 28 2022 at 20:30):

I agree, I don't think it's worth the complexity. My idea was to just add in nonempty versions of List, Set, and Dict into the standard lib and then add Nonempty* versions of functions like List Str -> List Str where it makes sense. That feels inelegant but I suspect in practice it will be fine.

view this post on Zulip Richard Feldman (Aug 28 2022 at 20:31):

one thing that it would make pretty different is that it's start calling stuff like Seq.map instead of List.map, but you'd still have List.get and not Seq.get because the types would be different (depending on whether it was nonempty)

view this post on Zulip Juliano (Aug 28 2022 at 20:34):

Richard Feldman said:

e.g. "I made my custom data structure and it doesn't work with Seq; how can I make it Just Work for any functions that accept Seq?" but of course the best you could do would be to offer a toSeq function which converts it into one of the stdlib data structures

What's an example where people would create a data structure that wouldn't work with Seq? I love the idea of Seq elem [NonEmpty], and I think I'm not seeing the implications :S

view this post on Zulip Qqwy / Marten (Aug 28 2022 at 20:38):

I think what Richard means is that there are many other sequential collections one can think of besides doubling-in-size vectors and nonempty-doubling-in-size vectors. Like for instance linked lists, RRB vectors, finger trees, numerical ranges, etc.

view this post on Zulip Qqwy / Marten (Aug 28 2022 at 20:39):

An alternative might be to define non-higher-kindred traits (side-stepping Functor a) by taking inspiration from, for instance,
https://docs.rs/cc-traits/latest/cc_traits/

view this post on Zulip Qqwy / Marten (Aug 28 2022 at 20:42):

It does depend on associated types, but only having those does not make it possible to implement the Monad stack.

view this post on Zulip Richard Feldman (Aug 29 2022 at 00:47):

yeah but with associated types we're talking about a whole new language feature in what I consider the riskiest category of new language features from the standpoint of overall ecosystem complexity (namely, making abilities more powerful), so I'd consider the bar for introducing something like that to be very high :big_smile:

view this post on Zulip Richard Feldman (Aug 29 2022 at 02:00):

although I suppose it could be something like:

Seq.custom : structure, (state, (state, elem -> state) -> state) -> Seq elem [Custom structure] [PotentiallyEmpty]

Seq.customNonEmpty : structure, elem, (state, (state, elem -> state) -> state) ->  Seq elem [Custom structure] [NonEmpty]

Seq.unwrapCustom : Seq * [Custom structure] * -> structure

example usage:

ConsList.toSeq : ConsList elem -> Seq elem [Custom (ConsList elem)] [PotentiallyEmpty]
ConsList.toSeq = \consList ->
    Seq.custom consList \state, transform ->
        ConsList.walk consList state transform

that way you could still use your custom data structure with functions that work on Seq

view this post on Zulip Richard Feldman (Aug 29 2022 at 02:01):

it wouldn't be as ergonomic as the builtin ones, but that seems like an ok tradeoff given that builtin data structures always have an ergonomics advantage due to having things like literals in the syntax for them etc

view this post on Zulip Qqwy / Marten (Aug 29 2022 at 11:02):

Interesting idea. In a sense, you're building a trait-implementation in the manual way.

view this post on Zulip Qqwy / Marten (Aug 29 2022 at 11:07):

However, it does not compose. If I wrap my fancy RRB vector using Seq.custom I cannot call any RRBVector.* functions on it until calling Seq.unwrapCustom again.

view this post on Zulip Qqwy / Marten (Aug 29 2022 at 11:07):

So it will indeed be very clunky to use

view this post on Zulip Richard Feldman (Aug 29 2022 at 12:29):

thinking about it more, this Seq idea (with Seq.custom) would be most similar to Rust's Iterator

view this post on Zulip Richard Feldman (Aug 29 2022 at 12:29):

like you can iterate over it, but you can't map etc

view this post on Zulip Richard Feldman (Aug 29 2022 at 12:31):

which is interesting because in Rust I've used that a lot, but so far I can't remember any times (in Rust at least) that I really wanted something that was "a generic collection that supports map" for example

view this post on Zulip Richard Feldman (Aug 29 2022 at 12:32):

which is relevant because I don't think that custom design could implement a way to provide a map implementation

view this post on Zulip Folkert de Vries (Aug 29 2022 at 12:33):

you don't need the map on the data structure because you can always use the iterator to map

view this post on Zulip Qqwy / Marten (Aug 29 2022 at 12:34):

Rust's Iterator does use an associated type however.

view this post on Zulip Folkert de Vries (Aug 29 2022 at 12:34):

isn't that an implementation detail? you could have Iterator<T> right?

view this post on Zulip Folkert de Vries (Aug 29 2022 at 12:34):

but that's pretty verbose in rust

view this post on Zulip Qqwy / Marten (Aug 29 2022 at 12:35):

:thinking: I think you're right

view this post on Zulip Richard Feldman (Aug 29 2022 at 12:36):

right, and I explicitly want to avoid adding associated types to the language :big_smile:

view this post on Zulip Qqwy / Marten (Aug 29 2022 at 12:37):

Yes, that's why I mentioned it. But I think Folkert is right that they are not needed.

view this post on Zulip Richard Feldman (Aug 29 2022 at 12:38):

yeah so you couldn't offer Seq.map itself, but since you'd have Seq.walk, you wouldn't need to

view this post on Zulip Richard Feldman (Aug 29 2022 at 12:39):

Seq.len would be implementable if desired

view this post on Zulip Richard Feldman (Aug 29 2022 at 12:39):

which would make Seq like ExactSizeIterator in Rust. There are pros and cons to that

view this post on Zulip Richard Feldman (Aug 29 2022 at 13:00):

could also have it ask for walkUntil instead of walk, and then Seq.first would be implementable

view this post on Zulip Qqwy / Marten (Aug 29 2022 at 13:20):

Richard Feldman said:

which would make Seq like ExactSizeIterator in Rust. There are pros and cons to that

You could have multiple versions that stack on top of each other, just like Iterator and ExactSizeIterator

view this post on Zulip Richard Feldman (Aug 29 2022 at 13:30):

it's possible, but that one seems like a stretch in terms of being worth it :big_smile:

view this post on Zulip Richard Feldman (Aug 29 2022 at 13:33):

are there use cases for nonempty dictionaries and sets? :thinking:

view this post on Zulip Qqwy / Marten (Aug 29 2022 at 13:35):

There are use-cases for many other different kinds of sequential datastructures

view this post on Zulip Richard Feldman (Aug 29 2022 at 13:36):

right, but for those specifically

view this post on Zulip Richard Feldman (Aug 29 2022 at 13:37):

like the way I typically use dictionaries and sets, I'm always asking questions like "is this specific key present?" (and in the case of a dictionary, possibly "is this specific value present?")

view this post on Zulip Richard Feldman (Aug 29 2022 at 13:38):

as opposed to a nonempty list where I sometimes ask questions like "give me the first element from this list, and I know there's one there so I don't want to have to deal with a Result"

view this post on Zulip Richard Feldman (Aug 29 2022 at 13:39):

so I'm wondering whether that's a use case worth thinking about while exploring this idea!

view this post on Zulip Ayaz Hafiz (Aug 29 2022 at 13:40):

I use non-empty sets in OCaml a lot, in particular to choose one arbitrary element and do some processing over it, but I want to make sure there are no duplicate elements that I need to process over

view this post on Zulip Richard Feldman (Aug 29 2022 at 13:41):

interesting!

view this post on Zulip Richard Feldman (Aug 29 2022 at 13:41):

how does OCaml do nonempty sets?

view this post on Zulip Ayaz Hafiz (Aug 29 2022 at 13:42):

It doesn't. I assume an invariant that the set I'm working over is non-empty :)

view this post on Zulip Richard Feldman (Aug 29 2022 at 13:43):

ah :big_smile:

view this post on Zulip léo (Aug 29 2022 at 13:55):

you can have real nonempty sets with GADTs in ocaml

view this post on Zulip Martin Stewart (Aug 29 2022 at 13:55):

Richard Feldman said:

are there use cases for nonempty dictionaries and sets? :thinking:

For nonempty sets I've used it to ensure a user has selected at least one item before submitting a form. I didn't use a nonempty list because each item should be unique.

For nonempty dict I've used it for modelling a user account that has one or more "cases" and each case has an id used to reference it. So NonemptyDict CaseId Case. Part of the account creation is to set up a case so there is guaranteed to be at least one.

view this post on Zulip Richard Feldman (Aug 29 2022 at 13:59):

also interesting, thanks! :thumbs_up:

view this post on Zulip léo (Aug 29 2022 at 14:09):

type (_, 'a) set =
  | Nil : ([`Empty], 'a) set
  | Cons : 'a * (_, 'a) set -> ([`Nonempty], 'a) set

let hd : ([`Nonempty], 'a) set -> 'a = function
  | Cons (hd, _tl) -> hd

let empty = Nil

let cons hd tl = Cons (hd, tl)

let () =
  let s = cons 1 empty in
  Printf.printf "hd = %d\n" (hd s)

That's just non-empty lists, but if you want to have uniqueness, you can have a smart constructor that ensures uniqueness

view this post on Zulip Brendan Hansknecht (Aug 29 2022 at 14:36):

For nonempty sets I've used it to ensure a user has selected at least one item before submitting a form. I didn't use a nonempty list because each item should be unique.

How does this work? If a user unselected an item, wouldn't it degrade into a regular set? Do you just block a user from unselecting unless they have at least 2 items selected?

Aside, what is the gain over just adding a length check at the beginning of your processing/submitting step?

I don't think I have ever used non-empty containers, so trying to understand better.

view this post on Zulip léo (Aug 29 2022 at 14:39):

I'm not sure about how you would use this to interact with the user...

view this post on Zulip léo (Aug 29 2022 at 14:41):

The gains are usually performance (no runtime check) and safety (the invariant is encoded in the type).

view this post on Zulip Brendan Hansknecht (Aug 29 2022 at 14:43):

But if you have to define the type with some sort of recursive structure or union, I would bet that is generally much slower than a linear list/array in memory. Following pointers is generally quite slow.

view this post on Zulip léo (Aug 29 2022 at 14:52):

You don't have to, you can juste have something like :

type (_, 'a) set =
  | Empty : ([`Empty], 'a) set
  | Nonempty : 'a array -> ([`Nonempty], 'a) set

Then ofc, you'll have to have a smart constructor and hide stuff to prevent user to build a Nonempty variant with an empty array.

view this post on Zulip léo (Aug 29 2022 at 15:13):

You would end up with something like:

module Non_empty_array : sig
  type (_, 'a) t
  val empty : ([> `Empty], 'a) t
  val cons : 'a -> ([ `Empty | `Nonempty], 'a) t -> ([`Nonempty], 'a) t
  val hd : ([`Nonempty], 'a) t -> 'a
end = struct

  type (_, 'a) t =
    | Empty : ([>`Empty], 'a) t
    | Nonempty : 'a array -> ([>`Nonempty], 'a) t

  let hd : ([`Nonempty], 'a) t -> 'a = function
    | Nonempty a -> a.(0)

  let empty = Empty

  let cons : 'a -> ([ `Empty | `Nonempty], 'a) t -> ([`Nonempty], 'a) t =
    fun hd -> function
    | Empty -> Nonempty ([|hd|])
    | Nonempty a ->
      let len = Array.length a + 1 in
      let f = function
        | 0 -> hd
        | i -> a.(i + 1)
      in
      let a = Array.init len f in
      Nonempty a
end

let () =
  let open Non_empty_array in
  let s = cons 1 empty in
  Printf.printf "hd = %d\n" (hd s)

(cons is terribly inefficient as I'm using non-resizable arrays, but that another problem)

view this post on Zulip Martin Stewart (Aug 29 2022 at 15:23):

Brendan Hansknecht said:

For nonempty sets I've used it to ensure a user has selected at least one item before submitting a form. I didn't use a nonempty list because each item should be unique.

How does this work? If a user unselected an item, wouldn't it degrade into a regular set? Do you just block a user from unselecting unless they have at least 2 items selected?

A normal set is used to store items while the user is still selecting them. The nonempty set is for places where the selection must be validated. For example, showing the selection on the submit success page, or storing the data in the backend.

Aside, what is the gain over just adding a length check at the beginning of your processing/submitting step?

The advantage of using nonempty set is that it's not possible for me to forget to do the validation when the user presses submit since I need to convert from a set to a nonempty set. Additionally having nonempty set better documents what is intended.

view this post on Zulip Richard Feldman (Aug 29 2022 at 16:53):

anyone have thoughts on nonempty strings, while we're on the subject?

view this post on Zulip Richard Feldman (Aug 29 2022 at 17:03):

e.g. for form fields, validating that things like username and email were nonempty

view this post on Zulip Martin Stewart (Aug 29 2022 at 17:06):

I made a nonempty string package in Elm. My experience after using it for a while is it can be useful* but there's some drawbacks:

  1. Often you end up needing more guarantees than just "this string is not the empty string". Things like, this string contains more than just white space. Or this string isn't longer than 100 chars. For those cases you'll need to make your own opaque type with its own validation at which point you don't gain much from using nonempty internally.
  2. Constructing a nonempty string constant value in Elm is super awkward. You either write NonemptyString 'M' "y string" or write
unsafeNonemptyString : String -> NonemptyString
unsafeNonemptyString text =
    case NonemptyString.fromString text of
        Just nonempty -> nonempty
        Nothing -> unsafeNonemptyString text -- I hope we never reach this

myString = unsafeNonemptyString "My string"

And since Roc doesn't have a char type, the first approach won't work.

*One usecase I had was creating a package for sending emails via sendgrid. The sendgrid API will return an error if the subject field is empty. So I used NonemptyString to ensure the user won't accidentally write code that does that.

view this post on Zulip Brendan Hansknecht (Aug 29 2022 at 18:52):

The advantage of using nonempty set is that it's not possible for me to forget to do the validation when the user presses submit since I need to convert from a set to a nonempty set. Additionally having nonempty set better documents what is intended.

Ah, I guess I wasn't thinking API boundaries and passing data to another function. That makes sense. Now a user has to verify a set is non-empty before even calling certain functions. So just encoding function constraints in type constructors.

view this post on Zulip Thomas Dwyer (Aug 30 2022 at 01:48):

Non-empty lists are a form of "Parse, don't validate." More common in languages with more powerful type systems, it can still be very useful in languages with expressive datatypes like Roc.

view this post on Zulip Qqwy / Marten (Aug 30 2022 at 08:00):

Maybe at some point we might improve the constant value story in Roc by simultaneously:

view this post on Zulip Qqwy / Marten (Aug 30 2022 at 08:03):

Then you could have something like

urlConstant : String -> Url
urlConstant string =
  case Url.parse string of
    Just url -> url
    Nothing -> absurd

view this post on Zulip Qqwy / Marten (Aug 30 2022 at 08:04):

But obviously this would require being able to run semi-arbitrary Roc code at compile time

view this post on Zulip Qqwy / Marten (Aug 30 2022 at 08:05):

I guess it would be nice to indicate in the types of the example that it's a compile-time thing too by the way, otherwise its type signature looks 'too good to be true'

view this post on Zulip Qqwy / Marten (Aug 30 2022 at 08:14):

urlConstant : CompileTime (String -> Url)
urlConstant string =
  case Url.parse string of
    Just url -> url
    Nothing -> absurd

or something

view this post on Zulip Martin Stewart (Aug 30 2022 at 08:15):

I think this was discussed a while back (I don't remember which thread). I would love to have this feature but there are some challenges.

view this post on Zulip Martin Stewart (Aug 30 2022 at 08:19):

I can add that, I've done something similar in Elm:

{-| Be very careful when using this!
-}
url : String -> Url
url urlText =
    case Url.fromString urlText of
        Just url_ ->
            url_

        Nothing ->
            unreachable ()


{-| Be very careful when using this!
-}
message : String -> Message
message text =
    case Message.fromString text of
        Ok ok ->
            ok

        Err _ ->
            unreachable ()


{-| Be very careful when using this!
-}
emailAddress : String -> EmailAddress
emailAddress text =
    case EmailAddress.fromString text of
        Just emailAddress_ ->
            emailAddress_

        Nothing ->
            unreachable ()


{-| Be very careful when using this!
-}
unreachable : () -> a
unreachable () =
    let
        _ =
            causeStackOverflow 0
    in
    unreachable ()


causeStackOverflow : Int -> Int
causeStackOverflow value =
    -- Don't TCO
    causeStackOverflow value + 1

In practice this has never given me any trouble*. So maybe this approach is good enough even without any compiler support

*I've only done this on solo projects so far. Maybe with multiple people, the odds are higher that someone will misuse these functions (i.e. Unsafe.url urlDeterminedAtRuntime)

view this post on Zulip Qqwy / Marten (Aug 30 2022 at 08:23):

Yes, you would need to work with what the cool kids nowadays call a 'gas meter'.
It's a simple trade-off between power/safety and compilation-speed and I do not think there is any way around it.

view this post on Zulip Qqwy / Marten (Aug 30 2022 at 08:23):

For the vast majority of things we want, the amount of time that needs to be spent during compilation will be very low (e.g. not recursive at all), of course.

view this post on Zulip lue (Sep 03 2022 at 06:27):

Cycling back to the annoyances of multiple operations for non-empty or emptiable containers,
there's a safe and convenient approach, introducing a Never/() type argument signaling empty-ability:

type List element emptyTag_ emptiable =
      Empty emptiable
    | Append element (List element Empty emptiable)

-- same with Set, ...
append :
       element
    -> List element Empty emptiable_
    -> List element Empty never_

-- same with Set, ...
top  : List element Empty Never -> element

Set.fromList : List element Empty emptiable -> Set element Empty emptiable

The linked elm API lists more use-cases and on how to implement


Last updated: Jun 16 2026 at 16:19 UTC