Stream: beginners

Topic: number ranges without heap allocations


view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 18:46):

Can someone explain a way to make a generator for a long array of numbers that would just, well, make numbers? List.range is no good as it seems to allocate the actual list (even if I will only ever walk the thing)

view this post on Zulip Luke Boswell (Aug 26 2024 at 18:48):

This may be helpful https://github.com/lukewilliamboswell/roc-random

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 18:51):

Luke Boswell said:

This may be helpful https://github.com/lukewilliamboswell/roc-random

which part of it? it still uses List.range to make the numbers, so it will OOM if I want to loop over 10 billion numbers

view this post on Zulip Sam Mohr (Aug 26 2024 at 19:02):

If you want to loop over 10 billion numbers in a functional language, you'll want to write a recursive function that is tail call optimized into a "while loop" at runtime.

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:04):

Well I do not want 10billion, I just want 1 billion, and I get a segfault with this...

app [main] { pf: platform "https://github.com/roc-lang/basic-cli/releases/download/0.14.0/dC5ceT962N_4jmoyoffVdphJ_4GlW3YMhAPyGPr-nU0.tar.br" }

import pf.Stdout
import pf.Task

loop = \i, max ->
    if i < max then
        i + loop (i + 1) max
    else
        i

main =
    x = 1000 * 1000 * 1000
    z = loop 0 x
    # z = List.range {start: At 0, end: Before x} |> List.walk 0 Num.add
    Stdout.line! "Hello, World! $(Num.toStr z)"

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:05):

and yes, I am fully aware that this not what roc was designed to do, but I think it is still relevant for the language that aims at high performance to not allocate arrays on the heap that will be insta-discarded after 1 use. even python is not that wasteful =)

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:06):

Of course, not segfaulting would be nice too =)I'll go file a bug.

view this post on Zulip Sam Mohr (Aug 26 2024 at 19:06):

Well, that's not gonna be tail-call optimized, since the looping happens in the middle of the function

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:06):

Sam Mohr said:

Well, that's not gonna be tail-call optimized, since the looping happens in the middle of the function

em? it is for all practical purposes... how else should I write it?

view this post on Zulip Brendan Hansknecht (Aug 26 2024 at 19:08):

It isn't a tail call cause addition has to happen after the next loop call returns.

You need a state to accumulate into

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:09):

errrr... hmhmh. ok. still it should not segfault on me, right?

view this post on Zulip Brendan Hansknecht (Aug 26 2024 at 19:09):

loop = \sum, i, max ->
    if i < max then
        loop (sum + i) (i + 1) max
    else
        sum

view this post on Zulip Sam Mohr (Aug 26 2024 at 19:10):

Yeah, probably not. It might be segfaulting because of an addition overflow? I'd have to gdb it

view this post on Zulip Brendan Hansknecht (Aug 26 2024 at 19:11):

I would expect it to stack overflow

view this post on Zulip Sam Mohr (Aug 26 2024 at 19:11):

I presume you're running roc <your-file>.roc, which would run the dev backend

view this post on Zulip Sam Mohr (Aug 26 2024 at 19:12):

So something in our custom assembly builder isn't handling either a stack overflow or a number addition failure, probably the first

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:12):

I am not getting the stack overflow, only SEGV.

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:14):

rustc  count.rs
headhunter@hunter-laptop /tmp> ./count
499999999500000000
headhunter@hunter-laptop /tmp> cat count.rs
fn main(){
   let s:u64 = (0u64..1000*1000*1000).sum();
    println!("{s}");
}

there is no overflow in this program as far as rust can tell

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:15):

I'm assuming that roc defaults to u64 for num representation here (not sure that is true, and even if it does not the program would just run forever)

view this post on Zulip Brendan Hansknecht (Aug 26 2024 at 19:15):

I64

view this post on Zulip Brendan Hansknecht (Aug 26 2024 at 19:15):

But yeah, definitely more than just the stack overflow issue being hit here

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:16):

loop = \sum, i, max ->
    if i < max then
        loop (sum + i) (i + 1) max
    else
        sum

this does not crash and gives correct answer

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:17):

so yes, missing TCO apparently leads to SEGV. which is kinda sad.

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:17):

but I would not even be here if I could have a generator for this sort of thing that I can then walk over

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:18):

feel like I'm inventing a range from python here, which feels wrong

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:18):

I mean rust has (min..max) notation that just gives you an iterator and you do not even need to think about it, would not be out of place in roc imo

view this post on Zulip Brendan Hansknecht (Aug 26 2024 at 19:20):

Yeah, iterators/streams have been talked about a few times on Zulip. There are concerns about splitting the ecosystem between lists and streams. But they are the right solution for certain problems.

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:22):

Brendan Hansknecht said:

Yeah, iterators/streams have been talked about a few times on Zulip. There are concerns about splitting the ecosystem between lists and streams. But they are the right solution for certain problems.

well its either that or better optimization passes that can detect when list is abused as iterator so it does not get backed by memory.

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:22):

either way it is a massive gotcha, as I would have never in my life expected List.range to allocate

view this post on Zulip Brendan Hansknecht (Aug 26 2024 at 19:22):

Interesting. It returns a List though

view this post on Zulip Brendan Hansknecht (Aug 26 2024 at 19:24):

Today, has gotchas but forces users to be explicit about allocations. Technically iterators can be built up in userland if required. But yeah, it isn't a great state for usability.

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:25):

Feels like roc needs an Iterator trait ability

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:26):

this way you can have functions taking in things that have Iterator ability rather than specifically Lists

view this post on Zulip Brendan Hansknecht (Aug 26 2024 at 19:28):

That would sadly require higher kinded abilites. Which roc does not have.

view this post on Zulip Brendan Hansknecht (Aug 26 2024 at 19:28):

At least that is my understanding of the issue

view this post on Zulip Brendan Hansknecht (Aug 26 2024 at 19:28):

But haven't looked into it in a while

view this post on Zulip Alexander Pyattaev (Aug 26 2024 at 19:40):

Brendan Hansknecht said:

That would sadly require higher kinded abilites. Which roc does not have.

what are those?

view this post on Zulip Brendan Hansknecht (Aug 26 2024 at 20:09):

An ability with a type variable. In the case of an iterator, it is needed to capture the element type.

In roc, you could make an I64Iterator ability, but not a generic Iterator ability

view this post on Zulip Alexander Pyattaev (Aug 27 2024 at 06:38):

Well in this case maybe some methods on numeric types would be nice to make ranges?

view this post on Zulip Alexander Pyattaev (Aug 27 2024 at 06:38):

as a stopgap measure at least

view this post on Zulip batyoullfly (Aug 27 2024 at 11:13):

Brendan Hansknecht said:

That would sadly require higher kinded abilites. Which roc does not have.

It would? You can have type variables in the functions belonging to an ability, and there are iterator interfaces in languages like C#/F# and Java where there are no higher kinded types

view this post on Zulip batyoullfly (Aug 27 2024 at 11:24):

Iterator implements next : {} -> Result a [NoMoreElements] something like this?

view this post on Zulip batyoullfly (Aug 27 2024 at 11:26):

Or I guess there’s a rule about the type variables needing to implement the ability

view this post on Zulip batyoullfly (Aug 27 2024 at 11:34):

Why is that a restriction exactly?

view this post on Zulip batyoullfly (Aug 27 2024 at 11:44):

Oh, well I guess these aren’t methods with implicit access to an iterator object so you would need to have the iterator as an argument even if that wasn’t being enforced. Hmm.
Couldn’t you still do something like
Iterator implements next : iter, a -> { state: iter, elem: Result a [NoMoreElements] } where iter implements Iterator
and just leave it up to the implementation to ensure that a is in fact the element type of the type implementing the ability?

view this post on Zulip batyoullfly (Aug 27 2024 at 12:07):

I guess then the issue is you couldn’t actually use it very effectively. Is there any reason abilities shouldn’t be able to take type parameters themselves as long as they are concrete types?

view this post on Zulip Brendan Hansknecht (Aug 27 2024 at 16:55):

This is the general faq on that: https://www.roc-lang.org/faq#arbitrary-rank-types

view this post on Zulip Notification Bot (Aug 27 2024 at 18:41):

47 messages were moved here from #beginners > Processing output in a CLI tool by Richard Feldman.

view this post on Zulip Richard Feldman (Aug 27 2024 at 18:44):

Alexander Pyattaev said:

Can someone explain a way to make a generator for a long array of numbers that would just, well, make numbers? List.range is no good as it seems to allocate the actual list (even if I will only ever walk the thing)

what if we added this to Num? (the ...etc part is the rest of the configuration record that List.range uses)

Num.walkRange :
    { start: [At (Int a), Before (Int a), After (Int a)], ...etc },
    state,
    (state, Num a -> state)
    -> state

view this post on Zulip Sam Mohr (Aug 27 2024 at 18:49):

I think this would be a great allocation-free way to achieve the result, but points to an underlying issue that streams would solve in many ways. We have

Because we want to have iterators, but don't want to support higher-kinded types. So having some mechanism to achieve that result would lower the cardinality of our APIs while also avoiding allocations.

view this post on Zulip Sam Mohr (Aug 27 2024 at 18:50):

I will try to look around this weekend at potential solutions that won't require our language to open the floodgates for type abuse

view this post on Zulip batyoullfly (Aug 27 2024 at 20:05):

Brendan Hansknecht said:

This is the general faq on that: https://www.roc-lang.org/faq#arbitrary-rank-types

Would allowing abilities to have type parameters be a form of higher ranked types?

view this post on Zulip batyoullfly (Aug 27 2024 at 20:23):

Would it be rank 2? Because instead of a function in an implementation being able to satisfy a signature of a -> a by accepting a specific type and returning that specific type, it would have to be able to accept any arbitrary type?

view this post on Zulip Sven van Caem (Aug 27 2024 at 20:23):

Sam Mohr said:

I will try to look around this weekend at potential solutions that won't require our language to open the floodgates for type abuse

I believe, to this end, the plan is for the compiler to eventually support deforestation/stream fusion, which would automatically combine successive list operations to avoid intermediate allocations. This way List.walkBackwards could be replaced by List.reverse |> List.walk with no loss in performance. In the meantime though, having this expanded family of list operations acts as a stop-gap to achieve the desired performance characteristics anyway.

view this post on Zulip Sam Mohr (Aug 27 2024 at 20:43):

Yes, that was a mechanism proposed by Rich. It would probably be better for users to not need to know the difference between List and Iterators for perf reasons, but there are some spots where we avoid giving "easy ways out", e.g. roc-unicode being a separate package instead of a standard library add with simple Str.len tools.

view this post on Zulip Sam Mohr (Aug 27 2024 at 20:45):

So yes, I'm okay with the expanded family for now, but it seems like if we could find a way to do iterators, that would be better than what we have. Meaning the issue isn't that iterators are the wrong solution, but that we don't know how to implement them in Roc's current type system.

view this post on Zulip Richard Feldman (Aug 27 2024 at 21:02):

for what it's worth, I spent a lot of time looking into various iterator designs and my conclusion was that "there are more functions exposed" was a better design than any of the ones I came up with

view this post on Zulip Richard Feldman (Aug 27 2024 at 21:05):

maybe another way to say it is that I prefer the downside of "this API could be a few functions smaller" to the downside of "there is a new entire chapter's worth of complexity in the tutorial on iterators"

view this post on Zulip Richard Feldman (Aug 27 2024 at 21:09):

that's not to say we should never do it, just that it's in the "explored various designs and none seemed like they were worth the complexity cost" state rather than the "hasn't really been investigated seriously" state :big_smile:

view this post on Zulip Sam Mohr (Aug 27 2024 at 21:19):

Yeah, that tracks. If there was an easy way to do it, it would've been in the language a few years ago.

view this post on Zulip Sam Mohr (Aug 27 2024 at 21:20):

Which is why I think maybe some macro to combine the various walk* functions in the compiler would be the way to go. Duplication in the compiler isn't bad if the stdlib API is nicer to use.

view this post on Zulip batyoullfly (Aug 27 2024 at 21:39):

Other than the number of functions for dealing with Lists, lack of any way to implement iterators still means there’s no generic way of writing a function that works with an unknown type of collection, right?

view this post on Zulip Brendan Hansknecht (Aug 27 2024 at 21:41):

Actually that is doable and done for decode and encode

view this post on Zulip Brendan Hansknecht (Aug 27 2024 at 21:41):

Or maybe it is done for the future version of encode and decode that aren't implemented yet

view this post on Zulip Brendan Hansknecht (Aug 27 2024 at 21:46):

SequenceWalker state seq elem : seq, state, (state, elem -> state) -> state
LengthInfo : [Length U64, UnknownLength]

# In the encode ability for sequences:
    sequence :
        seq,
        LengthInfo,
        SequenceWalker state seq elem,
        (elem -> FutureEncoder state)
        -> FutureEncoder state where state implements FutureEncoderFormatting

view this post on Zulip Brendan Hansknecht (Aug 27 2024 at 21:46):

You then can pass List.walk or Set.walk into that and it would work. Theoretically could also pass a lazy Iterator.walk.

view this post on Zulip Brendan Hansknecht (Aug 27 2024 at 21:47):

That said, it is technically missing information. It doesn't know the type of the element in the container. It just knows you have an impl of Container.walk that generates an element type. Technically that walk function could be mapping the element type to something different

view this post on Zulip batyoullfly (Aug 27 2024 at 21:49):

So I guess similar to what I asked about earlier with having an iterator ability, you could do operations on the collection but you couldn’t, for example, implement a sum function that works for all collections of Nums

view this post on Zulip Richard Feldman (Aug 27 2024 at 21:51):

that's correct

view this post on Zulip Richard Feldman (Aug 27 2024 at 21:51):

that was actually the original motivation for my looking into it

view this post on Zulip Richard Feldman (Aug 27 2024 at 21:51):

it seemed like something that the language "should be able to do" in the abstract, but when I got down to it I honestly had more and more trouble convincing myself it was actually important in practice as opposed to something that sounds important in the abstract :sweat_smile:

view this post on Zulip Brendan Hansknecht (Aug 27 2024 at 21:51):

Yeah, and you have to be careful about some weird cases. Like if you require elem as an input to specify the input type, that may lead to impossible cases the user doesn't have a way to generate an element type

view this post on Zulip batyoullfly (Aug 27 2024 at 21:52):

I was kind of thinking the same. Like in my mind it seems frustrating and strange that you wouldn’t be able to operate on generic collections of some specific type, but also I have never needed to do that really so it wouldn’t change anything for me in practice using the language

view this post on Zulip Brendan Hansknecht (Aug 27 2024 at 21:53):

Aside, isn't sequence walker almost the entire definitely or an iterator? Besides making apis inconvenient cause you have to pass in a input type and a walk function, I think it would work for most things iterators

view this post on Zulip Brendan Hansknecht (Aug 27 2024 at 21:53):

You could make a version that uses Num a as the element type and implement sum

view this post on Zulip batyoullfly (Aug 27 2024 at 21:56):

Oh? That’s cool, I think I don’t quite understand the usage of the example you gave then. I guess in situations where you’re writing library code and really anticipate people wanting to swap in their own data structure rather than using lists, you could always provide an alternate API for doing so, even if it’s a little awkward, because most of the time you’ll just use a list anyway

view this post on Zulip Brendan Hansknecht (Aug 27 2024 at 21:57):

Yeah, if you expect that you need to support multiple container types and you only need to walk the container. You could force the user to pass on the walk function and it would be functional. And they could even define a "container" that is really an iterator and doesn't have backing storage.

view this post on Zulip Brendan Hansknecht (Aug 28 2024 at 02:28):

So I guess iterables can actually work today in roc. A bit of a strange setup, but should be functional. Also, the typing is a bit jank. It does work though..

Also, I'm sure something nicer could be made from this. Just the first thing I tested really:

Functional iterator

view this post on Zulip Brendan Hansknecht (Aug 28 2024 at 02:29):

Since abilities get monomorphize, roc will generate the concrete Walker type that is created by a walk function. Thus, later uses will type check/error correctly based on the concrete final types.

view this post on Zulip Brendan Hansknecht (Aug 28 2024 at 02:33):

I guess that would suggest that something like this should just work:

Iterator implements
    next : iter -> (iter, Result elm [NoMoreElements]) where iter implements Iterator

view this post on Zulip Brendan Hansknecht (Aug 28 2024 at 02:35):

Of course you really need an extra layer of indirection. Cause an iterator is not just a container. It is general a container plus some extra state. But I guess it could be paired with:

Iterable implements
    toIter : iterable -> iterator where iterable implements Iterable, iterator implements Iterator

This definitely is throwing out some type information. Like we have no idea what the element type is from the signature, but I think it would still compile and work in roc.

view this post on Zulip Brendan Hansknecht (Aug 28 2024 at 02:35):

Kinda feels like magic that shouldn't work.

view this post on Zulip Sam Mohr (Aug 28 2024 at 02:36):

I did something earlier at work, I'm gonna benchmark stuff when I get home


Last updated: Jul 06 2025 at 12:14 UTC