Stream: contributing

Topic: Random number generator


view this post on Zulip Martin Stewart (Jul 20 2022 at 15:54):

I'm going to try implementing a random number generator package but I figure I should double check first if anyone has already implemented one.

view this post on Zulip Ayaz Hafiz (Jul 20 2022 at 15:57):

I believe @JanCVanB did something around this

view this post on Zulip Brendan Hansknecht (Jul 20 2022 at 15:58):

https://github.com/JanCVanB/roc-random

view this post on Zulip Martin Stewart (Jul 20 2022 at 15:59):

Nice, good thing I checked then :thumbs_up:

view this post on Zulip jan kili (Jul 20 2022 at 18:12):

:grinning_face_with_smiling_eyes:

view this post on Zulip jan kili (Jul 20 2022 at 18:13):

I got it to v1 in a week, a few months ago, and I was looking for feedback for v2.

view this post on Zulip jan kili (Jul 20 2022 at 18:14):

Feedback, contributions, and collaborators are always welcome! Usage and interest of the lib would make me reprioritize it.

view this post on Zulip jan kili (Jul 20 2022 at 18:19):

This was my first experience with RNG theory & implementation, so some of my naming/terminology might be off.

view this post on Zulip jan kili (Jul 20 2022 at 18:20):

Why are you interested in RNG? :) I was drawn to its seemingly-impure nature (potentially conflicting with Roc's purity) and use cases in art, data visualization, simulation, and games.

view this post on Zulip jan kili (Jul 20 2022 at 18:23):

I was very proud of this plot, even though I never successfully diagnosed the 0 bias: https://raw.githubusercontent.com/JanCVanB/roc-plotters/main/examples/scatter.png

view this post on Zulip Martin Stewart (Jul 20 2022 at 18:27):

I want to implement some of the core package stuff. Random numbers is pretty core!

view this post on Zulip Martin Stewart (Jul 20 2022 at 18:28):

I'm having a look at the code you've written now and experimenting with back passing to see if it can be used here.

view this post on Zulip Brendan Hansknecht (Jul 20 2022 at 18:30):

@JanCVanB I thought we eventually discovered the bug was related to a bug in the roc shifting functions. That should be fixed now (I think?), so maybe updating to latest roc would fix that plot.

view this post on Zulip jan kili (Jul 20 2022 at 18:31):

@Brendan Hansknecht Oh yeah, that might have been where I left it.

view this post on Zulip Qqwy / Marten (Jul 20 2022 at 19:01):

@JanCVanB Which pseudoRNG algorithm does the library implement?

view this post on Zulip Qqwy / Marten (Jul 20 2022 at 19:05):

That a RNG library exists and that we now have expect makes me hyped for a property-based testing library! :happy:

view this post on Zulip Martin Stewart (Jul 20 2022 at 19:05):

Here's a proof of concept of my backpassing idea

interface Random
    exposes [
        getSeed,
        u32,
        finish,
        Seed,
        run
    ]
    imports []

Generator a : Seed -> { value: a, newState: Seed }

Seed := U32

getSeed : U32 -> Seed
getSeed = \seed -> @Seed seed

u32 : U32, U32, (U32 -> Generator a) -> Generator a
u32 = \low, high, func ->
    \(@Seed state) ->
        generator = func (state % (high - low) + low)
        generator
            # The random seed just increments between uses
            (@Seed (state + 1))

finish : a -> Generator a
finish = \value ->
    (\state -> { value: value, newState : state })

# Example user code

point : Generator { x: U32, y: U32 }
point =
    x <- u32 0 9
    y <- u32 10 19
    finish { x, y }

run =
    point (getSeed 123)

view this post on Zulip Qqwy / Marten (Jul 20 2022 at 19:09):

@Martin Stewart That's great! We might even be able to go a step further with inferring the return type

view this post on Zulip Brendan Hansknecht (Jul 20 2022 at 19:09):

It uses a form of pcg if I recall

view this post on Zulip Martin Stewart (Jul 20 2022 at 19:10):

I'm not sure if this is what you mean but at first I tried writing int : Int b, Int b, (Int b -> Generator a) -> Generator a instead of u32 : U32, U32, (U32 -> Generator a) -> Generator a but Roc kept saying that it was inferring Int b as Int Unsigned32 and that was preventing me from adding two ints together.

view this post on Zulip Qqwy / Marten (Jul 20 2022 at 19:10):

So you can just have

myVal : Generator {x: U32, y: I64, z: List Int64, w: Str}
x <- range 0 9
y <- range 10 19
z <- random
w <- random
finish {x, y, z, w}

view this post on Zulip Martin Stewart (Jul 20 2022 at 19:12):

I'm not sure I understand, could you provide an example of how random would be implemented?

view this post on Zulip jan kili (Jul 20 2022 at 19:19):

Yes it's pcg based, and I put pcg links in the Roc comments

view this post on Zulip Qqwy / Marten (Jul 20 2022 at 19:20):

:thinking: To implement random we need an ability (Roc's version of what is known as a 'trait' or 'typeclass' or roughly an 'interface' in other languages).
Adding support for abilities is one of the things which is very recent (and it's not entirely finished yet if I remember correctly) but it should be possible to write this soon.

view this post on Zulip Martin Stewart (Jul 20 2022 at 19:30):

Gotcha. I don't think that's a good idea because, if I understand correctly, that means assigning exactly 1 way to randomly generate a given type. Most types will need some configuration, for example Random.str probably will take a min and max length (and maybe there could be another version called Random.ascii).

view this post on Zulip Martin Stewart (Jul 20 2022 at 19:32):

@JanCVanB If the backpassing approach looks good, I can try adding it to your code and making a PR

view this post on Zulip Qqwy / Marten (Jul 20 2022 at 19:33):

Yes, you're absolutely right! Sometimes you do want to customize something further!

...

https://media1.giphy.com/media/3o85xIO33l7RlmLR4I/giphy.gif?cid=c623cb350xcnug6zjprgvhqji08s64h9f3z7qarern1qb1lc&rid=giphy.gif&ct=g

:nerd:

view this post on Zulip jan kili (Jul 20 2022 at 20:36):

@Martin Stewart PRs are very welcome, thank you! I haven't really considered backpacking, so I'm looking forward to digging into your example above.

view this post on Zulip jan kili (Jul 20 2022 at 21:01):

:laughing: backpaSSing

view this post on Zulip jan kili (Jul 20 2022 at 23:26):

@Martin Stewart I think I see - is the backpassing support just to simplify the use of one-time generators?

view this post on Zulip jan kili (Jul 20 2022 at 23:46):

I definitely care about improving the ergonomics, so I'm open to stuff like this!

view this post on Zulip Martin Stewart (Jul 21 2022 at 06:29):

I'm not sure what you mean by one-time generators but yes this should simplify all generators

view this post on Zulip jan kili (Jul 21 2022 at 10:18):

In your example above, there is some conflation between seed & state (a state is initialized with a seed) and since both seeds & output values are U32 it's a little tricky to say for certain whether variables like x and y are intended as seeds or output values (the backpassing callback function input parameter name calls them seeds, but they're returned from point like output values).

view this post on Zulip jan kili (Jul 21 2022 at 10:19):

However, details aside, using backpassing with generators seems like it would prevent state chaining.

view this post on Zulip jan kili (Jul 21 2022 at 10:20):

This means that if your x and y domains were identical, they would generate identical values for each iteration (because using the same seed leads to the same sequence of values, without chaining).

view this post on Zulip jan kili (Jul 21 2022 at 10:21):

In general, I see backpassing as primarily for asynchronous processes, but in this case it's used to avoid... what?

view this post on Zulip jan kili (Jul 21 2022 at 10:22):

I may be missing something critical here, so please let me know! :)

view this post on Zulip jan kili (Jul 21 2022 at 10:23):

I'd be curious to see one of the roc-random examples rewritten with backpassing, that might help me understand.

view this post on Zulip Martin Stewart (Jul 21 2022 at 10:24):

This code really is passing along a new seed to each generator. So x and y will get different values event if they had the same range

point : Generator { x: U32, y: U32 }
point =
    x <- u32 0 9
    y <- u32 10 19
    finish { x, y }

run : { value : { x: U32, y: U32 }, newState: Seed }
run =
    point (getSeed 123)

It might be easier to see why this is possible if I replace the backpassing with the desugared version:

point : Generator { x: U32, y: U32 }
point =
    # The second u32 is passed in as a parameter to the first u32 so it's possible for it to be given the next seed
    u32 0 9 (\x -> u32 10 19 (\y -> finish { x, y }))

run : { value : { x: U32, y: U32 }, newState: Seed }
run =
    point (getSeed 123)

view this post on Zulip jan kili (Jul 21 2022 at 10:25):

Oh, wait, sorry, I just realized your example above is a from-scratch RNG implementation - I thought it was modifying my roc-random API.

view this post on Zulip Martin Stewart (Jul 21 2022 at 10:25):

I'll update your roc-random examples once I have the rest of my PR ready

view this post on Zulip jan kili (Jul 21 2022 at 10:27):

Ah! Are you using output values as state?

view this post on Zulip jan kili (Jul 21 2022 at 10:28):

Yes, that's it - okay. My seed chaining confusion came from the fact that I was assuming internal state and external return values were distinct.

view this post on Zulip jan kili (Jul 21 2022 at 10:30):

It's advantageous with PCG to have a transformation step during value return. By preventing direct exposure of the internal state, the randomness (against either just malicious actors or also stats, I can't remember) improves.

view this post on Zulip jan kili (Jul 21 2022 at 10:31):

Therefore, perhaps I'm simply expecting more complex parameters passed into the backpassing callback.

view this post on Zulip jan kili (Jul 21 2022 at 10:33):

I still don't see where the internal state (or any seed) is passed/chained between generators, though...

view this post on Zulip Martin Stewart (Jul 21 2022 at 10:38):

It's the last line here that passes the seed into the next generator.

u32 : U32, U32, (U32 -> Generator a) -> Generator a
u32 = \low, high, func ->
    \(@Seed state) ->
        generator = func (state % (high - low) + low)
        generator
            # The random seed just increments between uses
            (@Seed (state + 1))

I can give a more detailed explanation later if you'd like. Busy at the moment unfortunately

view this post on Zulip jan kili (Jul 21 2022 at 10:40):

That part makes sense, thank you.

view this post on Zulip jan kili (Jul 21 2022 at 10:41):

I'm wondering how point works, because I don't see how its input Seed is passed to its u32 subgenerators.

view this post on Zulip jan kili (Jul 21 2022 at 10:42):

Is that the magic of backpassing?

view this post on Zulip jan kili (Jul 21 2022 at 10:44):

finish = \value ->
    (\state -> { value: value, newState : state })
...
point =
    x <- u32 0 9
    y <- u32 10 19
    finish { x, y }

This seems like point will return the same .value every time.

view this post on Zulip jan kili (Jul 21 2022 at 11:05):

Your desugared version above (thank you for that, very helpful for my backpassing noob brain) is
point = u32 0 9 (\x -> u32 10 19 (\y -> finish { x, y }))
but I'm expecting something more like
...

view this post on Zulip jan kili (Jul 21 2022 at 11:06):

GAH! While thinking what I wanted in the second code snippet above, I think I figured out how this backpassing magic works :sweat_smile:

view this post on Zulip jan kili (Jul 21 2022 at 11:06):

My brain hurts. So many overlapping functions...

view this post on Zulip jan kili (Jul 21 2022 at 11:17):

Your backpassing system might be amazing :smiley: I'll have to dig in more later!

view this post on Zulip jan kili (Jul 21 2022 at 11:23):

If it smoothly (though subtly) eliminates the need for Random.next/.step inside custom generators, that's great!

view this post on Zulip jan kili (Jul 21 2022 at 11:26):

I'm looking forward to implementing/seeing this with Generations and I8 values.

view this post on Zulip jan kili (Jul 21 2022 at 11:27):

I wonder if renaming some variables would help me not need a 10-second mental loading spinner every time I re-read the backpassing lines :laughing:

view this post on Zulip jan kili (Jul 21 2022 at 16:03):

I'm curious to see whether this backpassing system works better for simpler use cases or more complex use cases! Right now I think it's good for chaining but not for one-shots... but I'm vascillating on that with my shifting mental pseudocode. I'll have to try some examples!

view this post on Zulip Brendan Hansknecht (Jul 21 2022 at 16:21):

Even with one off uses, it can often make something look more like linear code and help reduce indenting

view this post on Zulip Brendan Hansknecht (Jul 21 2022 at 16:22):

So helps with readability

view this post on Zulip jan kili (Jul 21 2022 at 16:51):

When there's already callbacks involved yes, but this system is adding both callbacks and backpassing at the same time, so they're equally flat. This might still make it more readable, though.

view this post on Zulip jan kili (Jul 21 2022 at 16:53):

The readability gains would just be in hiding the state machine's internals, rather than decreasing indentation.

view this post on Zulip Brendan Hansknecht (Jul 21 2022 at 16:54):

Ah, yeah. Makes sense. Hiding state can definitely be really nice in Roc. Otherwise you get everyone's favorite variables state0, state1, state2, state3, ...

view this post on Zulip Brendan Hansknecht (Jul 21 2022 at 16:54):

This is the one reason I really miss shadowing.

view this post on Zulip jan kili (Jul 21 2022 at 16:55):

I made helper functions to pass state around externally, but x.value and Random.step are ugly tbh.

view this post on Zulip Brendan Hansknecht (Jul 21 2022 at 16:57):

Yeah, which is definitely an improvement and what I did for the second version of the wasm cpu emulator in Roc.

view this post on Zulip Brendan Hansknecht (Jul 21 2022 at 16:57):

But definitely less nice than bsckpassing

view this post on Zulip Brendan Hansknecht (Jul 21 2022 at 16:57):

Though probably a lot faster to compile

view this post on Zulip jan kili (Jul 21 2022 at 17:04):

Interesting, I hadn't even begun to think about performance implications

view this post on Zulip jan kili (Jul 21 2022 at 17:05):

The nested callback function call stack for each generator chain... Probably fine

view this post on Zulip Qqwy / Marten (Jul 21 2022 at 17:58):

Hopefully that will become less of a thing in the near future :sweat_smile:

view this post on Zulip Brendan Hansknecht (Jul 21 2022 at 18:49):

Oh, I more meant that no matter what it will cost more to compilation time. It means you have lambdas that have to be inlined and optimized. That is not the case for directly passing state around. So it will always have some form of higher cost. Should be relatively small though (if we do things right)

view this post on Zulip Brendan Hansknecht (Jul 21 2022 at 18:50):

For runtime, it should get inlined and have zero cost

view this post on Zulip jan kili (Jul 22 2022 at 04:55):

@Martin Stewart I've been playing around with refactoring roc-random to internalize its chained generator state management with backpassing instead of externalizing it, but I can't run it yet due to seemingly-unrelated compilation issues that I'll have to debug another time. In the meantime, this syntax looks really promising for cleanliness!

view this post on Zulip jan kili (Jul 22 2022 at 05:00):

I only have one remaining concern right now, and that's regarding chaining chains. For example, your point generator above - I believe it can't be chained with other point generators (as currently written) because the only state/seed it can return is the initial state/seed. Is that correct? It seems like the backpassing hiding state management (good) is also preventing that state from chaining with other contexts above/beyond the function context that calls those backpasses (bad), since that means multiple points requires multiple seedings.

view this post on Zulip Qqwy / Marten (Jul 22 2022 at 06:40):

In the example, finish extracts the final state (in the example: after two RNG 'hops'), not the initial state.

view this post on Zulip Qqwy / Marten (Jul 22 2022 at 06:41):

What could be added is a function Generator a -> (a -> Generator b) -> Generator b which could be called chain or andThen or something similar. This would allow chaining arbitrary generators together (like calling point multiple times on the same generator)

view this post on Zulip Martin Stewart (Jul 22 2022 at 13:11):

Yeah, as Marten said, the latest seed does get returned. But there will need to be an andThen or something in order to make a finished generator compose with other generators

view this post on Zulip Martin Stewart (Jul 22 2022 at 13:12):

Haven't had a chance to continue working on it but I should have time this weekend

view this post on Zulip jan kili (Jul 22 2022 at 13:15):

Ah, so the finish is just a final wrapping step. Thank you both for that clarification!

view this post on Zulip jan kili (Jul 22 2022 at 13:16):

Yes, it seems that helpers like Result.step and/or Result.next will still be necessary when chaining without backpassing.

view this post on Zulip jan kili (Jul 22 2022 at 13:16):

I wonder if it would be valuable to support both ways, backpassing and not...

view this post on Zulip Richard Feldman (Jul 22 2022 at 13:27):

oh definitely - for example, sometimes it's useful to store the seed in application state and then come back to it later

view this post on Zulip Richard Feldman (Jul 22 2022 at 13:27):

I did this a bunch in the implementation of elm-test

view this post on Zulip Richard Feldman (Jul 22 2022 at 13:28):

because it would run the tests in parallel, but for the property-based tests, you could pass in a --seed value which would deterministically reproduce the results of a previous randomly-generated run

view this post on Zulip Richard Feldman (Jul 22 2022 at 13:28):

and the way it did that was by storing the seeds in the application state in between test runs

view this post on Zulip Richard Feldman (Jul 22 2022 at 13:29):

so it doesn't read as well when chaining several random operations together, but much nicer for storage!

view this post on Zulip jan kili (Jul 22 2022 at 14:12):

Is there a convention for how to support both patterns? Perhaps some sort of wrapper? Or is it just API x 2?

view this post on Zulip jan kili (Jul 22 2022 at 14:22):

I can probably figure something out by playing with callbacks and wrappers, but I don't see it just yet...

view this post on Zulip Qqwy / Marten (Jul 22 2022 at 14:59):

What do you mean?
Using the andThen/finish based API does not prevent you from using the manual API interchangeably in your program. (Nor vice-versa)

view this post on Zulip jan kili (Jul 22 2022 at 18:16):

I hope to still support simple calls like this:

diceRoll = (Random.int 1 6) myNoise
Stdout.line (Num.toStr diceRoll)

view this post on Zulip jan kili (Jul 22 2022 at 18:18):

Adding a callback parameter to Random.int seems to deprecate the above simplicity, no?

view this post on Zulip jan kili (Jul 22 2022 at 18:21):

(granted, the current API isn't quite that simple, as it includes a { value: diceRoll } destructuring of the wrapped state, but the above would be nice)

view this post on Zulip jan kili (Jul 22 2022 at 18:26):

I might be talking nonsense, though, as I still haven't finished the refactoring to see the changes in action!

view this post on Zulip jan kili (Jul 22 2022 at 18:26):

Thank you VERY much @Anton for unsticking me on Rust/Nix stuff :)

view this post on Zulip Martin Stewart (Jul 24 2022 at 21:21):

So my from the beginning the idea was to not need any Random.andThen and you'd only write things like

randomPoint =
    x <- Random.u32
    y <- Random.u32
    finish { x, y }

randomPerson =
    position <- randomPoint
    age <- Random.u16
    finish { position, age }

randomPeople =
    count <- Random.u8
    people <- Random.list count randomPerson
    finish people

foo =
    { value, newSeed } = Random.generate (Random.initialSeed 123) randomPeople
    value

It seems like this isn't possible to do though. As @JanCVanB pointed out randomPoint doesn't have the right type to be used the same way Random.u32 does for example. Maybe I'll call off that PR unless I come up with some better idea :sweat_smile:

view this post on Zulip Qqwy / Marten (Jul 24 2022 at 21:26):

I'm pretty sure there is a way to make it work, probably by just slightly altering the output type of finish.
I'm on vacation this week so I can only take a look afterwards though :zip_it:

view this post on Zulip Martin Stewart (Jul 24 2022 at 21:30):

That would be super cool! I'm interested in seeing how it can be done

view this post on Zulip jan kili (Jul 24 2022 at 21:30):

This backpassing stuff is a fantastic idea @Martin Stewart, thank you for proposing it! Yes, I intend for both primitive and custom generators to behave/chain identically, but what do you think, should that not be a requirement?

view this post on Zulip Qqwy / Marten (Jul 24 2022 at 21:31):

Haskell's Data.Random module allows this. But translating that to Roc might be a bit tricky unless you know Haskell well because Haskell code is quite terse (and you need to take extra care to translate stuff that might depend on Haskell's lazy-by-default nature to Roc, which is strict)

view this post on Zulip Qqwy / Marten (Jul 24 2022 at 21:31):

Sorry, I should stop reading and responding this late

view this post on Zulip Martin Stewart (Jul 24 2022 at 21:32):

JanCVanB said:

This backpassing stuff is a fantastic idea Martin Stewart, thank you for proposing it! Yes, I intend for both primitive and custom generators to behave/chain identically, but what do you think, should that not be a requirement?

I agree that it would be ideal if primitive and custom generates behave the same way.

view this post on Zulip Qqwy / Marten (Jul 24 2022 at 21:32):

The Haskell implementation does make liberal use of andThen

view this post on Zulip jan kili (Jul 24 2022 at 21:32):

I like the example above, how it chains lots of things together :))) very much what I'd like to see


Last updated: Jul 06 2025 at 12:14 UTC