Random number generator · contributing

I'm going to try implementing a random number generator package but I figure I should double check first if anyone has already implemented one.

Ayaz Hafiz (Jul 20 2022 at 15:57):

Brendan Hansknecht (Jul 20 2022 at 15:58):

Martin Stewart (Jul 20 2022 at 15:59):

jan kili (Jul 20 2022 at 18:12):

jan kili (Jul 20 2022 at 18:13):

I got it to v1 in a week, a few months ago, and I was looking for feedback for v2.

jan kili (Jul 20 2022 at 18:14):

Feedback, contributions, and collaborators are always welcome! Usage and interest of the lib would make me reprioritize it.

jan kili (Jul 20 2022 at 18:19):

This was my first experience with RNG theory & implementation, so some of my naming/terminology might be off.

jan kili (Jul 20 2022 at 18:20):

Why are you interested in RNG? :) I was drawn to its seemingly-impure nature (potentially conflicting with Roc's purity) and use cases in art, data visualization, simulation, and games.

jan kili (Jul 20 2022 at 18:23):

Martin Stewart (Jul 20 2022 at 18:27):

I want to implement some of the core package stuff. Random numbers is pretty core!

Martin Stewart (Jul 20 2022 at 18:28):

I'm having a look at the code you've written now and experimenting with back passing to see if it can be used here.

Brendan Hansknecht (Jul 20 2022 at 18:30):

@JanCVanB I thought we eventually discovered the bug was related to a bug in the roc shifting functions. That should be fixed now (I think?), so maybe updating to latest roc would fix that plot.

jan kili (Jul 20 2022 at 18:31):

Qqwy / Marten (Jul 20 2022 at 19:01):

Qqwy / Marten (Jul 20 2022 at 19:05):

That a RNG library exists and that we now have expect makes me hyped for a property-based testing library! :happy:

Martin Stewart (Jul 20 2022 at 19:05):

interface Random
    exposes [
        getSeed,
        u32,
        finish,
        Seed,
        run
    ]
    imports []

Generator a : Seed -> { value: a, newState: Seed }

Seed := U32

getSeed : U32 -> Seed
getSeed = \seed -> @Seed seed

u32 : U32, U32, (U32 -> Generator a) -> Generator a
u32 = \low, high, func ->
    \(@Seed state) ->
        generator = func (state % (high - low) + low)
        generator
            # The random seed just increments between uses
            (@Seed (state + 1))

finish : a -> Generator a
finish = \value ->
    (\state -> { value: value, newState : state })

# Example user code

point : Generator { x: U32, y: U32 }
point =
    x <- u32 0 9
    y <- u32 10 19
    finish { x, y }

run =
    point (getSeed 123)

Qqwy / Marten (Jul 20 2022 at 19:09):

@Martin Stewart That's great! We might even be able to go a step further with inferring the return type

Brendan Hansknecht (Jul 20 2022 at 19:09):

Martin Stewart (Jul 20 2022 at 19:10):

I'm not sure if this is what you mean but at first I tried writing int : Int b, Int b, (Int b -> Generator a) -> Generator a instead of u32 : U32, U32, (U32 -> Generator a) -> Generator a but Roc kept saying that it was inferring Int b as Int Unsigned32 and that was preventing me from adding two ints together.

Qqwy / Marten (Jul 20 2022 at 19:10):

myVal : Generator {x: U32, y: I64, z: List Int64, w: Str}
x <- range 0 9
y <- range 10 19
z <- random
w <- random
finish {x, y, z, w}

Martin Stewart (Jul 20 2022 at 19:12):

I'm not sure I understand, could you provide an example of how random would be implemented?

jan kili (Jul 20 2022 at 19:19):

Qqwy / Marten (Jul 20 2022 at 19:20):

:thinking: To implement random we need an ability (Roc's version of what is known as a 'trait' or 'typeclass' or roughly an 'interface' in other languages).
Adding support for abilities is one of the things which is very recent (and it's not entirely finished yet if I remember correctly) but it should be possible to write this soon.

Martin Stewart (Jul 20 2022 at 19:30):

Gotcha. I don't think that's a good idea because, if I understand correctly, that means assigning exactly 1 way to randomly generate a given type. Most types will need some configuration, for example Random.str probably will take a min and max length (and maybe there could be another version called Random.ascii).

Martin Stewart (Jul 20 2022 at 19:32):

@JanCVanB If the backpassing approach looks good, I can try adding it to your code and making a PR

Qqwy / Marten (Jul 20 2022 at 19:33):

Yes, you're absolutely right! Sometimes you do want to customize something further!

jan kili (Jul 20 2022 at 20:36):

@Martin Stewart PRs are very welcome, thank you! I haven't really considered backpacking, so I'm looking forward to digging into your example above.

jan kili (Jul 20 2022 at 21:01):

jan kili (Jul 20 2022 at 23:26):

@Martin Stewart I think I see - is the backpassing support just to simplify the use of one-time generators?

jan kili (Jul 20 2022 at 23:46):

I definitely care about improving the ergonomics, so I'm open to stuff like this!

Martin Stewart (Jul 21 2022 at 06:29):

I'm not sure what you mean by one-time generators but yes this should simplify all generators

jan kili (Jul 21 2022 at 10:18):

In your example above, there is some conflation between seed & state (a state is initialized with a seed) and since both seeds & output values are U32 it's a little tricky to say for certain whether variables like x and y are intended as seeds or output values (the backpassing callback function input parameter name calls them seeds, but they're returned from point like output values).

jan kili (Jul 21 2022 at 10:19):

However, details aside, using backpassing with generators seems like it would prevent state chaining.

jan kili (Jul 21 2022 at 10:20):

This means that if your x and y domains were identical, they would generate identical values for each iteration (because using the same seed leads to the same sequence of values, without chaining).

jan kili (Jul 21 2022 at 10:21):

In general, I see backpassing as primarily for asynchronous processes, but in this case it's used to avoid... what?

jan kili (Jul 21 2022 at 10:22):

jan kili (Jul 21 2022 at 10:23):

Martin Stewart (Jul 21 2022 at 10:24):

This code really is passing along a new seed to each generator. So x and y will get different values event if they had the same range

point : Generator { x: U32, y: U32 }
point =
    x <- u32 0 9
    y <- u32 10 19
    finish { x, y }

run : { value : { x: U32, y: U32 }, newState: Seed }
run =
    point (getSeed 123)

It might be easier to see why this is possible if I replace the backpassing with the desugared version:

point : Generator { x: U32, y: U32 }
point =
    # The second u32 is passed in as a parameter to the first u32 so it's possible for it to be given the next seed
    u32 0 9 (\x -> u32 10 19 (\y -> finish { x, y }))

run : { value : { x: U32, y: U32 }, newState: Seed }
run =
    point (getSeed 123)

jan kili (Jul 21 2022 at 10:25):

Oh, wait, sorry, I just realized your example above is a from-scratch RNG implementation - I thought it was modifying my roc-random API.

Martin Stewart (Jul 21 2022 at 10:25):

jan kili (Jul 21 2022 at 10:27):

jan kili (Jul 21 2022 at 10:28):

Yes, that's it - okay. My seed chaining confusion came from the fact that I was assuming internal state and external return values were distinct.

jan kili (Jul 21 2022 at 10:30):

It's advantageous with PCG to have a transformation step during value return. By preventing direct exposure of the internal state, the randomness (against either just malicious actors or also stats, I can't remember) improves.

jan kili (Jul 21 2022 at 10:31):

Therefore, perhaps I'm simply expecting more complex parameters passed into the backpassing callback.

jan kili (Jul 21 2022 at 10:33):

I still don't see where the internal state (or any seed) is passed/chained between generators, though...

Martin Stewart (Jul 21 2022 at 10:38):

u32 : U32, U32, (U32 -> Generator a) -> Generator a
u32 = \low, high, func ->
    \(@Seed state) ->
        generator = func (state % (high - low) + low)
        generator
            # The random seed just increments between uses
            (@Seed (state + 1))

I can give a more detailed explanation later if you'd like. Busy at the moment unfortunately

jan kili (Jul 21 2022 at 10:40):

jan kili (Jul 21 2022 at 10:41):

I'm wondering how point works, because I don't see how its input Seed is passed to its u32 subgenerators.

jan kili (Jul 21 2022 at 10:42):

jan kili (Jul 21 2022 at 10:44):

finish = \value ->
    (\state -> { value: value, newState : state })
...
point =
    x <- u32 0 9
    y <- u32 10 19
    finish { x, y }

jan kili (Jul 21 2022 at 11:05):

Your desugared version above (thank you for that, very helpful for my backpassing noob brain) is
point = u32 0 9 (\x -> u32 10 19 (\y -> finish { x, y }))
but I'm expecting something more like
...

jan kili (Jul 21 2022 at 11:06):

GAH! While thinking what I wanted in the second code snippet above, I think I figured out how this backpassing magic works :sweat_smile:

jan kili (Jul 21 2022 at 11:06):

jan kili (Jul 21 2022 at 11:17):

Your backpassing system might be amazing :smiley: I'll have to dig in more later!

jan kili (Jul 21 2022 at 11:23):

If it smoothly (though subtly) eliminates the need for Random.next/.step inside custom generators, that's great!

jan kili (Jul 21 2022 at 11:26):

I'm looking forward to implementing/seeing this with Generations and I8 values.

jan kili (Jul 21 2022 at 11:27):

I wonder if renaming some variables would help me not need a 10-second mental loading spinner every time I re-read the backpassing lines :laughing:

jan kili (Jul 21 2022 at 16:03):

I'm curious to see whether this backpassing system works better for simpler use cases or more complex use cases! Right now I think it's good for chaining but not for one-shots... but I'm vascillating on that with my shifting mental pseudocode. I'll have to try some examples!

Brendan Hansknecht (Jul 21 2022 at 16:21):

Even with one off uses, it can often make something look more like linear code and help reduce indenting

Brendan Hansknecht (Jul 21 2022 at 16:22):

jan kili (Jul 21 2022 at 16:51):

When there's already callbacks involved yes, but this system is adding both callbacks and backpassing at the same time, so they're equally flat. This might still make it more readable, though.

jan kili (Jul 21 2022 at 16:53):

The readability gains would just be in hiding the state machine's internals, rather than decreasing indentation.

Brendan Hansknecht (Jul 21 2022 at 16:54):

Ah, yeah. Makes sense. Hiding state can definitely be really nice in Roc. Otherwise you get everyone's favorite variables state0, state1, state2, state3, ...

Brendan Hansknecht (Jul 21 2022 at 16:54):

jan kili (Jul 21 2022 at 16:55):

I made helper functions to pass state around externally, but x.value and Random.step are ugly tbh.

Brendan Hansknecht (Jul 21 2022 at 16:57):

Yeah, which is definitely an improvement and what I did for the second version of the wasm cpu emulator in Roc.

Brendan Hansknecht (Jul 21 2022 at 16:57):

jan kili (Jul 21 2022 at 17:04):

jan kili (Jul 21 2022 at 17:05):

The nested callback function call stack for each generator chain... Probably fine

Qqwy / Marten (Jul 21 2022 at 17:58):

Brendan Hansknecht (Jul 21 2022 at 18:49):

Oh, I more meant that no matter what it will cost more to compilation time. It means you have lambdas that have to be inlined and optimized. That is not the case for directly passing state around. So it will always have some form of higher cost. Should be relatively small though (if we do things right)

Brendan Hansknecht (Jul 21 2022 at 18:50):

jan kili (Jul 22 2022 at 04:55):

@Martin Stewart I've been playing around with refactoring roc-random to internalize its chained generator state management with backpassing instead of externalizing it, but I can't run it yet due to seemingly-unrelated compilation issues that I'll have to debug another time. In the meantime, this syntax looks really promising for cleanliness!

jan kili (Jul 22 2022 at 05:00):

I only have one remaining concern right now, and that's regarding chaining chains. For example, your point generator above - I believe it can't be chained with other point generators (as currently written) because the only state/seed it can return is the initial state/seed. Is that correct? It seems like the backpassing hiding state management (good) is also preventing that state from chaining with other contexts above/beyond the function context that calls those backpasses (bad), since that means multiple points requires multiple seedings.

Qqwy / Marten (Jul 22 2022 at 06:40):

In the example, finish extracts the final state (in the example: after two RNG 'hops'), not the initial state.

Qqwy / Marten (Jul 22 2022 at 06:41):

What could be added is a function Generator a -> (a -> Generator b) -> Generator b which could be called chain or andThen or something similar. This would allow chaining arbitrary generators together (like calling point multiple times on the same generator)

Martin Stewart (Jul 22 2022 at 13:11):

Yeah, as Marten said, the latest seed does get returned. But there will need to be an andThen or something in order to make a finished generator compose with other generators

Martin Stewart (Jul 22 2022 at 13:12):

Haven't had a chance to continue working on it but I should have time this weekend

jan kili (Jul 22 2022 at 13:15):

Ah, so the finish is just a final wrapping step. Thank you both for that clarification!

jan kili (Jul 22 2022 at 13:16):

Yes, it seems that helpers like Result.step and/or Result.next will still be necessary when chaining without backpassing.

jan kili (Jul 22 2022 at 13:16):

Richard Feldman (Jul 22 2022 at 13:27):

oh definitely - for example, sometimes it's useful to store the seed in application state and then come back to it later

Richard Feldman (Jul 22 2022 at 13:27):

Richard Feldman (Jul 22 2022 at 13:28):

because it would run the tests in parallel, but for the property-based tests, you could pass in a --seed value which would deterministically reproduce the results of a previous randomly-generated run

Richard Feldman (Jul 22 2022 at 13:28):

and the way it did that was by storing the seeds in the application state in between test runs

Richard Feldman (Jul 22 2022 at 13:29):

so it doesn't read as well when chaining several random operations together, but much nicer for storage!

jan kili (Jul 22 2022 at 14:12):

Is there a convention for how to support both patterns? Perhaps some sort of wrapper? Or is it just API x 2?

jan kili (Jul 22 2022 at 14:22):

I can probably figure something out by playing with callbacks and wrappers, but I don't see it just yet...

Qqwy / Marten (Jul 22 2022 at 14:59):

What do you mean?
Using the andThen/finish based API does not prevent you from using the manual API interchangeably in your program. (Nor vice-versa)

jan kili (Jul 22 2022 at 18:16):

diceRoll = (Random.int 1 6) myNoise
Stdout.line (Num.toStr diceRoll)

jan kili (Jul 22 2022 at 18:18):

Adding a callback parameter to Random.int seems to deprecate the above simplicity, no?

jan kili (Jul 22 2022 at 18:21):

(granted, the current API isn't quite that simple, as it includes a { value: diceRoll } destructuring of the wrapped state, but the above would be nice)

jan kili (Jul 22 2022 at 18:26):

I might be talking nonsense, though, as I still haven't finished the refactoring to see the changes in action!

jan kili (Jul 22 2022 at 18:26):

Martin Stewart (Jul 24 2022 at 21:21):

So my from the beginning the idea was to not need any Random.andThen and you'd only write things like

randomPoint =
    x <- Random.u32
    y <- Random.u32
    finish { x, y }

randomPerson =
    position <- randomPoint
    age <- Random.u16
    finish { position, age }

randomPeople =
    count <- Random.u8
    people <- Random.list count randomPerson
    finish people

foo =
    { value, newSeed } = Random.generate (Random.initialSeed 123) randomPeople
    value

It seems like this isn't possible to do though. As @JanCVanB pointed out randomPoint doesn't have the right type to be used the same way Random.u32 does for example. Maybe I'll call off that PR unless I come up with some better idea :sweat_smile:

Qqwy / Marten (Jul 24 2022 at 21:26):

I'm pretty sure there is a way to make it work, probably by just slightly altering the output type of finish.
I'm on vacation this week so I can only take a look afterwards though :zip_it:

Martin Stewart (Jul 24 2022 at 21:30):

jan kili (Jul 24 2022 at 21:30):

This backpassing stuff is a fantastic idea @Martin Stewart, thank you for proposing it! Yes, I intend for both primitive and custom generators to behave/chain identically, but what do you think, should that not be a requirement?

Qqwy / Marten (Jul 24 2022 at 21:31):

Haskell's Data.Random module allows this. But translating that to Roc might be a bit tricky unless you know Haskell well because Haskell code is quite terse (and you need to take extra care to translate stuff that might depend on Haskell's lazy-by-default nature to Roc, which is strict)

Qqwy / Marten (Jul 24 2022 at 21:31):

Martin Stewart (Jul 24 2022 at 21:32):

I agree that it would be ideal if primitive and custom generates behave the same way.

Qqwy / Marten (Jul 24 2022 at 21:32):

jan kili (Jul 24 2022 at 21:32):

I like the example above, how it chains lots of things together :))) very much what I'd like to see

Stream: contributing

Topic: Random number generator

Martin Stewart (Jul 20 2022 at 15:54):

Ayaz Hafiz (Jul 20 2022 at 15:57):

Brendan Hansknecht (Jul 20 2022 at 15:58):

Martin Stewart (Jul 20 2022 at 15:59):

jan kili (Jul 20 2022 at 18:12):

jan kili (Jul 20 2022 at 18:13):

jan kili (Jul 20 2022 at 18:14):

jan kili (Jul 20 2022 at 18:19):

jan kili (Jul 20 2022 at 18:20):

jan kili (Jul 20 2022 at 18:23):

Martin Stewart (Jul 20 2022 at 18:27):

Martin Stewart (Jul 20 2022 at 18:28):

Brendan Hansknecht (Jul 20 2022 at 18:30):

jan kili (Jul 20 2022 at 18:31):

Qqwy / Marten (Jul 20 2022 at 19:01):

Qqwy / Marten (Jul 20 2022 at 19:05):

Martin Stewart (Jul 20 2022 at 19:05):

Qqwy / Marten (Jul 20 2022 at 19:09):

Brendan Hansknecht (Jul 20 2022 at 19:09):

Martin Stewart (Jul 20 2022 at 19:10):

Qqwy / Marten (Jul 20 2022 at 19:10):

Martin Stewart (Jul 20 2022 at 19:12):

jan kili (Jul 20 2022 at 19:19):

Qqwy / Marten (Jul 20 2022 at 19:20):

Martin Stewart (Jul 20 2022 at 19:30):

Martin Stewart (Jul 20 2022 at 19:32):

Qqwy / Marten (Jul 20 2022 at 19:33):

jan kili (Jul 20 2022 at 20:36):

jan kili (Jul 20 2022 at 21:01):

jan kili (Jul 20 2022 at 23:26):

jan kili (Jul 20 2022 at 23:46):

Martin Stewart (Jul 21 2022 at 06:29):

jan kili (Jul 21 2022 at 10:18):

jan kili (Jul 21 2022 at 10:19):

jan kili (Jul 21 2022 at 10:20):

jan kili (Jul 21 2022 at 10:21):

jan kili (Jul 21 2022 at 10:22):

jan kili (Jul 21 2022 at 10:23):

Martin Stewart (Jul 21 2022 at 10:24):

jan kili (Jul 21 2022 at 10:25):

Martin Stewart (Jul 21 2022 at 10:25):

jan kili (Jul 21 2022 at 10:27):

jan kili (Jul 21 2022 at 10:28):

jan kili (Jul 21 2022 at 10:30):

jan kili (Jul 21 2022 at 10:31):

jan kili (Jul 21 2022 at 10:33):

Martin Stewart (Jul 21 2022 at 10:38):

jan kili (Jul 21 2022 at 10:40):

jan kili (Jul 21 2022 at 10:41):

jan kili (Jul 21 2022 at 10:42):

jan kili (Jul 21 2022 at 10:44):

jan kili (Jul 21 2022 at 11:05):

jan kili (Jul 21 2022 at 11:06):

jan kili (Jul 21 2022 at 11:06):

jan kili (Jul 21 2022 at 11:17):

jan kili (Jul 21 2022 at 11:23):

jan kili (Jul 21 2022 at 11:26):

jan kili (Jul 21 2022 at 11:27):

jan kili (Jul 21 2022 at 16:03):

Brendan Hansknecht (Jul 21 2022 at 16:21):

Brendan Hansknecht (Jul 21 2022 at 16:22):

jan kili (Jul 21 2022 at 16:51):

jan kili (Jul 21 2022 at 16:53):

Brendan Hansknecht (Jul 21 2022 at 16:54):

Brendan Hansknecht (Jul 21 2022 at 16:54):

jan kili (Jul 21 2022 at 16:55):

Brendan Hansknecht (Jul 21 2022 at 16:57):

Brendan Hansknecht (Jul 21 2022 at 16:57):

Brendan Hansknecht (Jul 21 2022 at 16:57):

jan kili (Jul 21 2022 at 17:04):

jan kili (Jul 21 2022 at 17:05):

Qqwy / Marten (Jul 21 2022 at 17:58):

Brendan Hansknecht (Jul 21 2022 at 18:49):

Brendan Hansknecht (Jul 21 2022 at 18:50):

jan kili (Jul 22 2022 at 04:55):

jan kili (Jul 22 2022 at 05:00):

Qqwy / Marten (Jul 22 2022 at 06:40):

Qqwy / Marten (Jul 22 2022 at 06:41):