Seeking feedback / code review on idiomatic roc usage · beginners

Hi folks, I'm gradually working my way up to being able to do advent of code problems, and one library that I find constantly useful in Go is "charmap": a map of (x,y) to character. I decided to start implementing it in Roc, and I'd love to get feedback on what could be clearer, more concise, more idiomatic, etc.

Zellyn Hunter (Dec 17 2023 at 04:16):

interface Charmap
    exposes [
      parse,
      minmax,
      toString
    ]
    imports [
    ]

Point : (I64, I64)

parse : Str -> Dict Point Str
parse = \input ->
    input
    |> Str.trim
    |> Str.split "\n"
    |> List.map Str.graphemes
    |> List.mapWithIndex \list, y ->
        list |> List.mapWithIndex \char, x ->
            ((Num.toI64 x, Num.toI64 y), char)
    |> List.join
    |> Dict.fromList

pointsMap : Point, Point, (I64, I64 -> I64) -> Point
pointsMap = \(x1, y1), (x2, y2), f ->
    (f x1 x2, f y1 y2)

minmax : Dict Point Str -> Result ( Point, Point) [EmptyMap]
minmax = \dict ->
  when Dict.keys dict is
    [] -> Err EmptyMap
    [head, .. as tail] ->
        List.walk tail ( head, head ) \(min, max), point ->
            ( pointsMap min point Num.min, pointsMap max point Num.max )
        |> Ok

 toString : Dict Point Str, Str -> Str
 toString = \dict, default ->
    when minmax dict is
        Err EmptyMap -> ""
        Ok ((minx, miny), (maxx, maxy)) ->
            List.range { start: At miny, end: At maxy}
            |> List.map \y ->
                List.range { start: At minx, end: At maxx}
                |> List.map \x ->
                    Dict.get dict (x,y) |> Result.withDefault default
                |> Str.joinWith ""
            |> Str.joinWith "\n"

Brendan Hansknecht (Dec 17 2023 at 05:19):

For character, you may want to use a U8 or U32 depending on the goal. I know we have a way to convert a string to code points. Unicode codepoints are U32s.

Brendan Hansknecht (Dec 17 2023 at 05:19):

It will be a lot more performant that using a string assuming these truly are single character

Brendan Hansknecht (Dec 17 2023 at 05:20):

That said, character vs codepoints vs graphemes still confuses me. Maybe graphemes is the best if you want to support all forms of emoji and such.

Richard Feldman (Dec 17 2023 at 12:51):

yeah I've banned the word "character" from all of Roc's documentation because it is such a huge source of confusion and Unicode bugs :sweat_smile:

Richard Feldman (Dec 17 2023 at 12:52):

Richard Feldman (Dec 17 2023 at 12:53):

Zellyn Hunter (Dec 17 2023 at 16:57):

Yeah. I considered Str.toScalars, mainly because I felt slightly squeamish about dict entries and the default param being able to be any other length than one grapheme (make impossible states impossible in your data representation and all that), but dbg output is a lot harder to understand when all you see is ascii codes.

Zellyn Hunter (Dec 17 2023 at 17:00):

"charmaps" aren't really a thing in Go. I just use my "charmap" library constantly in adventofcode. It's defined like so in charmap.go:

// M is a map of geom.Vec2 to rune.
type M map[geom.Vec2]rune

Zellyn Hunter (Dec 17 2023 at 17:02):

I appreciate the feedback on always defaulting to Str. Any other code comments? Is stringing the whole function into one expression normal, or would it be more common to create variables holding intermediate results? (That's the way I had it at first, while fighting my ignorance of Roc, naturally.)

Zellyn Hunter (Dec 17 2023 at 17:04):

Also, I kinda used |> or not based on some combination of whether it felt cleaner, avoided parenthesized function calls as args, matched surrounding code, or was the only thing I could think of. And I didn't use <- because while I technically understand exactly what it does, I haven't quite internalized its "gestalt" / "feels" / whatever you'd call it.

Zellyn Hunter (Dec 17 2023 at 17:06):

One last question: I toyed around with Result.mapErr and Result.map to sort of thread the logic through a linear, non-branching sequence of steps, but it _seemed_ like pattern matching on Ok/Err came out clearer. Does my usage there seem idiomatic?

Brendan Hansknecht (Dec 17 2023 at 17:45):

The code reads quite well overall. I don't think backpassing is really warranted in these cases. Quite clean with all the pipelines and explicit when ... is

Brendan Hansknecht (Dec 17 2023 at 17:48):

Brendan Hansknecht (Dec 17 2023 at 17:54):

Due to a bug, this type will likely be very slow currently. The strings in the dictionary will get refcounted a lot as it gets passed around. Hopefully I will fix this soon, but just a warning.

Zellyn Hunter (Dec 17 2023 at 18:50):

Since this is for advent of code, calculating the min and max by iteration is almost always fine.

Zellyn Hunter (Dec 17 2023 at 18:51):

Brendan Hansknecht (Dec 17 2023 at 18:58):

Brendan Hansknecht (Dec 17 2023 at 19:03):

I'm currently traveling and not sure how much I will get done over the holidays but fixing this refcounting bug is my next priority, just will be a lot of pipelining. So hopefully by mid January at the latest, the choice here shouldn't affect perf.

Zellyn Hunter (Dec 18 2023 at 03:40):

Interesting. For strings of such small lengths (1 character!) it would be more efficient to do something akin to "intern" in other languages.

Zellyn Hunter (Dec 18 2023 at 03:41):

Or invent a separate representation for strings that are short enough to fit in a 64-bit value in a register or on the stack, although that seems a bit painful.

Luke Boswell (Dec 18 2023 at 03:42):

For Advent of Code I use Str.toUtf8 and represent "characters" as U8 as I know they are all ASCII and that is very efficient and fast.

Zellyn Hunter (Dec 18 2023 at 03:43):

Since strings are immutable, and there aren't any could-be-faster-if-you-modify-in-place methods on Str (well, depending on how you treat generating substrings, and if you count replacing with exactly-the-same-length replacements), I guess the refcount for strings is solely for GC, not for Roc's modify-in-place optimization.

Zellyn Hunter (Dec 18 2023 at 03:44):

Is there a way to create a new type that is an alias for U8, but has a dbg representation of an ascii character?

Brendan Hansknecht (Dec 18 2023 at 04:04):

Strings get mutated in place optimizations if you change them (like replace text). They also have a small string optimization to avoid allocations. Also, substrings end up being slices, so they avoid any allocations

Brendan Hansknecht (Dec 18 2023 at 04:07):

You can make a custom dbg impl if a type is opaque, but that may be inconvenient for the List U8 that is kinda a string use cases.

Also, if you were to switch your charmap to be a Dict Point (List U8), it would hit the same recursive refcount perf bug. But if you switch to Dict Point U8 it would probably work fine for everything Advent of code. Might be worth making charmap an opaque type with a custom inspect impl such that it prints as if it were a string.

Stream: beginners

Topic: Seeking feedback / code review on idiomatic roc usage

Zellyn Hunter (Dec 17 2023 at 04:16):

Zellyn Hunter (Dec 17 2023 at 04:16):

Brendan Hansknecht (Dec 17 2023 at 05:19):

Brendan Hansknecht (Dec 17 2023 at 05:19):

Brendan Hansknecht (Dec 17 2023 at 05:19):

Brendan Hansknecht (Dec 17 2023 at 05:20):

Richard Feldman (Dec 17 2023 at 12:51):

Richard Feldman (Dec 17 2023 at 12:52):

Richard Feldman (Dec 17 2023 at 12:53):

Zellyn Hunter (Dec 17 2023 at 16:57):

Zellyn Hunter (Dec 17 2023 at 17:00):

Zellyn Hunter (Dec 17 2023 at 17:02):

Zellyn Hunter (Dec 17 2023 at 17:04):

Zellyn Hunter (Dec 17 2023 at 17:06):

Brendan Hansknecht (Dec 17 2023 at 17:45):

Brendan Hansknecht (Dec 17 2023 at 17:48):

Brendan Hansknecht (Dec 17 2023 at 17:54):

Zellyn Hunter (Dec 17 2023 at 18:50):

Zellyn Hunter (Dec 17 2023 at 18:51):

Brendan Hansknecht (Dec 17 2023 at 18:58):

Brendan Hansknecht (Dec 17 2023 at 19:03):

Zellyn Hunter (Dec 18 2023 at 03:40):

Zellyn Hunter (Dec 18 2023 at 03:41):

Luke Boswell (Dec 18 2023 at 03:42):

Zellyn Hunter (Dec 18 2023 at 03:43):

Zellyn Hunter (Dec 18 2023 at 03:44):

Brendan Hansknecht (Dec 18 2023 at 04:04):

Brendan Hansknecht (Dec 18 2023 at 04:07):