Stream: beginners

Topic: Seeking feedback / code review on idiomatic roc usage


view this post on Zulip Zellyn Hunter (Dec 17 2023 at 04:16):

Hi folks, I'm gradually working my way up to being able to do advent of code problems, and one library that I find constantly useful in Go is "charmap": a map of (x,y) to character. I decided to start implementing it in Roc, and I'd love to get feedback on what could be clearer, more concise, more idiomatic, etc.

view this post on Zulip Zellyn Hunter (Dec 17 2023 at 04:16):

Here's what I have so far:

interface Charmap
    exposes [
      parse,
      minmax,
      toString
    ]
    imports [
    ]

Point : (I64, I64)

parse : Str -> Dict Point Str
parse = \input ->
    input
    |> Str.trim
    |> Str.split "\n"
    |> List.map Str.graphemes
    |> List.mapWithIndex \list, y ->
        list |> List.mapWithIndex \char, x ->
            ((Num.toI64 x, Num.toI64 y), char)
    |> List.join
    |> Dict.fromList

pointsMap : Point, Point, (I64, I64 -> I64) -> Point
pointsMap = \(x1, y1), (x2, y2), f ->
    (f x1 x2, f y1 y2)

minmax : Dict Point Str -> Result ( Point, Point) [EmptyMap]
minmax = \dict ->
  when Dict.keys dict is
    [] -> Err EmptyMap
    [head, .. as tail] ->
        List.walk tail ( head, head ) \(min, max), point ->
            ( pointsMap min point Num.min, pointsMap max point Num.max )
        |> Ok

 toString : Dict Point Str, Str -> Str
 toString = \dict, default ->
    when minmax dict is
        Err EmptyMap -> ""
        Ok ((minx, miny), (maxx, maxy)) ->
            List.range { start: At miny, end: At maxy}
            |> List.map \y ->
                List.range { start: At minx, end: At maxx}
                |> List.map \x ->
                    Dict.get dict (x,y) |> Result.withDefault default
                |> Str.joinWith ""
            |> Str.joinWith "\n"

view this post on Zulip Brendan Hansknecht (Dec 17 2023 at 05:19):

For character, you may want to use a U8 or U32 depending on the goal. I know we have a way to convert a string to code points. Unicode codepoints are U32s.

view this post on Zulip Brendan Hansknecht (Dec 17 2023 at 05:19):

U8 if you only care about ASCII

view this post on Zulip Brendan Hansknecht (Dec 17 2023 at 05:19):

It will be a lot more performant that using a string assuming these truly are single character

view this post on Zulip Brendan Hansknecht (Dec 17 2023 at 05:20):

That said, character vs codepoints vs graphemes still confuses me. Maybe graphemes is the best if you want to support all forms of emoji and such.

view this post on Zulip Richard Feldman (Dec 17 2023 at 12:51):

yeah I've banned the word "character" from all of Roc's documentation because it is such a huge source of confusion and Unicode bugs :sweat_smile:

view this post on Zulip Richard Feldman (Dec 17 2023 at 12:52):

I'm not sure what Go charmaps are used for though!

view this post on Zulip Richard Feldman (Dec 17 2023 at 12:53):

in general I'd always default to Str

view this post on Zulip Zellyn Hunter (Dec 17 2023 at 16:57):

Yeah. I considered Str.toScalars, mainly because I felt slightly squeamish about dict entries and the default param being able to be any other length than one grapheme (make impossible states impossible in your data representation and all that), but dbg output is a lot harder to understand when all you see is ascii codes.

view this post on Zulip Zellyn Hunter (Dec 17 2023 at 17:00):

"charmaps" aren't really a thing in Go. I just use my "charmap" library constantly in adventofcode. It's defined like so in charmap.go:

// M is a map of geom.Vec2 to rune.
type M map[geom.Vec2]rune

view this post on Zulip Zellyn Hunter (Dec 17 2023 at 17:02):

I appreciate the feedback on always defaulting to Str. Any other code comments? Is stringing the whole function into one expression normal, or would it be more common to create variables holding intermediate results? (That's the way I had it at first, while fighting my ignorance of Roc, naturally.)

view this post on Zulip Zellyn Hunter (Dec 17 2023 at 17:04):

Also, I kinda used |> or not based on some combination of whether it felt cleaner, avoided parenthesized function calls as args, matched surrounding code, or was the only thing I could think of. And I didn't use <- because while I technically understand exactly what it does, I haven't quite internalized its "gestalt" / "feels" / whatever you'd call it.

view this post on Zulip Zellyn Hunter (Dec 17 2023 at 17:06):

One last question: I toyed around with Result.mapErr and Result.map to sort of thread the logic through a linear, non-branching sequence of steps, but it _seemed_ like pattern matching on Ok/Err came out clearer. Does my usage there seem idiomatic?

view this post on Zulip Brendan Hansknecht (Dec 17 2023 at 17:45):

The code reads quite well overall. I don't think backpassing is really warranted in these cases. Quite clean with all the pipelines and explicit when ... is

view this post on Zulip Brendan Hansknecht (Dec 17 2023 at 17:48):

I would maybe:

  1. Make an explicit charmap type
  2. Add the min and max to the type so you don't have to do the expensive call to scan the entire dictionary (though this depends on how often you need them)

view this post on Zulip Brendan Hansknecht (Dec 17 2023 at 17:54):

Also, a note:

Due to a bug, this type will likely be very slow currently. The strings in the dictionary will get refcounted a lot as it gets passed around. Hopefully I will fix this soon, but just a warning.

view this post on Zulip Zellyn Hunter (Dec 17 2023 at 18:50):

Since this is for advent of code, calculating the min and max by iteration is almost always fine.

view this post on Zulip Zellyn Hunter (Dec 17 2023 at 18:51):

The string refcount thing would be an argument for using scalars…

view this post on Zulip Brendan Hansknecht (Dec 17 2023 at 18:58):

Yeah, if you hit perf issues, switching to scalar would probably fix it.

view this post on Zulip Brendan Hansknecht (Dec 17 2023 at 19:03):

I'm currently traveling and not sure how much I will get done over the holidays but fixing this refcounting bug is my next priority, just will be a lot of pipelining. So hopefully by mid January at the latest, the choice here shouldn't affect perf.

view this post on Zulip Zellyn Hunter (Dec 18 2023 at 03:40):

Interesting. For strings of such small lengths (1 character!) it would be more efficient to do something akin to "intern" in other languages.

view this post on Zulip Zellyn Hunter (Dec 18 2023 at 03:41):

Or invent a separate representation for strings that are short enough to fit in a 64-bit value in a register or on the stack, although that seems a bit painful.

view this post on Zulip Luke Boswell (Dec 18 2023 at 03:42):

For Advent of Code I use Str.toUtf8 and represent "characters" as U8 as I know they are all ASCII and that is very efficient and fast.

view this post on Zulip Zellyn Hunter (Dec 18 2023 at 03:43):

Since strings are immutable, and there aren't any could-be-faster-if-you-modify-in-place methods on Str (well, depending on how you treat generating substrings, and if you count replacing with exactly-the-same-length replacements), I guess the refcount for strings is solely for GC, not for Roc's modify-in-place optimization.

view this post on Zulip Zellyn Hunter (Dec 18 2023 at 03:44):

Is there a way to create a new type that is an alias for U8, but has a dbg representation of an ascii character?

view this post on Zulip Brendan Hansknecht (Dec 18 2023 at 04:04):

Strings get mutated in place optimizations if you change them (like replace text). They also have a small string optimization to avoid allocations. Also, substrings end up being slices, so they avoid any allocations

view this post on Zulip Brendan Hansknecht (Dec 18 2023 at 04:07):

You can make a custom dbg impl if a type is opaque, but that may be inconvenient for the List U8 that is kinda a string use cases.

Also, if you were to switch your charmap to be a Dict Point (List U8), it would hit the same recursive refcount perf bug. But if you switch to Dict Point U8 it would probably work fine for everything Advent of code. Might be worth making charmap an opaque type with a custom inspect impl such that it prints as if it were a string.


Last updated: Jul 06 2025 at 12:14 UTC