Hi folks, I'm gradually working my way up to being able to do advent of code problems, and one library that I find constantly useful in Go is "charmap": a map of (x,y) to character. I decided to start implementing it in Roc, and I'd love to get feedback on what could be clearer, more concise, more idiomatic, etc.
Here's what I have so far:
interface Charmap
exposes [
parse,
minmax,
toString
]
imports [
]
Point : (I64, I64)
parse : Str -> Dict Point Str
parse = \input ->
input
|> Str.trim
|> Str.split "\n"
|> List.map Str.graphemes
|> List.mapWithIndex \list, y ->
list |> List.mapWithIndex \char, x ->
((Num.toI64 x, Num.toI64 y), char)
|> List.join
|> Dict.fromList
pointsMap : Point, Point, (I64, I64 -> I64) -> Point
pointsMap = \(x1, y1), (x2, y2), f ->
(f x1 x2, f y1 y2)
minmax : Dict Point Str -> Result ( Point, Point) [EmptyMap]
minmax = \dict ->
when Dict.keys dict is
[] -> Err EmptyMap
[head, .. as tail] ->
List.walk tail ( head, head ) \(min, max), point ->
( pointsMap min point Num.min, pointsMap max point Num.max )
|> Ok
toString : Dict Point Str, Str -> Str
toString = \dict, default ->
when minmax dict is
Err EmptyMap -> ""
Ok ((minx, miny), (maxx, maxy)) ->
List.range { start: At miny, end: At maxy}
|> List.map \y ->
List.range { start: At minx, end: At maxx}
|> List.map \x ->
Dict.get dict (x,y) |> Result.withDefault default
|> Str.joinWith ""
|> Str.joinWith "\n"
For character, you may want to use a U8 or U32 depending on the goal. I know we have a way to convert a string to code points. Unicode codepoints are U32s.
U8 if you only care about ASCII
It will be a lot more performant that using a string assuming these truly are single character
That said, character vs codepoints vs graphemes still confuses me. Maybe graphemes is the best if you want to support all forms of emoji and such.
yeah I've banned the word "character" from all of Roc's documentation because it is such a huge source of confusion and Unicode bugs :sweat_smile:
I'm not sure what Go charmaps are used for though!
in general I'd always default to Str
Yeah. I considered Str.toScalars
, mainly because I felt slightly squeamish about dict entries and the default
param being able to be any other length than one grapheme (make impossible states impossible in your data representation and all that), but dbg
output is a lot harder to understand when all you see is ascii codes.
"charmaps" aren't really a thing in Go. I just use my "charmap" library constantly in adventofcode. It's defined like so in charmap.go:
// M is a map of geom.Vec2 to rune.
type M map[geom.Vec2]rune
I appreciate the feedback on always defaulting to Str
. Any other code comments? Is stringing the whole function into one expression normal, or would it be more common to create variables holding intermediate results? (That's the way I had it at first, while fighting my ignorance of Roc, naturally.)
Also, I kinda used |>
or not based on some combination of whether it felt cleaner, avoided parenthesized function calls as args, matched surrounding code, or was the only thing I could think of. And I didn't use <-
because while I technically understand exactly what it does, I haven't quite internalized its "gestalt" / "feels" / whatever you'd call it.
One last question: I toyed around with Result.mapErr
and Result.map
to sort of thread the logic through a linear, non-branching sequence of steps, but it _seemed_ like pattern matching on Ok/Err came out clearer. Does my usage there seem idiomatic?
The code reads quite well overall. I don't think backpassing is really warranted in these cases. Quite clean with all the pipelines and explicit when ... is
I would maybe:
Also, a note:
Due to a bug, this type will likely be very slow currently. The strings in the dictionary will get refcounted a lot as it gets passed around. Hopefully I will fix this soon, but just a warning.
Since this is for advent of code, calculating the min and max by iteration is almost always fine.
The string refcount thing would be an argument for using scalars…
Yeah, if you hit perf issues, switching to scalar would probably fix it.
I'm currently traveling and not sure how much I will get done over the holidays but fixing this refcounting bug is my next priority, just will be a lot of pipelining. So hopefully by mid January at the latest, the choice here shouldn't affect perf.
Interesting. For strings of such small lengths (1 character!) it would be more efficient to do something akin to "intern" in other languages.
Or invent a separate representation for strings that are short enough to fit in a 64-bit value in a register or on the stack, although that seems a bit painful.
For Advent of Code I use Str.toUtf8
and represent "characters" as U8 as I know they are all ASCII and that is very efficient and fast.
Since strings are immutable, and there aren't any could-be-faster-if-you-modify-in-place methods on Str
(well, depending on how you treat generating substrings, and if you count replacing with exactly-the-same-length replacements), I guess the refcount for strings is solely for GC, not for Roc's modify-in-place optimization.
Is there a way to create a new type that is an alias for U8, but has a dbg
representation of an ascii character?
Strings get mutated in place optimizations if you change them (like replace text). They also have a small string optimization to avoid allocations. Also, substrings end up being slices, so they avoid any allocations
You can make a custom dbg
impl if a type is opaque, but that may be inconvenient for the List U8
that is kinda a string use cases.
Also, if you were to switch your charmap to be a Dict Point (List U8)
, it would hit the same recursive refcount perf bug. But if you switch to Dict Point U8
it would probably work fine for everything Advent of code. Might be worth making charmap an opaque type with a custom inspect impl such that it prints as if it were a string.
Last updated: Jul 06 2025 at 12:14 UTC