I’m working on a sudoku solver in Roc. I started by porting over some Elm code for an interactive puzzle solving aid, but since I’m using basic-cli, I’m ditching the interactive bits and just building a (hopefully) fast solver.
Since the constraints are different, I’m trying to simplify where I can.
Here’s my first question:
I have an opaque type Number := U8
to represent a value from 1-9, inclusive. The idea here is to prevent me from mistakenly ending up in a situation where a cell contains a nonsensical value. However, I’m not convinced this abstraction is worth the complexity. I am parsing a puzzle from a string, then manipulating it with a solving algorithm. I’m not accepting any user input in the process. Sure, my Number.fromStr
function is a nice helper for parsing, but definitely isn’t necessary.
What would you do?
Here's the code, for reference: https://github.com/sandprickle/roc-sudoku-solver/blob/main/Sudoku/Number.roc
I somewhat reflexively question any abstraction I create, even one as simple as this. Not sure how productive it is.
Could also use a tag of 9 explicitly labelled values, which is arguably better than a U8.
Ah, that would also prevent errors within the Number module.
What would the memory footprint be of such a tag union?
That said, I would just use a U8 personally, but I think it would be easy to make a zero cost abstraction here.
You can still do your opaque number wrapper. Just don't defensively verify. Make your verification only be through fromNumber (don't have link rn, but parse don't validate) and only exposing Number.one through Number.nine otherwise. This way you limit where checks appear and ensure you are never accidentally generate a bad number.
Tag of up to 256 elements that doesn't wrap any datat is the same size as a U8
So I think both would be equally valid and performant
Also, the opaque wrapper will compile away so also should be as performant as just using the underlying U8
Plus since we have abilities now, Number can have Eq, Inspect, Hash, if you need them. Probably can just use the autoderive impl.
Yeah, I figured I wasn't incurring a runtime performance penalty here. Was just wondering whether the opaque wrapper was a good idea or not.
Right now the only verification I'm doing are in Number.fromInt
, Number.fromIntNormalize
, and Number.fromStr
, which was my attempt at using the parse don't validate approach.
Yes, I am autoderiving Eq and Hash
Glad memory footprint of Tags came up. I was wondering whether I could make any improvements to my representation of a cell:
Cell : [ Fixed Number, Empty (Set Number)]
So this would use 1 U8
plus the size of the Set
per cell?
Instead of fromInt
I bet most of the time, you can just expose a constant number and a list of all numbers and build from that, to remove a tiny bit of extra checks, but I wouldn't worry about perf too much right away. That last detail can be dug into when profiling.
Yeah, your size is correct 25bytes. Then cause padding is cruel, 32byte. If I'm correct
Ah, so getting rid of the tag wouldn't actually save me anything because of padding?
Getting rid of the tag would make it 24bytes.
And it wouldn't pad.
Oh, so that would actually be an improvement then.
I don't think I need to differentiate between a fixed value and a cell with only one possible value, so that shouldn't be difficult.
Yes
And could definitely be improved more if you are willing to write a BitSet abstraction on top of a U16.
That also probably would be a lot faster due to removing hashing and a memory lookup
Depending on how advance my solving algorithm gets, I may need to encode whether the cell was a hint given at the beginning or not. But there are plenty of extra bits where I could hide that information.
I was definitely wondering whether a Set was a good choice here. Logically it's what i want, but I had a hunch a List might be better.
But a List still allocates on the heap, and another question was going to be how I might avoid those heap allocations.
Two options.
Writing a bit set seems fine to me
I kinda want to write the same algorithm in Zig and benchmark both implementations. Not using a BitSet would put Roc at a disadvantage because Zig packed structs make that trivial
Hmm, a BitSet should also allow me to bypass a lot of Result
s
yep
I’ll plan on implementing it a few different ways and sharing the benchmarks for each.
One version I’d like to implement is a version where the 81-element list representing the Grid is swapped for some stack structure. Except that it seems rather unwieldy to do so. I need to be able to access arbitrary cells. Is this even possible with a tuple?
Maybe the better approach to avoiding that heap allocation cost would be to use a platform that does arena allocation
Though I guess it’s probably only one allocation anyway so maybe not much of an issue
.0
, .1.
, ...., .80
Then make a wrapper function that maps from index to those fucntions if you need to use an arbitrary nat index....it isn't great......
Ok I wasn’t sure whether I could do that or not.
Definitely not ergonomic. I’ll have to see how much performance it gains me.
Yeah......
I want tuples to work like arrays on the stack with more features like this
Or just to have a real stack allocated array type
You may find the SIMD discussion and design proposal interesting background on Tuples
Any plans for a List.walkUntilWithIndex
, or tips on how I could implement something like this myself?
They are written here in roc: https://github.com/roc-lang/roc/blob/f795d0856a8a68747aa7bc975827bd02ff8d4610/crates/compiler/builtins/roc/List.roc#L494-L579
Would be a great first issue to add it.
Ah, ok. I see there are some helper functions in that file I could use to implement it.
Shall I go ahead and submit an issue for it?
Also, want to check to see whether I'm looking for the right function here.
My sudoku grid is represented currently as List [Fixed Number, Empty (List Number)]
When cell does not have a number filled in, I instead store a list of possible values for that cell. The first step in solving the puzzle is to prune extraneous values from this list. After that, I will use a backtracking algorithm to visit each empty cell in turn, try setting the value to one of the possible values, and check whether that produces a valid solution. If so, I return it. If not, I try the next possible value. If none of those work, I move on to the next empty cell.
Hmm, List.walkWithIndex
would probably be sufficient if I actually want to check for multiple solutions. Which I should probably do.
I'm still open to adding walkUntilWithIndex
if we want that though.
Actually, I still want to be able to bail early when checking for multiple solutions. If I find a second solution halfway through, I want to be able to return an error immediately.
:duck:
If you want to work on it, just feel free to add and make a PR. If you don't filing an issue would be great.
Last updated: Jul 06 2025 at 12:14 UTC