I created the platform, that I will use for AoC this year:
https://github.com/ostcar/roc-aoc-platform
The main function has this signature: [Part1, Part2] -> List U8
.
The platform statically allocates memory on startup. While Roc is running, there are no allocations. As default, it allocates one GiB of memory, but other sizes can be used via the argument --memory
. The platform does not deallocate any memory. Last year, I activated deallocations on day 17, since this was necessary for my solution. It is possible, that I will implement it as well this year. But it will be optional.
The output of the platform looks like this:
Part1 in 24.141ms, used 64167756 bytes:
2377
Part2 in 20.347ms, used 64167757 bytes:
71220
I think that the platform is a good example for a zig-platform, since it has a working build.zig
file, that creates all files necessary for the surgical linker and the legacy linker for linux and mac (I don't have a mac, so I could only test it with linux, but I should work). The platform does not use C-malloc
and fiends, but a zig allocator (but without deallocate at the moment). As soon as expect
works like dbg
and does not need shm_open
, mmap
and getppid
anymore, it should be possible to use the platform without libc.
Perfect timing! I was just talking to a friend yesterday who's considering doing AoC in Roc this year, just sent them the link to this.
Also excited to take a look at how you did the Zig bits. I'm working on a zig platform on Linux and never got the surgical linker working. Also wanted to take a look at how I might run allocation through a Zig allocator but never felt sure enough of what I was doing to try it out, so will definitly check out how you did it!
I didn't get the surgical linking to work unfortunately, keep running into an error. I think it's a better error then the one I had before though: previously when using the surgical linker I got a crash when zig host code allocated memory. Now I get a nice error linking to a Roc issue about surgical linking not support absolute relocations. Either way I can use the legacy linker to work around it for now.
I'm liking working with the Zig build system! Wondering if we should set up a Zig package for Roc platform development. It could include roc.Build
namespace with Roc-specific build steps, a roc.std
for the roc standard library integrations, maybe some comptime helpers for calling host-exposed functions, and a roc.mem
with your allocator-for-Roc code.
Wondering if we should set up a Zig package for Roc platform development
https://github.com/lukewilliamboswell/roc-platform-template-zig
I've not given it much love recently... but this was the plan :smiley:
Or do you mean a roc_std
like, zig package?
I was aware of your zig platform template, it was a great help getting started with my platform work!
Yeah, I meant a package, as in something you could add to the build.zig.zon
file of a zig platform to get all the batteries for roc platform development.
This is one of the main reasons I'd like to land the LLVM / Zig 13 upgrade. :smiley:
AFAIK you can't use a zig package with 0.11.0
I'm hoping we can land this rebuild host PR soon (like very soon), and then I can merge the latest into that branch and it should be pretty close.
It's something we kind of need in the repository for that upgrade. My current workaround, is to copy the builtins into each of the test platforms - once at the start of running the tests. I experimented with a package, but that was deviating a bit from the builtins.
I'm starting to think it might be a good idea to make the builtins depend on a roc_std
zig package that lives in the repo, and the test platform also use that, and then also anyone else who is making a zig platform.
@Brendan Hansknecht does it sound like a terrible idea to have the builtins and zig test platform use a common zig package for the primitives? Could we separate out all the extra builtin stuff easily enough?
I'm super excited about this!
One question: could it lead to versioning issues if zig platforms use one version of the standard library and the applications people write another?
It should be fine, as long as the platform uses the correct C-ABI etc
I think it would be great for there to be a zig package for all the builtin types.
Fully in the zig ecosystem but tied to a roc version for compatibility reasons
Jasper Woudenberg said:
I'm super excited about this!
One question: could it lead to versioning issues if zig platforms use one version of the standard library and the applications people write another?
Yes it could. Your platform version is tied to a range of roc versions. Outside of those versions, it would interact incorrectly and break
I implemented deallocations. The current default is not to deallocate. Without the argument --deallocate
, roc_dealloc
is a noop.
I tested the platform on some of my AoC2023 solutions. I am surprised, that there is no real performance boost if deallocations are skipped. In many cases, the solution was even a bit faster with deallocations.
Could it be that the os needs to page in less memory when you are using a smaller part of the backing buffer? Like, when you deallocate and then allocate again, you can reuse already paged-in memory whereas when you don't deallocate each allocation might have to page in new memory
Yeah, it is likely exactly that...or that plus allocator overhead.
Cause allocators may start doing smart things but fall back to slower paths if nothing is being freed
Also, you still have the full cost of recounting in roc. The recounting will cost much more than the dealloc generally
If you preallocated an arena and paged in roughly the memory needed, you should see some gains
Brendan Hansknecht said:
If you preallocated an arena and paged in roughly the memory needed, you should see some gains
Thats what I am doing with or without the --deallocate
. I am using the page_allocator to get 1GiB of memory and use the FixedBufferAllocator to manage it: https://github.com/ostcar/roc-aoc-platform/blob/015c80f03680fa708c119db8371b2351649ce8a7/host/host.zig#L31-L33
So I would think, that the os creates all the pages anyway. Or is it smart enough only to give the memory, when it is actually used?
I don't think that the roc refcounting could change anything, since roc does not know that roc_dealloc is a noop.
I would have guessed, that it could be CPU cache locality. Less memory could mean less cache misses.
The difference was not so big. Maybe with a better benchmark I would get other numbers. I am currently only measuring the time that roc__solutionForHost_1_exposed_generic
needs.
Oskar Hahn said:
Brendan Hansknecht said:
If you preallocated an arena and paged in roughly the memory needed, you should see some gains
Thats what I am doing with or without the
--deallocate
. I am using the page_allocator to get 1GiB of memory and use the FixedBufferAllocator to manage it: https://github.com/ostcar/roc-aoc-platform/blob/015c80f03680fa708c119db8371b2351649ce8a7/host/host.zig#L31-L33So I would think, that the os creates all the pages anyway. Or is it smart enough only to give the memory, when it is actually used?
Yeah, I'm pretty sure that will lazily load pages on page fault
Which, to be fair, is generally the wanted behaviour
Oskar Hahn said:
I don't think that the roc refcounting could change anything, since roc does not know that roc_dealloc is a noop.
We technically could give platforms control over this setting. The compiler currently has an internal config to change recounting to a no-op. That said, with recounting as a no-op, it also breaks uniqueness analysis...so that may actually be even worse for perf.
Brendan Hansknecht said:
Oskar Hahn said:
Brendan Hansknecht said:
If you preallocated an arena and paged in roughly the memory needed, you should see some gains
Thats what I am doing with or without the
--deallocate
. I am using the page_allocator to get 1GiB of memory and use the FixedBufferAllocator to manage it: https://github.com/ostcar/roc-aoc-platform/blob/015c80f03680fa708c119db8371b2351649ce8a7/host/host.zig#L31-L33So I would think, that the os creates all the pages anyway. Or is it smart enough only to give the memory, when it is actually used?
Yeah, I'm pretty sure that will lazily load pages on page fault
The difference is really small, so I could be mistaken. But it seems, that when I set the hole memory buffer with @memset(buffer, 'x');
, then without deallocations is either faster or the same.
Brendan Hansknecht said:
If you preallocated an arena and paged in roughly the memory needed, you should see some gains
You are right. I tested the difference between preallocated memory and using the zig GeneralPurposeAllocator:
GPA with deallocation:
2.692s
GPA without deallocation:
440.34ms
Preallocated with deallocation:
18.369ms
Preallocated without deallocation:
13.631ms
The difference is remarkable.
Wow, deallocation costs way more than I expected. I wonder how much of that is simply that zig's GPA is pretty bad currently vs would happen with other allocators.
Whoa thats surprising!
@Oskar Hahn are you able to test with something like: https://github.com/joadnacer/jdz_allocator
Or I guess with malloc
and free
?
Both should be more robust allocators than the gpa.
I did more testing. It seems, there are many factors that drastically change the numbers. For example, in the middle of my tests, my laptop battery was empty and I plugged in the power cable. This probably disabled some power safe mode in the CPU and the outcome was twice as fast. The following numbers are all with a power cable. I only included one without the cable for comparison. All the tests are with my AoC2023 solution for day 1 part 2. Maybe other solutions would allocate and free memory in a different way.
And please keep in mind, that I have no experience with benchmarks. I probably did many mistakes (for example running the benchmark directly after running zig build
with a hot CPU. If you want to use the numbers, you should rerun them yourself.
Here are my numbers
Some conclusions are:
That rough summary and those numbers match pretty close to my mental model
There are plans to make gpa performant. I know there is a larger tracking issue on the zig GitHub, but it really is a very primitive allocator currently that is not very fast. Some people suggested renaming it to debug allocator for now, but they decided not to cause the plan is to improve it until it is a good general purpose allocator. But in zig today gpa is general not great to use except for debugging
I am currently thinking about the API of the platform. The current version is [Part1, Part2] -> List U8
. For this to work, you have to embed the input file into your roc program.
What I am thinking about is to change this to two functions: part1 : Str -> List U8
and part2 : Str -> List U8
. So the platform would provide the puzzle input to the Roc script.
The platform could either look for an input file next to the Roc script or read from stdin. If you read the FAQ Can I copy/redistribute part of Advent of Code?, then reading from stdin sounds like the only allowed option if you want to publish your solutions in a repo.
What do you think? If you want to use the platform for AoC, what would be your preference?
I'd definitely prefer to read from an input file, you could set up an example repo for cloning that already has a gitignore for input files so they don't get uploaded.
This is where the fancy solution would be to write a script that steals cookies from the browser and auto downloads the file to an ignored folder. Would open up the login page if the cookies can't be found.
This seems possible :joy:
https://pypi.org/project/advent-of-code-data/
Yeah, yt-dlp made me think of it. They steal cookies from browser to enable downloading videos from patreon among other protected sites
Total Roc beginner here. What's the reason for creating a platform for AoC? Why not use one of the existing "standard" platforms like basic-cli? Is it because you also want to experiment with Roc platforms while you develop for AoC?
Just gives a really simple and tailored experience. Could easily do AOC with the basic-cli platform.
Is it possible to wrap one platform in another? What if I wanted to use all the code in basic-cli but replace main.roc with my own? Would I have to fork it or can I wrap it somehow?
I think you can replace the term platform with the term framework in other languages.
For example in python, you can use Django to create a webpage with one input element. For you use a much simpler framework.
I would say, basic-cli is like Django. You can do many things with it. But for AoC, you just need to transform an input string.
I like simple solutions. And a small AoC framework/platform seems much simpler then basic-cli.
Ah, I see the motivation then. Thanks. Are there things in basic-cli that go beyond Roc builtins, or does your platform drop parts of the builtin stuff for simplicity's sake?
Never mind; I see all the extra complexity in basic-cli. The Roc platform concept is kind of interesting / new to me; I'm used to all of the things in basic-cli just being always available but I like the idea of a platform being as simple as possible for the purpose.
If you're interested I've also got a template for AoC I shared. It's using basic-cli and is just a roc package.
Technically you could use any platform that provides the required effects to pass in a module parameters to the package, but realistically basic-cli is probably the easiest rn.
Thanks!
I think I'll try my hand at a super simple Go based platform. I like the idea of a solution "part" being just a function from a string (or maybe list of strings) to another string. The platform can handle reading the input and writing the output.
I changed the signature of the platform. Now, it requires the functions part1 : Str -> Result Str _
and part2 : Str -> Result Str _
To work with List U8
was not so nice in the tests, since the values are not shown as string, but as a List of ascii codes.
The new signature works good with try
.
I found out today, that the zig FixedBufferAllocator does not support deallocation. It can free the last allocation, but not anything before that.
This does not work with the way roc uses the allocator.
Its a bit sad, but I had to switch back to a c-allocator.
Here is the zig issue. I also asked in there help discord channel. It seems, that there is no FixBufferAllocator, that can deallocate.
That seems strange, I thought that TigerBeetle used a FixedBufferAllocator for the entire lifetime of the database
But I guess what they actually say is they _never allocate_ while the database is running, after setup
So maybe they do, but then initialize all the structs within that layout and never worry about deallocation from then on
Yes, that is what they saying. I have not looked, how they do it
I think there DB just has a very specific structure where they can precisely lay out everything from the beginning
And they use asserts EVERYWHERE to make sure none of their invariants are broken during development
That's their "TigerStyle" coding practice
Yeah, tiger beetle only allocated once and never deallocates
Everything accounted for up front
Fixed buffer allocator is just a limited arena allocator
@Oskar Hahn what is your goal for the allocator? Why use a fixed buffer in the first place?
I thought, it would be faster. And for advent of code, it seemed like a good trade of, to allocate one big chunk of memory at the beginning
That's fair. Depending on the exact goals, something like the allocator for roc wasm4 may work. It is written in zig and very light weight. Could give it a chunk of memory and it will still properly free. The allocator it's a copy of was specifically designed to be light weight for embedded.
In the zig forum, there was the hint to use a MemoryPool
There could be pools specific sizes, like 100 bytes, 100 kb etc. For big values, it could fall back to a "normal" allocator.
Of cause, this is a lot of wasted space, but this would work with a preallocated memory.
Since the memory is big enough, the wast should not be a problem. I also don't think that it is a CPU-cache problem, since each memory is packed. Only the space between to allocations is bigger then needed. But there is no guaranty anyway, that two allocations are next to each other.
Yeah, all allocators use various levels of pools. Generally after a certain size they fall back on more direct allocation via mmap.
The allocator for roc wasm4 assumes a fixed size memory limit and has no fallback, but otherwise should be quite fast and generally light on resources
More robust allocators like tcmalloc, mimalloc and jbz allocator have a lot of tricks, but they also have a lot of complexity due to expecting multithreaded complex applications.
I would guess allocating is rarely a noticeable bottleneck for Advent of code applications, but if you want to try preallocating a specific size chunk of memory for allocation, you could give this a try based on umm_malloc: https://github.com/lukewilliamboswell/roc-wasm4/blob/main/platform/allocator.zig
Following up on the original post here, for anyone look for more zig platform examples. I've taken some hints from your build.zig file, Oskar, then added some additional features to the one I'm using for a Zig-based Roc platform:
That can be found here in case it's of use to anyone!
https://github.com/jwoudenberg/jay/blob/main/build.zig
Last updated: Jul 05 2025 at 12:14 UTC