Stream: show and tell

Topic: Advent of Code platform


view this post on Zulip Oskar Hahn (Nov 10 2024 at 14:30):

I created the platform, that I will use for AoC this year:

https://github.com/ostcar/roc-aoc-platform

The main function has this signature: [Part1, Part2] -> List U8.

The platform statically allocates memory on startup. While Roc is running, there are no allocations. As default, it allocates one GiB of memory, but other sizes can be used via the argument --memory. The platform does not deallocate any memory. Last year, I activated deallocations on day 17, since this was necessary for my solution. It is possible, that I will implement it as well this year. But it will be optional.

The output of the platform looks like this:

Part1 in 24.141ms, used 64167756 bytes:
2377

Part2 in 20.347ms, used 64167757 bytes:
71220

I think that the platform is a good example for a zig-platform, since it has a working build.zig file, that creates all files necessary for the surgical linker and the legacy linker for linux and mac (I don't have a mac, so I could only test it with linux, but I should work). The platform does not use C-malloc and fiends, but a zig allocator (but without deallocate at the moment). As soon as expect works like dbg and does not need shm_open, mmap and getppid anymore, it should be possible to use the platform without libc.

view this post on Zulip Jasper Woudenberg (Nov 10 2024 at 14:37):

Perfect timing! I was just talking to a friend yesterday who's considering doing AoC in Roc this year, just sent them the link to this.

Also excited to take a look at how you did the Zig bits. I'm working on a zig platform on Linux and never got the surgical linker working. Also wanted to take a look at how I might run allocation through a Zig allocator but never felt sure enough of what I was doing to try it out, so will definitly check out how you did it!

view this post on Zulip Jasper Woudenberg (Nov 11 2024 at 21:54):

I didn't get the surgical linking to work unfortunately, keep running into an error. I think it's a better error then the one I had before though: previously when using the surgical linker I got a crash when zig host code allocated memory. Now I get a nice error linking to a Roc issue about surgical linking not support absolute relocations. Either way I can use the legacy linker to work around it for now.

I'm liking working with the Zig build system! Wondering if we should set up a Zig package for Roc platform development. It could include roc.Build namespace with Roc-specific build steps, a roc.std for the roc standard library integrations, maybe some comptime helpers for calling host-exposed functions, and a roc.mem with your allocator-for-Roc code.

view this post on Zulip Luke Boswell (Nov 11 2024 at 22:01):

Wondering if we should set up a Zig package for Roc platform development

https://github.com/lukewilliamboswell/roc-platform-template-zig

I've not given it much love recently... but this was the plan :smiley:

view this post on Zulip Luke Boswell (Nov 11 2024 at 22:02):

Or do you mean a roc_std like, zig package?

view this post on Zulip Jasper Woudenberg (Nov 11 2024 at 22:29):

I was aware of your zig platform template, it was a great help getting started with my platform work!

Yeah, I meant a package, as in something you could add to the build.zig.zon file of a zig platform to get all the batteries for roc platform development.

view this post on Zulip Luke Boswell (Nov 11 2024 at 22:39):

This is one of the main reasons I'd like to land the LLVM / Zig 13 upgrade. :smiley:

view this post on Zulip Luke Boswell (Nov 11 2024 at 22:40):

AFAIK you can't use a zig package with 0.11.0

view this post on Zulip Luke Boswell (Nov 11 2024 at 22:40):

I'm hoping we can land this rebuild host PR soon (like very soon), and then I can merge the latest into that branch and it should be pretty close.

view this post on Zulip Luke Boswell (Nov 11 2024 at 22:55):

It's something we kind of need in the repository for that upgrade. My current workaround, is to copy the builtins into each of the test platforms - once at the start of running the tests. I experimented with a package, but that was deviating a bit from the builtins.

I'm starting to think it might be a good idea to make the builtins depend on a roc_std zig package that lives in the repo, and the test platform also use that, and then also anyone else who is making a zig platform.

view this post on Zulip Luke Boswell (Nov 11 2024 at 22:59):

@Brendan Hansknecht does it sound like a terrible idea to have the builtins and zig test platform use a common zig package for the primitives? Could we separate out all the extra builtin stuff easily enough?

view this post on Zulip Jasper Woudenberg (Nov 11 2024 at 23:06):

I'm super excited about this!

One question: could it lead to versioning issues if zig platforms use one version of the standard library and the applications people write another?

view this post on Zulip Luke Boswell (Nov 11 2024 at 23:12):

It should be fine, as long as the platform uses the correct C-ABI etc

view this post on Zulip Brendan Hansknecht (Nov 11 2024 at 23:59):

I think it would be great for there to be a zig package for all the builtin types.

view this post on Zulip Brendan Hansknecht (Nov 11 2024 at 23:59):

Fully in the zig ecosystem but tied to a roc version for compatibility reasons

view this post on Zulip Brendan Hansknecht (Nov 12 2024 at 00:00):

Jasper Woudenberg said:

I'm super excited about this!

One question: could it lead to versioning issues if zig platforms use one version of the standard library and the applications people write another?

Yes it could. Your platform version is tied to a range of roc versions. Outside of those versions, it would interact incorrectly and break

view this post on Zulip Oskar Hahn (Nov 13 2024 at 21:13):

I implemented deallocations. The current default is not to deallocate. Without the argument --deallocate, roc_dealloc is a noop.

I tested the platform on some of my AoC2023 solutions. I am surprised, that there is no real performance boost if deallocations are skipped. In many cases, the solution was even a bit faster with deallocations.

view this post on Zulip teskje (Nov 13 2024 at 21:32):

Could it be that the os needs to page in less memory when you are using a smaller part of the backing buffer? Like, when you deallocate and then allocate again, you can reuse already paged-in memory whereas when you don't deallocate each allocation might have to page in new memory

view this post on Zulip Brendan Hansknecht (Nov 13 2024 at 21:39):

Yeah, it is likely exactly that...or that plus allocator overhead.

view this post on Zulip Brendan Hansknecht (Nov 13 2024 at 21:39):

Cause allocators may start doing smart things but fall back to slower paths if nothing is being freed

view this post on Zulip Brendan Hansknecht (Nov 13 2024 at 21:40):

Also, you still have the full cost of recounting in roc. The recounting will cost much more than the dealloc generally

view this post on Zulip Brendan Hansknecht (Nov 13 2024 at 21:40):

If you preallocated an arena and paged in roughly the memory needed, you should see some gains

view this post on Zulip Oskar Hahn (Nov 13 2024 at 21:47):

Brendan Hansknecht said:

If you preallocated an arena and paged in roughly the memory needed, you should see some gains

Thats what I am doing with or without the --deallocate. I am using the page_allocator to get 1GiB of memory and use the FixedBufferAllocator to manage it: https://github.com/ostcar/roc-aoc-platform/blob/015c80f03680fa708c119db8371b2351649ce8a7/host/host.zig#L31-L33

So I would think, that the os creates all the pages anyway. Or is it smart enough only to give the memory, when it is actually used?

view this post on Zulip Oskar Hahn (Nov 13 2024 at 21:48):

I don't think that the roc refcounting could change anything, since roc does not know that roc_dealloc is a noop.

view this post on Zulip Oskar Hahn (Nov 13 2024 at 21:51):

I would have guessed, that it could be CPU cache locality. Less memory could mean less cache misses.

The difference was not so big. Maybe with a better benchmark I would get other numbers. I am currently only measuring the time that roc__solutionForHost_1_exposed_generic needs.

view this post on Zulip Brendan Hansknecht (Nov 13 2024 at 22:02):

Oskar Hahn said:

Brendan Hansknecht said:

If you preallocated an arena and paged in roughly the memory needed, you should see some gains

Thats what I am doing with or without the --deallocate. I am using the page_allocator to get 1GiB of memory and use the FixedBufferAllocator to manage it: https://github.com/ostcar/roc-aoc-platform/blob/015c80f03680fa708c119db8371b2351649ce8a7/host/host.zig#L31-L33

So I would think, that the os creates all the pages anyway. Or is it smart enough only to give the memory, when it is actually used?

Yeah, I'm pretty sure that will lazily load pages on page fault

view this post on Zulip Brendan Hansknecht (Nov 13 2024 at 22:02):

Which, to be fair, is generally the wanted behaviour

view this post on Zulip Brendan Hansknecht (Nov 13 2024 at 22:03):

Oskar Hahn said:

I don't think that the roc refcounting could change anything, since roc does not know that roc_dealloc is a noop.

We technically could give platforms control over this setting. The compiler currently has an internal config to change recounting to a no-op. That said, with recounting as a no-op, it also breaks uniqueness analysis...so that may actually be even worse for perf.

view this post on Zulip Oskar Hahn (Nov 13 2024 at 22:13):

Brendan Hansknecht said:

Oskar Hahn said:

Brendan Hansknecht said:

If you preallocated an arena and paged in roughly the memory needed, you should see some gains

Thats what I am doing with or without the --deallocate. I am using the page_allocator to get 1GiB of memory and use the FixedBufferAllocator to manage it: https://github.com/ostcar/roc-aoc-platform/blob/015c80f03680fa708c119db8371b2351649ce8a7/host/host.zig#L31-L33

So I would think, that the os creates all the pages anyway. Or is it smart enough only to give the memory, when it is actually used?

Yeah, I'm pretty sure that will lazily load pages on page fault

The difference is really small, so I could be mistaken. But it seems, that when I set the hole memory buffer with @memset(buffer, 'x');, then without deallocations is either faster or the same.

view this post on Zulip Oskar Hahn (Nov 13 2024 at 22:37):

Brendan Hansknecht said:

If you preallocated an arena and paged in roughly the memory needed, you should see some gains

You are right. I tested the difference between preallocated memory and using the zig GeneralPurposeAllocator:

GPA with deallocation:
2.692s

GPA without deallocation:
440.34ms

Preallocated with deallocation:
18.369ms

Preallocated without deallocation:
13.631ms

The difference is remarkable.

view this post on Zulip Brendan Hansknecht (Nov 14 2024 at 00:03):

Wow, deallocation costs way more than I expected. I wonder how much of that is simply that zig's GPA is pretty bad currently vs would happen with other allocators.

view this post on Zulip Isaac Van Doren (Nov 14 2024 at 00:05):

Whoa thats surprising!

view this post on Zulip Brendan Hansknecht (Nov 14 2024 at 00:31):

@Oskar Hahn are you able to test with something like: https://github.com/joadnacer/jdz_allocator

Or I guess with malloc and free?

Both should be more robust allocators than the gpa.

view this post on Zulip Oskar Hahn (Nov 14 2024 at 07:46):

I did more testing. It seems, there are many factors that drastically change the numbers. For example, in the middle of my tests, my laptop battery was empty and I plugged in the power cable. This probably disabled some power safe mode in the CPU and the outcome was twice as fast. The following numbers are all with a power cable. I only included one without the cable for comparison. All the tests are with my AoC2023 solution for day 1 part 2. Maybe other solutions would allocate and free memory in a different way.

And please keep in mind, that I have no experience with benchmarks. I probably did many mistakes (for example running the benchmark directly after running zig build with a hot CPU. If you want to use the numbers, you should rerun them yourself.

Here are my numbers

Some conclusions are:

view this post on Zulip Brendan Hansknecht (Nov 14 2024 at 08:10):

That rough summary and those numbers match pretty close to my mental model

view this post on Zulip Brendan Hansknecht (Nov 14 2024 at 08:11):

There are plans to make gpa performant. I know there is a larger tracking issue on the zig GitHub, but it really is a very primitive allocator currently that is not very fast. Some people suggested renaming it to debug allocator for now, but they decided not to cause the plan is to improve it until it is a good general purpose allocator. But in zig today gpa is general not great to use except for debugging

view this post on Zulip Oskar Hahn (Nov 16 2024 at 13:01):

I am currently thinking about the API of the platform. The current version is [Part1, Part2] -> List U8. For this to work, you have to embed the input file into your roc program.

What I am thinking about is to change this to two functions: part1 : Str -> List U8 and part2 : Str -> List U8. So the platform would provide the puzzle input to the Roc script.

The platform could either look for an input file next to the Roc script or read from stdin. If you read the FAQ Can I copy/redistribute part of Advent of Code?, then reading from stdin sounds like the only allowed option if you want to publish your solutions in a repo.

What do you think? If you want to use the platform for AoC, what would be your preference?

view this post on Zulip Anton (Nov 16 2024 at 13:13):

I'd definitely prefer to read from an input file, you could set up an example repo for cloning that already has a gitignore for input files so they don't get uploaded.

view this post on Zulip Brendan Hansknecht (Nov 16 2024 at 17:08):

This is where the fancy solution would be to write a script that steals cookies from the browser and auto downloads the file to an ignored folder. Would open up the login page if the cookies can't be found.

view this post on Zulip Oskar Hahn (Nov 16 2024 at 17:18):

This seems possible :joy:
https://pypi.org/project/advent-of-code-data/

view this post on Zulip Brendan Hansknecht (Nov 16 2024 at 17:23):

Yeah, yt-dlp made me think of it. They steal cookies from browser to enable downloading videos from patreon among other protected sites

view this post on Zulip Matt Harden (Nov 28 2024 at 21:03):

Total Roc beginner here. What's the reason for creating a platform for AoC? Why not use one of the existing "standard" platforms like basic-cli? Is it because you also want to experiment with Roc platforms while you develop for AoC?

view this post on Zulip Brendan Hansknecht (Nov 28 2024 at 21:05):

Just gives a really simple and tailored experience. Could easily do AOC with the basic-cli platform.

view this post on Zulip Matt Harden (Nov 28 2024 at 21:08):

Is it possible to wrap one platform in another? What if I wanted to use all the code in basic-cli but replace main.roc with my own? Would I have to fork it or can I wrap it somehow?

view this post on Zulip Oskar Hahn (Nov 28 2024 at 21:21):

I think you can replace the term platform with the term framework in other languages.

For example in python, you can use Django to create a webpage with one input element. For you use a much simpler framework.

I would say, basic-cli is like Django. You can do many things with it. But for AoC, you just need to transform an input string.

I like simple solutions. And a small AoC framework/platform seems much simpler then basic-cli.

view this post on Zulip Matt Harden (Nov 28 2024 at 21:23):

Ah, I see the motivation then. Thanks. Are there things in basic-cli that go beyond Roc builtins, or does your platform drop parts of the builtin stuff for simplicity's sake?

view this post on Zulip Matt Harden (Nov 28 2024 at 21:34):

Never mind; I see all the extra complexity in basic-cli. The Roc platform concept is kind of interesting / new to me; I'm used to all of the things in basic-cli just being always available but I like the idea of a platform being as simple as possible for the purpose.

view this post on Zulip Luke Boswell (Nov 28 2024 at 21:42):

If you're interested I've also got a template for AoC I shared. It's using basic-cli and is just a roc package.

https://roc.zulipchat.com/#narrow/channel/358903-advent-of-code/topic/2024.20AoC.20Template/near/481387177

Technically you could use any platform that provides the required effects to pass in a module parameters to the package, but realistically basic-cli is probably the easiest rn.

view this post on Zulip Matt Harden (Nov 28 2024 at 21:43):

Thanks!

view this post on Zulip Matt Harden (Nov 28 2024 at 21:52):

I think I'll try my hand at a super simple Go based platform. I like the idea of a solution "part" being just a function from a string (or maybe list of strings) to another string. The platform can handle reading the input and writing the output.

view this post on Zulip Oskar Hahn (Dec 01 2024 at 17:51):

I changed the signature of the platform. Now, it requires the functions part1 : Str -> Result Str _ and part2 : Str -> Result Str _

To work with List U8 was not so nice in the tests, since the values are not shown as string, but as a List of ascii codes.

The new signature works good with try.

view this post on Zulip Oskar Hahn (Dec 06 2024 at 14:45):

I found out today, that the zig FixedBufferAllocator does not support deallocation. It can free the last allocation, but not anything before that.

This does not work with the way roc uses the allocator.

Its a bit sad, but I had to switch back to a c-allocator.

Here is the zig issue. I also asked in there help discord channel. It seems, that there is no FixBufferAllocator, that can deallocate.

view this post on Zulip Anthony Bullard (Dec 06 2024 at 14:50):

That seems strange, I thought that TigerBeetle used a FixedBufferAllocator for the entire lifetime of the database

view this post on Zulip Anthony Bullard (Dec 06 2024 at 14:51):

But I guess what they actually say is they _never allocate_ while the database is running, after setup

view this post on Zulip Anthony Bullard (Dec 06 2024 at 14:51):

So maybe they do, but then initialize all the structs within that layout and never worry about deallocation from then on

view this post on Zulip Oskar Hahn (Dec 06 2024 at 14:51):

Yes, that is what they saying. I have not looked, how they do it

view this post on Zulip Anthony Bullard (Dec 06 2024 at 14:52):

I think there DB just has a very specific structure where they can precisely lay out everything from the beginning

view this post on Zulip Anthony Bullard (Dec 06 2024 at 14:52):

And they use asserts EVERYWHERE to make sure none of their invariants are broken during development

view this post on Zulip Anthony Bullard (Dec 06 2024 at 14:53):

That's their "TigerStyle" coding practice

view this post on Zulip Brendan Hansknecht (Dec 06 2024 at 16:34):

Yeah, tiger beetle only allocated once and never deallocates

view this post on Zulip Brendan Hansknecht (Dec 06 2024 at 16:34):

Everything accounted for up front

view this post on Zulip Brendan Hansknecht (Dec 06 2024 at 16:35):

Fixed buffer allocator is just a limited arena allocator

view this post on Zulip Brendan Hansknecht (Dec 06 2024 at 16:36):

@Oskar Hahn what is your goal for the allocator? Why use a fixed buffer in the first place?

view this post on Zulip Oskar Hahn (Dec 07 2024 at 08:01):

I thought, it would be faster. And for advent of code, it seemed like a good trade of, to allocate one big chunk of memory at the beginning

view this post on Zulip Brendan Hansknecht (Dec 07 2024 at 08:44):

That's fair. Depending on the exact goals, something like the allocator for roc wasm4 may work. It is written in zig and very light weight. Could give it a chunk of memory and it will still properly free. The allocator it's a copy of was specifically designed to be light weight for embedded.

view this post on Zulip Oskar Hahn (Dec 08 2024 at 15:23):

In the zig forum, there was the hint to use a MemoryPool

There could be pools specific sizes, like 100 bytes, 100 kb etc. For big values, it could fall back to a "normal" allocator.

Of cause, this is a lot of wasted space, but this would work with a preallocated memory.

Since the memory is big enough, the wast should not be a problem. I also don't think that it is a CPU-cache problem, since each memory is packed. Only the space between to allocations is bigger then needed. But there is no guaranty anyway, that two allocations are next to each other.

view this post on Zulip Brendan Hansknecht (Dec 08 2024 at 16:37):

Yeah, all allocators use various levels of pools. Generally after a certain size they fall back on more direct allocation via mmap.

view this post on Zulip Brendan Hansknecht (Dec 08 2024 at 16:38):

The allocator for roc wasm4 assumes a fixed size memory limit and has no fallback, but otherwise should be quite fast and generally light on resources

view this post on Zulip Brendan Hansknecht (Dec 08 2024 at 16:39):

More robust allocators like tcmalloc, mimalloc and jbz allocator have a lot of tricks, but they also have a lot of complexity due to expecting multithreaded complex applications.

view this post on Zulip Brendan Hansknecht (Dec 08 2024 at 16:42):

I would guess allocating is rarely a noticeable bottleneck for Advent of code applications, but if you want to try preallocating a specific size chunk of memory for allocation, you could give this a try based on umm_malloc: https://github.com/lukewilliamboswell/roc-wasm4/blob/main/platform/allocator.zig

view this post on Zulip Jasper Woudenberg (Dec 19 2024 at 20:32):

Following up on the original post here, for anyone look for more zig platform examples. I've taken some hints from your build.zig file, Oskar, then added some additional features to the one I'm using for a Zig-based Roc platform:

That can be found here in case it's of use to anyone!
https://github.com/jwoudenberg/jay/blob/main/build.zig


Last updated: Jul 05 2025 at 12:14 UTC