zig compiler - intern pool · compiler development

Stream: compiler development

Topic: zig compiler - intern pool

Brendan Hansknecht (Feb 03 2025 at 06:04):

For the actual zig compiler to make IR nodes small, they store a lot of data out of band. They have InternPool to manage this while giving a type safe api.

Is this a pattern we want to follow in the new roc compiler? Do we want to use the zig intern pool directly in roc?

Fundamentally, it enables storing unions of various sizes in a compact way out of band.

Brendan Hansknecht (Feb 03 2025 at 06:05):

Design talked about here: https://youtu.be/KOZcJwGdQok?si=YjBElUxvNBhG6CwI&t=1620

Sam Mohr (Feb 03 2025 at 06:20):

That's what the Env was gonna be for, but it's trickier to ensure data stored out-of-band is present in a multithreaded environment, so I think keeping the interns in each stage is a cleaner solution for now

Sam Mohr (Feb 03 2025 at 06:21):

Eventually something like this would make sense. I expect we'd want to outline the API we want, ignoring the Zig version, and then if it turns out that our desired API matches Zig, then we'd use theirs

Brendan Hansknecht (Feb 03 2025 at 06:21):

I don't think it needs to be global at all. I think simply having it per stage would be really good for reduce memory bloat. I would also assume in a multithreaded context that we may have an arena per thread

Brendan Hansknecht (Feb 03 2025 at 06:23):

As noted in that talk, zig tried a few iterations before reaching this. A number of them are not type safe and I don't think those stepping stones would be worth doing while focusing on correctness. The intern pool (and maybe something simpler but similar) keeps type safety, gives you the exact struct representation you want, and saves memory. So I think it is very low cost for anything correctness and would be a nice win for design.

Brendan Hansknecht (Feb 03 2025 at 06:25):

Not sure it would be their exact intern pool, but I assume it would be a data structure that is shared between multi IRs (not the specific instance, but the type itself).

Sam Mohr (Feb 03 2025 at 06:29):

At least per stage is good, but the current plan is per module, per stage. It avoids the worry of coordination between modules for now, and shouldn't be that hard to pull out once everything works

Sam Mohr (Feb 03 2025 at 06:29):

To beat a dead horse, there are tons of spots where we can get better performance for a low cost

Sam Mohr (Feb 03 2025 at 06:30):

But I'm trying to buy a correctness cheeseburger with all of these perf pennies on the ground

Brendan Hansknecht (Feb 03 2025 at 06:30):

Yeah, per module, per stage

Brendan Hansknecht (Feb 03 2025 at 06:31):

I more generally asking about the general IR design and if we plan to lean more out of band (like intern polol is) or just do large tags

Sam Mohr (Feb 03 2025 at 06:31):

Okay, same page!

Sam Mohr (Feb 03 2025 at 06:31):

Yeah, interning is a must to plan for

Sam Mohr (Feb 03 2025 at 06:32):

Basically any data that's not a number or union should be interned

Brendan Hansknecht (Feb 03 2025 at 06:34):

cool! That is exactly what I was trying to ask

Sam Mohr (Feb 03 2025 at 06:36):

Great

Anthony Bullard (Feb 03 2025 at 13:54):

I think this already what the parser is shooting for already

Anthony Bullard (Feb 03 2025 at 15:49):

Ok, I watched that talk (thought it was Andrew's from 2021 originally). I think we should be careful here if we are aiming for a straightforward compiler that is easy to debug AND easy to contribute to. Peruse the implementation of the Intern Pool - it's not trivial (especially if/when you want to handle multi-threading) for probably 90% of potential contributors.

Brendan Hansknecht (Feb 03 2025 at 16:44):

I don't think intern pool has to be trivial. I just think the interface to it has to be trivial

Brendan Hansknecht (Feb 03 2025 at 16:45):

Just like we can depend on a complex implementation of a hash map that most people don't understand, I think we could depend on a complex intern pool

Brendan Hansknecht (Feb 03 2025 at 16:46):

Clearly there is a balance, but I think API is far more important than underlying complexity for something like this. (Not saying use intern pool, but I think we should think deeply about the easiest way to store data out of band)

Anthony Bullard (Feb 03 2025 at 19:39):

The problem is the intern pool has a simple interface - if you are not touching any of the interned nodes

Anthony Bullard (Feb 03 2025 at 19:40):

If you do, you are right back into the mess

Brendan Hansknecht (Feb 03 2025 at 19:49):

I have not messed with it. We likely need to spin up a prototype to see what use would look like.

Brendan Hansknecht (Feb 03 2025 at 19:49):

That or look at concrete use in the zig compiler

Andrew Kelley (Feb 03 2025 at 22:44):

InternPool started off as some very straightforward code in the compiler, which would have been a good reference I think, but at this point it might be a bit overcomplicated to use as inspiration since it implements an experimental sharding system so that it can be used in multi-threaded contexts

Andrew Kelley (Feb 03 2025 at 22:46):

at its core it's essentially the same pattern as the tokenizer, parser, and other phases of the pipeline, just with some fancy hash map stuff thrown in there

Andrew Kelley (Feb 03 2025 at 22:47):

a neat trick is you can use a nonexhaustive enum for indices of the intern pool, and reserve the first N indices for specific known types. then any time you want to intern a common thing, you can use an enum literal instead

Andrew Kelley (Feb 03 2025 at 22:48):

so your code looks like doSomething(.i32_zero) and .i32_zero lowers to whatever intern pool index you have reserved for that value

Brendan Hansknecht (Feb 04 2025 at 01:15):

Yeah, definitely sounds like the current intern pool would not map to the simplicity we want to start with.

Brendan Hansknecht (Feb 04 2025 at 01:16):

I wonder if we can essentially reimplement the simple starting version that is single threaded

Brendan Hansknecht (Feb 04 2025 at 01:17):

Otherwise, I guess we can start by storing data out of band in multiple different allocations. One for each different type (or at least size) of data. That would be a lot simpler while still avoiding tons of wasted space.

Richard Feldman (Feb 04 2025 at 01:26):

what if we just had each module have its own interns that it populated from scratch in its own thread every time?

Sam Mohr (Feb 04 2025 at 01:27):

That's what I'm currently writing

Sam Mohr (Feb 04 2025 at 01:27):

I'm starting with per module, per stage

Sam Mohr (Feb 04 2025 at 01:27):

But I think we'd actually want just per-module

Richard Feldman (Feb 04 2025 at 01:27):

yeah it duplicates work but has the best cache coherence

Richard Feldman (Feb 04 2025 at 01:27):

and is very simple :big_smile:

Sam Mohr (Feb 04 2025 at 01:28):

So once I finish the Zig conversion, I'll see if I can make just per-module work. That avoids needing to copy cache data between stages

Sam Mohr (Feb 04 2025 at 01:28):

And there's no multithreading concern because a module can't have two stages worked on simultaneously

Ayaz Hafiz (Feb 04 2025 at 01:54):

you may need to be careful that interns across pools (per module or whatever) are equivalent under comparison

Ayaz Hafiz (Feb 04 2025 at 01:55):

the way this is done in the current implementation is that interns per module keep a reference to a global interner that is probed upon insertion (so effectively a write-through cache)

Anthony Bullard (Feb 04 2025 at 02:05):

The parser we are working with is pretty similar to the non-type-safe iteration of what you guys built

Anthony Bullard (Feb 04 2025 at 02:06):

I think we could get to the type-safe API of the intern pool there with me having more experience with comptime

Last updated: Jul 26 2025 at 12:14 UTC