For the actual zig compiler to make IR nodes small, they store a lot of data out of band. They have InternPool to manage this while giving a type safe api.
Is this a pattern we want to follow in the new roc compiler? Do we want to use the zig intern pool directly in roc?
Fundamentally, it enables storing unions of various sizes in a compact way out of band.
Design talked about here: https://youtu.be/KOZcJwGdQok?si=YjBElUxvNBhG6CwI&t=1620
That's what the Env was gonna be for, but it's trickier to ensure data stored out-of-band is present in a multithreaded environment, so I think keeping the interns in each stage is a cleaner solution for now
Eventually something like this would make sense. I expect we'd want to outline the API we want, ignoring the Zig version, and then if it turns out that our desired API matches Zig, then we'd use theirs
I don't think it needs to be global at all. I think simply having it per stage would be really good for reduce memory bloat. I would also assume in a multithreaded context that we may have an arena per thread
As noted in that talk, zig tried a few iterations before reaching this. A number of them are not type safe and I don't think those stepping stones would be worth doing while focusing on correctness. The intern pool (and maybe something simpler but similar) keeps type safety, gives you the exact struct representation you want, and saves memory. So I think it is very low cost for anything correctness and would be a nice win for design.
Not sure it would be their exact intern pool, but I assume it would be a data structure that is shared between multi IRs (not the specific instance, but the type itself).
At least per stage is good, but the current plan is per module, per stage. It avoids the worry of coordination between modules for now, and shouldn't be that hard to pull out once everything works
To beat a dead horse, there are tons of spots where we can get better performance for a low cost
But I'm trying to buy a correctness cheeseburger with all of these perf pennies on the ground
Yeah, per module, per stage
I more generally asking about the general IR design and if we plan to lean more out of band (like intern polol is) or just do large tags
Okay, same page!
Yeah, interning is a must to plan for
Basically any data that's not a number or union should be interned
cool! That is exactly what I was trying to ask
Great
I think this already what the parser is shooting for already
Ok, I watched that talk (thought it was Andrew's from 2021 originally). I think we should be careful here if we are aiming for a straightforward compiler that is easy to debug AND easy to contribute to. Peruse the implementation of the Intern Pool - it's not trivial (especially if/when you want to handle multi-threading) for probably 90% of potential contributors.
I don't think intern pool has to be trivial. I just think the interface to it has to be trivial
Just like we can depend on a complex implementation of a hash map that most people don't understand, I think we could depend on a complex intern pool
Clearly there is a balance, but I think API is far more important than underlying complexity for something like this. (Not saying use intern pool, but I think we should think deeply about the easiest way to store data out of band)
The problem is the intern pool has a simple interface - if you are not touching any of the interned nodes
If you do, you are right back into the mess
I have not messed with it. We likely need to spin up a prototype to see what use would look like.
That or look at concrete use in the zig compiler
InternPool started off as some very straightforward code in the compiler, which would have been a good reference I think, but at this point it might be a bit overcomplicated to use as inspiration since it implements an experimental sharding system so that it can be used in multi-threaded contexts
at its core it's essentially the same pattern as the tokenizer, parser, and other phases of the pipeline, just with some fancy hash map stuff thrown in there
a neat trick is you can use a nonexhaustive enum for indices of the intern pool, and reserve the first N indices for specific known types. then any time you want to intern a common thing, you can use an enum literal instead
so your code looks like doSomething(.i32_zero)
and .i32_zero
lowers to whatever intern pool index you have reserved for that value
Yeah, definitely sounds like the current intern pool would not map to the simplicity we want to start with.
I wonder if we can essentially reimplement the simple starting version that is single threaded
Otherwise, I guess we can start by storing data out of band in multiple different allocations. One for each different type (or at least size) of data. That would be a lot simpler while still avoiding tons of wasted space.
what if we just had each module have its own interns that it populated from scratch in its own thread every time?
That's what I'm currently writing
I'm starting with per module, per stage
But I think we'd actually want just per-module
yeah it duplicates work but has the best cache coherence
and is very simple :big_smile:
So once I finish the Zig conversion, I'll see if I can make just per-module work. That avoids needing to copy cache data between stages
And there's no multithreading concern because a module can't have two stages worked on simultaneously
you may need to be careful that interns across pools (per module or whatever) are equivalent under comparison
the way this is done in the current implementation is that interns per module keep a reference to a global interner that is probed upon insertion (so effectively a write-through cache)
The parser we are working with is pretty similar to the non-type-safe iteration of what you guys built
I think we could get to the type-safe API of the intern pool there with me having more experience with comptime
Last updated: Jul 06 2025 at 12:14 UTC