would it be worth having some kind of struct of arrays thing as a builtin? similar to zigs MultiArrayList
after handmade Seattle I was really impressed with how these can improve performance and the control is in "userspace"
I hadn't really thought about it to be honest!
so I guess the idea would be that we have MultiList a as a builtin, and the way it works is that when you put a record type in there, it compiles to a record of Lists, where each List represents a field in the list?
and any non-records you put in there would work like they do in a normal List?
intuitively. You could imagine that tag unions store the tag id in a separate array
this idea sounds very attractive to me
and I guess nested product types would get flattened automatically :thinking:
yea or like an Array of structs
I forget which is which but MultiArrayList from zig is the exact implementation I’m thinking off
yeah that one's a struct of arrays
where each field from a record goes into its own list or something
right!
how would iteration work? :thinking:
So { a, b }
becomes List a and List b under the hood
like if I did a walk on a MultiList { x : I64, y : I64 }
what would that compile to?
hm not sure, let me think a little and I’ll come back after this appointment I have
you do the bounds check once, and then just get the value out of each array
ah that makes sense!
you can re-assemble into the record type that you pretend to be storing. That's just moving values around on the stack
so then like map would be more efficient, but walk might be less efficient
than a List
and then call the function
wow yeah I think that would Just Work in general :thinking:
and get the efficiency benefits
like even if map and such were all implemented in terms of walk
unless I'm missing something?
MultiArrayList in Zig, for context:
https://zig.news/kristoff/struct-of-arrays-soa-in-zig-easy-in-userland-40m0
https://github.com/ziglang/zig/blob/master/lib/std/multi_array_list.zig
I thnk that just works yes
If I can drop a comment or two: it's efficient since you don't actually store structs in an array (having to care about padding to satisfy alignment reqs), but store each struct field in a separate array. In zig we then have a "view" into each slice which you can use to iterate on, etc.
I think we wouldn't even need the "view" in our case, right? :thinking:
the view is hidden in the map/walk/etc functions
I think you always need a view into each array, but you can synthesise into a struct one level up
anyway, example usage: https://github.com/ziglang/zig/blob/3d528161c80c753a9b00cae84cae245526b29c10/src/arch/x86_64/Emit.zig#L60
is a view more than an index?
or equivalently, a pointer
Folkert de Vries said:
or equivalently, a pointer
Just an index/pointer
right
a relevant distinction with Zig's MultiArrayList is that this wouldn't be in userspace in Roc - it would be a language builtin, part of the standard library
You don't actually need to build a struct on the way out always - sometimes it's better to just iterate on fields
that's true but something we cannot easily do because this is not in userspace
but I've been doing that a bunch in rust and it's great
Roc is higher level so returning a struct but storing each field in a separate array should be doable and fine, right?
e.g. using a vector instead of a hashmap, store the keys separately in an array from the elements, and then if the keys are small simd can just tear through that array to find if a key is in there
yes as a user you could always do that yourself
but then other functionality is more error prone
or slower
Folkert de Vries said:
e.g. using a vector instead of a hashmap, store the keys separately in an array from the elements, and then if the keys are small simd can just tear through that array to find if a key is in there
interesting, I never thought of relying on SIMD in a situation like this but this makes sense
Folkert de Vries said:
or slower
Slower? How so? Isn't it just extracting an index and then doing an O(1) * n where n is the number of fields in the struct?
yes but we do bounds checking always
Folkert de Vries said:
yes but we do bounds checking always
Ah, gotcha
so it's both slower, and you have to deal with "optionals" every time
Right
so totally possible, but not ergonomic
hence why we'd want to bake it into the language
and then you loose the ability to iterate over just one of the arrays
but in practice llvm/inlining might elide the lookups that are not used
Sweet I love how well received the idea is
I’m down to put it on my queue of roc things to do but I’d probably need help so if someone gets to it without me it’s totally fine
This feature should do well with game devs
or anyone looking to reduce padding
Do we have any ideas for enabling users to walk a sub array. Like explicitly saying that they want to loop over the a part of { a, b }. I think this may be important from a compiler optimization perspective. I am not sure that llvm will be able to/will always correct fix a map that only uses a subset of the fields. So generating code to load data from all arrays and build a struct only to use one field may not optimize well.
hm good point
I am not sure either, but I think there is a good chance it will. The alternatives are worse: a single array will always load the data, the user making multiple lists is still possible, but super inconvenient and you always pay the cost of the bounds check
I am not sure that llvm will be able to/will always correct fix a map that only uses a subset of the fields.
if it can't, we could potentially do our own dead code elimination pass on that
we have all the info, after all!
Brendan Hansknecht said:
So generating code to load data from all arrays and build a struct only to use one field may not optimize well.
Is it really necessary to effectively build structs out of the multi-array-lists? Couldn't the compiler internally treat an index into a multi-array-list like a stand-alone record? Something like the distinction between short and long strings: both are strings on the surface, but have a different representation in memory?
it's not required, but it's the easiest way to do it. If that turns out to be inefficient, then, as Richard said, we have all the info to do specific optimizations
What would be the downsides / how much would it cost in performance to implement the existing List type like this?
You mean have a list of records in Roc always compile to a record of lists?
Yes
i think the worst case would be random access where you care about all elements in a single record. You would be doing n random memory loads where n is the number of fields in the record. Instead of 1 random and the rest cached. That would probably be a large hit on performance. Also, adding a single element could lead to n calls to the allocator.
Also, probably a large cost to sort a list given n locations to move memory from instead of 1 chunk.
Though probably faster lookup if your key is a single element and we can use simd.
So definitely has some hefty potential costs to consider. Very useful when used right, but definitely has bad cases as well.
yea it's worth giving the control over to the user as a possible optimization if needed
What need to be done to make SOA possible to define in userspace?
Zig uses compile-time execution and reflection to generate a SOA type for a struct, and generate the corresponding pointer type too.
What's the semantic of a struct in Roc? If there is no guarantee of ABI, the members of a struct don't even need to be contiguous in memory. Then SOA or AOS is an implementation detail, not part of the type of a value. It is similar to inline, that I think it should be a hint to the compiler.
Idealistic view of optimization: Humans are terrible are guessing performance of a program. I hope I don't need to tell the compiler how to optimize my code, that the compiler will search in the output-space and find the best assembly implementation for a program.
do you know of any academic research on compilers that can detect at compile time whether AoS versus SoA would have better performance at runtime?
That would be really valuable if someone figured out a way to do it reliably!
futhark does the transformation, but I think it just always does it
I think a well done SoA tends to be better in most cases, but well done tends to depend on design decisions that a human would make rather than a compiler.
AoS is really only conducive to cases where each struct in the array is accessed randomly and most or all of its fields are used on each access (Where you might also use a hashmap). If that is the case, you are doing something that already sucks on modern hardware, so it still won't be fast.
All that being said, both should be completely doable in roc currently. Just would be done manually.
To make it even close to automatic in our current state i don't think is doable. No current way to map from a struct to the set of arrays.
Also, roc does have a struct abi that defines order, size, and alignment of each field. Because platforms can depend on it, it is not up to the compiler to change it as it sees fit based on AoS or SoA. That being said, a specific data structure could theoretically automatically pick as long as it has an API that deals with the conversion to/illusion of regular structs
yeah the MultiList a builtin idea definitely seems the most promising to me!
Last updated: Jun 16 2026 at 16:19 UTC