Stream: compiler development

Topic: zig compiler - recursion


view this post on Zulip Jared Ramirez (Jun 02 2025 at 21:56):

Brendan Hansknecht said:

For example, a list or a box should break infinite recursion and be valid:

Node a: { data: a, children: List(Node(a)) }

For this (and your other example), these would need to be wrapped in a nominal type, not an alias, correct? Based on this RFC

view this post on Zulip Brendan Hansknecht (Jun 02 2025 at 21:57):

I hope not

view this post on Zulip Brendan Hansknecht (Jun 02 2025 at 21:57):

At least for the list one (same with box, set, dict, etc)

view this post on Zulip Brendan Hansknecht (Jun 02 2025 at 21:58):

For the other one, I think nominal type is reasonable and should make the rfc

view this post on Zulip Notification Bot (Jun 02 2025 at 22:02):

4 messages were moved here from #compiler development > casual conversation by Jared Ramirez.

view this post on Zulip Jared Ramirez (Jun 02 2025 at 22:09):

Moving this into its own topic!

I’m a bit confused about what kinds of recursion are allowed and what aren’t. I’ve been working on implementing an occurs check based on this RFC — but it’s possible I misunderstood the intent there.

My understanding was that, due to the issue outlined in that RFC, the problem stemmed from recursive types that are equivalent-but-not-equal, which lead to layout mismatches. I thought the proposed solution in the new compiler was to require all recursion to go through a nominal type. That seemed to be echoed in this Zulip thread as well.

I'm not taking a position on whether recursion should always go through nominal types — I just want to clarify what the actual rules are, so I can implement the correct behavior!

view this post on Zulip Richard Feldman (Jun 02 2025 at 22:24):

@Ayaz Hafiz would be the expert on this topic for sure :smile:

view this post on Zulip Brendan Hansknecht (Jun 02 2025 at 22:28):

Personally, I want to just special case list and box (special casing list by extension special cases dict and set which are implemented with list).

They contain pointers (that is just part of the implementation). As such, they should always break recursion.

view this post on Zulip Brendan Hansknecht (Jun 02 2025 at 22:28):

So no matter the rest of the rules, I think those need special casing

view this post on Zulip Ayaz Hafiz (Jun 04 2025 at 18:33):

I think the only way to make it work is to have the recursion only allowed if it's defined from a nominal type

view this post on Zulip Ayaz Hafiz (Jun 04 2025 at 18:33):

require all recursion to go through a nominal type
this is the way imo

view this post on Zulip Richard Feldman (Jun 04 2025 at 21:25):

so in other words this would work:

Node(a) := [Node({ data: a, children: List(Node(a)) })]

...but the type alias version with an anonymous record wouldn't work: (the distinction being : vs :=)

Node(a) : { data: a, children: List(Node(a)) }

view this post on Zulip Anthony Bullard (Jun 04 2025 at 22:31):

i hope we have a really good error message for this

view this post on Zulip Anthony Bullard (Jun 04 2025 at 22:31):

this is going to feel surprising for functional devs

view this post on Zulip Richard Feldman (Jun 04 2025 at 22:32):

yeah error message can be very direct

view this post on Zulip Richard Feldman (Jun 04 2025 at 22:33):

like "hey you need to do it this other way"

view this post on Zulip Anthony Bullard (Jun 04 2025 at 22:35):

i meant that i think it should explain why. it seems arbitrary if you aren't used to thinking about memory layout

view this post on Zulip Ayaz Hafiz (Jun 04 2025 at 22:40):

there are good explanations without invoking memory layout. most ML family languages require names for types like these anyway

view this post on Zulip Anthony Bullard (Jun 04 2025 at 22:42):

Sure, but we aren't talking about anonymous records versus nominal. we are talking about named (via alias here) versus nominal. i think many will struggle to understand why one is boxed and one is not

view this post on Zulip Anthony Bullard (Jun 04 2025 at 22:43):

Even being called an alias, I think many will feel "Node(a) = ..." is naming a type

view this post on Zulip Richard Feldman (Jun 04 2025 at 23:03):

there's actually no difference in terms of boxing

view this post on Zulip Richard Feldman (Jun 04 2025 at 23:03):

I don't know if there's an explanation concise enough to fit in an error message :sweat_smile:

view this post on Zulip Richard Feldman (Jun 04 2025 at 23:04):

the reason for the design decision is so subtle that nobody even realized it was a problem until people had been making Roc programs for years

view this post on Zulip Ayaz Hafiz (Jun 04 2025 at 23:12):

the explanation is that Roc needs to know where the recursion happens and with anonymous and alias types the point is ambiguous. You need to know where recursion happens for efficiency reasons

view this post on Zulip Ayaz Hafiz (Jun 04 2025 at 23:13):

you could also break module isolation in implementation which solves this problem without the surface level change. but that is harder to implement

view this post on Zulip Ayaz Hafiz (Jun 04 2025 at 23:13):

it’s really an implementation constraint

view this post on Zulip Anthony Bullard (Jun 04 2025 at 23:15):

i guess i don't even understand it :rolling_on_the_floor_laughing::rolling_on_the_floor_laughing::rolling_on_the_floor_laughing:

view this post on Zulip Jared Ramirez (Jun 04 2025 at 23:36):

Richard Feldman said:

Node(a) := [Node({ data: a, children: List(Node(a)) })]

I think this would also work?

Node(a) := { data: a, children: List(Node(a)) }

node = @Node({ ... })

Since it's a List (which breaks recursion), and passes thru a nominal type

Or maybe this would've worked with opaque types in the old compiler, but will not with nominal types in the new one? I'm not clear on exactly how nominal types will work (eg how to construct them, with an @?)

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 00:03):

I think List(a) and Box(a) should be special exceptions that don't require nominal types.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 00:03):

they should work with simple aliases.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 00:05):

Given List(a) and Box(a) are guaranteed to break recursion, I see no reason they need nominal types.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 00:05):

If you see List(a) simply don't check for recursion on the a.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 00:06):

So to the recursion checker, these two should be equivalent:

Node(a) : { data: a, children: List(Node(a)) }
Node(a) : { data: a, children: List({}) }

view this post on Zulip Richard Feldman (Jun 05 2025 at 00:31):

so the current planned design, which I'm already thinking should be tweaked slightly, is that only nominal tag unions are a thing

view this post on Zulip Richard Feldman (Jun 05 2025 at 00:32):

the way those work is that I can use := but only if the thing right after it is a tag union, and then it becomes nominal

view this post on Zulip Richard Feldman (Jun 05 2025 at 00:33):

and the way I refer to its variants are by fully-qualifiyng them with the name of the type, e.g. this would work:

ConsList(a) := [Nil, Node(ConsList(a))]

empty : ConsList(a)
empty = ConsList.Nil

view this post on Zulip Richard Feldman (Jun 05 2025 at 00:42):

Brendan Hansknecht said:

So to the recursion checker, these two should be equivalent:

Node(a) : { data: a, children: List(Node(a)) }

Node(a) : { data: a, children: List({}) }

yeah, re-reading the RFC it seems like this is compatible with it?

view this post on Zulip Richard Feldman (Jun 05 2025 at 00:43):

from the RFC:

Types must be equal, not just equivalent.

I think if the recursion in a structural type is only done through some combination of List and/or Box, then we wouldn't have the problem outlined in the RFC because reproducing that problem requires recursion through tag unions with equivalent types but different layouts

view this post on Zulip Richard Feldman (Jun 05 2025 at 00:44):

but if a List and a Box are type-compatible, then they also have the same layout

view this post on Zulip Richard Feldman (Jun 05 2025 at 00:44):

so unless I'm missing something (and again @Ayaz Hafiz would know better than I would!) I think it should work?

view this post on Zulip Richard Feldman (Jun 05 2025 at 00:44):

but then again I thought the original design would work, so... :joy:

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 00:44):

Richard Feldman said:

so the current planned design, which I'm already thinking should be tweaked slightly, is that only nominal tag unions are a thing

Would that hurt automatic error tag accumulation? Seems like one of the big roc features many people like

view this post on Zulip Ayaz Hafiz (Jun 05 2025 at 00:50):

No, it doesn't work - it either must be named or the implementation needs to break module isolation when computing the points of recursion

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 00:53):

Really, you couldn't just blackbox all list element types when doing recursion checking. Like literally treat all List(a) types as List({}). Cause from a recursion standpoint they are the same.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 00:54):

They are also the same from a layout stand point.

view this post on Zulip Ayaz Hafiz (Jun 05 2025 at 00:56):

thats not the problem though

view this post on Zulip Ayaz Hafiz (Jun 05 2025 at 00:57):

like you might have a larger type where there is an equivalent version where the recursion does not happen at List a (a being the anonymous type)

view this post on Zulip Ayaz Hafiz (Jun 05 2025 at 00:57):

then you are back to the same problem

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 00:58):

I don't follow. If we are only allowing recusion through nominal tags or List(a), where does this other recursion get introduced in the larger type?

view this post on Zulip Ayaz Hafiz (Jun 05 2025 at 01:01):

refer to this example in the document

Rec1 : [ From2 { from1 : <Rec1> }, End ]
Rec2 : { from1 : [ From2 <Rec2>, End ] }

now imagine this was

Rec1 : [ From2 { from1 : List(<Rec1>) }, End ]
Rec2 : List({ from1 : [ From2 <Rec2>, End ] })

view this post on Zulip Ayaz Hafiz (Jun 05 2025 at 01:04):

actually, that example you could maybe make work with a special case. but there are larger types where it doesn't.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 01:05):

Should be this?

Rec1 : [ From2 { from1 : List(<Rec1>) }, End ]
Rec2 : { from1 : List([ From2 <Rec2>, End ]) }

view this post on Zulip Ayaz Hafiz (Jun 05 2025 at 01:07):

yes

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 01:15):

I think this should work.

If only recursion through List(a), then the tag is simply a flat tag and payload with no pointers (except those in the list). As such { from1: List(Rec1) } and Rec2 will have the exact same layout.

If recursion happens through any other mechanism, we are still requiring the type to be nominal. As such, you won't be able to compare them anyway.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 01:17):

But maybe I am missing something that would relate to multi-recusion or nested recursion via List(a) and List(b) together.

view this post on Zulip Ayaz Hafiz (Jun 05 2025 at 01:19):

i considered this when i wrote this

view this post on Zulip Ayaz Hafiz (Jun 05 2025 at 01:19):

you can write out a bunch of increasing complicated examples and what they compile to and you will see it fails

view this post on Zulip Ayaz Hafiz (Jun 05 2025 at 01:20):

the only way i have found is the three options listed in the document. I am probably missing something, but it's also not entirely naive, and the List/Box thing doesn't solve it

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 01:21):

I trust you, I'm just quite surprised about List based recursion specifically. I totally understand from the recursive tag perspective that this needs one of your 3 solutions.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 01:44):

Actually, wait, as I think about this more, I am more convinced List(a) is fine. The fundamental problem discussed in the RFC is that tags, when generated in different context, can lead to different layouts. Thus, they need one of the suggested solutions. This is not true for List. No matter how a list is generated, it will always have the exact same layout.

So I feel pretty convinced that any structural type that only have recursions via lists should be fine.

For recursion via tags directly, we still need one of the 3 suggested solutions. And if you mix list recursion and direct tag recursion, you must follow tag recursion rules. So that again will use one of the 3 suggested solutions, but I think pure list recursion is a proper special case that is valid with structural types.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 01:46):

Box(a) is a bit more complicated than List(a) in terms of rules, but also could be special cased in a similar way. Box(a) though requires both avoiding any sort of direct tag based recursion and being in a tag with a base case (otherwise, it is an infinite type).

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 01:48):

I really just want Node a : { val: a, children: List(Node(a)) } to work as a structural type. Otherwise, I think it is bad UX for the language. We have seen a ton of beginners hit this over the years.

view this post on Zulip Ayaz Hafiz (Jun 05 2025 at 13:38):

i don't disagree with that, but the fundamental problem is that

Rec1 : [ From2 { from1 : List(<Rec1>) }, End ]
Rec2 : { from1 : List([ From2 <Rec2>, End ]) }

still have unequal layouts

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 14:24):

Due to starting at different points in the recursion chain, sure, but I don't see the problem with that. If you line up the layouts by wrapping Rec1, they become the exact same layout and are comparable.

{ from1: List(Rec1) } is equivalent to Rec2

And extracting the first element from the list in a Rec2 has the same layout as a Rec1.

Ao once lined up in the recursion chain, they will always have identical layout.

So what is the problem with that?

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 14:25):

Even if eventually recursively the same, they are not the same without some sort of wrapping or unwrapping. That seems fine.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 14:26):

To check for the ability to compare, both the starting point and the recursion point have to be the same.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 14:32):

In terms of your rfc, those two type are not equivalent structurally recursive types without either apply some sort of wrapping or unwrapping. Once that is applied, they are guaranteed to have equivalent layouts due to List always having the same layout and tags being sorted alphabetically.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 14:42):

Looking concrete at:

val1 : { from1 : List(Rec1) }
val1 = ...

val2 : Rec2
val2 = ...

expect val1 == val2

The layout of val1 is the same as a list (3 usizes)
The layout of val2 is the same as a list (3 usizes)

Both have a wrapper record field that does not affect layout.
Both lists contain a tag with 2 variants (From1 and End). Both tags have variant 0 containing a payout of a List (wrapped in a record). Both tags have an empty variant 1.

So when the starting point is aligned, the recursive layout is guaranteed the same when using Lists as the only recursion point.

view this post on Zulip Brendan Hansknecht (Jun 05 2025 at 14:49):

The important note here is that from the tags perspective, it is not really recursive at all. As such, there are no raw pointers introduced and the tag and payload are always stored inline.

view this post on Zulip Richard Feldman (Jun 06 2025 at 22:30):

hm, I think this case would be problematic?

view this post on Zulip Richard Feldman (Jun 06 2025 at 22:30):

in a nominal type that couldn't happen bc they can't grow

view this post on Zulip Brendan Hansknecht (Jun 07 2025 at 03:25):

Is that an open record specific problem?

view this post on Zulip Brendan Hansknecht (Jun 07 2025 at 03:26):

Isnt it the same if I make some silly function that gets a value from a list and the passes it into the same function. So infinitely deep lists?

view this post on Zulip Brendan Hansknecht (Jun 07 2025 at 03:27):

Also, multiple recursion points is fine. They just all have to be through list.


Last updated: Jul 06 2025 at 12:14 UTC