Stream: ideas

Topic: exhaustive record destructuring


view this post on Zulip Richard Feldman (Dec 16 2023 at 23:35):

in Rust, if I destructure a struct, by default I get an error if I leave off any fields. I can opt out of that error by adding .. as one of the fields, at which point the destructure works like how Roc's record destructures work.

I've found this to be an annoying default, but I do occasionally want it. For example, I'm about to write a function that returns the total length in bytes of all the fields in a struct (they're all collections) so I can know how much space to preallocate for copying them. In that case, the exhaustive destructuring is a nice way to make sure that if I add a new field, I don't forget to add it to that calculation.

view this post on Zulip Richard Feldman (Dec 16 2023 at 23:35):

so here's an idea: what if we made this opt-in in Roc?

view this post on Zulip Richard Feldman (Dec 16 2023 at 23:35):

something like this:

{ foo, bar, ..{} }

view this post on Zulip Richard Feldman (Dec 16 2023 at 23:36):

with the ..{} meaning "...and nothing else"

view this post on Zulip David Mell (Dec 16 2023 at 23:45):

Would this also work with tuples? (I guess the syntax would be ..() in that case.)

view this post on Zulip Agus Zubiaga (Dec 17 2023 at 00:08):

Yes! This would be great. Part of me wants it to be opt-out like in Rust, but I’m happy to get it either way :big_smile:

view this post on Zulip Agus Zubiaga (Dec 17 2023 at 00:17):

Here is an example of a function where I need to do something with all the fields of a record

view this post on Zulip Agus Zubiaga (Dec 17 2023 at 00:19):

I’d also use it in a lot of other situations where I’m likely to have to add code if I add a new field, even if I don’t need all of them

view this post on Zulip Richard Feldman (Dec 17 2023 at 00:20):

the thing is, in Rust I am writing .. almost always, like 95% of the time - it's really annoying :sweat_smile:

view this post on Zulip Agus Zubiaga (Dec 17 2023 at 00:23):

Yeah, I get it. When it’s manageable, I like to do { usedField, unusedField: _ } instead of .. because it gives me peace of mind to know the compiler will let me know if I add a new field. I know it’s not for everyone, though :grinning:

view this post on Zulip Agus Zubiaga (Dec 17 2023 at 00:34):

It depends a lot on the use case. You’re probably right about the best default.

view this post on Zulip Brendan Hansknecht (Dec 17 2023 at 01:21):

I feel like reversing it and having a similar syntax would look confusing

view this post on Zulip Brendan Hansknecht (Dec 17 2023 at 01:21):

With a different syntax, maybe

view this post on Zulip Pearce Keesling (Dec 17 2023 at 04:28):

I think defaulting to non exhaustive makes more sense in roc where type inference and structural typing are the norm. In rust since it is all concrete types you know exactly what fields to expect in a struct and it is unlikely to grow/shrink as much up the stack

view this post on Zulip Eli Dowling (Dec 18 2023 at 06:14):

How about a different assignment operator?
Eg:

{a,b} :=  {a:1,b:2,c:3} #error record not full restructured
{a,b} =  {a:1,b:2,c:3} #all good

But then it wouldn't work inside pattern matching.

If you did go for a symbol, I'd definitely prefer a simpler symbol. I think adding another record like syntax into the mix is confusing, and looks a bit weird inside tuples
Eg:

{a,b,!}

Because "!" generally means "not" or negation
Or

{a,b,~}

Because it's visually very clear and isn't used elsewhere.

Basically the symbol would be the opposite of "_" instead of "and then the rest" it means "and there is no more" I think it's quite intuitive actually.

view this post on Zulip Anton (Dec 18 2023 at 09:49):

I like words for their clarity and searchability, my suggested keywords:

view this post on Zulip Anton (Dec 18 2023 at 09:49):

nothingElse

view this post on Zulip Anton (Dec 18 2023 at 09:49):

nothing-else

view this post on Zulip Anton (Dec 18 2023 at 09:50):

It's very unlikely that you want a variable named nothingElse :p so it's fine in that respect

view this post on Zulip Kevin Gillette (Dec 22 2023 at 22:20):

I think keywords are more elegant when they're uniform-case and avoid underscores.

nothing-else looks a lot better to me than nothingElse or nothing_else, but it's still suspect. If else weren't already a keyword, then nothing-else would be precariously similar to a subtraction expression.

view this post on Zulip Kevin Gillette (Dec 22 2023 at 22:26):

I like the suggestion of ! the most so far since it has well understood meaning and the behavior has a decent chance of being inferred by the reader even if they don't know about that feature.

Without explanation, I'd have no idea what ..{} means. {} as a type param to close a record is perhaps self-consistent, but imo not all that intuitive, so I don't believe we should expand use of that syntax into more areas.

view this post on Zulip Richard Feldman (Jan 02 2025 at 22:10):

one potentially interesting design: we could make it so that structural record destructures are non-exhaustive, but custom record destructures work the way they do in Rust

view this post on Zulip Richard Feldman (Jan 02 2025 at 22:11):

so for example:

{ x, y } = # not exhaustive, like today
Point.{ x, y } = # exhaustive, like Rust
Point.{ x, y, .. } = # not exhaustive, like Rust

view this post on Zulip Richard Feldman (Jan 02 2025 at 22:13):

then if desired, we could do something like this for exhaustive structural records:

{ x, y, ..{} } = # exhaustive

view this post on Zulip Richard Feldman (Jan 02 2025 at 22:13):

which doesn't look the prettiest, but also seems like it would be extremely rare to want in practice

view this post on Zulip Anthony Bullard (Jan 02 2025 at 22:46):

Why not just use an identifier for the "rest", and use _ if you don't care

view this post on Zulip Richard Feldman (Jan 02 2025 at 22:50):

could work!

view this post on Zulip Kilian Vounckx (Jan 03 2025 at 07:36):

Like { x, y, ..rest }?

That looks kinda weird with underscore in my opinion: { x, y, .._ }.

But just the underscore looks okay to me: { x, y, _ }.

But do exhaustive structural records really come up?

view this post on Zulip Dawid Danieluk (Jan 16 2025 at 15:50):

Another idea, use ellipsis ... operator (there were some discussions about having it right?).
Want to use rest? { x, y, ..rest }
Don't care about it? { x, y, ... }

If ... would be introduced then I think it'd be pretty nice usage of it (as in "i don't care about the 'rest' right now") so it's similar conceptually to todo!() with nice side effect that changing ... into ..rest requires less keystrokes and looks similarly.

It doesn't introduce new concepts and would reuse something already in the language (assuming that ellipsis will be added).

view this post on Zulip Sam Mohr (Jan 16 2025 at 18:13):

We already plan on having { x, y, .. } meaning open record and { x, y } meaning closed record

view this post on Zulip Anthony Bullard (Jan 16 2025 at 19:13):

I think Dawid is talking about taking the "rest" of the open record and doing something with it

view this post on Zulip Anthony Bullard (Jan 16 2025 at 19:14):

For some function like: { a : Str, b : Str, }a -> (a, Str)

view this post on Zulip Sam Mohr (Jan 16 2025 at 19:28):

Ellipsis will definitely be added

view this post on Zulip Sam Mohr (Jan 16 2025 at 19:28):

Anthony, I don't understand how this would help that. Could you give an example?

view this post on Zulip Sam Mohr (Jan 16 2025 at 19:29):

Also, I think supporting .. and ... in the same location could lead to some very surprising code breaks

view this post on Zulip Sam Mohr (Jan 16 2025 at 19:30):

Though hopefully the presence of a warning saying "you wrote a ..., remove it eventually" would help

view this post on Zulip Sam Mohr (Jan 16 2025 at 19:30):

As is the plan for all ellipses

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:29):

I think the idea is, ... means "there might be other stuff here, but I don't care about it", and ..<IDENT> means "there might be other stuff, and if so, put that other stuff into a record and assign it to the variable IDENT".

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:30):

That's how .. is supposed to work.

{ x, y, .. } = { x: 123, y: 456, z: 789, foo: "bar" }

z and foo are dropped here

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:31):

I don't think we would support ... and .. unqualified

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:32):

Yes, but:

{ x, y, ..rest } = { x: 123, y: 456, z: 789, foo: "bar" }
expect x == 123
expect y == 456
expect rest == { z: 789, foo: "bar" }

In this proposal

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:32):

And if you don't care about rest, you would use

{ x, y, ... } = { x: 123, y: 456, z: 789, foo: "bar" }

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:33):

Oh, yeah, the current intent is

{ x, y, ..rest } = { x: 123, y: 456, z: 789, foo: "bar" }
expect x == 123
expect y == 456
expect rest == { x: 123, y: 456, z: 789, foo: "bar" }

We don't want to have rest only contain the uncaptured fields because that requires us to create a new record, which is inefficient if done a lot

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:33):

But that doesn't really jive with how similar features (in the few languages that have it) work

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:34):

Anthony Bullard said:

And if you don't care about rest, you would use

{ x, y, ... } = { x: 123, y: 456, z: 789, foo: "bar" }

I don't understand why { x, y, .. } doesn't work here

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:34):

And it doesn't really make sense

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:34):

I think .. could work

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:34):

If we treat it like _ today

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:34):

Where the IDENT is optional

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:35):

But I would be INCREDIBLY surprised that ..rest didn't only give me back a new struct with the uncaptured fields

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:36):

First of all, only a crazy person is doing:

{ x, y, ..rest } = { x: 123, y: 456, z: 789, foo: "bar" }
expect x == 123
expect y == 456
expect rest == { x: 123, y: 456, z: 789, foo: "bar" }

They are doing

{ x, y, ..rest } = some_func()
expect x == 123
expect y == 456
expect rest == { z: 789, foo: "bar" }

More than like

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:36):

You're right that it's different for us to have rest capture everything (a.k.a. be a reference to the original record), but if the default from other languages is easy to make it inefficient, we should help people write more efficient code

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:37):

I think it is obviously inefficient

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:37):

I don't think it is obvious

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:37):

I mean it's a new stack allocated struct

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:37):

It's obvious to someone that knows what's running

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:37):

So it's not SUPER inefficient

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:38):

But Roc will be used by people that don't know what a stack or a heap are

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:38):

Creating a new string isn't efficient

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:39):

So if I call Str.split_firstI have to understand there's a new stack allocated struct (tuple), and probably two new heap allocated strings

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:39):

Okay, just to make sure we're on the same page, if { x, y, ..rest } returned only the other fields into rest, you think that .. would handle what we want instead of ...?

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:39):

Even if the strings are interned

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:39):

Yes, I don't think we need ...

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:39):

I think above Richard thought we might need to have .._ for ignoring the rest

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:40):

There aren't heap-allocated strings, I think. We should just take references to the slices we want

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:40):

And that is horrible

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:40):

So ... seems more reasonable

view this post on Zulip Richard Feldman (Jan 16 2025 at 22:40):

Anthony Bullard said:

So if I call Str.split_firstI have to understand there's a new stack allocated struct (tuple), and probably two new heap allocated strings

both strings actually share references to the original allocation, so that one happens to be efficient :big_smile:

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:40):

But I don't think so

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:40):

Richard Feldman said:

Anthony Bullard said:

So if I call Str.split_firstI have to understand there's a new stack allocated struct (tuple), and probably two new heap allocated strings

both strings actually share references to the original allocation, so that one happens to be efficient :big_smile:

Really? We just create a new seamless slice over the original? Basically a new view?

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:41):

If so, bravo that's awesome

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:41):

If you wanted to make sure that rest was empty, you could do { x, y, ..{} }, but { x, y } does that, so no need for .._ or ..{}

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:41):

Yeah, I agree

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:42):

(At some point I need to read the zig runtime code)

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:43):

Another reason I think we shouldn't do ... here is because I think it should unambiguously refer to "code I haven't written yet"

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:43):

And this would overload it with another meaning

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:43):

Namely "values I'm discarding"

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:43):

Oh yeah, I forgot about the ... TODO thing you proposed

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:44):

Yeah, I want to keep that design space clear for you

view this post on Zulip Anthony Bullard (Jan 16 2025 at 22:44):

Did you ever create an issue for it?

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:45):

Yep: https://github.com/roc-lang/roc/issues/7440

view this post on Zulip Sam Mohr (Jan 16 2025 at 22:45):

I'm trying to keep all syntax change issues in the pinned syntax change issue

view this post on Zulip Brendan Hansknecht (Jan 17 2025 at 20:39):

You're right that it's different for us to have rest capture everything (a.k.a. be a reference to the original record), but if the default from other languages is easy to make it inefficient, we should help people write more efficient code

I was the original person to really push against struct mutation syntaxes (that actually change the fields contained) because they are inefficient. I honestly think that was likely a mistake at this point. Except for very large structs or very hot loops, no programmer will actually care about the efficiency here. Same with tags conversions from one union to another.

I'm not saying we should just enable these things, but I really think we have started to weigh them too heavily. Yes, small hits across an entire program can really hurt. Same in a hot loop. But being able to just use the language and get things done is more important to most people.

The big question in my mind is when a user has completed there app and goes to optimize, will they feel that they need to remove the feature in many locations or that using it was a mistake.

For most of the these features, I think that almost always the user will not care and these features are unlikely to be at the bottleneck. If they are a bottleneck it will be in a few limited hot loops.

That said, still really hard to gauge. A single large struct destructuring might lead to tons of data copying due to destructing and a ton of extra refcount updates. So it could be very heavy.

view this post on Zulip Brendan Hansknecht (Jan 17 2025 at 20:41):

At the same time, most low level languages will never hit this class of issues due to only having nominal types. It is structural types that specifically opt into these kinds of questions and problems.

view this post on Zulip Brendan Hansknecht (Jan 17 2025 at 20:42):

Really hard to pick a balance cause roc is trying to live in two different worlds. One where the feature is a no brainer and another where the feature is questionable at best.

view this post on Zulip Richard Feldman (Jan 17 2025 at 21:25):

I definitely think it's most likely to be fine

view this post on Zulip Richard Feldman (Jan 17 2025 at 21:25):

another factor is that LLVM in a lot of cases may end up breaking up the structs anyway

view this post on Zulip Richard Feldman (Jan 17 2025 at 21:25):

like it's not as if we are definitely going to end up making an actual whole new struct, after all the optimization passes have happened

view this post on Zulip Brendan Hansknecht (Jan 18 2025 at 00:27):

another factor is that LLVM in a lot of cases may end up breaking up the structs anyway

I'm not sure how likely this is in practice, but yeah, theoretically a completely local struct could be split into many separate variable. I'm not sure I've ever seen it in our optimized IR though.

view this post on Zulip Brendan Hansknecht (Jan 18 2025 at 00:29):

if we are definitely going to end up making an actual whole new struct

I think it practice we will almost all the time.

view this post on Zulip Brendan Hansknecht (Jan 18 2025 at 00:30):

but also, what is the actually cost. It is just moving a handful of bytes form one stack offset to another

view this post on Zulip Sam Mohr (Jan 18 2025 at 00:41):

I think the perf cost is not really a thing, I'm more worried about the specialization cost (more records, longer compile times). But that also should be negligible

view this post on Zulip Brendan Hansknecht (Jan 18 2025 at 00:41):

Oh, actually, I think I am wrong here and the promotion happens more often than I realize. It just does it in a weird way that still leaves around a lot of allocas even if structs are broken up and mostly treated as scalars.

view this post on Zulip Brendan Hansknecht (Jan 18 2025 at 00:41):

Looking at rocci-bird llvm ir right now to try and guage this a bit better

view this post on Zulip Brendan Hansknecht (Jan 18 2025 at 00:42):

At a minimum, structs that are local to a function will be split into n alloca instructions for each field. That should make all of this data movement free as along as we don't cross the function boundary.

view this post on Zulip Brendan Hansknecht (Jan 18 2025 at 00:42):

That said, sometime llvm gets confused by data movement that involves pointers and allocas.

view this post on Zulip Brendan Hansknecht (Jan 18 2025 at 00:43):

Not same for tags though. They are opaque to llvm due to being unions and a major cause of most alloca and data movement that sticks around.

view this post on Zulip Brendan Hansknecht (Jan 18 2025 at 00:47):

I'm more worried about the specialization cost (more records, longer compile times). But that also should be negligible

I think generally that won't be an issue. You likely will just get one specialization to a function. The only special case will be if a record is open going into a function and then you return rest still leaving it open. That will then specialize per record type passed in.

Like

fn : {a : Str, ..rest } -> { ..rest }
fn = \{a, ..rest} ->
    dbg a
    rest

view this post on Zulip Brendan Hansknecht (Jan 18 2025 at 00:49):

Though I guess any function that takes an open record (which is all of them) is already susceptible to this. So no real change.

This will actually specialize just as much as the function above. For every different shaped record passed in, it is a new specialization.

fn : {a : Str} -> Str
fn = \{a} ->
    a

view this post on Zulip Sam Mohr (Jan 18 2025 at 00:50):

Yeah, makes sense

view this post on Zulip Sam Mohr (Jan 18 2025 at 00:50):

Okay, well, then sounds like we should go for this?

view this post on Zulip Brendan Hansknecht (Jan 18 2025 at 00:53):

Might even be worth reconsidering record update to allow adding fields (though that has more weird consequences about exactly how it will work)

view this post on Zulip Richard Feldman (Jan 18 2025 at 01:06):

yeah in general I think we can plan to revisit record features sometime after 0.1.0

view this post on Zulip Richard Feldman (Jan 18 2025 at 01:07):

definitely not urgent and they can all be nonbreaking changes as long as we've already switched to the .. syntax

view this post on Zulip Sam Mohr (Jan 18 2025 at 01:09):

Yeah, that last part is the thing I'm gonna try to fix this weekend as the last syntax push. I think { x, y } now being closed instead of open ({ x, y, .. } is now open) could break stuff, so it'd be nice to get that in

view this post on Zulip Sam Mohr (Jan 18 2025 at 01:09):

Otherwise, sure, let's revisit in the future


Last updated: Jun 16 2026 at 16:19 UTC