Stream: ideas

Topic: ✔ Should type annos use ::


view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:17):

You should probably scroll near the bottom for the actual meat of this convo, which started off talking about parser issues and a different solution

A similar issue has reared it's head, but now it is with type annos as the first statement. I even left a comment about it in the parser

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:20):

Without requiring a (new) token like NoSpaceOpColon be used for record fields (and forbidding it in type annotation statements). Without doing that, the only way to tell that a Expr introduced by an OpenCurly is a block (versus a record) is to have a backtracking function that tries to parse a record to completion (without saving the nodes) if we see a LowerIdent followed by a OpColon

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:22):

Because today, this:

foo = |x| {
    something : a
}

Should be valid, parseable code - but it is unclear _what_ it should be. Do we have a function that returns a record with a something field with the value of a? Or a block with a single, useless type annotation statement in it?

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:25):

But with my suggestion the above would always be the later and:

foo = |x| {
    something: a
}

Would always be the former

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:25):

But it does come at a cost of flexibility

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:26):

Another, worse idea would be that Records require a sigil before the OpenCurly (like zig). I don't condone or promote that idea at all, but it is a valid solution.

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:28):

So TL;DR:

view this post on Zulip Luke Boswell (Jun 26 2025 at 22:33):

I recall, we had a subtle thing with formatting where the types would have a space, but the values wouldn't.

my_record : {
    name : Str,
    age : U64,
}
my_record = {
    name: "foo",
    age: 42,
}

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:34):

Yep, and inside of a Anno or Pattern Record Field it's not really a problem to parse. It is only the one that separates the header from the anno

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:34):

So we would need to throw an error on:

my_record: {
    name : Str,
    age : U64,
}

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:35):

But

my_record : {
    name: Str,
    age: U64,
}

Is fine to parse

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:36):

But I know that doesn't have design consistency with say = in decls

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:36):

Which need not worry about whitespace on either side

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:37):

It also disallows comments after the header and before the OpColon

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:38):

But I think it's better than

view this post on Zulip Luke Boswell (Jun 26 2025 at 22:38):

I'm not opposed to the sigil idea tbh

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:39):

There is plenty of prior art for that, but that's a big change for an important data structure

view this post on Zulip Luke Boswell (Jun 26 2025 at 22:39):

I don't like the idea of introduce whitespace significance (after we just removed it)

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:40):

Dude

view this post on Zulip Luke Boswell (Jun 26 2025 at 22:40):

Also there is now a very subtle difference between OpColon and NoSpaceOpColon

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:40):

The most obvious, least offensive move would be to have type annos use ::

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:41):

Like Haskell

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:41):

OpDoubleColon

view this post on Zulip Luke Boswell (Jun 26 2025 at 22:43):

I haven't spent a lot of time writing the new syntax but while writing these snapshots for records... I have been finding the "is this a block or a record" a little confusing

view this post on Zulip Luke Boswell (Jun 26 2025 at 22:48):

Anthony Bullard said:

Because today, this:

foo = |x| {
    something : a
}

Should be valid, parseable code - but it is unclear _what_ it should be. Do we have a function that returns a record with a something field with the value of a? Or a block with a single, useless type annotation statement in it?

Maybe we should just make a rule that this parses one way or the other. How big of an issue would it be I wonder

view this post on Zulip Luke Boswell (Jun 26 2025 at 22:49):

Changing the syntax would be a pretty major downside. Making the parser slightly less performant or tolerant might be acceptable.

view this post on Zulip Luke Boswell (Jun 26 2025 at 22:51):

the only way to tell that a Expr introduced by an OpenCurly is a block (versus a record) is to have a backtracking function that tries to parse a record to completion (without saving the nodes) if we see a LowerIdent followed by a OpColon

How bad is backtracking in this situation?

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:55):

it depends on the annotation

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:55):

could be very bad in situations with a big annotation

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:56):

but i think the confusion is a bigger deal

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:56):

we started using : in annos when we didn't have {} delimited blocks

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:58):

this would be very clear :

foo = |x| {
    something :: a
}

that's a block and either

foo = |x| {
    something : a
}

or

foo = |x| {
    something: a
}

would be a record and there is NO room for confusion by the actual human

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:59):

and the latter would be the canonical formatting for records

view this post on Zulip Anthony Bullard (Jun 26 2025 at 22:59):

And there is no change needed in a TypeRecordField

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:03):

so

my_record : {
    name: Str,
    age: U64,
}

would just be

my_record :: {
    name: Str,
    age: U64,
}

view this post on Zulip Luke Boswell (Jun 26 2025 at 23:04):

It shifts the annotation by one character in single line form

my_record :: { name : Str, age : U64 }
my_record = { name: "john", age: 64 }

my_list :: List(Str)
my_list = [ "one", "two", "three"]

my_int :: U64
my_int = 42

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:07):

that's true, but hasn't bothered Haskell users for the past 27 years :rolling_on_the_floor_laughing:

view this post on Zulip Luke Boswell (Jun 26 2025 at 23:08):

We're not using :: anywhere else right?

view this post on Zulip Luke Boswell (Jun 26 2025 at 23:10):

Wait... does the :: only go in the declaration part, not inside the record??

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:11):

yes

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:11):

the record could use the same as in pattern or expr record

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:11):

see above sample

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:12):

here's some haskell for example

main = do
    str <- getContents
    let rna :: [RNA]
        rna = map (\c -> read [c]) str

    let aminoAcids :: [AminoAcid]
        aminoAcids = decodeAll rna

    putStrLn (concatMap show aminoAcids)

view this post on Zulip Luke Boswell (Jun 26 2025 at 23:14):

I'm trying to think of a valid example where we have a lower ident after the colon. Is this the only way?

foo : List(a) -> U64
foo = |x| {

    something : a # refers to the `a` from the foo annotation
    something = x

    x.len()
}

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:14):

if you are on desktop and can do it, could you move all of this from https://roc.zulipchat.com/#narrow/stream/304641-ideas/topic/Needed.20Function.20signature.20and.20lambda.20expr.20change/near/525987327 to a new topic here in ideas?

something like "Move type annotations to use :: between header and type"

view this post on Zulip Notification Bot (Jun 26 2025 at 23:15):

45 messages were moved here from #ideas > Needed Function signature and lambda expr change by Luke Boswell.

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:15):

thanks @Luke Boswell

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:16):

the problem isn't only lower ident after the colon, it's any token that could start an expr

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:16):

and you could go quite a ways before you realized this isn't a record field

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:17):

At the end of the day, this is Richard's decision but i'm glad we laid out the scenario for everyone and him

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:18):

I could change the formatter to format code like this and format a big module with lots of annotations

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:19):

to get a good sample

view this post on Zulip Luke Boswell (Jun 26 2025 at 23:21):

(deleted) to avoid confusion

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:23):

i'm not sure about the last one

view this post on Zulip Luke Boswell (Jun 26 2025 at 23:25):

I guess the proposal here is only changing the type anno.

# Type Annotation
foo :: List(a) -> U64
foo = ...

# Nominal Type Declaration
Foo(a) := [Good(a), Bad]

# Alias Type Declaration
Foo(a) : [Good(a), Bad]

My line of thinking was that these all look nice

(edited to clarify wording of statements)

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:29):

i prefer this

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:29):

type aliases don't really cause confusion

view this post on Zulip Luke Boswell (Jun 26 2025 at 23:30):

createUser :: UserId, UserName, UserAge -> User
createUser = |id, name, age| { id, name, age }

getUserName :: User -> UserName
getUserName = |user| user.name

main! = |_| {

    user :: User
    user = createUser(123, "Alice", 25)

    getUserName(user)
}

Yeah, I've been working through examples... and it's definitely growing on me too.

It feels much clearer and visually distinct where we add type annotations.

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:31):

i'd love to see what someone like @Niclas Ahden thinks about it, as an actual Roc practitioner

view this post on Zulip Luke Boswell (Jun 26 2025 at 23:33):

I've gone through most of our snapshot examples and in every case I think the :: is an improvement.

view this post on Zulip Anthony Bullard (Jun 26 2025 at 23:34):

i hope Richard feels the same

view this post on Zulip Luke Boswell (Jun 26 2025 at 23:35):

I like that it visually distinguishes the annotations from the aliases (further than just upper/lower case idents)

view this post on Zulip Luke Boswell (Jun 26 2025 at 23:57):

Here's my attempt at a summary

So the only downsides I can think of are;

foo :: U64
foo = 42

bar :: { age : U8 }
bar = { age: 42 }

The upsides I can think of are;

view this post on Zulip Luke Boswell (Jun 27 2025 at 00:01):

Also type annotations are always optional, so the :: isn't required on everything, so when it is included it stands out. This feels appropriate because the author has deliberately inserted an annotation.

view this post on Zulip Anthony Bullard (Jun 27 2025 at 00:20):

i love this summary and i'll let it stand. hopefully someone else will step in and speak up

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:08):

Anthony Bullard said:

Because today, this:

foo = |x| {
    something : a
}

Should be valid, parseable code - but it is unclear _what_ it should be. Do we have a function that returns a record with a something field with the value of a? Or a block with a single, useless type annotation statement in it?

that would be a block with no expression at the end, which isn't allowed, right?

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:08):

so I think a record would be the only valid way to parse it

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:09):

a block can have one or more statements in parsing

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:09):

in Can a block needs a trailing expr

view this post on Zulip Luke Boswell (Jun 27 2025 at 01:10):

Isn't the issue that there is an unbounded amount you would have to parse ahead before you could know it's not a record.

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:10):

ah I see

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:11):

Luke Boswell said:

Isn't the issue that there is an unbounded amount you would have to parse ahead before you could know it's not a record.

this is the issue for the Parser. the thing Richard is picking on is more about how a human seeing code would parse it

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:12):

so there is two distinct issues and i think both are solved with annos using ::

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:13):

I strongly don't want to do :: so I'd very much prefer to explore alternatives :smile:

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:13):

i was worried you'd say that

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:13):

is there a reason, or just aesthetics

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:13):

Or Haskell PTSD? :rolling_on_the_floor_laughing:

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:14):

I guess we can go on a brief tangent about that :laughing:

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:14):

so basically every language except the Haskell family uses : over :: for types (assuming they use one or the other)

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:14):

the other ML family languages used :: for cons

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:14):

Anthony Bullard said:

But I think it's better than

I think these were the options i found

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:15):

we don't have to backtrack, there's another fix

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:15):

the Haskell committee reversed it and used : for cons and :: for types because they thought that people were going to be using cons way more often than type annotations so it would be a nice ergonomics improvement

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:15):

obviously that decision did not age well

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:16):

every other language except for Haskell and direct descendents of Haskell (PureScript comes to mind) either always used : for types or went back to it (e.g. Idris and Elm came after Haskell and went with : for types)

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:17):

separately, it's super common in modern languages to use : without a space for types, e.g. foo: bar

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:17):

we use a space, which is already a little bit weird; adding a second : just does not seem like a justifiable use of weirdness budget to solve a parsing edge case that can be solved in another way

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:18):

yeah so the other option i found which i know you want is inline annos

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:18):

but what's this alternative to backtracking

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:18):

so this is an exploratory idea for sure, but I've been thinking about it for awhile and I haven't been able to come up with a reason that it wouldn't work

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:18):

i know we could share nodes between exprs and patterns

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:18):

exactly

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:19):

they have total overlap in terms of syntax, and the only things that are invalid in one but not the other can be checked during canonicalization

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:19):

yeah but do we just bail on the pattern parsing as soon as we find something that couldn't be a pattern?

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:19):

nh

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:19):

*nah

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:20):

well Alternatives and as as not part of expr

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:20):

the idea would be that the parser is just concerned with the structure

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:20):

right, but canonicalization could give an error for that just as easily

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:20):

one of the reasons to do the design would be that it could make formatting faster because the parser can do less work

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:20):

by deferring some of the checks that normally happen during parsing to canonicalization

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:21):

so parsing becomes about turning tokens into a valid "shape" but not about deciding which things are patterns vs expressions vs record fields vs type annotations etc.

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:21):

that becomes canonicalization's job, and the job becomes easier because canonicalization has a more complete picture to work with

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:22):

ok

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:22):

Does that mean we need to do Can before we format then?

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:22):

I don't think so

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:22):

I can't think of a situation where it would matter :thinking:

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:22):

Or am I going to just blow up the formatter when I get a pattern in the middle of formatting an expr?

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:22):

Ok, maybe that's true

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:22):

I'm down to try

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:22):

do we format patterns and exprs differently?

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:23):

I don't think we do, but I could be missing something :smile:

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:23):

pretty sure they just follow the same rules

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:23):

They are different today

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:23):

interesting!

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:23):

Though this is a type

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:23):

Different functions

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:23):

ah so they're separate just because the types are different

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:23):

I think they are very similar

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:23):

Yes

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:23):

gotcha

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:24):

But the thing we are talking about in this topic is about a TYPE

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:24):

if we're going to try combining them, there's another cool Zig thing we can try - a technique I liked in Layout

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:25):

I can only tell this is NOT a record by trying to parse at least one record field

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:25):

Which depending on the type of annotation, could be a large number of tokens

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:25):

I can talk more in like an hour, putting kids to bed

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:26):

layout's tag
layout's union

view this post on Zulip Luke Boswell (Jun 27 2025 at 01:27):

(deleted)

view this post on Zulip Luke Boswell (Jun 27 2025 at 01:28):

(deleted)

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:28):

so the basic idea is that this is similar to a Zig tagged union, except that you can store all the tags, and then separately store all the unions - like what we do with lhs and rhs right now, except that you get to specify that lhs and rhs are unions

code example: https://github.com/roc-lang/roc/blob/main/src/layout/store.zig#L209-L220

return switch (layout.tag) {
    .scalar => switch (layout.data.scalar.tag) {
        .int => layout.data.scalar.data.int.size(),
        .frac => layout.data.scalar.data.frac.size(),
        .bool => 1, // bool is 1 byte
        .str, .opaque_ptr => target_usize.size(), // str and opaque_ptr are pointer-sized
    },
    .box, .box_of_zst => target_usize.size(), // a Box is just a pointer to refcounted memory
    .list, .list_of_zst => target_usize.size(), // TODO: get this from RocStr.zig and RocList.zig
    .record => self.record_data.get(@enumFromInt(layout.data.record.idx.int_idx)).size,
    .tuple => self.tuple_data.get(@enumFromInt(layout.data.tuple.idx.int_idx)).size,
};

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:29):

the relevant technique there is

.int => layout.data.scalar.data.int.size()

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:29):

so we know that since the tag was int we can use the data.int union variant

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:30):

and what's cool about this is that in debug builds, Zig actually secretly tracks at runtime which union variant you instantiated

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:30):

so if in this code I wrote data.frac instead of data.int there, I'd get a runtime panic in debug builds

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:30):

saying that I'd put an int in that union originally, but now I'm trying to read it as a frac

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:30):

of course in release builds it doesn't do this

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:31):

this is the "untagged unions" feature

view this post on Zulip Luke Boswell (Jun 27 2025 at 01:33):

https://ziglang.org/documentation/master/#toc-Anonymous-Union-Literals

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:33):

so Data could use union for its lhs and rhs in this way

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:33):

and then we'd get that extra runtime safety, plus the code could be more self-documenting in various places

view this post on Zulip Luke Boswell (Jun 27 2025 at 01:37):

So related to the original problem above.. we'd parse a block/record shaped thing with statement shaped things in them. Then in Can if we have valid statements followed by a final expression we have a block, otherwise it's a record?

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:42):

isn't the point of lhs and dhs is that we have a low-byte fixed layout for all nodes and any extra data is referenced is other nodes via indexes stored in extra data?

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:43):

maybe i should read more about this but it seems like this could lead to much fatter nodes where the data list has items the size of the largest union variant

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:44):

yeah so a union in Zig (without an enum in there) - e.g. const Foo = union { ... } is just saying "Foo could be any one of these types at runtime, and I'm not storing any metadata about which it would be"

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:45):

so yes it's taking up the space of whatever its biggest variant is, but none of its variants would be bigger than u32 anyway

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:45):

another way to say it is that union is just a way to be more formal than u32 about what different types that u32 could be referring to

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:45):

but it doesn't change the runtime representation in any way - at least not in a release build

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:46):

but in a debug build Zig keeps extra info (I guess in a side table somewhere or something?) so you can also get at least a runtime type mismatch if you think you've got one type in there, but actually that's not the type that was put in there in practice when you set the value of lhs or rhs

view this post on Zulip Anthony Bullard (Jun 27 2025 at 01:50):

i feel like i must be misunderstanding. you are saying data is a union, but nothing in the union would be larger than a u32, but also that it has several nested structs in it?

.int => layout.data.scalar.data.int.size()

what's the union here?

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:53):

oh no, no nested structs

view this post on Zulip Richard Feldman (Jun 27 2025 at 01:53):

let me give a concrete example, 1 sec

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:03):

ok so this code is currently:

.module => |mod| {
    node.tag = .module_header;
    node.data.lhs = @intFromEnum(mod.exposes);
    node.region = mod.region;
},
.hosted => |hosted| {
    node.tag = .hosted_header;
    node.data.lhs = @intFromEnum(hosted.exposes);
    node.region = hosted.region;
},
.package => |package| {
    node.tag = .package_header;
    node.data.lhs = @intFromEnum(package.exposes);
    node.data.rhs = @intFromEnum(package.packages);
    node.region = package.region;
},

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:04):

...but it could be:

.module => |mod| {
    node.tag = .module_header;
    node.data = .{ .mod = .{ exposes = mod.exposes } };
    node.region = mod.region;
},
.hosted => |hosted| {
    node.tag = .hosted_header;
    node.data = .{ .hosted = .{ .exposes = hosted.exposes } };
    node.region = hosted.region;
},
.package => |package| {
    node.tag = .package_header;
    node.data = .{
        .package = {
            .packages = package.packages,
            .exposes = package.exposes,
        }
    };
    node.region = package.region;
},

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:04):

...and then Data would be something like:

const Data = union {
    mod: struct {
        exposes: Collection.Idx,
    },
    hosted: struct {
        exposes: Collection.Idx,
    },
    package: struct {
        exposes: Collection.Idx,
        packages: Collection.Idx,
    },
}

memory would be exactly the same ones and zeros as today (at least in release builds)

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:05):

but now we've documented what the different possibilities are for what could be in Data, and the Zig compiler can use that so if I set node.data = .{ .hosted = ... }; in a particular node, and then later access node.data.package instead of node.data.hosted in that node, I get a runtime exception because I put a .hosted in that node, not a .package

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:05):

(in debug builds only)

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:24):

I see

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:25):

That's cool, but I guess I'm at a loss for how this helps resolve the issue that's the root of this particular topic

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:26):

oh it's separate, sorry

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:26):

Were you trying to say earlier that we should share the same nodes for Exprs, Patterns, AND Type Annotations? Because even that doesn't help

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:27):

Unless we go SUPER abstract with the syntax tree to the point of barely doing more than tokenization

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:29):

Which is basically an entire re-rewrite of the parser at that points

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:33):

So taking out backtracking for the moment the other options (besides :: which I still didn't feel there was a compelling argument against):

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:34):

hm, so I just thought of a potentially easier fix:

foo = |x| {
    something : a
}

let's suppose that when we start parsing, we assume we're building up a record

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:35):

as soon as we hit a , after the expr, we know that's confirmed and we're all set. so for example this comma after a:

foo = |x| {
    something : a,
    other : b
}

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:36):

conversely, if we later hit something that tells us we're not a record, such as the above without a comma...

foo = |x| {
    something : a
    other : b
}

(which is unambiguously two consecutive type annotations)

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:36):

then we know we've actually been parsing a block

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:37):

but an important observation here is that at the point where we make this realization, it is for sure the case that we have parsed exactly:

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:37):

But what if you just get a curly after?

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:37):

then it's definitely a record

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:38):

(for the reasons discussed earlier)

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:38):

So I'll get a LSP error about an undeclared variable when the code is in this state?

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:38):

no, because we assume it's a record

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:38):

until proven otherwise

view this post on Zulip Luke Boswell (Jun 27 2025 at 02:38):

couldn't it be another record type annotation?

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:39):

Since there is a type variable (above the function is a top-level annotation (which is unambiguous) that introduces it. But here a is not a defined variable

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:39):

@Luke Boswell not in that position

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:39):

That could only be an expr

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:39):

And there are two Exprs that start with OpenCurly: Record and Block

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:41):

Richard Feldman said:

but an important observation here is that at the point where we make this realization, it is for sure the case that we have parsed exactly:

to finish this thought:

at this point, if we want to change our mind about what we've been building up, we don't have to backtrack and redo work; we can just reach back and swap the node type in constant time

just say "instead of a record with exactly 1 field, whose name we have already interned, this is now a type annotation where the pattern ident is the record field name we interned, and the type is the thing we thought was an expr after a record field"

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:41):

but that relies on being able to have one node type for types and exprs

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:42):

The big thing here is that if we have (the actual motivating snapshot):

identity : a -> a
identity = |x| {
    thing : a  # refers to the type var introduced in function type annotation
    thing = x  # refers to the value from the function parameter
    thing
}

What is thing : a stored in the NodeStore as?

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:42):

Richard Feldman said:

but that relies on being able to have one node type for types and exprs

Yeah, and therein lies the problem

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:42):

We've now collapsed three nodes types into one node type

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:43):

at the parsing stage, but they all have the same structure right?

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:43):

Which I guess for Nodes maybe isn't the biggest deal

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:43):

yeah

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:43):

there are cases where it would be really confusing but I don't actually think this is one of them

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:44):

like ok [A, B, C] could be a pattern, or a tag union type, or a list expression

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:44):

I have to constantly remind myself that Nodes are just a stored representation, whereas the Typed values are what's important to downstream consumers

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:44):

like the canonicalization logic would be almost the same

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:44):

it's still "lhs is pattern, rhs is expression"

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:45):

just the arguments to the function you pass lhs and rhs have a different types

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:45):

The can logic wouldn't have to change at all

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:45):

but they still have the same conditionals and the same branches

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:45):

yeah exactly

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:45):

Except catch things that don't make sense

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:45):

aside from whatever logic we do or don't decide to move to canonicalization

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:45):

right!

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:46):

Ok, so luckily type annotations come pretty var down the tutorial - and at that point we'll just have to let people know (because of auto-bracketing editors) that a type annotation by itself in a block will be treated as a record

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:47):

In case they see weird errors

view this post on Zulip Luke Boswell (Jun 27 2025 at 02:47):

That would have to be exceedingly rare

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:48):

Like with the motivating example, if at some point of entering it they end up in this state:

identity : a -> a
identity = |x| {
    thing : a  # refers to the type var introduced in function type annotation
}

The LSP will report that a is undefined and that the return type of the function doesn't match the annotation of it

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:48):

Because when they typed { their editor gave them the final }

view this post on Zulip Luke Boswell (Jun 27 2025 at 02:49):

You could special case this one scenario

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:49):

It'll be a transitory error, but still confusing

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:49):

hm, why wouldn't it see that as a record? :thinking:

view this post on Zulip Luke Boswell (Jun 27 2025 at 02:49):

It would -- that's why it would be confusing

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:49):

It does see it as a record

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:49):

oh I see

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:49):

But a is not defined in the scope

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:49):

yeah

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:49):

I gotcha

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:50):

This kind of thing is why with arrow function in JS you have to wrap returning an object as a bare expression in ()s

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:50):

yeah I'm not worried about that being a problem in practice haha

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:50):

especially with AI autocomplete probably suggesting a thing = right below

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:50):

you'd probably tab-complete before the LSP even had a chance to complain haha

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:51):

I think it'll be surprising still to many

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:51):

Because I know you work for Zed, but there are those of us out there not using AI - even for completions

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:52):

But yeah, it's not likely a big issue. But you seem really concerned about confusing new users

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:52):

fair, but also inline type annotations are super rare in practice

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:52):

Which is laudable

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:52):

Yeah, to the point of my wondering if they are even necessary outside of the top-level

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:52):

they are occasionally

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:53):

sometimes it's really nice for clarifying what something is, because it's nonobvious from other context

view this post on Zulip Luke Boswell (Jun 27 2025 at 02:53):

I love using them in blocks

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:53):

Well, never necessary technically, but convenient

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:53):

but come to think of it, in those scenarios I almost always find myself adding them after the fact

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:53):

rather than up front

view this post on Zulip Richard Feldman (Jun 27 2025 at 02:53):

which also doesn't run into that LSP scenario

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:53):

It's documentation

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:54):

Yeah, it's a special kind of user who knows exactly what type something is, feels the need to have the annotation, and then writes it before even writing the declaration for it

view this post on Zulip Luke Boswell (Jun 27 2025 at 02:54):

Anthony Bullard said:

Like with the motivating example, if at some point of entering it they end up in this state:

identity : a -> a
identity = |x| {
    thing : a  # refers to the type var introduced in function type annotation
}

The LSP will report that a is undefined and that the return type of the function doesn't match the annotation of it

By special case, I mean if we are not expecting a record return type, but we have a record with exactly one field then give a different warning that also suggests this might be a block expression or something.

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:54):

And also likes a language with total type inference

view this post on Zulip Luke Boswell (Jun 27 2025 at 02:55):

unfortunately I'd be that guy

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:55):

Luke Boswell said:

Anthony Bullard said:

Like with the motivating example, if at some point of entering it they end up in this state:

identity : a -> a
identity = |x| {
    thing : a  # refers to the type var introduced in function type annotation
}

The LSP will report that a is undefined and that the return type of the function doesn't match the annotation of it

By special case, I mean if we are not expecting a record return type, but we have a record with exactly one field then give a different warning that also suggests this might be a block expression or something.

That could be something for error reporting for sure if Can can give us that info

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:56):

Luke Boswell said:

unfortunately I'd be that guy

I thought Claude wrote all of your code :stuck_out_tongue_wink:

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:57):

So it is resolved: We are not moving to :: for type annotations (sorry @Luke Boswell who I sold it to, and actually came to really like it).

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:57):

My action plan going forward, aligning the node structure for Expr, TypeAnno, and Pattern to be the same. And then adopting the strategy of: If I see a statement start with LowerIdent, UpperIdent, OpenRound, OpenSquare, or OpenCurly just start parsing it and then once I know what I'm working with convert the node id to the appropriate typed id

view this post on Zulip Luke Boswell (Jun 27 2025 at 02:58):

To be fair, @Anthony Bullard you could sell ice to an eskimo

view this post on Zulip Anthony Bullard (Jun 27 2025 at 02:59):

Hahahahaaha. Try telling that to my Director

view this post on Zulip Anthony Bullard (Jun 27 2025 at 03:00):

But maybe I don't know how to sell AI solutions yet since I still hate it :stuck_out_tongue:

view this post on Zulip Luke Boswell (Jun 27 2025 at 03:00):

The selling, the AI, or the solutions?

view this post on Zulip Anthony Bullard (Jun 27 2025 at 03:01):

At least the first two

view this post on Zulip Notification Bot (Aug 11 2025 at 10:59):

Nils Hjelte has marked this topic as resolved.


Last updated: Jun 16 2026 at 16:19 UTC