✔ Should type annos use :: · ideas

Stream: ideas

Topic: ✔ Should type annos use ::

Anthony Bullard (Jun 26 2025 at 22:17):

You should probably scroll near the bottom for the actual meat of this convo, which started off talking about parser issues and a different solution

A similar issue has reared it's head, but now it is with type annos as the first statement. I even left a comment about it in the parser

Anthony Bullard (Jun 26 2025 at 22:20):

Without requiring a (new) token like NoSpaceOpColon be used for record fields (and forbidding it in type annotation statements). Without doing that, the only way to tell that a Expr introduced by an OpenCurly is a block (versus a record) is to have a backtracking function that tries to parse a record to completion (without saving the nodes) if we see a LowerIdent followed by a OpColon

Anthony Bullard (Jun 26 2025 at 22:22):

Because today, this:

foo = |x| {
    something : a
}

Should be valid, parseable code - but it is unclear _what_ it should be. Do we have a function that returns a record with a something field with the value of a? Or a block with a single, useless type annotation statement in it?

Anthony Bullard (Jun 26 2025 at 22:25):

But with my suggestion the above would always be the later and:

foo = |x| {
    something: a
}

Would always be the former

Anthony Bullard (Jun 26 2025 at 22:25):

But it does come at a cost of flexibility

Anthony Bullard (Jun 26 2025 at 22:26):

Another, worse idea would be that Records require a sigil before the OpenCurly (like zig). I don't condone or promote that idea at all, but it is a valid solution.

Anthony Bullard (Jun 26 2025 at 22:28):

So TL;DR:

Type annos at the top of blocks cause predictably bad parsing errors
We have two realistic options to solve it:
- Create a NoSpaceOpColon token and require it for record fields, and disallow it being used to separate a type header from the type annotation itself in a type anno statement.
- Do a lot of backtracking in the scenario where a block looks like it could be a record

Luke Boswell (Jun 26 2025 at 22:33):

I recall, we had a subtle thing with formatting where the types would have a space, but the values wouldn't.

my_record : {
    name : Str,
    age : U64,
}
my_record = {
    name: "foo",
    age: 42,
}

Anthony Bullard (Jun 26 2025 at 22:34):

Yep, and inside of a Anno or Pattern Record Field it's not really a problem to parse. It is only the one that separates the header from the anno

Anthony Bullard (Jun 26 2025 at 22:34):

So we would need to throw an error on:

my_record: {
    name : Str,
    age : U64,
}

Anthony Bullard (Jun 26 2025 at 22:35):

But

my_record : {
    name: Str,
    age: U64,
}

Is fine to parse

Anthony Bullard (Jun 26 2025 at 22:36):

But I know that doesn't have design consistency with say = in decls

Anthony Bullard (Jun 26 2025 at 22:36):

Which need not worry about whitespace on either side

Anthony Bullard (Jun 26 2025 at 22:37):

It also disallows comments after the header and before the OpColon

Anthony Bullard (Jun 26 2025 at 22:38):

But I think it's better than

backtracking
requiring a sigil for record
changing the symbol used for separating the header and anno
a symbol or keyword needed to begin a type anno

Luke Boswell (Jun 26 2025 at 22:38):

I'm not opposed to the sigil idea tbh

Anthony Bullard (Jun 26 2025 at 22:39):

There is plenty of prior art for that, but that's a big change for an important data structure

Luke Boswell (Jun 26 2025 at 22:39):

I don't like the idea of introduce whitespace significance (after we just removed it)

Anthony Bullard (Jun 26 2025 at 22:40):

Dude

Luke Boswell (Jun 26 2025 at 22:40):

Also there is now a very subtle difference between OpColon and NoSpaceOpColon

Anthony Bullard (Jun 26 2025 at 22:40):

The most obvious, least offensive move would be to have type annos use ::

Anthony Bullard (Jun 26 2025 at 22:41):

Like Haskell

Anthony Bullard (Jun 26 2025 at 22:41):

OpDoubleColon

Luke Boswell (Jun 26 2025 at 22:43):

I haven't spent a lot of time writing the new syntax but while writing these snapshots for records... I have been finding the "is this a block or a record" a little confusing

Luke Boswell (Jun 26 2025 at 22:48):

Anthony Bullard said:

Because today, this:
foo = |x| {
    something : a
}
Should be valid, parseable code - but it is unclear _what_ it should be. Do we have a function that returns a record with a something field with the value of a? Or a block with a single, useless type annotation statement in it?

Maybe we should just make a rule that this parses one way or the other. How big of an issue would it be I wonder

Luke Boswell (Jun 26 2025 at 22:49):

Changing the syntax would be a pretty major downside. Making the parser slightly less performant or tolerant might be acceptable.

Luke Boswell (Jun 26 2025 at 22:51):

the only way to tell that a Expr introduced by an OpenCurly is a block (versus a record) is to have a backtracking function that tries to parse a record to completion (without saving the nodes) if we see a LowerIdent followed by a OpColon

How bad is backtracking in this situation?

Anthony Bullard (Jun 26 2025 at 22:55):

it depends on the annotation

Anthony Bullard (Jun 26 2025 at 22:55):

could be very bad in situations with a big annotation

Anthony Bullard (Jun 26 2025 at 22:56):

but i think the confusion is a bigger deal

Anthony Bullard (Jun 26 2025 at 22:56):

we started using : in annos when we didn't have {} delimited blocks

Anthony Bullard (Jun 26 2025 at 22:58):

this would be very clear :

foo = |x| {
    something :: a
}

that's a block and either

foo = |x| {
    something : a
}

foo = |x| {
    something: a
}

would be a record and there is NO room for confusion by the actual human

Anthony Bullard (Jun 26 2025 at 22:59):

and the latter would be the canonical formatting for records

Anthony Bullard (Jun 26 2025 at 22:59):

And there is no change needed in a TypeRecordField

Anthony Bullard (Jun 26 2025 at 23:03):

my_record : {
    name: Str,
    age: U64,
}

would just be

my_record :: {
    name: Str,
    age: U64,
}

Luke Boswell (Jun 26 2025 at 23:04):

It shifts the annotation by one character in single line form

my_record :: { name : Str, age : U64 }
my_record = { name: "john", age: 64 }

my_list :: List(Str)
my_list = [ "one", "two", "three"]

my_int :: U64
my_int = 42

Anthony Bullard (Jun 26 2025 at 23:07):

that's true, but hasn't bothered Haskell users for the past 27 years :rolling_on_the_floor_laughing:

Luke Boswell (Jun 26 2025 at 23:08):

We're not using :: anywhere else right?

Luke Boswell (Jun 26 2025 at 23:10):

Wait... does the :: only go in the declaration part, not inside the record??

Anthony Bullard (Jun 26 2025 at 23:11):

yes

Anthony Bullard (Jun 26 2025 at 23:11):

the record could use the same as in pattern or expr record

Anthony Bullard (Jun 26 2025 at 23:11):

see above sample

Anthony Bullard (Jun 26 2025 at 23:12):

here's some haskell for example

main = do
    str <- getContents
    let rna :: [RNA]
        rna = map (\c -> read [c]) str

    let aminoAcids :: [AminoAcid]
        aminoAcids = decodeAll rna

    putStrLn (concatMap show aminoAcids)

Luke Boswell (Jun 26 2025 at 23:14):

I'm trying to think of a valid example where we have a lower ident after the colon. Is this the only way?

foo : List(a) -> U64
foo = |x| {

    something : a # refers to the `a` from the foo annotation
    something = x

    x.len()
}

Anthony Bullard (Jun 26 2025 at 23:14):

if you are on desktop and can do it, could you move all of this from https://roc.zulipchat.com/#narrow/stream/304641-ideas/topic/Needed.20Function.20signature.20and.20lambda.20expr.20change/near/525987327 to a new topic here in ideas?

something like "Move type annotations to use :: between header and type"

Notification Bot (Jun 26 2025 at 23:15):

45 messages were moved here from #ideas > Needed Function signature and lambda expr change by Luke Boswell.

Anthony Bullard (Jun 26 2025 at 23:15):

thanks @Luke Boswell

Anthony Bullard (Jun 26 2025 at 23:16):

the problem isn't only lower ident after the colon, it's any token that could start an expr

Anthony Bullard (Jun 26 2025 at 23:16):

and you could go quite a ways before you realized this isn't a record field

Anthony Bullard (Jun 26 2025 at 23:17):

At the end of the day, this is Richard's decision but i'm glad we laid out the scenario for everyone and him

Anthony Bullard (Jun 26 2025 at 23:18):

I could change the formatter to format code like this and format a big module with lots of annotations

Anthony Bullard (Jun 26 2025 at 23:19):

to get a good sample

Luke Boswell (Jun 26 2025 at 23:21):

(deleted) to avoid confusion

Anthony Bullard (Jun 26 2025 at 23:23):

i'm not sure about the last one

Luke Boswell (Jun 26 2025 at 23:25):

I guess the proposal here is only changing the type anno.

# Type Annotation
foo :: List(a) -> U64
foo = ...

# Nominal Type Declaration
Foo(a) := [Good(a), Bad]

# Alias Type Declaration
Foo(a) : [Good(a), Bad]

My line of thinking was that these all look nice

(edited to clarify wording of statements)

Anthony Bullard (Jun 26 2025 at 23:29):

i prefer this

Anthony Bullard (Jun 26 2025 at 23:29):

type aliases don't really cause confusion

Luke Boswell (Jun 26 2025 at 23:30):

createUser :: UserId, UserName, UserAge -> User
createUser = |id, name, age| { id, name, age }

getUserName :: User -> UserName
getUserName = |user| user.name

main! = |_| {

    user :: User
    user = createUser(123, "Alice", 25)

    getUserName(user)
}

Yeah, I've been working through examples... and it's definitely growing on me too.

It feels much clearer and visually distinct where we add type annotations.

Anthony Bullard (Jun 26 2025 at 23:31):

i'd love to see what someone like @Niclas Ahden thinks about it, as an actual Roc practitioner

Luke Boswell (Jun 26 2025 at 23:33):

I've gone through most of our snapshot examples and in every case I think the :: is an improvement.

Anthony Bullard (Jun 26 2025 at 23:34):

i hope Richard feels the same

Luke Boswell (Jun 26 2025 at 23:35):

I like that it visually distinguishes the annotations from the aliases (further than just upper/lower case idents)

Luke Boswell (Jun 26 2025 at 23:57):

Here's my attempt at a summary

So the only downsides I can think of are;

is different from current roc (minor impact on strangeness budget)
introduces :: as new operator
visually offsets the annotation by 1 character, imo only really affects single line things (examples below)

foo :: U64
foo = 42

bar :: { age : U8 }
bar = { age: 42 }

The upsides I can think of are;

keep the parser efficient and unambiguous, no need for backtracking
commonly used to mean the same thing in other languages (Rust, Haskell)
improves the visual distinction between type annotations and alias type declarations
three distinct operators for the "type" statements, :: vs := vs :

Luke Boswell (Jun 27 2025 at 00:01):

Also type annotations are always optional, so the :: isn't required on everything, so when it is included it stands out. This feels appropriate because the author has deliberately inserted an annotation.

Anthony Bullard (Jun 27 2025 at 00:20):

i love this summary and i'll let it stand. hopefully someone else will step in and speak up

Richard Feldman (Jun 27 2025 at 01:08):

Anthony Bullard said:

Because today, this:
foo = |x| {
    something : a
}
Should be valid, parseable code - but it is unclear _what_ it should be. Do we have a function that returns a record with a something field with the value of a? Or a block with a single, useless type annotation statement in it?

that would be a block with no expression at the end, which isn't allowed, right?

Richard Feldman (Jun 27 2025 at 01:08):

so I think a record would be the only valid way to parse it

Anthony Bullard (Jun 27 2025 at 01:09):

a block can have one or more statements in parsing

Anthony Bullard (Jun 27 2025 at 01:09):

in Can a block needs a trailing expr

Luke Boswell (Jun 27 2025 at 01:10):

Isn't the issue that there is an unbounded amount you would have to parse ahead before you could know it's not a record.

Richard Feldman (Jun 27 2025 at 01:10):

ah I see

Anthony Bullard (Jun 27 2025 at 01:11):

Luke Boswell said:

Isn't the issue that there is an unbounded amount you would have to parse ahead before you could know it's not a record.

this is the issue for the Parser. the thing Richard is picking on is more about how a human seeing code would parse it

Anthony Bullard (Jun 27 2025 at 01:12):

so there is two distinct issues and i think both are solved with annos using ::

Richard Feldman (Jun 27 2025 at 01:13):

I strongly don't want to do :: so I'd very much prefer to explore alternatives :smile:

Anthony Bullard (Jun 27 2025 at 01:13):

i was worried you'd say that

Anthony Bullard (Jun 27 2025 at 01:13):

is there a reason, or just aesthetics

Anthony Bullard (Jun 27 2025 at 01:13):

Or Haskell PTSD? :rolling_on_the_floor_laughing:

Richard Feldman (Jun 27 2025 at 01:14):

I guess we can go on a brief tangent about that :laughing:

Richard Feldman (Jun 27 2025 at 01:14):

so basically every language except the Haskell family uses : over :: for types (assuming they use one or the other)

Richard Feldman (Jun 27 2025 at 01:14):

the other ML family languages used :: for cons

Anthony Bullard (Jun 27 2025 at 01:14):

Anthony Bullard said:

But I think it's better than

backtracking

requiring a sigil for record

changing the symbol used for separating the header and anno

a symbol or keyword needed to begin a type anno

I think these were the options i found

Richard Feldman (Jun 27 2025 at 01:15):

we don't have to backtrack, there's another fix

Richard Feldman (Jun 27 2025 at 01:15):

the Haskell committee reversed it and used : for cons and :: for types because they thought that people were going to be using cons way more often than type annotations so it would be a nice ergonomics improvement

Richard Feldman (Jun 27 2025 at 01:15):

obviously that decision did not age well

Richard Feldman (Jun 27 2025 at 01:16):

every other language except for Haskell and direct descendents of Haskell (PureScript comes to mind) either always used : for types or went back to it (e.g. Idris and Elm came after Haskell and went with : for types)

Richard Feldman (Jun 27 2025 at 01:17):

separately, it's super common in modern languages to use : without a space for types, e.g. foo: bar

Richard Feldman (Jun 27 2025 at 01:17):

we use a space, which is already a little bit weird; adding a second : just does not seem like a justifiable use of weirdness budget to solve a parsing edge case that can be solved in another way

Anthony Bullard (Jun 27 2025 at 01:18):

yeah so the other option i found which i know you want is inline annos

Anthony Bullard (Jun 27 2025 at 01:18):

but what's this alternative to backtracking

Richard Feldman (Jun 27 2025 at 01:18):

so this is an exploratory idea for sure, but I've been thinking about it for awhile and I haven't been able to come up with a reason that it wouldn't work

Anthony Bullard (Jun 27 2025 at 01:18):

i know we could share nodes between exprs and patterns

Richard Feldman (Jun 27 2025 at 01:18):

exactly

Richard Feldman (Jun 27 2025 at 01:19):

they have total overlap in terms of syntax, and the only things that are invalid in one but not the other can be checked during canonicalization

Anthony Bullard (Jun 27 2025 at 01:19):

yeah but do we just bail on the pattern parsing as soon as we find something that couldn't be a pattern?

Richard Feldman (Jun 27 2025 at 01:19):

*nah

Anthony Bullard (Jun 27 2025 at 01:20):

well Alternatives and as as not part of expr

Richard Feldman (Jun 27 2025 at 01:20):

the idea would be that the parser is just concerned with the structure

Richard Feldman (Jun 27 2025 at 01:20):

right, but canonicalization could give an error for that just as easily

Richard Feldman (Jun 27 2025 at 01:20):

one of the reasons to do the design would be that it could make formatting faster because the parser can do less work

Richard Feldman (Jun 27 2025 at 01:20):

by deferring some of the checks that normally happen during parsing to canonicalization

Richard Feldman (Jun 27 2025 at 01:21):

so parsing becomes about turning tokens into a valid "shape" but not about deciding which things are patterns vs expressions vs record fields vs type annotations etc.

Richard Feldman (Jun 27 2025 at 01:21):

that becomes canonicalization's job, and the job becomes easier because canonicalization has a more complete picture to work with

Anthony Bullard (Jun 27 2025 at 01:22):

Does that mean we need to do Can before we format then?

Richard Feldman (Jun 27 2025 at 01:22):

I don't think so

Richard Feldman (Jun 27 2025 at 01:22):

I can't think of a situation where it would matter :thinking:

Anthony Bullard (Jun 27 2025 at 01:22):

Or am I going to just blow up the formatter when I get a pattern in the middle of formatting an expr?

Anthony Bullard (Jun 27 2025 at 01:22):

Ok, maybe that's true

Anthony Bullard (Jun 27 2025 at 01:22):

I'm down to try

Richard Feldman (Jun 27 2025 at 01:22):

do we format patterns and exprs differently?

Richard Feldman (Jun 27 2025 at 01:23):

I don't think we do, but I could be missing something :smile:

Richard Feldman (Jun 27 2025 at 01:23):

pretty sure they just follow the same rules

Anthony Bullard (Jun 27 2025 at 01:23):

They are different today

Richard Feldman (Jun 27 2025 at 01:23):

interesting!

Anthony Bullard (Jun 27 2025 at 01:23):

Though this is a type

Anthony Bullard (Jun 27 2025 at 01:23):

Different functions

Richard Feldman (Jun 27 2025 at 01:23):

ah so they're separate just because the types are different

Anthony Bullard (Jun 27 2025 at 01:23):

I think they are very similar

Anthony Bullard (Jun 27 2025 at 01:23):

Yes

Richard Feldman (Jun 27 2025 at 01:23):

gotcha

Anthony Bullard (Jun 27 2025 at 01:24):

But the thing we are talking about in this topic is about a TYPE

Richard Feldman (Jun 27 2025 at 01:24):

if we're going to try combining them, there's another cool Zig thing we can try - a technique I liked in Layout

Anthony Bullard (Jun 27 2025 at 01:25):

I can only tell this is NOT a record by trying to parse at least one record field

Anthony Bullard (Jun 27 2025 at 01:25):

Which depending on the type of annotation, could be a large number of tokens

Anthony Bullard (Jun 27 2025 at 01:25):

I can talk more in like an hour, putting kids to bed

Richard Feldman (Jun 27 2025 at 01:26):

layout's tag
layout's union

Luke Boswell (Jun 27 2025 at 01:27):

(deleted)

Luke Boswell (Jun 27 2025 at 01:28):

(deleted)

Richard Feldman (Jun 27 2025 at 01:28):

so the basic idea is that this is similar to a Zig tagged union, except that you can store all the tags, and then separately store all the unions - like what we do with lhs and rhs right now, except that you get to specify that lhs and rhs are unions

code example: https://github.com/roc-lang/roc/blob/main/src/layout/store.zig#L209-L220

return switch (layout.tag) {
    .scalar => switch (layout.data.scalar.tag) {
        .int => layout.data.scalar.data.int.size(),
        .frac => layout.data.scalar.data.frac.size(),
        .bool => 1, // bool is 1 byte
        .str, .opaque_ptr => target_usize.size(), // str and opaque_ptr are pointer-sized
    },
    .box, .box_of_zst => target_usize.size(), // a Box is just a pointer to refcounted memory
    .list, .list_of_zst => target_usize.size(), // TODO: get this from RocStr.zig and RocList.zig
    .record => self.record_data.get(@enumFromInt(layout.data.record.idx.int_idx)).size,
    .tuple => self.tuple_data.get(@enumFromInt(layout.data.tuple.idx.int_idx)).size,
};

Richard Feldman (Jun 27 2025 at 01:29):

the relevant technique there is

.int => layout.data.scalar.data.int.size()

Richard Feldman (Jun 27 2025 at 01:29):

so we know that since the tag was int we can use the data.int union variant

Richard Feldman (Jun 27 2025 at 01:30):

and what's cool about this is that in debug builds, Zig actually secretly tracks at runtime which union variant you instantiated

Richard Feldman (Jun 27 2025 at 01:30):

so if in this code I wrote data.frac instead of data.int there, I'd get a runtime panic in debug builds

Richard Feldman (Jun 27 2025 at 01:30):

saying that I'd put an int in that union originally, but now I'm trying to read it as a frac

Richard Feldman (Jun 27 2025 at 01:30):

of course in release builds it doesn't do this

Richard Feldman (Jun 27 2025 at 01:31):

this is the "untagged unions" feature

Luke Boswell (Jun 27 2025 at 01:33):

https://ziglang.org/documentation/master/#toc-Anonymous-Union-Literals

Richard Feldman (Jun 27 2025 at 01:33):

so Data could use union for its lhs and rhs in this way

Richard Feldman (Jun 27 2025 at 01:33):

and then we'd get that extra runtime safety, plus the code could be more self-documenting in various places

Luke Boswell (Jun 27 2025 at 01:37):

So related to the original problem above.. we'd parse a block/record shaped thing with statement shaped things in them. Then in Can if we have valid statements followed by a final expression we have a block, otherwise it's a record?

Anthony Bullard (Jun 27 2025 at 01:42):

isn't the point of lhs and dhs is that we have a low-byte fixed layout for all nodes and any extra data is referenced is other nodes via indexes stored in extra data?

Anthony Bullard (Jun 27 2025 at 01:43):

maybe i should read more about this but it seems like this could lead to much fatter nodes where the data list has items the size of the largest union variant

Richard Feldman (Jun 27 2025 at 01:44):

yeah so a union in Zig (without an enum in there) - e.g. const Foo = union { ... } is just saying "Foo could be any one of these types at runtime, and I'm not storing any metadata about which it would be"

Richard Feldman (Jun 27 2025 at 01:45):

so yes it's taking up the space of whatever its biggest variant is, but none of its variants would be bigger than u32 anyway

Richard Feldman (Jun 27 2025 at 01:45):

another way to say it is that union is just a way to be more formal than u32 about what different types that u32 could be referring to

Richard Feldman (Jun 27 2025 at 01:45):

but it doesn't change the runtime representation in any way - at least not in a release build

Richard Feldman (Jun 27 2025 at 01:46):

but in a debug build Zig keeps extra info (I guess in a side table somewhere or something?) so you can also get at least a runtime type mismatch if you think you've got one type in there, but actually that's not the type that was put in there in practice when you set the value of lhs or rhs

Anthony Bullard (Jun 27 2025 at 01:50):

i feel like i must be misunderstanding. you are saying data is a union, but nothing in the union would be larger than a u32, but also that it has several nested structs in it?

.int => layout.data.scalar.data.int.size()

what's the union here?

Richard Feldman (Jun 27 2025 at 01:53):

oh no, no nested structs

Richard Feldman (Jun 27 2025 at 01:53):

let me give a concrete example, 1 sec

Richard Feldman (Jun 27 2025 at 02:03):

ok so this code is currently:

.module => |mod| {
    node.tag = .module_header;
    node.data.lhs = @intFromEnum(mod.exposes);
    node.region = mod.region;
},
.hosted => |hosted| {
    node.tag = .hosted_header;
    node.data.lhs = @intFromEnum(hosted.exposes);
    node.region = hosted.region;
},
.package => |package| {
    node.tag = .package_header;
    node.data.lhs = @intFromEnum(package.exposes);
    node.data.rhs = @intFromEnum(package.packages);
    node.region = package.region;
},

Richard Feldman (Jun 27 2025 at 02:04):

...but it could be:

.module => |mod| {
    node.tag = .module_header;
    node.data = .{ .mod = .{ exposes = mod.exposes } };
    node.region = mod.region;
},
.hosted => |hosted| {
    node.tag = .hosted_header;
    node.data = .{ .hosted = .{ .exposes = hosted.exposes } };
    node.region = hosted.region;
},
.package => |package| {
    node.tag = .package_header;
    node.data = .{
        .package = {
            .packages = package.packages,
            .exposes = package.exposes,
        }
    };
    node.region = package.region;
},

Richard Feldman (Jun 27 2025 at 02:04):

...and then Data would be something like:

const Data = union {
    mod: struct {
        exposes: Collection.Idx,
    },
    hosted: struct {
        exposes: Collection.Idx,
    },
    package: struct {
        exposes: Collection.Idx,
        packages: Collection.Idx,
    },
}

memory would be exactly the same ones and zeros as today (at least in release builds)

Richard Feldman (Jun 27 2025 at 02:05):

but now we've documented what the different possibilities are for what could be in Data, and the Zig compiler can use that so if I set node.data = .{ .hosted = ... }; in a particular node, and then later access node.data.package instead of node.data.hosted in that node, I get a runtime exception because I put a .hosted in that node, not a .package

Richard Feldman (Jun 27 2025 at 02:05):

(in debug builds only)

Anthony Bullard (Jun 27 2025 at 02:24):

I see

Anthony Bullard (Jun 27 2025 at 02:25):

That's cool, but I guess I'm at a loss for how this helps resolve the issue that's the root of this particular topic

Richard Feldman (Jun 27 2025 at 02:26):

oh it's separate, sorry

Anthony Bullard (Jun 27 2025 at 02:26):

Were you trying to say earlier that we should share the same nodes for Exprs, Patterns, AND Type Annotations? Because even that doesn't help

Anthony Bullard (Jun 27 2025 at 02:27):

Unless we go SUPER abstract with the syntax tree to the point of barely doing more than tokenization

Anthony Bullard (Jun 27 2025 at 02:29):

Which is basically an entire re-rewrite of the parser at that points

Anthony Bullard (Jun 27 2025 at 02:33):

So taking out backtracking for the moment the other options (besides :: which I still didn't feel there was a compelling argument against):

requiring a sigil for record (like zig)
changing the symbol used for separating the header and anno (to something besides :: if that is out)
a symbol or keyword needed to begin a type anno (something like let as found in many ML languages)

Richard Feldman (Jun 27 2025 at 02:34):

hm, so I just thought of a potentially easier fix:

foo = |x| {
    something : a
}

let's suppose that when we start parsing, we assume we're building up a record

Richard Feldman (Jun 27 2025 at 02:35):

as soon as we hit a , after the expr, we know that's confirmed and we're all set. so for example this comma after a:

foo = |x| {
    something : a,
    other : b
}

Richard Feldman (Jun 27 2025 at 02:36):

conversely, if we later hit something that tells us we're not a record, such as the above without a comma...

foo = |x| {
    something : a
    other : b
}

(which is unambiguously two consecutive type annotations)

Richard Feldman (Jun 27 2025 at 02:36):

then we know we've actually been parsing a block

Richard Feldman (Jun 27 2025 at 02:37):

but an important observation here is that at the point where we make this realization, it is for sure the case that we have parsed exactly:

exactly 1 record field, namely something - which we have interned
the expr that goes after it

Anthony Bullard (Jun 27 2025 at 02:37):

But what if you just get a curly after?

Richard Feldman (Jun 27 2025 at 02:37):

then it's definitely a record

Richard Feldman (Jun 27 2025 at 02:38):

(for the reasons discussed earlier)

Anthony Bullard (Jun 27 2025 at 02:38):

So I'll get a LSP error about an undeclared variable when the code is in this state?

Richard Feldman (Jun 27 2025 at 02:38):

no, because we assume it's a record

Richard Feldman (Jun 27 2025 at 02:38):

until proven otherwise

Luke Boswell (Jun 27 2025 at 02:38):

couldn't it be another record type annotation?

Anthony Bullard (Jun 27 2025 at 02:39):

Since there is a type variable (above the function is a top-level annotation (which is unambiguous) that introduces it. But here a is not a defined variable

Anthony Bullard (Jun 27 2025 at 02:39):

@Luke Boswell not in that position

Anthony Bullard (Jun 27 2025 at 02:39):

That could only be an expr

Anthony Bullard (Jun 27 2025 at 02:39):

And there are two Exprs that start with OpenCurly: Record and Block

Richard Feldman (Jun 27 2025 at 02:41):

Richard Feldman said:

but an important observation here is that at the point where we make this realization, it is for sure the case that we have parsed exactly:

exactly 1 record field, namely something - which we have interned

the expr that goes after it

to finish this thought:

at this point, if we want to change our mind about what we've been building up, we don't have to backtrack and redo work; we can just reach back and swap the node type in constant time

just say "instead of a record with exactly 1 field, whose name we have already interned, this is now a type annotation where the pattern ident is the record field name we interned, and the type is the thing we thought was an expr after a record field"

Richard Feldman (Jun 27 2025 at 02:41):

but that relies on being able to have one node type for types and exprs

Anthony Bullard (Jun 27 2025 at 02:42):

The big thing here is that if we have (the actual motivating snapshot):

identity : a -> a
identity = |x| {
    thing : a  # refers to the type var introduced in function type annotation
    thing = x  # refers to the value from the function parameter
    thing
}

What is thing : a stored in the NodeStore as?

Anthony Bullard (Jun 27 2025 at 02:42):

Richard Feldman said:

but that relies on being able to have one node type for types and exprs

Yeah, and therein lies the problem

Anthony Bullard (Jun 27 2025 at 02:42):

We've now collapsed three nodes types into one node type

Richard Feldman (Jun 27 2025 at 02:43):

at the parsing stage, but they all have the same structure right?

Anthony Bullard (Jun 27 2025 at 02:43):

Which I guess for Nodes maybe isn't the biggest deal

Richard Feldman (Jun 27 2025 at 02:43):

yeah

Richard Feldman (Jun 27 2025 at 02:43):

there are cases where it would be really confusing but I don't actually think this is one of them

Richard Feldman (Jun 27 2025 at 02:44):

like ok [A, B, C] could be a pattern, or a tag union type, or a list expression

Anthony Bullard (Jun 27 2025 at 02:44):

I have to constantly remind myself that Nodes are just a stored representation, whereas the Typed values are what's important to downstream consumers

Richard Feldman (Jun 27 2025 at 02:44):

like the canonicalization logic would be almost the same

Richard Feldman (Jun 27 2025 at 02:44):

it's still "lhs is pattern, rhs is expression"

Richard Feldman (Jun 27 2025 at 02:45):

just the arguments to the function you pass lhs and rhs have a different types

Anthony Bullard (Jun 27 2025 at 02:45):

The can logic wouldn't have to change at all

Richard Feldman (Jun 27 2025 at 02:45):

but they still have the same conditionals and the same branches

Richard Feldman (Jun 27 2025 at 02:45):

yeah exactly

Anthony Bullard (Jun 27 2025 at 02:45):

Except catch things that don't make sense

Richard Feldman (Jun 27 2025 at 02:45):

aside from whatever logic we do or don't decide to move to canonicalization

Richard Feldman (Jun 27 2025 at 02:45):

right!

Anthony Bullard (Jun 27 2025 at 02:46):

Ok, so luckily type annotations come pretty var down the tutorial - and at that point we'll just have to let people know (because of auto-bracketing editors) that a type annotation by itself in a block will be treated as a record

Anthony Bullard (Jun 27 2025 at 02:47):

In case they see weird errors

Luke Boswell (Jun 27 2025 at 02:47):

That would have to be exceedingly rare

Anthony Bullard (Jun 27 2025 at 02:48):

Like with the motivating example, if at some point of entering it they end up in this state:

identity : a -> a
identity = |x| {
    thing : a  # refers to the type var introduced in function type annotation
}

The LSP will report that a is undefined and that the return type of the function doesn't match the annotation of it

Anthony Bullard (Jun 27 2025 at 02:48):

Because when they typed { their editor gave them the final }

Luke Boswell (Jun 27 2025 at 02:49):

You could special case this one scenario

Anthony Bullard (Jun 27 2025 at 02:49):

It'll be a transitory error, but still confusing

Richard Feldman (Jun 27 2025 at 02:49):

hm, why wouldn't it see that as a record? :thinking:

Luke Boswell (Jun 27 2025 at 02:49):

It would -- that's why it would be confusing

Anthony Bullard (Jun 27 2025 at 02:49):

It does see it as a record

Richard Feldman (Jun 27 2025 at 02:49):

oh I see

Anthony Bullard (Jun 27 2025 at 02:49):

But a is not defined in the scope

Richard Feldman (Jun 27 2025 at 02:49):

yeah

Richard Feldman (Jun 27 2025 at 02:49):

I gotcha

Anthony Bullard (Jun 27 2025 at 02:50):

This kind of thing is why with arrow function in JS you have to wrap returning an object as a bare expression in ()s

Richard Feldman (Jun 27 2025 at 02:50):

yeah I'm not worried about that being a problem in practice haha

Richard Feldman (Jun 27 2025 at 02:50):

especially with AI autocomplete probably suggesting a thing = right below

Richard Feldman (Jun 27 2025 at 02:50):

you'd probably tab-complete before the LSP even had a chance to complain haha

Anthony Bullard (Jun 27 2025 at 02:51):

I think it'll be surprising still to many

Anthony Bullard (Jun 27 2025 at 02:51):

Because I know you work for Zed, but there are those of us out there not using AI - even for completions

Anthony Bullard (Jun 27 2025 at 02:52):

But yeah, it's not likely a big issue. But you seem really concerned about confusing new users

Richard Feldman (Jun 27 2025 at 02:52):

fair, but also inline type annotations are super rare in practice

Anthony Bullard (Jun 27 2025 at 02:52):

Which is laudable

Anthony Bullard (Jun 27 2025 at 02:52):

Yeah, to the point of my wondering if they are even necessary outside of the top-level

Richard Feldman (Jun 27 2025 at 02:52):

they are occasionally

Richard Feldman (Jun 27 2025 at 02:53):

sometimes it's really nice for clarifying what something is, because it's nonobvious from other context

Luke Boswell (Jun 27 2025 at 02:53):

I love using them in blocks

Anthony Bullard (Jun 27 2025 at 02:53):

Well, never necessary technically, but convenient

Richard Feldman (Jun 27 2025 at 02:53):

but come to think of it, in those scenarios I almost always find myself adding them after the fact

Richard Feldman (Jun 27 2025 at 02:53):

rather than up front

Richard Feldman (Jun 27 2025 at 02:53):

which also doesn't run into that LSP scenario

Anthony Bullard (Jun 27 2025 at 02:53):

It's documentation

Anthony Bullard (Jun 27 2025 at 02:54):

Yeah, it's a special kind of user who knows exactly what type something is, feels the need to have the annotation, and then writes it before even writing the declaration for it

Luke Boswell (Jun 27 2025 at 02:54):

Anthony Bullard said:

Like with the motivating example, if at some point of entering it they end up in this state:
identity : a -> a
identity = |x| {
    thing : a  # refers to the type var introduced in function type annotation
}
The LSP will report that a is undefined and that the return type of the function doesn't match the annotation of it

By special case, I mean if we are not expecting a record return type, but we have a record with exactly one field then give a different warning that also suggests this might be a block expression or something.

Anthony Bullard (Jun 27 2025 at 02:54):

And also likes a language with total type inference

Luke Boswell (Jun 27 2025 at 02:55):

unfortunately I'd be that guy

Anthony Bullard (Jun 27 2025 at 02:55):

Luke Boswell said:

Anthony Bullard said:
Like with the motivating example, if at some point of entering it they end up in this state:
identity : a -> a
identity = |x| {
    thing : a  # refers to the type var introduced in function type annotation
}
The LSP will report that a is undefined and that the return type of the function doesn't match the annotation of it
By special case, I mean if we are not expecting a record return type, but we have a record with exactly one field then give a different warning that also suggests this might be a block expression or something.

That could be something for error reporting for sure if Can can give us that info

Anthony Bullard (Jun 27 2025 at 02:56):

Luke Boswell said:

unfortunately I'd be that guy

I thought Claude wrote all of your code :stuck_out_tongue_wink:

Anthony Bullard (Jun 27 2025 at 02:57):

So it is resolved: We are not moving to :: for type annotations (sorry @Luke Boswell who I sold it to, and actually came to really like it).

Anthony Bullard (Jun 27 2025 at 02:57):

My action plan going forward, aligning the node structure for Expr, TypeAnno, and Pattern to be the same. And then adopting the strategy of: If I see a statement start with LowerIdent, UpperIdent, OpenRound, OpenSquare, or OpenCurly just start parsing it and then once I know what I'm working with convert the node id to the appropriate typed id

Luke Boswell (Jun 27 2025 at 02:58):

To be fair, @Anthony Bullard you could sell ice to an eskimo

Anthony Bullard (Jun 27 2025 at 02:59):

Hahahahaaha. Try telling that to my Director

Anthony Bullard (Jun 27 2025 at 03:00):

But maybe I don't know how to sell AI solutions yet since I still hate it :stuck_out_tongue:

Luke Boswell (Jun 27 2025 at 03:00):

The selling, the AI, or the solutions?

Anthony Bullard (Jun 27 2025 at 03:01):

At least the first two

Notification Bot (Aug 11 2025 at 10:59):

Nils Hjelte has marked this topic as resolved.

Last updated: Jul 23 2026 at 13:15 UTC