Over the last couple days I started poking around the compiler to see if I could figure out how to update the abilities syntax. I think I need a hint :upside_down:
Specifically, what part of the compiler identifies specific character sequences in the source file and translates them to AST nodes?
What you describe is the parse
crate in general. I don't know the specifics about the abilities parser but I would start at the type_annotations
module.
For example, here's how the has
clause is parsed
Which is used by has_clause_chain
to parse one or more of them.
You can also see the |
char here which we want to replace by where
Ah ha! I did spend some time digging around in the parse
crate, but not quite enough by the looks of it.
This is very helpful, I think I can change the behavior of the parser now. Once I have that working, I’ll update the various types s/Has/Implements/g
Then I can update tests, comments, etc.
I realize we also want to require curly brackets for new ability definitions but I’m hoping I can make that a separate PR.
Thanks @Agus Zubiaga !
Sounds good! Happy to help if you have more specific questions later :)
As you reading more of the parsing code you'll start to notice some of these helpers everywhere: and!
, loc!
,wordN
, map!
, etc.
These are defined in parser.rs. Looking at how these are implemented really helped me demystify them.
Oh, also! You can find where things are parsed by opening parse/src/ast.rs and using your editor's Go to References
feature. Look at those under crates/compiler/parse/src
and you'll find where the AST node is created:
CleanShot-2023-05-18-at-11.07.472x.png
Ok, looks like it’s time for me to bite the bullet and learn how macros are defined in Rust. This day was bound to arrive eventually.
I did use the go to references feature in my editor, but I’m pretty sure I was looking at references to the wrong symbols or that I just missed the key references I needed.
I also just realized there’s an entire README in the compiler crate that I missed. I guess being productive on a plane is harder than I thought lol.
Thanks for the pointers, this helps a lot!
For the Set
builtin I had to change
Set k := Dict.Dict k {}
has [
Eq {
isEq,
},
]
to
Set k := Dict.Dict k {} | k implements Hash & Eq
implements [
Eq {
isEq,
},
]
in order to make the compiler happy. Is this correct?
@Brendan Hansknecht You left a comment here a while back (https://github.com/roc-lang/roc/blob/28e56937ccab4da5792fa9779bb3779063a2f9f5/crates/compiler/builtins/roc/Set.roc#L26)
Note: I'm only working on s/has/implements/g
for now.
Assuming it doesn't break the formatter anymore, that is a valid change. Just is clearer specification of the type.
Not sure why just a naming change would make it required though
Yeah, it makes me wonder if I did something wrong in my changes to the parser.
Yeah the constraint on k
should not be required. Did the definition of Dict change in a similar way?
All that changed for the constraint on k
in Dict
is the keyword has
changing to implements
Draft PR is #5426
When Set is defined as:
Set k := Dict.Dict k {}
implements [
Eq {
isEq,
},
]
Compilation fails with this error (followed by a huge string of others):
── TOO MANY TYPE ARGUMENTS ─────────────────────────────────────────── Set.roc ─
The Dict opaque expects 2 type arguments, but it got 4 instead:
31│> Set k := Dict.Dict k {}
32│> implements [
33│> Eq {
34│> isEq,
35│> },
36│> ]
Are there missing parentheses?
But defined like this:
Set k := Dict.Dict k {} | k implements Hash & Eq
implements [
Eq {
isEq,
},
]
The formatter fails:
An internal compiler expectation was broken.
This is definitely a compiler bug.
Please file an issue here: https://github.com/roc-lang/roc/issues/new/choose
thread 'main' panicked at 'Formatting bug; formatted code isn't valid
I wrote the incorrect result to this file for debugging purposes:
crates/compiler/builtins/roc/Set.roc-format-failed
Parse error was: NotEndOfFile(@686)
', crates/cli/src/format.rs:79:13
Hmm
# crates/compiler/builtins/roc/Set.roc-format-failed
## Provides a [set](https://en.wikipedia.org/wiki/Set_(abstract_data_type))
## type which stores a collection of unique values, without any ordering
Set k := Dict.Dict k {} | k has Hash & Eq
has [
Eq {
isEq,
},
]
Are the builtins formatted during the build process? Looks like I failed to update the formatter.
after fixing the formatter, it seems to be working fine as long as I keep the constraint on k
for Set
However, it still seems to be broken if I don't include the constraint on k
I'm also getting some weird failures on reporting tests. for example, cargo test --package roc_reporting --test test_reporting -- test_reporting::ability_first_demand_not_indented_enough
gives me:
-old snapshot
+new results
────────────┬──────────────────────────────────────────────────────────────────────────────────
0 │-── UNFINISHED ABILITY ── tmp/ability_first_demand_not_indented_enough/Test.roc ─␊
0 │+── SYNTAX PROBLEM ──────────────────────────────────────── /code/proj/Main.roc ─␊
...
Not the sort of failure I would expect.
This seems to me to be a strong indicator that I either introduced or uncovered a bug when making changes to the parsing logic.
I have two questions:
I’ll take a look in a few minutes. I think larger sized PRs are fine for this kind of thing, namely an atomic refactor of the language. I think we should figure out why the type definitions changed though, that should not be necessary.
The type changes were things such as ast::Has
-> ast::Implements
, ast::HasClause
-> ast::ImplementsClause
, etc. My thought was that like leaving them as-is could be a source of confusion in the future.
The PR LGTM so far! Left a couple comments.
Sweet thanks!
I noticed that has
was not defined as a constant in roc_parse::keywords
is there a specific reason for this, or is it ok if I add a new constant there for implements
and replace the various string literals?
(I definitely did not spend hours troubleshooting a simple failure to find and replace. Not me. Why would you think that?)
Bryce Miller said:
I noticed that
has
was not defined as a constant inroc_parse::keywords
is there a specific reason for this, or is it ok if I add a new constant there forimplements
and replace the various string literals?
we should probably rename that - it's just for reserved keywords in expressions specifically
so implements
is only reserved in type annotations and ability definitions
really, what we should do is have a separate list of keywords that are reserved in expressions vs in types!
Richard Feldman said:
really, what we should do is have a separate list of keywords that are reserved in expressions vs in types!
If we did that, would that mean you could use type keywords in expressions and vice versa? E.g. Could you have a type signature List if -> List if
? Not saying anyone would do this, but would that change make it possible?
heh, I guess? Seems like a bad idea, of course :sweat_smile:
to be honest, that does make a decent argument for unifying the keywords list just to make the rules easier to understand :thinking:
Ohh, are the strings in the keywords
array automatically allowed in expressions? I was wondering what that was for.
All I really wanted was a const
How would a separate list of keywords reserved in types allow keywords from the other list to be used in types?
Aside from names changing in the AST files for test_snapshots
, would changes to regions also be expected?
I'd also like to confirm whether you want me to keep the changes I made to names of types, functions, values, etc. I could revert those changes so that this PR contains only functional changes. Then make another PR changing the names of things to reflect the new syntax.
The PR is already at +665 −617 and I haven't changed |
to where
or added the brackets for ability definitions yet.
I just took a look through it and it looks fantastic!!! :heart_eyes:
I could revert those changes so that this PR contains only functional changes. Then make another PR changing the names of things to reflect the new syntax.
I don't think that's necessary; I think it's fine if both changes are in the same PR :thumbs_up:
Aside from names changing in the AST files for
test_snapshots
, would changes to regions also be expected?
I'd expect regions to change if keywords change lengths, so yeah I'd assume there would be some region changes here!
How would a separate list of keywords reserved in types allow keywords from the other list to be used in types?
if we only give errors for identifiers from one list vs the other depending on whether we're parsing a type vs an expression
but honestly based on this discussion I think we should unify those lists and just have one list of reserved keywords for both types and expressions :big_smile:
Richard Feldman said:
How would a separate list of keywords reserved in types allow keywords from the other list to be used in types?
if we only give errors for identifiers from one list vs the other depending on whether we're parsing a type vs an expression
Ohhh I think I understand now.
Richard Feldman said:
Aside from names changing in the AST files for
test_snapshots
, would changes to regions also be expected?I'd expect regions to change if keywords change lengths, so yeah I'd assume there would be some region changes here!
Ah ha, right. So it looks like the only changes to these ASTs are names and regions. Would it make sense to just overwrite the old AST files with the new output? And if so, is there a script somewhere to do that?
Speaking of ASTs, while I was looking at the code I was kinda wondering why most nodes included a reference to a Loc
. It just occurred to me that that’s probably there to facilitate error messages. Is that correct?
I'm running into thread 'gen_abilities::encode_derive_to_encoder_for_opaque' has overflowed its stack
Ignoring that test results in a failure for the next test that uses derived abilities.
Any suggestions on where to check for the problem?
It's looking as though merging main has fixed the problem
now gen_tags::issue_5162_recast_nested_nullable_unwrapped_layout
is failing with a stack overflow, but we've made progress
Does it fail in release mode?
I’ll try that
Yeah that one seems to pass in release mode
okay then you will want to use the with_larger_stack helper to increase the test stack size in debug mode
That worked. Thanks!
I have completed s/has/implements/g
and s/|/where/g
. I'm now working on adding the braces around ability members in ability definitions.
## Definition of the [DecoderFormatting] ability
DecoderFormatting implements {
u8 : Decoder U8 fmt where fmt implements DecoderFormatting,
u16 : Decoder U16 fmt where fmt implements DecoderFormatting,
u32 : Decoder U32 fmt where fmt implements DecoderFormatting,
}
Does the following approach make sense?
roc_parse::ast::TypeDef::Ability
fromAbility {
header: TypeHeader<'a>,
loc_implements: Loc<Implements<'a>>,
members: &'a [AbilityMember<'a>],
},
to something like
Ability {
header: TypeHeader<'a>,
loc_implements: Loc<Implements<'a>>,
ann: Loc<TypeAnnotation<'a>>,
},
roc_parse::type_annotation
to parse the text following the implements
keywordI noticed that the implements
keyword plays the same role as :
or :=
in alias or opaque type definitions, but is the only type definition that maintains the location of that sequence in the AST.
An alternative I see to the above approach is to parse the braces in roc_parse::expr::finish_parsing_ability_def_help
I was planning to bring this up later, but I wonder if there is any merit to using something other than the implements
keyword for ability definitions. In this context, it has a different meaning than it does in other contexts.
In the the following contexts, the implements
keyword means "the preceding type implements the following ability":
# Type annotations:
foo : a -> Str where a implements Bar
# Implementing abilities for opaque types:
Foo := U64 implements [Eq]
In the context of an ability definition, the implements
keyword means "The preceding name identifies an ability which can be implemented by providing the following functions"
One idea I had for disambiguating these use cases was to reserve a new keyword such as implemented-with
to use instead of implements
in ability definitions. I don't like reserving another keyword though, and that one is long and cumbersome.
It occurred to me today that perhaps another approach would be to use a symbol or set of symbols, like all other type and value definitions use:
# Value Def
nice = 69
# Type Def (alias)
Foo : { bar : U64 }
# Type Def (opaque)
Baz := U64
# Ability Def (???)
SomeAbility :@ {
foo : a -> Str where a implements SomeAbility,
}
This seemed relevant to me because I think the first approach I suggested for implementing the braces requirement would make more sense if ability defs looked more like the other type annotations whose parsers live in type_annotation.rs
Anyway, the last few messages should probably be moved to #ideas > Abilities Syntax. I wasn't even planning to bring that up until after we had a chance to use the new syntax for a while.
Possibly related:
When I add the implements
keyword to roc_parse::keyword::KEYWORDS
, the compiler chokes on ability definitions in the builtins.
Does anybody have any advice on which direction to go in regards to parsing the braces? I’m not liking my first idea because an ability definition really isn’t a type annotation. I know I will also need to provide error messages for missing braces.
I apologize for the parsing questions. I think I’d be much better off here if I had experience implementing a parser for a toy language. I suspect I’m getting a little tripped up simply because I haven’t worked with parsers before.
have you taken a look at the record!
macro in the parser? there are various places which use it, and I think that's what I would use here!
Thanks for the lead! I’ll take a look!
cool, happy to help! :smiley:
I’ve noticed special treatment of indentation for ability definitions. For example, ability::demand_parser
takes an ability::IndentLevel
and returns a parser which produces a (u32, AbilityDemand)
value.
Does adding braces around the list of demands change how we treat indentation here?
It looks like I’m going to have to make more than a couple tweaks to get the types to work out for the record!
macro, because the parsing logic for ability definitions seems different from most of the rest of the parsing logic.
Bryce Miller said:
Does adding braces around the list of demands change how we treat indentation here?
yeah I don't think we should have to care about indentation anymore after introducing the record literal :thumbs_up:
I see @Richard Feldman and @Anton have merged my branch into main. I feel I owe you all a status update, and regret that I haven't done so before now.
As you can see, I completed the |
-> where
and has
-> implements
changes. I have not yet completed the proposed addition of curly braces in ability definitions. This is because I was getting tripped up a lot in my attempts to implement the change, and my personal life has also been hairy.
Realistically, I don't think I'll be able to implement the last part of the proposal. I apologize for not recognizing this and letting you know sooner. I kept thinking that I might be able to try again and get it, so that we would have only one breaking syntax change instead of two. But alas, weeks have passed and nothing of the sort has happened.
I could possibly manage it with a pairing session, but anyone who would pair with me on this could probably get it done faster alone. I don't want to take anyone away from the other important things they are doing to make Roc awesome. Ergo, I think I'm going to need to pass the torch.
hey no worries! I really appreciate all the work you did on it - you got the most valuable part done for sure!
Glad I was able to help a little!
I would say a lot! :smiley:
thanks again!
Last updated: Jul 05 2025 at 12:14 UTC