7897: Handle parsing ambiguous nodes
Starting work on updating if-else:
predictNodeIndex
Just a note on the above, I'll be touching huge swaths of check/parse/AST.zig, a good bit of check/parse/NodeStore.zig, and a bit of check/parse/Parser.zig. If you could check in with me about changes there that will be coming in within the next week, it would be helpful to avoid a very painful merge conflict.
I do aim to merge the most impactful stuff to AST and NodeStore as soon as I have tests passing
I'm getting rid of other uses of predictNodeIndex
@Anthony Bullard would you mind looking at https://github.com/roc-lang/roc/pull/7898 does that impact on your parser work?
I'm wanting to finish my crusade to add a single unit test per Can NodeStore add/get variant to verify they roundtrip correctly.
I'm working on SingleQuote
tokenization and possibly parsing @Anthony Bullard it will likely interfere with you changes. although the conflicts shouldn't be very big
Also, I plan to make tokenizer.zig
use Region
s in relevant places in a separate pr
@Kiryl Dziamura I don't think there is work to do here...Maybe I'm wrong. But the main thing that needs to be done is adding a node for it in the Parser and actually parsing it.
The tokenization is happening here: https://github.com/roc-lang/roc/blob/259b290c2a39e45f683d329d16ba2963cec13c68/src/check/parse/tokenize.zig#L980
and it lacks return after this line:
https://github.com/roc-lang/roc/blob/259b290c2a39e45f683d329d16ba2963cec13c68/src/check/parse/tokenize.zig#L999
Oh, I didn't even see that! Great catch!
Planning on starting type checking on match statements once Luke's draft PR lands
I may get the Can for match (at least with current supported syntax) done by your time tomorrow. Somewhat tempted to take a detour and cleanup the snapshot mess I'm feeling like I've contributed to
started working on tokenize.zig
cleanup related to offsets (the goal is to use regions if they don't introduce any penalty):
https://github.com/roc-lang/roc/pull/7917
Generally question about the above PR. Why are we starting lengths in the tokenizer at all?
Retpkenizing should be essentially free.
So why not only store the offset and leave regions to later parts of the stack. Even later parts of the stack likely could just reference two token indices and then just request the start and end from the two tokens.
Not saying that is the best setup, but curious our strategy here
I guess interned
is the reason? maybe it's redundant tho
Also, whitespaces are skipped, I think without end offset we'd have to collect all gaps as tokens
Jared Ramirez said:
Planning on starting type checking on match statements once Luke's draft PR lands
Working on canonicalization + type checking for match
@Jared Ramirez are you building on top of my branch? Does it look ok to merge as is?
Yeah, it looks good to me!
I've been mostly reading about exhaustiveness checking so far, haven't written any code yet. But was planning on starting with error messages until your branch is merged!
#7919 -- I've started adding a roc package. I'm thinking about and exploring how we will do multi file snapshots. Also wanting more realistic examples to test our parser/can implementation against to understand what areas need work still.
My current line of thinking is we make a directory that represents the root, with the .roc
files in it exactly as they would be.
Until we get multiple module things up and running, I might have the snapshot tool just pickup these .roc
files and treat them as independent file
type snapshots and generate a .md
for each...
I'm going to start investigating support for Nominal Tag Unions.
PR https://github.com/roc-lang/roc/pull/7922
#7923 - draft of making effectful functions work (and not use a type variable anymore!) along with some type inference fixes and other miscellaneous improvements
I've started refactoring CIR a little... https://github.com/roc-lang/roc/pull/7925. Basically pulling the obvious parts into separate files (Expressions, Statements, Patterns, TypeAnnotations etc), adding doc comments and examples. etc.
I've been researching how to implement new features like nominal types, and figure I may as well clean up and document everything as I learn more. It's easier for me to understand how things are wired together when it's organised, and hopefully helps the next guy that comes along too.
I can keep this in Draft and rebase it until @Richard Feldman and @Jared Ramirez land those two PRs sometime tomorrow (my time) I assume, I'm not tracking anyone else working in CIR right now.
I've rebased this CIR refactor on main. I'll continue poking at it for a few hours. Please avoid making PR's that touch CIR unless your basing off this branch.
I'm taking a look at type checking binops. Are all of the following, minus and
, or
, pipe_forward
and null_coalesce
going use static dispatch?
/// Binary operators available in Roc.
pub const Op = enum {
add, // +
sub, // -
mul, // *
div, // /
rem, // %
lt, // <
gt, // >
le, // <=
ge, // >=
eq, // ==
ne, // !=
pow, // ^
div_trunc, // //
@"and", // and
@"or", // or
pipe_forward, // |>
null_coalesce, // ?
};
pipe_forward
should be ->
instead of |>
(and prob could use a different name)
it doesn't use static dispatch
it's just sugar for a normal function call
(maybe arrow_call
might be a better name?)
e.g. arg1->my_fn(arg2, arg3)
does the same thing as my_fn(arg1, arg2, arg3)
null_coalesce
also doesn't use static dispatch, and should be renamed to something like return_err
it works like this:
answer = my_result?
...is equivalent to:
answer = match my_result {
Ok(val) => val
Err(err) => return Err(err)
}
or maybe postfix_question_mark
if we want to name it based on how it looks :smile:
btw I definitely want to make it so that if we get type mismatches with these, we report them using the binop names (since that's what was in the source code) rather than the functions they effectively desugar to!
like if I write +
or ?
in my code, I should see +
or ?
in the error message (and not just in the source code snippet, but in the words too!)
okay cool, since no desugaring has happened by type checking, nice error messages should be straightforward here
then ?
shouldn't actually be a binop, since it's really just a suffix? Maybe I should remove it?
true
Maybe this null_coalesce
was ported from the older ??
or optional record fields?
ohh
yeah ??
is totally a binop
that's like Result.withDefault
right?
kinda - it's also not static dispatch, but rather a match
sugar:
answer = my_result ?? 5
...is equivalent to:
answer = match my_result {
Ok(val) => val
Err(_) => 5
}
the distinction matters if you want to use it with things that affect control flow, like ?? return 5
or ?? crash "blah"
whereas if it just desugared to actual .with_default
, those use cases wouldn't work
I'd like to re-attack Nominal Types.
Before I can do that we need to support parsing multiple UpperIdent
separated by Comma
tokens, and also a package prefix, e.g. json.Core.Utf8.Encoder
might be an example of the Encoder
type declaration (presumably nominal but could be an alias) in the Core/Utf8.roc
module inside the json
package.
I've pushed my WIP to https://github.com/roc-lang/roc/pull/7931
I think it's looking ok, but I'm still working through the diffs in the snapshots and picking up minor bugs that I'm fixing. I've ran out of time today to finish this... I should be able to pick it up again Sat evening.
@Anthony Bullard there are changes in here in the Parser and AST/NodeStore to support parsing of qualified types and values in packages/module chains.
Hey! I continue reading the codebase. Now I have some questions regarding parsing. can you please take a look https://github.com/roc-lang/roc/pull/7936/files ?
I was looking at switching our build script back to a proper check
step. On top of that, making check
cover all executables and tests, but then having smaller steps for faster checking. The most relevant probably being check-test
which would likely be a solid default for most work. One thing that still annoys me with how zig/zls currently do checking is that if you have multiple check targets that include the same source, you get a ton of repeated error mesages. So for every executable, we get another copy of each error message. Theoretically the fix for this is to use more modules. That way the executables just import our "main" module and the "main" module runs check once on all the files it contains. I haven't tried factoring this way yet, but I want to make some of this nicer overall (though it would require a refactoring of imports).
Have a draft PR for at least some starter work as I think of better factoring: https://github.com/roc-lang/roc/pull/7942
Not actually sure this is the right way to go, just tinkering and trying to match what zls suggests.
I feel like adding support for qualified Idents, and Nominal Types (including recursive) has been a mighty big yak.
I've gotten to the home stretch a few times now, and then noticed a bug which has taken me down another rabbit hole.
I got type instantiation landed, but it needs a refactor. gonna do that next.
Gonna work on checking nominal types next.
Thinking a next mini-milestone for me might be: after nominal type checking work, I might create a pared-down version of the bool.roc built-in file. Then canonicalize and type check that, and pass those types into the actual user roc file to to use the built-in Bool nominal type in type checking in things like if conditions.
that sounds sweet!
if you wanted to get advanced, you could do the same with List.roc
and implement it as a Cons list (just for now, of course)
:alert: PSA! :alert: as of https://github.com/roc-lang/roc/pull/7949 we now have a new EXPECTED
section of snapshots, which lists the PROBLEMS
we expect to see in the rest of the snapshot (just the ALL CAPS name of the problem and the source region, not the entire error message)
the goal here is just that if we accidentally cause regressions in our snapshots, this will let us know! (If we actually make fixes and reduce the reported PROBLEMS
, then we can update the snapshot with a revised EXPECTED
section.) And since it doesn't verify the entire contents of the snapshot, just the errors reported, it shouldn't give us false positives for things like refactors that change ident numbers etc.
Wow, that PR hits more files than I expected...we have way more snapshots than I realized.
in the future we could maybe also do something with like verifying expected types of things
yeah we've accumulated a lot of them already :smile:
many of them have problems though
(as in, unintentional ones, e.g. because we haven't implemented things yet, or in some cases because they're using old syntax)
I've been talking with @Joshua Warner about the snapshots. I've started reviewing all of the old-syntax ones and I'm going through and deleting any that I think are not helpful, and updating the ones I think are useful.
I still plan on implementing Can for where
, but just taking a quick detour to clean up that snapshots a little. I got a bit carried away just converting them all using a script, and didn't take the time to properly review them.
Last updated: Jul 06 2025 at 12:14 UTC