Stream: compiler development

Topic: Scope behaviour


view this post on Zulip Luke Boswell (Jun 18 2025 at 10:51):

I haven't touched any code today. Instead I spent my time re-reading through the design documents and trying to understand/scope out the intended behaviour of Scopes specifically... and their interaction with lambdas, var re-assignment, and the loops for/while.

I've put it all together into the following design document
https://gist.github.com/lukewilliamboswell/9175249d51fa89b26d7a32bd308fc531

I've been through the scenarios a few times, but I'm not super confident I've got everything correct. I feel like I'm close, but would appreciate a second pair of eyes to sense check the behaviour.

It will be much easier for me to implement (and verify correctness) once I know for sure how it's meant to behave.

view this post on Zulip Anthony Bullard (Jun 18 2025 at 11:38):

I read this today and discussed in detail with Luke, but in summary this is my feedback:

view this post on Zulip Luke Boswell (Jun 18 2025 at 11:44):

I thought we needed Scope in the parser to avoid an extra pass through the AST. Anthony has an idea for how we avoid that. It sounds like a good idea to try. If we can keep the Scope functionality in Can, then we should rename Level to Scope and move the levels: std.ArrayListUnmanaged(Level) = .{}, -> scopes: std.ArrayListUnmanaged(Level) = .{}, into src/check/canonicalize.zig.

view this post on Zulip Anthony Bullard (Jun 18 2025 at 11:46):

I don't think a single iteration through the top-level decls in a module is expensive enough to justify Scope in parsing

view this post on Zulip Anthony Bullard (Jun 18 2025 at 11:47):

And we can toss levels away after Can, as we only need CanIR and the exposed members after that

view this post on Zulip Luke Boswell (Jun 18 2025 at 11:47):

It wouldn't be that hard to change it later if we decided we really needed to save a pass through the AST (in the worst case)

view this post on Zulip Anthony Bullard (Jun 18 2025 at 11:49):

To be clear, it's not a pass through the AST. The module has the top-level decl StatementIds in a slice, iteration should be _very_ straightforward as we only need to find decls and introduce idents in the patterns

view this post on Zulip Luke Boswell (Jun 18 2025 at 11:49):

Does that mean I can have both of these next to each other?

x = 2
var x_ = 3

view this post on Zulip Anthony Bullard (Jun 18 2025 at 11:49):

Luke Boswell said:

Does that mean I can have both of these next to each other?

x = 2
var x_ = 3

Yes. Though I doubt that exact scenario would often creep up.

You might want to do something like

var count_ = 0
for something in some_list {
    count = something.get_count()
    count_ = if count > 5 count_ + count else count_
}

Which looks dumb but in a realistic scenario like this, not naming count that seems like an unnecessary inconvenience

view this post on Zulip Luke Boswell (Jun 18 2025 at 11:56):

My thought process is a bit silly, consider these scenarios the _;

So it feels to me more like a semantic meaning thing that modifies an ident, and not a syntax thing.

view this post on Zulip Anthony Bullard (Jun 18 2025 at 11:59):

So, do you think _foo and foo are different?

view this post on Zulip Luke Boswell (Jun 18 2025 at 12:01):

They feel like the same thing.

view this post on Zulip Anthony Bullard (Jun 18 2025 at 12:05):

Interesting. I disagree, but we should get a lot of feedback and thoughts on it.

view this post on Zulip Anthony Bullard (Jun 18 2025 at 12:06):

I also don't mind (and actually kind of love) being able to use ' as much as you want at the end of an ident like you can in OCaml and F#

view this post on Zulip Anthony Bullard (Jun 18 2025 at 12:06):

And they are all unique

view this post on Zulip Anton (Jun 18 2025 at 12:29):

_foo and foo are different

This seems to be the simplest approach

view this post on Zulip Richard Feldman (Jun 18 2025 at 12:56):

yeah

view this post on Zulip Richard Feldman (Jun 18 2025 at 12:56):

I appreciate the concerns about it, but in literally every language I've ever heard of, identifiers are different if their names are different, full stop

view this post on Zulip Richard Feldman (Jun 18 2025 at 12:57):

I don't think this is worth the strangeness cost

view this post on Zulip Richard Feldman (Jun 18 2025 at 13:04):

regarding reassignment being disallowed across function boundaries, here's what I was thinking:

view this post on Zulip Richard Feldman (Jun 18 2025 at 13:18):

happy to elaborate/clarify/restate any of that :smile:

view this post on Zulip Brendan Hansknecht (Jun 18 2025 at 16:39):

Richard Feldman said:

I appreciate the concerns about it, but in literally every language I've ever heard of, identifiers are different if their names are different, full stop

I can think of a few exceptions, but they are very niche.

view this post on Zulip Brendan Hansknecht (Jun 18 2025 at 16:39):

What does for is not special mean?

view this post on Zulip Anthony Bullard (Jun 18 2025 at 17:27):

Brendan Hansknecht said:

What does for is not special mean?

just means that for introduces a normal scope just like a when branch, not anything special like a function scope

view this post on Zulip Luke Boswell (Jun 18 2025 at 23:13):

I've updated my scope analysis above. I've also started implementing this refactor.

There's a few things that need implementing/fixing along the way before I can properly validate the behaviour is correct :sweat_smile:

view this post on Zulip Anthony Bullard (Jun 18 2025 at 23:40):

such as...?

view this post on Zulip Luke Boswell (Jun 18 2025 at 23:41):

Parsing statements in lambdas

view this post on Zulip Anthony Bullard (Jun 18 2025 at 23:41):

Ok

view this post on Zulip Anthony Bullard (Jun 18 2025 at 23:42):

that should be relatively easy if we constrain it at first to decls and a limited set of exprs

view this post on Zulip Luke Boswell (Jun 18 2025 at 23:42):

My methodology is basically write out a snapshot test that I want to work.

Then step through the tokens, parser, problems etc and work my way down the compiler stages ensuring everything is behaving the way I think it should.

view this post on Zulip Anthony Bullard (Jun 18 2025 at 23:42):

and then we can move to ifs, when's, crash, expect, etc

view this post on Zulip Anthony Bullard (Jun 18 2025 at 23:43):

so is that what you would like me to help
with?

view this post on Zulip Luke Boswell (Jun 18 2025 at 23:46):

No thank you. I'm chipping away at that. I'm posting things here just to keep everyone informed of what I'm doing so we avoid duplicating efforts.

view this post on Zulip Luke Boswell (Jun 19 2025 at 00:20):

We talked about a top-level var being an error, but we thought we should introduce it to our scope anyway. It's currently a parser error so we have a malformed node and therefore we don't have any information to handle it in Can.

~~~SOURCE
module []

# This should cause an error - var not allowed at top level
var topLevelVar_ = 0
(file (1:1-39:33)
    (module (1:1-1:10) (exposes (1:8-1:10)))
    (statements
        (malformed_stmt (4:1-4:4) "var_only_allowed_in_a_body")

If we want to continue, we could potentially make this a diagnostic instead of a malformed AST node, then convert it to an assignment without the var or trailing underscore. This would be a change to the Parser.

view this post on Zulip Luke Boswell (Jun 19 2025 at 08:38):

#7842 -- DRAFT

Implements some of what I described in the Scope behaviour above. I'd like to re-read with fresh eyes tomorrow before marking as ready for review.

Here is a demo of it so far...

module []

my_long_ident = "global"

foo = |_| {
    my_long_ident = "shadowing here"

    var sum_ = 0

    sum_ = sum_ + 1
    sum_ = sum_ + 1
    sum_ = sum_ + 1

    sum_
}

Screenshot 2025-06-19 at 18.38.10.png

view this post on Zulip Luke Boswell (Jun 19 2025 at 08:42):

I am reasonably sure the CI failures are from snapshots we have that include invalid things in them, and then when a PROBLEM includes a slice of the original source this causes issues across OS's. I need to investigate further if we want to have pretty rendered problem reports included in snapshot files.

One hack solution might be to add a flag in the META section to include pretty rendered problems, otherwise by default it just prints out the tag name, for example PARSER .not_implemented.

view this post on Zulip Anthony Bullard (Jun 19 2025 at 10:11):

I know I would prefer to have the pretty printed errors so that we can have the snapshots also acting as test for the problem reports as well.

view this post on Zulip Luke Boswell (Jun 19 2025 at 11:23):

We can have them. Maybe the flag behaviour is to turn the pretty off instead. So we could flag snapshots that are deliberately testing "misuse" or bad utf8 etc.


Last updated: Jul 06 2025 at 12:14 UTC