Stream: compiler development

Topic: Canonicalization overhaul - overview


view this post on Zulip Sam Mohr (Jan 13 2025 at 22:32):

For those of you that don't know, I am working on reworking the canonicalization code in roc_can for a few reasons:

In order to achieve these goals, the plan is to break roc_can into two new crates:

Once roc_can_combine is finished, we can pass the results to the type checker as before. In the future, we can even do partial typechecking on the values in each solo module, which should not even require that many changes to roc_constrain, but I don't know how feasible that is.

Some nice things that will come out of this change:

The plan for implementing this change comes down to me understanding the final outcome's shape well enough to implement it in small, peer-reviewable chunks. To that end, I'm working in this branch of my fork with a machete to roughly shape a copy of roc_can into the right shape. At the same time, I have a markdown document on my machine where I'm writing down what the "plain English" recipe for a solo module canonicalization, and then a combined one, will look like. Once I have those together, I'll start making PRs!

Feel free to ask any questions.

view this post on Zulip Joshua Warner (Jan 13 2025 at 22:35):

Awesome!!!

view this post on Zulip Joshua Warner (Jan 13 2025 at 22:35):

Think I'll take a pause on trying to resolve can panics for the moment then

view this post on Zulip Sam Mohr (Jan 13 2025 at 22:40):

This is really my main focus when working on Roc at the moment, but I am just one person. If anyone is concerned by the likely slowdown to implementing static dispatch in a big bundle with all of these other changes, I understand and partly share your concern. Since this doesn't immediately slot into the rest of the compiler plan, it might be a couple months for us to have static dispatch available. So if someone really wants it now, I'd be happy to talk about how we can parallelize this work

view this post on Zulip Sam Mohr (Jan 13 2025 at 22:41):

Otherwise, I'm happy to do this myself. It's really so enriching to get to put blood, sweat, and tears into what will be the best language someday

view this post on Zulip Luke Boswell (Jan 13 2025 at 22:48):

Ah man, you're taking all the fun jobs :smiling_face:

view this post on Zulip Luke Boswell (Jan 13 2025 at 22:49):

I volunteer to help find all the bugs you leave behind

view this post on Zulip Sam Mohr (Jan 13 2025 at 22:51):

Oh yeah, I forgot lmao. When I was writing the snake_case conversion, I wanted to break a test intentionally and write panic!("Luke, you're my only hope") or something.

view this post on Zulip Sam Mohr (Jan 13 2025 at 22:51):

I'll make sure to do that for this next set of PRs just for you

view this post on Zulip Anton (Jan 14 2025 at 10:15):

We have lots of language changes that need to be implemented, and it'll be faster to just implement those all in one go instead of incrementally. I'm talking:

These are not like the syntax changes, it seems like difficult bugs are a real possibility here. For debugging it can help a lot if you only need to consider a small set of changes. Are you sure this will be faster if you include potential debug time for issues that may pop up from the entire Roc ecosystem?

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:19):

The plan is not to make one giant PR with all changes included, but to make the changes incrementally on a new canonicalization code

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:19):

I just have no idea what it'll look like yet. What change do I make first?

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:21):

I don't know how to make these changes in steps. One benefit to this approach is that I get to ignore a lot of features to begin with. I'm starting without tracking lambda sets, without module params, etc.

view this post on Zulip Anton (Jan 14 2025 at 10:21):

changes incrementally on a new canonicalization code

Can you explain this in more detail?

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:22):

If someone would know how to do this without such a nuclear option that wouldnt take 6 months, it'd be great to hear

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:22):

I'm planning on modelling my changes on the strategy Agus has been taking with the new monomorphization code, more or less

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:23):

Start by outlining the end shape, and leaving a whole lot of "implement this and TODO" in places where it's obvious what needs to happen

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:24):

That can help us start with PRs that other people can understand

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:24):

To help with the robustness of this, I think a very important step will also be defining a testing strategy for all of these features

view this post on Zulip Anton (Jan 14 2025 at 10:25):

Makes sense!

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:26):

I've not dug too deeply into that side of things, but the canonicalization testing today is mostly testing individual warnings here or there, and a lot of desugaring testing.

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:27):

We'll need to figure that out as well. My hope is that we can do more unit testing on "just canonicalize this alias", not "create a whole module with aliases and check the problems that arise"

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:27):

That should make it more readable, and more modular

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:28):

Probably once I figure out the overall plan, I'll try to write it up in more detail and jump on a call with someone. That will give me an opportunity to make sure that there isn't a big hole in it somewhere.

view this post on Zulip Sam Mohr (Jan 14 2025 at 10:32):

I think the main steps before I can start drafting an outline for PR are:

view this post on Zulip Luke Boswell (Jan 14 2025 at 22:49):

If someone would know how to do this without such a nuclear option that wouldnt take 6 months, it'd be great to hear

I've been thinking about this.

We will be in a position with two Can stages, the current (legacy) one, and the (new) one being developed. Both of these take the same input, Parser AST... and eventually produce the same output, Mono IR?? right?

Can we wire up a test harness that can feed the same input in, and confirm it's getting the same output?

Starting with the most basic of expressions, but over time as the new Can implementation matures we can add tests and eventually be in a position where we have feature parity.

view this post on Zulip Luke Boswell (Jan 14 2025 at 22:50):

Or maybe there is a way to use the fuzzer, and incrementally add supported AST nodes

view this post on Zulip Sam Mohr (Jan 14 2025 at 22:51):

The same output won't come out because of a number of changes. Static dispatch for instance

view this post on Zulip Luke Boswell (Jan 14 2025 at 22:51):

This won't catch all the new features... but maybe it helps get us something we can use sooner

view this post on Zulip Sam Mohr (Jan 14 2025 at 22:51):

And I'm avoiding supporting abilities

view this post on Zulip Sam Mohr (Jan 14 2025 at 22:51):

But yes

view this post on Zulip Sam Mohr (Jan 14 2025 at 22:52):

For those things in common, it should output the same Mono IR

view this post on Zulip Sam Mohr (Jan 14 2025 at 22:52):

Well...

view this post on Zulip Sam Mohr (Jan 14 2025 at 22:52):

Lambda sets are getting built differently as well

view this post on Zulip Sam Mohr (Jan 14 2025 at 22:53):

In that they're supposed to be built later in the compiler

view this post on Zulip Luke Boswell (Jan 14 2025 at 22:53):

Even without Abilities and Lambda sets... we could still cover a lot of the AST though

view this post on Zulip Sam Mohr (Jan 14 2025 at 22:53):

Probably

view this post on Zulip Sam Mohr (Jan 14 2025 at 22:53):

Worth a try to make sure we're on the right track

view this post on Zulip Luke Boswell (Jan 14 2025 at 22:56):

If we had the new Can module (even just stubbed out) @Joshua Warner might be able to help with the test harness

view this post on Zulip Sam Mohr (Jan 14 2025 at 22:56):

Sure!

view this post on Zulip Sam Mohr (Jan 14 2025 at 22:56):

I think I'd be able to get something in the next few weeks as a stub

view this post on Zulip Luke Boswell (Jan 14 2025 at 23:02):

Sam Mohr said:

And I'm avoiding supporting abilities

Is there a way we could rip this out of current Can, and make it another pass to the side, or move it to the end or something? Basically... could we do something now so we can keep the current impl and then it could be compatible with the new Can?

And I guess lambda sets are in the same boat

view this post on Zulip Joshua Warner (Jan 14 2025 at 23:04):

In my professional experience, it can be very very tempting to do a rewrite _and_ make significant functionality changes at the same time, but it's almost always a terrible idea

view this post on Zulip Luke Boswell (Jan 14 2025 at 23:05):

Yeah, I'm trying to find ways we can keep everything online and enable an incremental approach.

view this post on Zulip Sam Mohr (Jan 14 2025 at 23:06):

That's why I think the first step is to try to understand the end state, and then write down a plan that outlines what things should look like at the end state, and then break that into incremental changes as much as is feasible

view this post on Zulip Luke Boswell (Jan 14 2025 at 23:07):

It's also ok to start and change course along the way. More of a discovery or R&D type approach than an up front engineering effort

view this post on Zulip Sam Mohr (Jan 14 2025 at 23:08):

Yes, I'd call this the R&D stage for sure

view this post on Zulip Sam Mohr (Jan 14 2025 at 23:08):

An idea: between roc_can_solo and roc_can_combine, the latter is basically what we do today, but separated. We can maybe start by making a very small roc_can_solo that only does a little bit of work, and then passes everything else to the old roc_can

view this post on Zulip Sam Mohr (Jan 14 2025 at 23:09):

And eventually we move as much as possible to roc_can_solo until it all works

view this post on Zulip Sam Mohr (Jan 14 2025 at 23:09):

So step two would be to figure out the caching mechanism and roughly set that up

view this post on Zulip Sam Mohr (Jan 14 2025 at 23:09):

And step one is to do the prep work of making roc_can ready for this work

view this post on Zulip Sam Mohr (Jan 14 2025 at 23:10):

Meaning moving to use arenas as much as possible, changing names of things, using CompilerProblems where possible

view this post on Zulip Joshua Warner (Jan 14 2025 at 23:10):

What if we did something like:

view this post on Zulip Luke Boswell (Jan 14 2025 at 23:10):

Yeah, so for anything low technical maturity/R&D I would highly recommend taking a more agile/incremental approach -- keeping everything online and running ops normal.

I think the biggest risk here is the unkown-unkowns (sorry for the cliche's).

view this post on Zulip Sam Mohr (Jan 14 2025 at 23:11):

Yeah, Josh's suggestion is basically what I was expecting. I can try it

view this post on Zulip Sam Mohr (Jan 14 2025 at 23:12):

The subtle difference is that I think that roc_can_solo and roc_can_combine will use the same IR

view this post on Zulip Joshua Warner (Jan 14 2025 at 23:14):

There's no reason that can't be the case eventually

view this post on Zulip Sam Mohr (Jan 14 2025 at 23:14):

Well, one option is for the new desugared IR to be a roc_can_solo::Expr that looks just like desugared IR to start with, but over time we change it bit by bit, and once roc_can_solo::Expr and roc_can_combine::Expr are the same thing, we can use roc_can_solo::Expr

view this post on Zulip Joshua Warner (Jan 14 2025 at 23:14):

I would try to get there incrementally tho

view this post on Zulip Richard Feldman (Apr 11 2025 at 17:48):

so now that the Frontend Masters stuff is wrapped up, I have a backlog of things I should be doing...but what I'm fired up to do instead is to write some Zig canonicalization code :grinning_face_with_smiling_eyes:

view this post on Zulip Richard Feldman (Apr 11 2025 at 17:48):

what's the current status of that? I have no idea how far along things are!

view this post on Zulip Richard Feldman (Apr 11 2025 at 17:48):

I'm assuming @Sam Mohr might know?

view this post on Zulip Joshua Warner (Apr 11 2025 at 18:04):

I have some local changes to implement sexprs for the can ir, planning on submitting a pr “soon”

view this post on Zulip Sam Mohr (Apr 11 2025 at 18:52):

I'd love to see Richard working on it! I don't have that much work done, but I can put it in a branch and see what comes out.

If it's not already obvious, I've been burned out on Roc development for like a month and I don't know how to fix it. I was hoping taking time to play games and not think about it would work, but nothing seems to be working... Life outside has been tough. I'll give an update soon when I have the energy to come back.

view this post on Zulip Sam Mohr (Apr 11 2025 at 18:53):

So yes, thank you Richard for picking up my slack!

view this post on Zulip Brendan Hansknecht (Apr 11 2025 at 19:20):

Sam Mohr said:

If it's not already obvious, I've been burned out on Roc development for like a month and I don't know how to fix it. I was hoping taking time to play games and not think about it would work, but nothing seems to be working... Life outside has been tough.

This is normal and something that might just take time or the right break/inspiration. My time invest in roc significantly varies month by month. Often times, it just takes a while to revive. Generally, certain things re-energize and inspire (like community events and longer vacations eg holidays).

view this post on Zulip Brendan Hansknecht (Apr 11 2025 at 19:21):

Take the time you need and don't worry about roc. It will keep moving and it will still be here when you get back.

view this post on Zulip Anthony Bullard (Apr 11 2025 at 19:29):

We will miss your presence in the chat, hope to see talk to you soon buddy

view this post on Zulip Sam Mohr (Apr 11 2025 at 19:29):

Thanks Brendan

view this post on Zulip Sam Mohr (Apr 11 2025 at 19:30):

Yeah, maybe when work chills out

view this post on Zulip Richard Feldman (Apr 11 2025 at 21:42):

yeah super normal feeling... please don't feel bad about it! You're welcome whenever you're feeling it, just drop in and we'll catch you up on whatever's been happening :heart:

view this post on Zulip Richard Feldman (Apr 11 2025 at 21:42):

and thanks for all your awesome contributions so far!

view this post on Zulip Richard Feldman (Apr 11 2025 at 22:11):

also @Sam Mohr I'm happy to start from a blank slate, so no need to push a WIP branch unless you really want to :big_smile:

view this post on Zulip Anton (Apr 12 2025 at 08:11):

Life outside has been tough.

It pains me to hear that, I hope things get better :hugging:

view this post on Zulip Joshua Warner (Apr 12 2025 at 23:48):

Adding sexpr formatting to the can IR: https://github.com/roc-lang/roc/pull/7737
Note that this is largely untested & doesn't get hit (yet)

view this post on Zulip Isaac Van Doren (Apr 13 2025 at 01:14):

I’m constantly impressed by how mature and emotionally healthy the Roc community is :heart:

view this post on Zulip Sam Mohr (Apr 13 2025 at 21:47):

Yeah, it's really nice to see

view this post on Zulip Sam Mohr (Apr 13 2025 at 21:48):

There's a selection bias in this group for people that are willing to give their free time for no pay to improve the state of programming

view this post on Zulip Sam Mohr (Apr 13 2025 at 21:48):

So how surprised can we be?

view this post on Zulip Anthony Bullard (May 17 2025 at 18:51):

I wonder if anyone has made any progress here?

view this post on Zulip Anthony Bullard (May 17 2025 at 18:52):

I'd really love to do enough to get a Hello World program to be able to run (in an interpreter)

view this post on Zulip Brendan Hansknecht (May 17 2025 at 18:54):

I know this PR is moving, but not sure how much else has moved: https://github.com/roc-lang/roc/pull/7772

view this post on Zulip Anthony Bullard (May 17 2025 at 18:57):

I'd love to sit with someone as they review this and learn how to even make it through it. It's just too much code in an area I am not an expert in for me to review with any sort of authority at the moment

view this post on Zulip Anthony Bullard (May 17 2025 at 18:57):

I've implemented a few simple type checkers, and unification once before (but not to completion). But this is a LOT

view this post on Zulip Brendan Hansknecht (May 17 2025 at 19:00):

Haha, this is an area of the compiler I tend to avoid. I don't feel like it is that complicated, but I have never felt like groking all the type checking pieces.

view this post on Zulip Richard Feldman (May 17 2025 at 19:15):

yeah I took a bunch of notes about that PR on the plane (no wifi, couldn't comment) - overall looks good, I just want to leave some comments

view this post on Zulip Richard Feldman (May 17 2025 at 19:15):

but yeah I'm planning to merge it this weekend!

view this post on Zulip Richard Feldman (May 17 2025 at 19:16):

I also have some cache serialization stuff that's close but needs some more work

view this post on Zulip Luke Boswell (May 18 2025 at 23:14):

I've been following along with all the commits. But I also don't feel qualified to really comment on it.

view this post on Zulip Jared Ramirez (May 19 2025 at 18:03):

Yeah, sorry the PR is so huge — I know it makes it really hard to digest. I debate breaking it up, just didn’t have time. I’ll make sure to chunk things up better in the future for ease of review.

Planning on looking at Richard’s comments in detail later today, but probably will merge as-is then open a follow up PR addressing comments this week.

And I’m happy to find a time with whoever is interested (@Anthony Bullard ?) and talk through it to share the knowledge!

view this post on Zulip Anthony Bullard (May 19 2025 at 20:35):

heck yeah! thanks Jared!

view this post on Zulip Luke Boswell (May 19 2025 at 23:04):

I'd love to join that discussion too

view this post on Zulip Anthony Bullard (May 20 2025 at 12:53):

I think if no one else is working on it, I'd like to try to set up desugaring. Should I start a new topic on that?

view this post on Zulip Anthony Bullard (May 20 2025 at 12:53):

I'm going to assume so :-)


Last updated: Jul 06 2025 at 12:14 UTC