Stream: compiler development

Topic: zig compiler - IRs and pass ordering


view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 00:44):

Are all of those IRs separate data structures? Many (especially in the build phase) Feel like they should be the same IR just with multiple passes running on it.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 00:46):

Also, I feel like adding in reference counting is likely too late in the pipeline. The interpreter will need to run reference counting. So likely we will want refcounting right after check phase and to the interpreter run on the refcounted IR, but maybe I am missing something with the dependencies here.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 00:48):

I also definitely think we should think more about the build phase pass ordering. What would lead to the simplest implementation.

view this post on Zulip Richard Feldman (Feb 02 2025 at 00:53):

I think the interpreter would do its own runtime reference counting

view this post on Zulip Richard Feldman (Feb 02 2025 at 00:53):

separate system

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 00:54):

Sure, but it still needs to know when to check reference counts, right? Like it needs to know when calling a function that the function requires incrementing the reference count of the 3rd arg passed in.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 00:55):

Though I guess that probably can still be figured out at runtime. Hmm.

view this post on Zulip Luke Boswell (Feb 02 2025 at 00:55):

That stage has a big question mark around it. Sam mentioned it's something he was really unsure about what it does or where it should go.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 01:00):

I see it as a few different pieces:

  1. You have inserting refcount increments and decrements based on lifetimes
  2. You have optimizations like drop specialization to remove a ton of refcounting changes by realizing when records or containers go out of scope but some of their elements stick around (like record destructuring).
  3. You have morphic which we don't plan to port and don't need to worry about for now, but can statically recognize uniqueness for mutation purposes.
  4. You have borrow inference which is super important for hot loop perf and allows avoiding refcounts if a function argument is only ever read from and never potentially mutated (currently only works for lists, but theoretically would be more powerful if it understands aggregates).
  5. Finally, you have actually generation of code that knows how to perform a refcount. For example, you have to know how to recursively decrement refcounts of complex datastructures. This makes code gen a lot simpler cause code gen no longer has to understand how to refcount types.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 01:03):

also, I'm definitely keen to help port and work on a lot of this backend stuff and our new wiring to llvm. I feel like I have a solid bar on many things after specialization (Though I have not actually looked into the algorithm for drop specialization).

The nice thing with most of theses passes is that they are optional. So In the list above, you would start by porting just 1 and 5. Later 2, 3, and 4 can be added in.

view this post on Zulip Richard Feldman (Feb 02 2025 at 01:59):

Brendan Hansknecht said:

Though I guess that probably can still be figured out at runtime. Hmm.

yeah, for example Python does automatic reference counting fully at runtime

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 07:33):

As a note, I realized I missed reset reuse which is also refcounting related. Lots of optimizations passes to eventually add but that I don't think are fundamentally needed.

Oh, as another note talking of irs. Sharing irs between multiple passes means you can turn off passes or implement new passes without changing API boundaries significantly.

view this post on Zulip Sam Mohr (Feb 02 2025 at 09:25):

You're right. However, it's easier to ensure correctness when we have the shape of the IRs constrained to what each phase can actually use and actually produce. Less of a need to say "I shouldn't have a function, save a compiler problem to a big list."

view this post on Zulip Sam Mohr (Feb 02 2025 at 09:25):

It's prioritizing correctness over performance

view this post on Zulip Sam Mohr (Feb 02 2025 at 09:25):

It really won't be that hard to change later

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 16:47):

I think adding 10 to 20 clones of the ir for all the phases is still a mistake either way. And I don't think it helps correctness in many cases

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 16:49):

As a first note, any pass that is optional, if it can avoid changing to a new IR, that really helps test correctness. It makes it easy to toggle any off, which is huge for debugging correctness.

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:50):

I'm pretty sure @Joshua Warner was running into multiple "this should have been desugared" errors while getting fuzzing up to speed

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:50):

We can assume testing will catch these things, but I'd always rather trust types than tests

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 16:50):

On top of that some passes like lifting lambdas really should not change the ir. It will be easier to move a few nodes around then make an entire new ir.

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:51):

And Ayaz keeps iterating that IR translation costs basically nothing compared to algorithmic costs like specialization

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 16:52):

Sure, but even ignoring the compile time cost, I think it hurts correctness for many phases (especially optimization phases that are optional)

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:52):

I don't understand how correctness could be hurt

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 16:52):

Cause you lose the ability to turn off a pass. It is now always required due to creating a new IR.

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:53):

I don't think any of our passes besides maybe refcounting and maaaaybeee lower_ir could be turned off

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 16:54):

reset reuse, drop specialization, borrow inference, morphic (if we ever re-add it), any other future optimization

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:54):

Oh, those, sure

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:54):

Those aren't in the preliminary design

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:55):

The 9 in the preliminary design are all required

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:55):

Everything past that could get away with reuse

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 16:55):

Ah, ok, then quite possibly it is fine.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 16:56):

What are the difference in the middle IRs? like lift functions?

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:56):

The 9 I'm thinking about are:

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:56):

I'll edit that message to talk about structural differences

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 16:58):

yes, please do, I would like to understand

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:58):

done

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:59):

Very terse descriptions, let me know what needs explaining

view this post on Zulip Sam Mohr (Feb 02 2025 at 16:59):

The only one we might consider is solve functions, but that's playing with fire a little bit

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:00):

resolve imports (reused for typechecking) (any "delayed import" values are resolved to the actual values)

Why is this an entire new IR and not just making a new dictionary of resolved imports or editting all import nodes and then banning the old type of import node? What does a new IR help with?

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:00):

just one example

view this post on Zulip Sam Mohr (Feb 02 2025 at 17:04):

As a relative percentage, like 90%+ of the IR is shared

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:05):

yeah, so why make a new IR instead of just making a verifier that the old node type is gone/banned. That is exceptionally easy to fuzz/test correct

view this post on Zulip Agus Zubiaga (Feb 02 2025 at 17:10):

Yeah, some passes seem to just return subsets of the previous IRs instead of changing the fundamental shape of them

view this post on Zulip Sam Mohr (Feb 02 2025 at 17:13):

:shrug: it's a question of trust. Not that big a deal either way

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:14):

For context, I am not against spinning up new IRs, I have worked with mlir professionally, and it is made to make making new IRs super easy. In practice, making too many IRs always sucks. Verifiers and progressively adding more details generally works a lot better if IRs are similar.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:14):

it's a question of trust

Verifiers can remove the need for trust.

view this post on Zulip Joshua Warner (Feb 02 2025 at 17:15):

The feedback on verifiers is far from instantaneous tho

view this post on Zulip Joshua Warner (Feb 02 2025 at 17:15):

e.g. you have to find a failing input first, which is not always obvious

view this post on Zulip Joshua Warner (Feb 02 2025 at 17:16):

If you have different IR types, the compiler will yell at you immediately

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:17):

I agree that it gives some compile time guarantees around typing which can be nice.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:18):

I still am not sold it is worth 9 IRs, many which are exceptionally similar. As I mentioned above, from working with MLIR, too many dialects ends up being more painful to work with. I think there is a balance where some passes should share, but others should be split. Depends on the amount of difference between IRs.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:22):

Anyway, not a hill I will die on, I think 9 execeptionally similar IRs will just make a lot of extra work and maintenance burden for us with little gains.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:23):

I wonder if there are ways we can catch these at compile time without the entire new IR. Not sure.

view this post on Zulip Agus Zubiaga (Feb 02 2025 at 17:25):

Maybe storing the parts that are different out of band?

view this post on Zulip Agus Zubiaga (Feb 02 2025 at 17:25):

Not sure if that helps correctness, though, we would have to look at the specifics

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:25):

Ah yeah, that would work. Then cause the old out of ban thing is no longer around, you are guaranteed to have transitioned.

view this post on Zulip Agus Zubiaga (Feb 02 2025 at 17:26):

Yeah, exactly

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:26):

Though I guess you could still have a stale reference to the old out of ban data potentially.

view this post on Zulip Agus Zubiaga (Feb 02 2025 at 17:27):

There could be debug-mode checks for that I think

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:34):

Yeah, just like verifiers on the same IR for classes of nodes being gone.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:40):

Also, one simple example of maintenance pain. If you add a new IR node, you now how to add it to an wire it through 9 dialects instead of 3. Any larger refactor also becomes multiple times bigger if you have more IRs.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:41):

You lose the ability to ever reorder passes without significant hassle (may not matter for roc if we truly know the pipeline, but has been a big pain in practices for work I have done before)

view this post on Zulip Joshua Warner (Feb 02 2025 at 17:42):

If passes have fundamentally different requirements and guarantees on input and output (e.g. it expects a specific node to have been desugared), then it's already not re-orderable

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:43):

Some passes are not re-orderable. That is true. But that is not true of all passes

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:43):

And that is where verifiers come in.

view this post on Zulip Joshua Warner (Feb 02 2025 at 17:43):

I would contend that it is true of all passes that should have separate IRs

view this post on Zulip Joshua Warner (Feb 02 2025 at 17:44):

(and visa-versa)

view this post on Zulip Joshua Warner (Feb 02 2025 at 17:45):

If you add a new IR node, you now how to add it to an wire it through 9 dialects instead of 3.

This "wire things around" kind of update is the sort of thing I expect LLM-driven tooling to get much better at very quickly.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:47):

I think that is a bad metric for what is a maintainable and fun to work on compiler.

view this post on Zulip Joshua Warner (Feb 02 2025 at 17:47):

Fair :shrug:

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:48):

@Sam Mohr is specialize where we make N copies of each function?

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 17:53):

If so, we may want to order passes as:

  1. lower IR
  2. Add in refcounting and do any sort of refcounting optimizations (not sure we can do all the optimizations before specialization)
  3. specialize
  4. code gen concrete refcounting functions (this will also turn unused refcount nodes into noops)
  5. gen llvm/other backends

Note: for 2, I think at a minimum we can do the refcount insertion and borrow inference. May not be able to do other optimization like reset reuse.

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:00):

How can we do refcounting for a function that uses either a list or an int? We'd have to optionally do it

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:01):

If you're doing this level of reordering, we need a expert to agree on it, aka @Ayaz Hafiz . I'm pretty sure we need to make everything first order before we start lowering

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:10):

Refcounting is the same. Int refcounting will just turn to a noop when finalized. So we would insert refcounts for everything concrete that needs refcounting and anything dynamic that might need refcounting.

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:11):

Why not just do refcounting once at the end?

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:11):

I see your goal

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:11):

Which is to specialize later to avoid doing the same work we need to do later on more copies

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:11):

Just trying to move as much stuff as possible before specialization to avoid repeating it n times

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:11):

Yeah, exactly

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:12):

I'm just assuming @Ayaz Hafiz would have suggested that was an option if we could

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:12):

But maybe I'm putting words in his mouth

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:13):

Also, I assume specializing a dense ir will be faster than specializing a recursive ir. Thus moving lower ir before specialization. I think for the most part this shouldn't hurt complexity/correctness much. Just a different order.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:13):

But fundamentally the same work

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:13):

Actually, I don't know how lowering to statements would work with child functions not having been lifted yet

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:14):

I'm not sure I follow isn't lifting functions much earlier in the pipeline?

view this post on Zulip Luke Boswell (Feb 02 2025 at 23:14):

Can we spin these off into separate discussion threads.. and keep the contributor coord meeting thread scoped to planning/scheduling discussion. It's going to be hard to find things later.

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:14):

It's after type specialization

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:14):

I'm on mobile, could someone else do it?

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:17):

@Sam Mohr can you explain the difference betteen type and function specialization? I assumed function specialization is what leads to N copies of each function, but maybe that is wrong?

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:18):

No, it's different

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:19):

Type specialization takes a copy of every top level value and every top level function and makes a concretely-typed copy for each specific set of types it should have at runtime

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:20):

Function specialization is https://github.com/roc-lang/rfcs/blob/ayaz/compile-with-lambda-sets/0102-compiling-lambda-sets.md#function_specialize

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:20):

oh....

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:21):

Which is a way to convert higher order functions to functions we call using a tag union to represent different possible arguments

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:21):

It's kinda like, if not exactly, defunctionalization

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:21):

Confusing name, I agree

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:22):

and type specialization has to come before all of these:

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:22):

We could maybe call it defunctionalize to make the name distinct

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:22):

It doesn't have to, but it's gonna be pretty hard to do those correctly otherwise I think

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:22):

ack

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:22):

That section of the compiler, even if we combine those phases, will be hard to reorder

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:23):

Yeah, then suggested order sounds fine

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:23):

Though it'd be nice to avoid doing more work for the phases after them because of the type specialization

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:25):

Yeah, would be nice if we could share more. Cause, for example, if you have 10 specializations of the same function that uses lists, they all refcount the same way.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:26):

But unless that is easy to do, I 100% agree with correctness first.

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:28):

Brendan Hansknecht said:

Also, one simple example of maintenance pain. If you add a new IR node, you now how to add it to an wire it through 9 dialects instead of 3. Any larger refactor also becomes multiple times bigger if you have more IRs.

Not sure if we'll come to an agreement, but it feels like we're not on the same page, so I'll ask: if we can't really reorder from the suggested order, then are you still against the IRs being different?

One big benefit of different IRs is that if we make sure that calculating stage N takes the stage N+1 IRs for the dep modules, then we have a type guarantee that the deps must have been compiled first

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:29):

I think it's my lizard brain speaking, but I see fuzzing as a high probability test for correctness, and types as true guarantees

view this post on Zulip Sam Mohr (Feb 02 2025 at 23:30):

I always prefer types where possible, but I'm definitely always a high-level thinker for modeling problems

view this post on Zulip Ayaz Hafiz (Feb 02 2025 at 23:31):

my heuristic for this would be:

view this post on Zulip Ayaz Hafiz (Feb 02 2025 at 23:31):

all optimizations over the so called mono IR, refcounting etc are probably same IR

view this post on Zulip Ayaz Hafiz (Feb 02 2025 at 23:31):

cost of creating a new IR is a bad reason not to create an IR IMO

view this post on Zulip Ayaz Hafiz (Feb 02 2025 at 23:32):

the IR creation being redundant and obfuscating the actual transformation is a good reason not to create an IR, which is the case for refcounting and optimizations

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:37):

Yeah, that sounds pretty reasonable overall as a metric. It's up to the stage author to decide if invariants are simple enough to reuse the same IR. In my mind, many of the "its the same IR minus one specific thing" are pretty simple invariants, but I won't be writing those passes, so whoever writes the passes can decide.

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:39):

@Ayaz Hafiz do you think any work can be hoisted above type specialization to avoid repeating N times? Just curious if there is anything that is flexible that we should try to move now instead of later when the pipeline is more ironed down.

view this post on Zulip Ayaz Hafiz (Feb 02 2025 at 23:49):

what is the worry about what might be repeated N times?

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:51):

It is N times faster to do work once than N times. So anything that is reasonable to move before specialization probably should be.

view this post on Zulip Ayaz Hafiz (Feb 02 2025 at 23:51):

i see

view this post on Zulip Ayaz Hafiz (Feb 02 2025 at 23:52):

i probably wouldn't think about this too deeply right now

view this post on Zulip Brendan Hansknecht (Feb 02 2025 at 23:52):

I don't want to think too deeply about it, but if there is anything we think would be easy to move now without a concern for correctness, I think we should move it. I think later it will be much harder to fix these kinds of issues.

view this post on Zulip Ayaz Hafiz (Feb 02 2025 at 23:53):

if there's something obvious that can be lifted it to an earlier pass it will be obviously and be easy to lift i think

view this post on Zulip Ayaz Hafiz (Feb 02 2025 at 23:53):

for example function lifting can kind of be done whenever after typechecking

view this post on Zulip Ayaz Hafiz (Feb 02 2025 at 23:53):

but that is one of those kind of obvious/trivial things

view this post on Zulip Ayaz Hafiz (Feb 02 2025 at 23:54):

i think its worth embracing that all of this will have to be rewritten at the limit anyway

view this post on Zulip Ayaz Hafiz (Feb 02 2025 at 23:54):

if you want the fastest compiler there are some parts of this architecture that will be blockers

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:27):

Ayaz Hafiz said:

i think its worth embracing that all of this will have to be rewritten at the limit anyway

I don't want to embrace that

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:27):

I think that will lead us to an unnecessarily and noticeably slow compiler for years and years

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:27):

I think that's what most compilers do, and I think we can do better without going overboard like we did before

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:28):

I do think it's fine to have different IRs for each pass (when it would be valuble)

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:28):

but I think eventually once things have settled down we might explore combining some of them as an incremental change

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:28):

eventually as in "someday" not like right after 0.1.0 or anything :sweat_smile:

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:29):

and I think that can be done more incrementally than rewriting the whole compiler

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:29):

i don't know. i don't think you can have both worlds. the current implementation is closer to what the limit needs to look like. but there's a reasoning a rewrite is being discussed.

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:30):

I'm saying I think we should prioritize correctness, simplicity, maintainability, and debugabillity for this rewrite, but I think "embrace that it's all gonna be rewritten and don't worry about it" goes too far

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:30):

im not saying this approach will be too slow or anything like that. it will be totally fine for the vast majority of programs because limiting factors for any non-trivial program are going to be algorithmic, not number of passes or whatever.

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:30):

well, i'm just saying that the fastest possible approach is not this approach

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:31):

for sure!

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:31):

I just think we shouldn't be all-or-nothing about it haha

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:31):

before the goal was "fastest possible" and that got us into trouble

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:31):

I don't think we should aim for "fastest possible" for this rewrite

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:31):

but I also don't think we should aim for "only algorithmic perf matters" because I don't think that will turn out to be correct

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:31):

I mean maybe I'll be wrong!

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:32):

i think it's true. the only cases i can think of in the current implementation where it was a problem was because of algorithms. but maybe i'm missing cases you have in mind?

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:33):

I agree with that, but "the current thing that's heavily optimized for perf only has algorithmic trouble" does not imply "if we have one that's not optimized at all for perf it will also only have algorithmic trouble"

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:34):

that would imply that only algorithmic performance optimizations matter to compilers as a category, which I would be extremely skeptical of haha

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:34):

i agree

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:36):

i think the difference is in limiting factors. it's far more likely that algorithmic passes are the limiting factor because Roc is doing a lot work to lower a high level language to a low level one, than just rewriting stuff to a different form. But at the limit you have to fuse both, and this approach doesn't really work for that, I think.

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:37):

but also like i think they should be two different planes

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:37):

the concerns about the algorithmic passes should be entirely disjoint from the implementation of how data is stored

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:37):

I guess a really big question mark here is whether the "dev backend can be an interpreter and it's fine" hypothesis proves correct in practice

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:37):

that way you have a nice interface boundary between the two and can prefer to optimize one or the other

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:38):

because if that works out, then in dev builds we stop after type-checking

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:38):

and only --optimize builds do any of this expensive stuff

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:38):

if that kind of boundary can be done correctly, then you don't have to worry about getting both right at the same time

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:38):

which get run way less frequently, and the downside of a slower feedback loop is much less (unless you're doing performance optimization, in which case it's painful if they're slow)

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:38):

I think it will work out fine, taking e.g python or ocaml as prior

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:38):

OCaml is interpreted? :astonished:

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:39):

both of which don't have sophisticated interpreters (or at least didn't until the past few years)

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:39):

ocaml has many modes

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:39):

TIL!

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:46):

yeah the specific scenario I'm concerned about (because I don't have any idea how likely it is to come up in practice) is something like:

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:46):

Brendan pointed out that our current dev backend is so unoptimized that it might not actually be faster enough than an interpreter to be noticeable

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:46):

and also, we can potentially JIT up the interpreter if :point_up: becomes a problem in practice

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:47):

which also would still skip all the steps after type-checking for dev builds

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:48):

i think this is the kind of thing where it's probably just worth punting until it comes up

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:48):

if the implementation is simple, then it's not hard to refactor to make this work or introduce complexity for specific usage patterns

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:49):

and maybe it never comes up

view this post on Zulip Richard Feldman (Feb 03 2025 at 00:49):

"this" being the IRs after type-checking?

view this post on Zulip Ayaz Hafiz (Feb 03 2025 at 00:49):

well just the compiler in general

view this post on Zulip Richard Feldman (Feb 21 2025 at 16:19):

just a note I remembered: today, we have automatic dead code elimination of unused constants (e.g. I use a package but I only use some of its exposed operations, and nothing I actually use involves certain top-level constants the package defines) because we compile them to thunks which automatically get DCE'd

view this post on Zulip Richard Feldman (Feb 21 2025 at 16:19):

but in the new compiler where we're instead evaluating constants at compile-time, we need to make sure that we don't include constants that have been compiled down to literals unless they're actually referenced somewhere!

view this post on Zulip Richard Feldman (Feb 21 2025 at 16:20):

(same goes for string literals etc in general)

view this post on Zulip Brendan Hansknecht (Feb 21 2025 at 17:09):

Depends on what section granularity we use

view this post on Zulip Brendan Hansknecht (Feb 21 2025 at 17:10):

You can also put each constant in their own section just like functions.

view this post on Zulip Richard Feldman (Feb 21 2025 at 17:15):

true, although it'll improve build speed if we just only flag things as used when we actually use them, and then only include them if so

view this post on Zulip Richard Feldman (Feb 21 2025 at 17:15):

(compared to adding everything and then having the linker remove them again in a later pass)

view this post on Zulip Brendan Hansknecht (Feb 21 2025 at 17:44):

Same for functions preferably

view this post on Zulip Brendan Hansknecht (Feb 21 2025 at 17:45):

I think we could use specialization to do a liveness check even on functions that are already concrete.

view this post on Zulip Richard Feldman (Feb 21 2025 at 17:47):

we've always done it for functions

view this post on Zulip Brendan Hansknecht (Feb 21 2025 at 17:49):

Oh, never realized

view this post on Zulip Sam Mohr (Feb 21 2025 at 18:57):

I think we can have type specialization handle this


Last updated: Jul 06 2025 at 12:14 UTC