The interpreter is still quite slow. Is it worth it to optimize it, or will it be replaced by a dev backend soon anyway?
One concrete slowdown I see is that the compiler get's significantly slower with loops over 2000.
app [main!] { pf: platform "https://github.com/lukewilliamboswell/roc-platform-template-zig/releases/download/0.6/2BfGn4M9uWJNhDVeMghGeXNVDFijMfPsmmVeo6M4QjKX.tar.zst" }
main! = |_| {
_foo = 0.to(<n>)
Ok({})
}
depending on n, I have the following time output (Windows x86 machine):
| n | time (s) |
|---|---|
| 500 | 0.283 |
| 1000 | 0.323 |
| 2000 | 0.47 |
| 3000 | 1.282 |
| 4000 | 2.331 |
| 8000 | 9.773 |
I did not investigate. Is it worth doing so? And performance of the interpreter in general?
I think we want to keep the interpreter for constant folding. I would also like to keep it to keep our options open. It could be difficult to resurrect compared to just keeping it running. Definitely for this simple case where perf is bad, optimization seems worth it.
I recommend checking out https://github.com/roc-lang/roc/blob/main/src/PROFILING/README.md
My understanding was that the interpreter would eventually become fast enough to become the de-facto dev workflow. Other languages have shown that interpreters can be fast enough for development workflows. Also, another advantage is that it is much more amenable to a debugger.
I recall the main benefit of dev backends was to shorten the startup time albeit zero optimization. I believe the interpreter should startup even faster and can eventually be JITTED for optimization on hot paths.
So what why are the dev backends being resurrected ? I am not sure... I suppose there is a Claude translation just in case so that they are not lost when the rust backend gets yanked.
(i am just an outsider, so take my words with a huge grain of salt)
Related conversation:
I dont think there is any short or medium term plans to not have an interpreter.
Longer term the benefits of reviving the dev backends is that we can simplify a lot of things at runtime I think.
Before we started the rewrite the hypothesis was that an interpreter would be relatively easy to build... in practice we have learnt that there is a lot of type wrangling that has to happen at runtime - to support polymorphism. So instead of using the same Mono build pipeline at compile time we have just as much work to do at runtime.
In a compiled world with dev backends and if the interpeter didnt exist, we could eliminate all of this additional complexity and just have a single pipeline.
Fabian Schmalzried said:
The interpreter is still quite slow. Is it worth it to optimize it, or will it be replaced by a dev backend soon anyway?
One concrete slowdown I see is that the compiler get's significantly slower with loops over 2000.
app [main!] { pf: platform "https://github.com/lukewilliamboswell/roc-platform-template-zig/releases/download/0.6/2BfGn4M9uWJNhDVeMghGeXNVDFijMfPsmmVeo6M4QjKX.tar.zst" } main! = |_| { _foo = 0.to(<n>) Ok({}) }depending on n, I have the following time output (Windows x86 machine):
n time (s) 500 0.283 1000 0.323 2000 0.47 3000 1.282 4000 2.331 8000 9.773 I did not investigate. Is it worth doing so? And performance of the interpreter in general?
I definitely think it's worth investigating any performance issues. It shouldn't be really bad or anything. This could just be some obvious bug in our implementation.
We had something similar recently where Brendan sped us up like 6000x with a single line change or something like that.
The focus so far has been on correctness, so there is a lot of low hanging fruit in the performance department.
Alright, thanks for all the input. I will try to see what I can do to improve the performance.
So one culprit for the slowdown is types.snapshot() in unifyWithConf(..), which is used for restoring, when unification fails: https://github.com/roc-lang/roc/blob/eb54be9749da1966bcf0ce01a87b3f96a3919e87/src/check/unify.zig#L183
Removing this turns the ~9.7s to ~1.2s.
Of cause just removing this does break something, at least the tests are now broken.
Anyone has some insights on how to handle this? I see those options:
Helpful commit message here: https://github.com/roc-lang/roc/commit/8b97a2d03e1c485507d599a18262f75c11f074d8
What do you think @Richard Feldman?
I'm leaning towards "1. Figure out why this simple code is unifying so much and reduce that (probably worth it anyway)"
agree, in this code sample the value of n should not increase the number of times the type checker unifies.
also note, there are compile time types (in Check) and runtime types (in the interpreter), so it’s also worth figuring out which part of the codebase the unifies are coming from
Anton said:
Helpful commit message here: https://github.com/roc-lang/roc/commit/8b97a2d03e1c485507d599a18262f75c11f074d8
reading through that commit message, the optimistic snapshot seem wasteful if the unification does not error.
i wonder if there’s some other way to solve this, like i wonder if we change unification order so we could avoid the “inner unify success pollutes outer error message” thing
I think this strategy goes all the way back to elm's type checker...the only idea I have for how to improve it is basically a "double or nothing" - we optimistically assume there will be no type mismatches in the entire module, and then if we're wrong, we start over and redo type checking for the entire module with snapshotting
the problem is just that unification has to use in-place mutation for performance, and it just unavoidably loses information
i’ll need to double check, but i think in elm, because it uses continuation style passing, type descriptors are not updated until after all recursive unification has succeeded (or not). maybe we could capture “pending” merges during unification, then only actually upadate the types store after we know if it was a success or failure
but, that aside, it would surprise me if the value of an argument caused the type checking phase to call unify many more times, so i suspect its a unify call in the interpreter
Jared Ramirez said:
but, that aside, it would surprise me if the value of an argument caused the type checking phase to call unify many more times, so i suspect its a unify call in the interpreter
Yes, here: https://github.com/roc-lang/roc/blob/eb54be9749da1966bcf0ce01a87b3f96a3919e87/src/eval/interpreter.zig#L17716
I think this runs for every .append in the range_to builtin.
Any idea how to best improve this?
I also don't quite understand why a copy of the arg is necessary at this point.
Can the unification in the interpreter fail? Type checking should prevent this, right? So we can just have a unification function just for the interpreter that does not create this snapshot?
Would the unification in the interpreter still be required, when we have the "monomorphized roc code"
Don't we full type check before the interpreter? Can we removing all type checking from the interpreter and just assert types?
I don't think we need mono. Anything with static dispatch just needs a virtual table. They literally could store an index to their module and anything with dispatch would just grab that module and run, no type checking
I obviously don't know the pipeline. But something of this nature should be doable and huge for perf.
Fabian Schmalzried said:
Can the unification in the interpreter fail? Type checking should prevent this, right? So we can just have a unification function just for the interpreter that does not create this snapshot?
that's true, it should never fail in the interpreter!
Brendan Hansknecht said:
I don't think we need mono. Anything with static dispatch just needs a virtual table. They literally could store an index to their module and anything with dispatch would just grab that module and run, no type checking
If I remember right, I think the reason we have to unify in the interpreter is because of layout:
#8967 for the example above it's now >10x faster
Found something else that I don't know how to fix:
the refcount of the list $answer list in range_to is 2 for all the appends, therefore the list is not mutated inplace. I think it should be just 1? Anyone has any hints on where to look for this unnecessary refcount?
Ah. Layouts require it. That makes sense. Maybe the interpreter really does need to be after mono then. Not sure.
For range_to sounds like we get confused by the mutable variable
Cause it is referenced after the loop as well. So that is technically an extra refcount
But that is only true for non-mutable variables. This mutable variable is actually changing in the loop
So sounds like we need special recount exceptions for mutable variables.
Last updated: Jan 12 2026 at 12:19 UTC