it occurred to me that our interpreter can be faster than a lot of interpreters because it doesn't need to do runtime type checking
because we'll already have type-checked it, so we know that if there are any type mismatches, we've already dealt with them at compile time
Yes, though we will have to still dispatch on type for at least refcounting
oh sure, and also for static dispatch
Yeah, and to call zig builtins
today, the way we (very often incorrectly) deal with type mismatches in the "I want to run my program anyway even though I know there are type mismatches" scenario is that during the building process, whenever we encounter a canonical IR node and go ask it what its type is, we then notice when its type is "type mismatch" and emit an IR node that's supposed to crash at runtime
Well, not dispatch on type, but fill out type specific information for the call
in the case where we're not doing a full build, and are instead running an interpreter right after type checking finishes, we aren't doing any more compile-time passes over the canonical IR, so we don't have the same opportunity to detect those type mismatches at compile time
one way we could deal with those is to, at runtime, always check every single canonical IR node's type to see if it happens to be a type mismatch, and then crash if so
but at that point we're potentially worse off than languages that check types at runtime :sweat_smile:
I think there's a much simpler solution
every time the type checker hits a type mismatch, in addition to pushing a Problem
onto a list (to be reported later), it also pushes a "hey this Variable
was a type mismatch" onto a list too
then after type-checking is done, if we're going to be interpreting, we have one very quick pass where we go through the list of recorded type mismatches and replace all the canonical IR nodes corresponding to those Variable
s with crash
nodes
if the list is empty and there are no type mismatches, this is free
if there are type mismatches, then we only pay for however many we actually need to fixup
and then at runtime we don't need to check for any type mismatches at all!
also, we can make the "get the canonical IR node from the Variable
" lookup trivial if we assign variables to be the same number as nodes by default, except for nodes that have multiple variables (in which case we can hand out those extra Variables starting from -1 and decrementing, whereas the canonical IR node IDs would all be 0 or positive)
Richard Feldman said:
are instead running an interpreter right after type checking finishes, we aren't doing any more compile-time passes over the canonical IR, so we don't have the same opportunity to detect those type mismatches at compile time
Can you explain this more? In the current pipeline does creating these nodes happen later in the pipeline like after specialization or something?
yeah exactly
I think it's during specialization actually
because we go and look up the type associated with the node to see what its layout is
and then if the type is a mismatch, instead of getting a layout we replace it with a crash (in theory, although in practice we mess this up often)
I don't think this strategy could be done with other forms of dispatch though
In the interpreter, we would instead do it when calling a function (cause we know the concrete types for that call at that point)?
it relies on the fact that once we hit a type mismatch in solving, we know it's game over
I'm proposing that we do one more pass before we run the interpreter
after type checking has completed
and in that pass we go fixup some canonical IR nodes
to replace them with crash
nodes
so the interpreter just encounters them and says "oh, a crash
, I guess I will crash"
How does that work for a function that is called with 4 different specializations?
Like each specialization might or might not have the crash
they'll all have it for sure
type mismatches apply before specializations
it's not possible in Roc to have a type mismatch for one specialization but not others
like it would happen at the call site
either that call site is busted or it isn't
Oh, then sounds like in general type checking can just change the the node to a crash?
yeah exactly
Yeah, sounds reasonable to fold into type checking or a pass right after.
yeah I think we'd want to do it right after because:
roc check
or editor analysis, fixing up the canonical IR is a waste of timeWhat about static dispatch?
That would depend on the specific arg type for whether or not it is a crash.
static dispatch always resolves to the same type
one way to think of it is that
if you have a language like Elm or OCaml or Haskell
they don't monomorphize, there is no specialization pass
but of course they still have type mismatches
we have specialization, but there's no such thing as a "specialization error"
like we don't report compile-time problems in that pass because all the userspace errors occur before that
(I guess we can still have bugs in our compiler implementation though haha)
so Elm, Haskell, OCaml, and Roc would all implement this feature in the same way
If we have this function
fn: x -> U64 where x.len() -> U64
At one call sight, it might need to crash due to the type being passed in not having .len()
and at another it might run just fine, right?
yes, but we'd know that after type-checking
we wouldn't need to specialize
maybe I should clarify that when I say specialize I'm talking about monomorphization
But we now have one call site that runs the crash and another call site that doesn't run the crash.
there's also function application, which is arguably a form of specialization, but that's not really the term I hear used for that :big_smile:
totally!
but we have all the info necessary to do that after type-checking
we don't need to perform specialization
like Haskell literally has this feature haha
and they have ad-hoc polymorphism (typeclasses, not static dispatch, but they both look up implementations based on their type)
https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/defer_type_errors.html
Interesting. I trust you, but I definitely don't understand.
fair enough, I don't think I'm doing a good job explaining it :sweat_smile:
Ayaz could probably explain it better
Oh, I guess you could always generate a .len()
that just calls crash for the type that is missing .len()
well we'd just crash at the call site
like replace the call itself
not the function
so when you get to the call, instead of calling anything it just crashes
main! = ||
x = []
y = "string has no len()"
tmp = fn(x, Bool.true) + fn(y, Bool.false)
z = fn(y, Bool.true)
fn : a, Bool -> U64 where a.len() -> U64
fn = |a, b|
if b then
a.len()
else
0
In the above, the first two calls to fn
should pass with no issues. Only on the third call to fn
(line with z = ...
), it should crash. In this case, the crash should be generated from the line a.len()
due to the type of a
being a Str
.
We could/should crash on the second call site because its type is malformed. The type signature doesn’t match the args provided.
Yeah, I guess that would be valid.
Last updated: Jul 06 2025 at 12:14 UTC