Stream: compiler development

Topic: Static dispatch typing / unify implementation


view this post on Zulip Joshua Warner (Aug 22 2025 at 15:55):

What's the strategy for implementing static dispatch in unify? How would this example be handled?

|x, y| {
  joined = x.map2(y, Pair)
  res = joined.walk([], |acc, el| acc)
  if false { res } else { res.push(0) }
}

It seems that would need to be something like:

a, b -> c
  where a.map2(b, d): e,
   e.walk(...): c,
   c.push(...): c

(not sure I have the syntax exactly right there?)

Prior to static dispatch that method would have resolved as a much more pronounceable List(a), List(b) -> List(c)

Tracking all that in type inference sounds like a headache :sweat_smile:

view this post on Zulip Anton (Aug 22 2025 at 16:07):

Tracking it seems manageable but it does not look nice in the LSP type hover...

view this post on Zulip Joshua Warner (Aug 22 2025 at 16:10):

Err, I guess more specifically - is the intent to implement this via separate constraints? or constrained types? or some kind of row polymorphism types?

view this post on Zulip Anton (Aug 22 2025 at 17:04):

I believe we have not discussed this before @Richard Feldman

view this post on Zulip Joshua Warner (Aug 22 2025 at 18:00):

I ask because I've been tinkering with a type inference system where I tried to do this via row polymorphism... and this is one example where that strategy runs into issues. I think doing it properly that way would require tracking where types are _used_ (i.e. back-links), so that effectively the constraints can be reversed back out of the types. Furthermore, when checking if this function can be applied in a given scenario, we need to follow that whole type graph, making unification much more complicated. So.. probably one of the first two approaches would be best. That, or significant restrictions need to be placed on how much inference we do in this case.

view this post on Zulip Joshua Warner (Aug 22 2025 at 18:20):

Which is to say, I think some form of global constraint tracking is necessary here

view this post on Zulip Joshua Warner (Aug 22 2025 at 18:21):

I have a very vague memory of someone here discussing row-polymorphism for implementing static dispatch, but I can't find it, and I could very well be imagining it.

view this post on Zulip Richard Feldman (Aug 22 2025 at 19:40):

nah it's basically the same as open records or anonymous tag unions

view this post on Zulip Richard Feldman (Aug 22 2025 at 19:40):

I can write it out in more detail tonight

view this post on Zulip Joshua Warner (Aug 22 2025 at 19:42):

Very interested in how it handles that example!

view this post on Zulip Richard Feldman (Aug 23 2025 at 02:53):

yeah so in that example, it would indeed be something similar - so if you put this into the repl:

|x, y| {
  joined = x.map2(y, Pair)
  res = joined.walk([], |acc, el| acc)
  if false { res } else { res.push(0) }
}

view this post on Zulip Richard Feldman (Aug 23 2025 at 02:57):

I believe the inferred type would be:

x, y -> res
    where [
        module(x).map2 : x, y, [Pair, ..] -> joined,
        module(joined).walk : List(elem_), (acc, el_ -> acc) -> res,
        module(res).push(num) -> res,
        module(num).from_digits : Iter(U8) -> Result(num, BadNumLiteral(Str)),
    ]

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:01):

of course, in practice people will normally annotate top-level functions, at which point the type checker would just verify that the body of the function matches the annotation like normal

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:01):

so the only time you'd see big inferred types like that would be if you weren't annotating top-level functions at all, which in turn would likely only be when writing quick scripts anyway

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:02):

as far as how this works across modules, yeah it's row types similar to records or anonymous tag unions

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:03):

so let's say I write this function (no annotation) and just expose it from my module

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:03):

let's say the function's name is join_walk_push

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:04):

another module could call it like so:

|arg1, arg2| FirstModule.join_walk_push(arg1, arg2)

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:04):

in that case, the inferred type of that function would of course be the same, because all it's doing is calling a function with that type and returning its answer

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:05):

similarly, if the other module were to expose that function, and another module were to call it, that function's type would still be unchanged

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:08):

however, if at any point the function gets called with a narrower type, e.g.

|arg1, arg2| FirstModule.join_walk_push([].concat(arg1), arg2)

...then the type of that function (but not all the others) would be instantiated (at the call site) and unified to something narrower, in this case (because of the [] being passed for the first arg):

List(elem), List(elem) -> List(num)
    where [
        module(num).from_digits : Iter(U8) -> Result(num, BadNumLiteral(Str)),
    ]

(assuming List.push exists; otherwise it's a type mismatch - these end up needing to be Lists because of x.map2(y, ...) when x is a List means this map2 will be List.map2 etc.)

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:09):

and again, you could export that function from the module, and other modules could import it, and then either re-export it by just calling it and returning the answer (in which case this narrower type would be unchanged), or else narrowing it even further

view this post on Zulip Joshua Warner (Aug 23 2025 at 03:15):

Ahh ok, so this is a bit more sorted out in my head now

view this post on Zulip Joshua Warner (Aug 23 2025 at 03:17):

The issue I was running into is, in the row-polymorphism approach, you end up with this type information hanging off of the inferred argument types of the function. Which would be fine on its own - except that, for reasons, I want/need to understand what properties the return value of that function could possibly have - and that would require traversing all the type references _backwards_.

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:18):

so the key here is that in order to narrow a type variable in a module (e.g. the x in module(x)), you have to actually pas a specific nominal type

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:18):

and as soon as you've passed a specific nominal type, we know what module it came from

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:19):

and as soon as we know the module, it's trivial to unify moudle(x).foo : type goes here with the type of that module's exposed foo function (or give an error if that module doesn't expose a foo)

view this post on Zulip Joshua Warner (Aug 23 2025 at 03:20):

Yep yep

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:20):

so basically you can add constraints by calling more methods, and they just accumulate in the inferred type as more module(a).whatevers

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:20):

and then you can reduce constraints by providing actual nominal types

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:20):

and eventually you end up providing nominal types for everything (or variables stay unbound, e.g. empty list, which is also fine)

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:21):

at which point we know all the functions that need to be called, and it all works out! :smiley:

view this post on Zulip Joshua Warner (Aug 23 2025 at 03:22):

Random observation - when you use static dispatch and let the compiler infer the type, the inferred type ends up pretty closely reflecting the shape of the code in the function, but obviously omitting some details. I wonder if that could be useful somehow... :thinking:

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:34):

that's interesting!

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:34):

I kinda suspect it would be like..."almost but not quite" something we could rely on in the compiler

view this post on Zulip Richard Feldman (Aug 23 2025 at 03:35):

like there'd be just enough differences that we couldn't just say "the type is [just go look at this canonical IR node]"

view this post on Zulip Joshua Warner (Aug 23 2025 at 03:40):

Right, it "looks thru" inlining, for example

view this post on Zulip Joshua Warner (Aug 23 2025 at 03:41):

I was more thinking of either from the _user_ side, or perhaps the _possible language extension_ side.

view this post on Zulip Joshua Warner (Aug 23 2025 at 03:44):

The analogy that was popping into my head was how some ML frameworks (e.g. pytorch) "compile" functions via bytecode inspection. And vaguely similarly, you can do interesting things with passing Proxy objects to javascript functions and just always return new proxy objects from them - you end up essentially getting a trace of the execution, minus branching and constant evaluation, etc.


Last updated: Sep 09 2025 at 12:16 UTC