Stream: compiler development

Topic: Alias analysis error across modules


view this post on Zulip Folkert de Vries (Apr 29 2024 at 19:50):

We are looking at this example (from https://github.com/roc-lang/roc/issues/5701#issuecomment-2016888604) of an alias analysis error.

# Main.roc
interface Main
    exposes []
    imports [Helper]

expect
    _ = Helper.f (loop {})
    1 == 1

loop = \{} ->
    Helper.foobar \_ ->
        if Bool.true then
            loop {}
        else
            \_ -> {}
# Helper.roc
interface Helper
    exposes [f, foobar]
    imports []

f : ({} -> {}) -> {}
f = \_ -> {}

foobar : ({} -> ({} -> {})) -> ({} -> {})
foobar = \fn -> \_ -> (fn {}) {}

run

roc test Main.roc

What happens here is that Main.roc will call foobar with a particular LayoutId (really a ([LayoutId, Niche, LayoutId)), say foobar#32. It will also ask Helper to generate an implementation of foobar for the particular type at the call site. Helper will do so, but gives it another name, e.g. foobar#33. This wil now fail with an alias analysis error because loop calls a function that does not exist.

So, in effect, the same type gets two different layout ids. That causes problems down the line. We need to somehow guarantee that both parent and child module use the same LayoutId at both the definition and call site.

When we put all definitions in the same module, it works. So something fails to translate between modules.

@Ayaz Hafiz any idea for how how to track this down? We've tried printing a bunch of things but it's not really telling us much besides confirming the above diagnosis

cc @Richard Feldman

view this post on Zulip Ayaz Hafiz (Apr 29 2024 at 21:51):

yeah, i don't really know how we solve this problem in general

view this post on Zulip Ayaz Hafiz (Apr 29 2024 at 21:52):

i think i discussed this before, let me see if i can find earlier notes

view this post on Zulip Ayaz Hafiz (Apr 29 2024 at 21:55):

this is one reason this issue can happen: https://github.com/roc-lang/roc/issues/5464#issuecomment-1631583439. As i mention there i have no idea how to fix it

view this post on Zulip Ayaz Hafiz (Apr 29 2024 at 22:00):

I fixed this a while ago for some structural types via "fixpoint fixing", which during unifications finds if there are types that are isomorphic but whose fixpoint is not equivalent (e.g. https://roc.zulipchat.com/#narrow/stream/231635-compiler-development-.28private.29/topic/.E2.9C.94.20alias.20references.20to.20mutually.20recursive.20types/near/300843640). It's possible that if the type is cloned across modules, its appearance in the new module causes it to get fixpoint-fixed again into a form that produces a different layout (i.e. different fixpoint) in the new module.

view this post on Zulip Ayaz Hafiz (Apr 29 2024 at 22:06):

this is truly the curse of structural types. Maybe we need some way to "lock" the structure of a type once it gets exported?

view this post on Zulip Ayaz Hafiz (Apr 29 2024 at 22:06):

I'm not sure

view this post on Zulip Richard Feldman (Apr 30 2024 at 02:42):

if performance were no concern, is there any sort of "brute force" solution that would be comically inefficient but get the right answer?

view this post on Zulip Ayaz Hafiz (Apr 30 2024 at 03:06):

"locking" as mentioned above would solve it

view this post on Zulip Ayaz Hafiz (Apr 30 2024 at 03:06):

i.e. don't allow the type to change after it's exported, all other types must unify to it exactly

view this post on Zulip Richard Feldman (Apr 30 2024 at 10:27):

but that changes semantics right? Like some programs wouldn't compile anymore

view this post on Zulip Richard Feldman (Apr 30 2024 at 10:28):

or rather, they wouldn't type-check anymore, when today they would

view this post on Zulip Ayaz Hafiz (Apr 30 2024 at 12:50):

why wouldn’t they?

view this post on Zulip Richard Feldman (Apr 30 2024 at 13:14):

I might be misunderstanding the implications of this: :sweat_smile:

Ayaz Hafiz said:

i.e. don't allow the type to change after it's exported, all other types must unify to it exactly

it sounds like "all other types must unify to it exactly" means that there would be some scenario(s) where a type gets less polymorphic

view this post on Zulip Richard Feldman (Apr 30 2024 at 13:15):

like today it would be allowed to unify into a different type which later successfully unifies with other types, but instead it would refuse to unify into that other type, and therefore might have a type mismatch with other types later

view this post on Zulip Richard Feldman (Apr 30 2024 at 13:15):

but maybe I'm misunderstanding the implications there!

view this post on Zulip Ayaz Hafiz (May 01 2024 at 04:24):

is there a case where a type gets imported from a module and doesn't become less, or equivalently, polymorphic?

view this post on Zulip Ayaz Hafiz (May 01 2024 at 04:24):

i guess the crux of my point is that once you find a location of a recursionpointer in a type, you must decide whether that recursionpointer's location is allowed to change in the future or not.

view this post on Zulip Ayaz Hafiz (May 01 2024 at 04:25):

if the type is unique in the sense that nothing else has unified with it, it's fine for it to change.

view this post on Zulip Ayaz Hafiz (May 01 2024 at 04:25):

but if it's not unique (e.g. exported to another module), it cannot change without running into the bug here.

view this post on Zulip Ayaz Hafiz (May 01 2024 at 04:25):

so finding some way to encode/enforce that would be the fix.

view this post on Zulip Richard Feldman (May 12 2024 at 16:28):

I'd love to try this out, and I'll actually have a week off after I get back from Milan, but I have no idea where to start with the implementation :laughing:

view this post on Zulip Richard Feldman (May 12 2024 at 16:29):

any suggestions for how to go about doing this?

view this post on Zulip Ayaz Hafiz (May 12 2024 at 16:39):

yeah, so it's probably most useful to start here with the definitions of recursive types:
https://github.com/roc-lang/roc/blob/e500d664fdd5b79b004ec5be4614424557b67425/crates/compiler/types/src/subs.rs#L2362-L2366

and how fixpoint fixing works

https://github.com/roc-lang/roc/blob/e500d664fdd5b79b004ec5be4614424557b67425/crates/compiler/unify/src/fix.rs#L14

view this post on Zulip Ayaz Hafiz (May 12 2024 at 16:40):

im realizing though this may be hard more difficult than it might seem though

view this post on Zulip Ayaz Hafiz (May 12 2024 at 16:41):

if there are two recursive types that are isomorphic but not equal, imported from two different modules into the same module M, then when we compare them in M, it may be impossible to reconcile them into the same type

view this post on Zulip Ayaz Hafiz (May 12 2024 at 16:43):

yeah okay i don't know how to deal with that without breaking module isolation for typechecking, lol

view this post on Zulip Ayaz Hafiz (May 12 2024 at 16:44):

or compiling all recursive types as boxed at every level

view this post on Zulip Ayaz Hafiz (May 12 2024 at 16:50):

to put the problem concretely: suppose we have modules A and B both importing the definitions

F : [FromG G]
G : [G {lst : List F}]

# the unfurled representation of these types is
# F = [FromG [G {lst: List <1>}] as <1>
# G = [G {lst: List [FromG <2>]}] as <2>
# where <1>, <2> are recursion pointers

both expose val where

# module A
val : F
val = FromG (G {lst: []})

# module B
val : G
val = G {lst: []}

now you import both in module C and do a comparison

A.val == (FromG B.val)

In C we are comparing

A.val : [FromG [G {lst: List <1>}] as <1>
to
(FromG B.val) :  [FromG [G {lst: List [FromG <2>]}] as <2>]

which typechecks because these types are clearly isomorphic. But their layouts are not the same. And we cannot change the type of either A.val or B.val to be equal to the other's type without also changing the types in module A and B, respectively.

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:12):

ugh. if we break module isolation, incremental compilation will become very difficult in the future.

view this post on Zulip Folkert de Vries (May 12 2024 at 17:13):

can we not make F and G structurally the same in the defining module?

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:14):

we could, but suppose it was instead

F : [FromG G_]
G_ : [G {lst : List F}]

F_ : [FromG G]
G : [G {lst : List F_}]

now F and G are not related in the defining module but [FromG G] is still isomorphic to F

view this post on Zulip Folkert de Vries (May 12 2024 at 17:17):

hmm yes

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:18):

I'll run this by some ocaml people

view this post on Zulip Folkert de Vries (May 12 2024 at 17:18):

also even in the earlier example F and G could be defined in distinct modules (with slight modifications), right?

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:18):

yeah

view this post on Zulip Folkert de Vries (May 12 2024 at 17:18):

because this is all about structural equality

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:18):

yeah, exactly

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:19):

the curse of structural recursion with unboxed compilation. maybe this is why everyone uses nominal types

view this post on Zulip Richard Feldman (May 12 2024 at 17:24):

Ayaz Hafiz said:

ugh. if we break module isolation, incremental compilation will become very difficult in the future.

is that necessarily true if this is the only way we break it?

view this post on Zulip Richard Feldman (May 12 2024 at 17:24):

for example, if what we cache is "these are the resolved type annotations, but they can be modified further later"

view this post on Zulip Richard Feldman (May 12 2024 at 17:24):

I guess we potentially have to load a lot more from disk if we need all the subs

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:24):

i think the dependency graph becomes a mess, because it is no longer is DAG

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:25):

If you had the graph A -> C <- B, where C depends on both A and B as in the example above, a change in A now can force a change in B

view this post on Zulip Richard Feldman (May 12 2024 at 17:25):

hm true

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:26):

ok crazy idea. recursive types must be nominal

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:27):

this is potentially a can of worms, but probably worth considering

view this post on Zulip Richard Feldman (May 12 2024 at 17:27):

:thinking: what about "exposed types must be annotated" instead?

view this post on Zulip Richard Feldman (May 12 2024 at 17:27):

like to expose them from a module

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:28):

we still have the same problem though right? For example if val is annotated as F and G accordingly in both modules A and B in the example above.

view this post on Zulip Richard Feldman (May 12 2024 at 17:29):

yeah true

view this post on Zulip Richard Feldman (May 12 2024 at 17:29):

so in other words they can only be recursive if they're opaque

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:30):

yes. the idea being that the name of the opaque type forces one representation of the type, and hence one layout

view this post on Zulip Richard Feldman (May 12 2024 at 17:30):

interesting!

view this post on Zulip Richard Feldman (May 12 2024 at 17:30):

can we think of cases where this would cause problems?

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:33):

yeah, there's certainly some issues to figure out with that approach though. A couple off the top of my head:

Op := # my opaque recursive definition

eq = \@Op ob1, @Op ob2 -> ob1.fieldA == ob2.fieldA && ob1.fieldB == ob2.fieldB

inside this definition, the field accesses are structural and recursive, which we've disallowed. So some language rule to allow this needs to be defined.

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:34):

Actually the second might have an easy solution - if the recursion point is named by an opaque type it's fine.

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:35):

but this needs a design doc

view this post on Zulip Richard Feldman (May 12 2024 at 17:36):

I'm very hype about this direction! :smiley:

view this post on Zulip Richard Feldman (May 12 2024 at 17:43):

it seems like it would come up relatively rarely, and existing cases of "recursion has to happen behind an opaque type" haven't felt like a big burden

view this post on Zulip Richard Feldman (May 12 2024 at 17:43):

more of a surprise than anything

view this post on Zulip Richard Feldman (May 12 2024 at 17:46):

@Ayaz Hafiz are you up for writing a design doc on this?

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:46):

yeah i can

view this post on Zulip Richard Feldman (May 12 2024 at 17:48):

awesome, thank you so much!

view this post on Zulip Richard Feldman (May 12 2024 at 17:48):

it would be so huge to have this working

view this post on Zulip Richard Feldman (May 12 2024 at 17:49):

huh, I guess morphic doesn't have this problem because it's a separate pass

view this post on Zulip Richard Feldman (May 12 2024 at 17:50):

maybe that's a relevant difference in correctness between doing it during type checking vs in a separate pass

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:50):

morphic only has nominal types

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:51):

for recursive defs anyway

view this post on Zulip Richard Feldman (May 12 2024 at 17:51):

ha, til!

view this post on Zulip Richard Feldman (May 12 2024 at 17:51):

but a separate pass would also fix the problem, right?

view this post on Zulip Richard Feldman (May 12 2024 at 17:52):

because then "revisiting modules" wouldn't be a problem

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:52):

yes, but it would need a whole-program analysis

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:52):

i.e. you would still need to examine all modules, i think

view this post on Zulip Ayaz Hafiz (May 12 2024 at 17:52):

to find all instances of a type and produce one canonical form

view this post on Zulip Richard Feldman (May 12 2024 at 17:54):

right

view this post on Zulip Richard Feldman (May 12 2024 at 17:55):

yeah not saying it's the right design, just something I hadn't thought of until now

view this post on Zulip Richard Feldman (May 12 2024 at 17:56):

requiring opaque types feels like the better design to proceed with

view this post on Zulip Richard Feldman (May 12 2024 at 19:31):

I can't think of a reason this would be a problem, but if you define the opaque type as := _ does that reintroduce any issues because its (recursive) structure is being completely inferred?

view this post on Zulip Ayaz Hafiz (May 12 2024 at 19:33):

i’m not sure but it probably does introduce complexity

view this post on Zulip Richard Feldman (May 12 2024 at 19:45):

worth thinking though I guess, since that's definitely a thing that can happen today!

view this post on Zulip Ayaz Hafiz (May 12 2024 at 19:54):

huh? writing O := _?

view this post on Zulip Ayaz Hafiz (May 12 2024 at 19:54):

pretty sure that's not allowed

view this post on Zulip Richard Feldman (May 12 2024 at 19:59):

it's allowed today!

view this post on Zulip Richard Feldman (May 12 2024 at 20:00):

and even if it weren't, if it turned out that was a problem, you'd probably have to disallow _ from appearing in there at all, including in any type aliases it references - since type aliases can also be defined with _ today

view this post on Zulip Richard Feldman (May 12 2024 at 20:03):

that said, I'm open to disallowing _ in opaque type and/or type alias declarations if necessary to fix this (since they're allowed today but don't seem to be used significantly in practice)

view this post on Zulip Ayaz Hafiz (May 12 2024 at 20:05):

── UNBOUND TYPE VARIABLE ─────────────────────────────────────────────── B.roc ─

The definition of O has an unbound type variable:

3│  O := _
         ^

Tip: Type variables must be bound before the :=. Perhaps you intended
to add a type parameter to this type?

view this post on Zulip Ayaz Hafiz (May 12 2024 at 20:05):

Am I doing this wrong?

view this post on Zulip Ayaz Hafiz (May 12 2024 at 20:05):

interface B exposes [] imports []

O := _

f = \@O x -> x + 1

expect
    f (@O 1) == 2

this program doesn't compile for me

view this post on Zulip Ayaz Hafiz (May 12 2024 at 20:07):

pretty sure type aliases also can't have _

── UNBOUND TYPE VARIABLE in B.roc ──────────────────────────────────────────────

The definition of O has an unbound type variable:

3│  O : _
        ^

Tip: Type variables must be bound before the :. Perhaps you intended
to add a type parameter to this type?

view this post on Zulip Ayaz Hafiz (May 12 2024 at 20:09):

Allowing _ in type aliases would have a few UX problems too

view this post on Zulip Richard Feldman (May 12 2024 at 20:10):

huh, I'm on mobile and the online repl accepted it :sweat_smile:

view this post on Zulip Ayaz Hafiz (May 12 2024 at 20:13):

seems like a bug. im pretty sure this isn't allowed. even if it is, i think we should revisit ever supporting this

view this post on Zulip Ayaz Hafiz (May 12 2024 at 20:14):

looks like the repl is deferring evaluation. if you use

O := _

f = \@O x -> x + 1

f (@O 1)

you'll see it errors out

view this post on Zulip Richard Feldman (May 12 2024 at 20:20):

ahh gotcha

view this post on Zulip Richard Feldman (May 12 2024 at 20:20):

yeah I guess we should give an error for this at the parsing stage then

view this post on Zulip Richard Feldman (May 12 2024 at 20:20):

for both opaque type and type alias declarations

view this post on Zulip Richard Feldman (May 12 2024 at 20:28):

should ability declarations also disallow it?

view this post on Zulip Richard Feldman (May 12 2024 at 20:33):

I'm gonna add an explicit error for it

view this post on Zulip Ayaz Hafiz (May 12 2024 at 20:43):

yes, let's disallow it in abilities too

view this post on Zulip Sam Mohr (May 12 2024 at 21:00):

I posted a simpler, single file reduction in the issue here.

app [main] {
    pf: platform "https://github.com/roc-lang/basic-cli/releases/download/0.10.0/vNe6s9hWzoTZtFmNkvEICPErI9ptji_ySjicO6CkucY.tar.br",
}

import pf.Stdout
import pf.Task exposing [Task]

First : {}
Second : First {}

main =
    testFunc {}

testFunc : Second -> Task {} [StdoutErr Stdout.Err]
testFunc = \{} ->
    Stdout.line "123"

view this post on Zulip Sam Mohr (May 12 2024 at 21:02):

It looks (without compiler analysis) that we generate a layout for First, think Second has the same layout, but don't pass First's layout to the generated Task

view this post on Zulip Sam Mohr (May 12 2024 at 21:02):

That's just a guess though

view this post on Zulip Sam Mohr (May 12 2024 at 21:03):

Sorry to interrupt discussion, just figured there's more visibility on Zulip

view this post on Zulip Ayaz Hafiz (May 12 2024 at 23:58):

Thanks for the note Sam. Unfortunately I think that issue is different and the bug is that Second : First {} isn't caught as a type error; if I change it to Second : First, the program compiles.

view this post on Zulip Sam Mohr (May 13 2024 at 05:43):

Yes, that's true. Good to know

view this post on Zulip Richard Feldman (May 13 2024 at 14:56):

btw is this related to https://github.com/roc-lang/roc/issues/5464#issuecomment-1631583439 or is that a separate issue?

view this post on Zulip Ayaz Hafiz (May 13 2024 at 14:57):

That's a separate issue

view this post on Zulip Ayaz Hafiz (May 14 2024 at 04:46):

Document for consideration on this issue: https://github.com/roc-lang/rfcs/pull/1. Feedback appreciated from everyone.

view this post on Zulip Eli Dowling (May 14 2024 at 22:25):

Wow... That's super thorough and impressive!
As to your proposed solution of using opaque types. I think it is another example of the fact that opaque types kind of come in two forms "types I wish to hide the implementation of" and "types I need the features of opaque types for (see some of the discussion on Json decoding and enum types)".
I do think the current setup is unergonomic for those second class of types because I generally don't actually want them to be "opaque"

In my own codebase I tested out implementing from and to abilities for opaque types that I wnated to be less opaque. eg: when from myopaqueType is

In summary. My only concern with solution 1 is reduced ergonomics, but I think that is already an issue that needs to be investigated.

view this post on Zulip Ayaz Hafiz (May 14 2024 at 22:32):

Thanks for the feedback Eli!

view this post on Zulip Richard Feldman (May 20 2024 at 19:07):

Eli Dowling said:

I think it is another example of the fact that opaque types kind of come in two forms "types I wish to hide the implementation of" and "types I need the features of opaque types for (see some of the discussion on Json decoding and enum types)".
I do think the current setup is unergonomic for those second class of types because I generally don't actually want them to be "opaque"

I know this has come up before, but personally I really hope we can avoid introducing "nominal but not opaque types" to the language

view this post on Zulip Richard Feldman (May 20 2024 at 19:08):

that said, I just realized that this solution to the problem will create a different (much smaller) problem: I think it's important for multiple reasons that we disallow sending opaque types across the host boundary (in either direction), and this will mean that it would be disallowed to send recursive types to or from the host

view this post on Zulip Richard Feldman (May 20 2024 at 19:09):

obviously it would be ideal to not have that limitation somehow, but like I said, I really would prefer not to introduce the language feature of nominal types that are not opaque :sweat_smile:

view this post on Zulip Folkert de Vries (May 20 2024 at 19:33):

so we tried the solution of making task an opaque type, but that does not (yet) work either. This code

interface Task
    exposes [Task, stdoutLine, Op]
    imports []

Op : [
    StdoutLine Str (Box ({} -> Op)),
    StdinLine (Box (Str -> Op)),
    None,
]

Task ok err := (Result ok err -> Op) -> Op

stdoutLine : Str -> Task {} a
stdoutLine = \line ->
    helper: ((Result {} a) -> Op) -> Op
    helper =
        \fromResult -> StdoutLine line (Box.box \{} -> fromResult (Ok {}))

    @Task helper

gives this error with cargo run -- check examples/platform-switching/rust-platform/Task.roc

── CIRCULAR TYPE in examples/platform-switching/rust-platform/Task.roc ─────────

I'm inferring a weird self-referential type for stdoutLine:

14│>  stdoutLine = \line ->
15│>      helper: ((Result {} a) -> Op) -> Op
16│>      helper =
17│>          # crash "foo"
18│>          \fromResult -> StdoutLine line (Box.box \{} -> fromResult (Ok {}))
19│>
20│>      @Task helper

Here is my best effort at writing down the type. You will see ∞ for
parts of the type that repeat something already printed out
infinitely.

    Str -> Task {} a

@Ayaz Hafiz this is no longer a module crossing problem now, is what we observe here entirely different from the original problem we're trying to solve here?

view this post on Zulip Ayaz Hafiz (May 20 2024 at 19:52):

yes that seems different

view this post on Zulip Ayaz Hafiz (May 20 2024 at 19:53):

try making Op opaque too

view this post on Zulip Folkert de Vries (May 20 2024 at 19:55):

right, we tried that next

Op := [
    StdoutLine ({} -> Op),
    StdinLine ({} -> Op),
    None,
]

Task := ({} -> Op) -> Op

stdoutLine : Task
stdoutLine =
    helper: ({} -> Op) -> Op
    helper =
        \fromResult -> @Op (StdoutLine ((\{} -> fromResult {})))

    @Task helper

view this post on Zulip Folkert de Vries (May 20 2024 at 19:55):

gives

── CIRCULAR TYPE in examples/platform-switching/rust-platform/Task.roc ─────────

I'm inferring a weird self-referential type for stdoutLine:

14│  stdoutLine =
     ^^^^^^^^^^

Here is my best effort at writing down the type. You will see ∞ for
parts of the type that repeat something already printed out
infinitely.

    Task

view this post on Zulip Ayaz Hafiz (May 20 2024 at 19:55):

hm

view this post on Zulip Ayaz Hafiz (May 20 2024 at 19:55):

will take a look in a minute

view this post on Zulip Ayaz Hafiz (May 20 2024 at 20:00):

this is some kind of bug in the occurs checking

view this post on Zulip Ayaz Hafiz (May 20 2024 at 20:00):

if you remove the annotation on stdoutLine it works for me

view this post on Zulip Ayaz Hafiz (May 20 2024 at 20:00):

well, it typechecks at least

view this post on Zulip Ayaz Hafiz (May 20 2024 at 20:09):

interface B exposes [main] imports []

# Task.roc
Task v op := (v -> op) -> op

await : Task a op, (a -> Task b op) -> Task b op
await = \@Task fromResult, next ->
    @Task \continue ->
        fromResult \result ->
            @Task inner = next result
            inner continue

# StdinEffect.roc
OpIn a b : [
    StdinLine (Str -> OpIn a b),
    Done a,
]b

# Stdin.roc
lineIn : Task Str (OpIn * *)
lineIn = @Task \toNext -> StdinLine \s -> (toNext s)

# StdoutEffect.roc
OpOut a b : [
    StdoutLine Str ({} -> OpOut a b),
    Done a,
]b

# Stdout.roc
lineOut : Str -> Task {} (OpOut * *)
lineOut = \s -> @Task \toNext -> StdoutLine s \{} -> (toNext {})

# Platform
# really, we want a syntax like [Done a](OpIn a)(OpOut a) here
Op a : [
    StdinLine (Str -> Op a),
    StdoutLine Str ({} -> Op a),
    Done a,
]

main : Task {} (Op *)
main =
    lineIn
    |> await \s -> lineOut s

view this post on Zulip Ayaz Hafiz (May 20 2024 at 20:11):

interface B exposes [stdoutLine] imports []

Op := [
    StdinLine (Str -> Op),
    StdoutLine Str ({} -> Op),
    Done,
]

Task := ({} -> Op) -> Op

stdoutLine =
    @Task \fromResult -> @Op (StdoutLine "foo" (\{} -> fromResult {}))

view this post on Zulip Ayaz Hafiz (May 21 2024 at 03:03):

Okay, I suggest as a path forward here:
-Remove lambda sets in favor of erasure, at least for the time being
-Remove anonymous recursive types as described above, or at least severely restrict them

view this post on Zulip Ayaz Hafiz (May 21 2024 at 03:06):

I know this has various downsides including runtime performance. But lambda sets can be added later, in a separate pass. This will temporarily push Roc further away from its end goals, but I think the goal of making a fast compile-time and runtime language has put the implementation in a position where it’s difficult to recover from these issues we discover only years later. A functional compiler is more useful than a fast one that has bugs. And, without lambda sets or anonymous recursive types, the implementation should be much more approachable to contributors.

view this post on Zulip Folkert de Vries (May 21 2024 at 11:44):

but, the erased version also does not work with current examples. Do you know what the problem is there?

view this post on Zulip Ayaz Hafiz (May 21 2024 at 13:33):

Almost certainly the problem of recursive types across modules

view this post on Zulip Richard Feldman (May 21 2024 at 16:04):

I was just rereading the doc and I wondered - would the fixpoint fixing work if we delayed it until after typechecking was done? e.g. if we did it during mono, when we’re already revisiting modules

view this post on Zulip Ayaz Hafiz (May 21 2024 at 16:37):

You would still need to break module isolation during type inference

view this post on Zulip Ayaz Hafiz (May 21 2024 at 16:38):

because you need to know "what types, across all modules, are in the same strongly-connected components"

view this post on Zulip Richard Feldman (May 21 2024 at 16:42):

gotcha

view this post on Zulip Ayaz Hafiz (May 21 2024 at 16:43):

I really think the goal here should be to simplify, at the cost of the performance, until this works again, and then layer on optimizations as needed

view this post on Zulip Folkert de Vries (May 21 2024 at 17:04):

@Ayaz Hafiz is breaking that isolation required during type inference, or only in later steps, assuming these steps

  1. type inference
  2. type mono
  3. lambda set inference (using graph SSC)
  4. lambda set mono

where is the isolation breaking actually required? Intuitively it's only at step 3 that we need it, given the restriction on recursive types that was proposed earlier in this thread

view this post on Zulip Ayaz Hafiz (May 21 2024 at 17:11):

yes, if recursive types must be nominal, then

  1. type inference over Can AST. This can happen with module isolation.
  2. type specialization. Let's call this IR MonoTy. All code is combined into one program; there is no longer a module concept in the same way there is no module concept after the Mono IR today.
  3. lambda set inference over MonoTy
  4. lambda set specialization over MonoTy, let's say the new IR is called MonoLam

view this post on Zulip Ayaz Hafiz (May 21 2024 at 17:12):

MonoLam = Mono IR today

view this post on Zulip Folkert de Vries (May 21 2024 at 17:13):

can we not perform step 2 just in subs only (while traversing Can, but not materializing a whole separate AST)?

view this post on Zulip Ayaz Hafiz (May 21 2024 at 17:25):

there is probably a way, but i don't see it quite yet. one complication is abilities. if you do not materialize the call to a specific lambda in the AST you must keep the call information elsewhere. so you would need a lookaside table that associates a type specialization ID + can AST node ID => ability specialization

view this post on Zulip Ayaz Hafiz (May 21 2024 at 17:25):

this could be an optimization. i suspect it will be easier to implement and debug if it's a separate AST.

view this post on Zulip Richard Feldman (May 21 2024 at 21:56):

Ayaz Hafiz said:

I really think the goal here should be to simplify, at the cost of the performance, until this works again, and then layer on optimizations as needed

ok so how would we go about doing this? I remember we'd talked about this before but I don't remember the specific steps it would take :sweat_smile:

view this post on Zulip Richard Feldman (May 21 2024 at 21:57):

something to do with marking closures as erased maybe?

view this post on Zulip Ayaz Hafiz (May 21 2024 at 22:57):

  1. Replace lambda sets with type erased closures. This is already implemented for the most part. The missing pieces are reference counting and abilities.
  2. In parallel, the restriction of recursive types to be nominal can be put in place. Folkert and I discussed how to do this. The RecursiveType constructor needs to be adjusted to hold a symbol naming the recursive type. During can, construct recursive types from opaque type definitions as needed. Two recursive types are equal only if their RecursiveType constructors are equal in symbol name, and the arguments to their opaque type unify.
    I don't have the bandwidth to implement this myself but I'm happy to provide suggestions or mentorship as needed

view this post on Zulip Richard Feldman (May 24 2024 at 14:58):

Ayaz Hafiz said:

  1. Replace lambda sets with type erased closures. This is already implemented for the most part. The missing pieces are reference counting and abilities.

ok cool! So where should we look for implementing type-erased closures?

view this post on Zulip Richard Feldman (May 24 2024 at 15:00):

more specifically, where does the refcounting need to happen? Could we reuse some logic from Box maybe?

view this post on Zulip Richard Feldman (May 24 2024 at 15:01):

also, what needs to happen with regards to abilities? Normally functions don't have any abilities, but I'm assuming there's some nuance I'm missing there :big_smile:

view this post on Zulip Folkert de Vries (May 24 2024 at 17:37):

whe need to generate a function that decrements all captured values I think? and make that accessible such that a dec someClosure calls the corresponding function

view this post on Zulip Folkert de Vries (May 24 2024 at 17:37):

right?

view this post on Zulip Brendan Hansknecht (May 24 2024 at 17:41):

Like with my work for lists, can we make this lazy? Does that make sense here.

As in just refcount the erased closure instead of every value in the closure? I guess it maybe makes less sense here than in lists.

Fundamentally, for wrapped types, I think only refcounting the wrapper is a nice property.

view this post on Zulip Brendan Hansknecht (May 24 2024 at 17:41):

Of course when loading values from the wrapper, you would need to deal with refcounts

view this post on Zulip Folkert de Vries (May 24 2024 at 17:42):

yes sure but the generation work is mostly the same (but only runs when the rc goes from 1 to 0)

view this post on Zulip Brendan Hansknecht (May 24 2024 at 17:44):

Yep

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:13):

actually, nothing needs to happen with abilities so you can disregard that

view this post on Zulip Richard Feldman (May 24 2024 at 18:16):

ok cool!

view this post on Zulip Richard Feldman (May 24 2024 at 18:17):

I wonder - is this a situation where we can split up the work by having someone knowledgeable about the low-level parts make a starting point commit like "ok now the refcounting happens in this one place, but it also needs to happen in these N other similar situations that need to be considered individually" and someone else can pick up that part?

view this post on Zulip Folkert de Vries (May 24 2024 at 18:18):

@Ayaz Hafiz what is the memory layout supposed to be? where does that function pointer go?

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:20):

one sec, im putting up the document i wrote about this before

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:24):

https://github.com/roc-lang/rfcs/pull/4

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:24):

the structure of the erased closure is

struct erased {
  void* value;
  void (*refcounter)(void*);  // NULL if value needs no refcounting
  void* callee;  // NULL if the erased value is not a closure
  struct ability_dictionary* abilities;  // Discussed below
};

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:24):

ability_dictionary is NULL for closures

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:24):

callee is the function target`

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:24):

value are the captures (NULL if no captures)

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:25):

refcounter is a pointer to the refcounting function for the captures. the name needs to be generated/looked up whenever the erased closure is constructed.

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:26):

i suppose it should really be void (*refcounter)(int, void*) where the first parameter indicates inc or dec

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:26):

or split them out into separate pointers

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:28):

when refcounting happens for captures is the same as when refcounting happens today. you just need to lookup the refcounting function differently because it's not statically known from the caller's side

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:28):

fyi this memory layout is already implemented

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:28):

but the refcounting function is not populated

view this post on Zulip Richard Feldman (May 24 2024 at 18:31):

nice! do you know where that code lives at the moment?

view this post on Zulip Richard Feldman (May 24 2024 at 18:31):

like where the refcounting fn needs to go

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:33):

https://github.com/roc-lang/roc/blob/25f230fda81ef57387dc73cc315a15b5e383752b/crates/compiler/gen_llvm/src/llvm/erased.rs#L61-L89

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:34):

oh, well the actual implementation already refernces refcounter_inc and refcounter_dec, so it is two separate functions

/// Erased is laid out like
///
/// text /// struct Erased { /// value: void*, /// callee: void*, /// refcounter_inc: (void* -> void) *, /// refcounter_dec: (void* -> void) *, /// } ///

view this post on Zulip Ayaz Hafiz (May 24 2024 at 18:34):

https://github.com/roc-lang/roc/blob/25f230fda81ef57387dc73cc315a15b5e383752b/crates/compiler/gen_llvm/src/llvm/erased.rs#L33


Last updated: Jul 06 2025 at 12:14 UTC