We are looking at this example (from https://github.com/roc-lang/roc/issues/5701#issuecomment-2016888604) of an alias analysis error.
# Main.roc
interface Main
exposes []
imports [Helper]
expect
_ = Helper.f (loop {})
1 == 1
loop = \{} ->
Helper.foobar \_ ->
if Bool.true then
loop {}
else
\_ -> {}
# Helper.roc
interface Helper
exposes [f, foobar]
imports []
f : ({} -> {}) -> {}
f = \_ -> {}
foobar : ({} -> ({} -> {})) -> ({} -> {})
foobar = \fn -> \_ -> (fn {}) {}
run
roc test Main.roc
What happens here is that Main.roc
will call foobar
with a particular LayoutId (really a ([LayoutId, Niche, LayoutId)
), say foobar#32
. It will also ask Helper
to generate an implementation of foobar
for the particular type at the call site. Helper
will do so, but gives it another name, e.g. foobar#33
. This wil now fail with an alias analysis error because loop
calls a function that does not exist.
So, in effect, the same type gets two different layout ids. That causes problems down the line. We need to somehow guarantee that both parent and child module use the same LayoutId at both the definition and call site.
When we put all definitions in the same module, it works. So something fails to translate between modules.
@Ayaz Hafiz any idea for how how to track this down? We've tried printing a bunch of things but it's not really telling us much besides confirming the above diagnosis
cc @Richard Feldman
yeah, i don't really know how we solve this problem in general
i think i discussed this before, let me see if i can find earlier notes
this is one reason this issue can happen: https://github.com/roc-lang/roc/issues/5464#issuecomment-1631583439. As i mention there i have no idea how to fix it
I fixed this a while ago for some structural types via "fixpoint fixing", which during unifications finds if there are types that are isomorphic but whose fixpoint is not equivalent (e.g. https://roc.zulipchat.com/#narrow/stream/231635-compiler-development-.28private.29/topic/.E2.9C.94.20alias.20references.20to.20mutually.20recursive.20types/near/300843640). It's possible that if the type is cloned across modules, its appearance in the new module causes it to get fixpoint-fixed again into a form that produces a different layout (i.e. different fixpoint) in the new module.
this is truly the curse of structural types. Maybe we need some way to "lock" the structure of a type once it gets exported?
I'm not sure
if performance were no concern, is there any sort of "brute force" solution that would be comically inefficient but get the right answer?
"locking" as mentioned above would solve it
i.e. don't allow the type to change after it's exported, all other types must unify to it exactly
but that changes semantics right? Like some programs wouldn't compile anymore
or rather, they wouldn't type-check anymore, when today they would
why wouldn’t they?
I might be misunderstanding the implications of this: :sweat_smile:
Ayaz Hafiz said:
i.e. don't allow the type to change after it's exported, all other types must unify to it exactly
it sounds like "all other types must unify to it exactly" means that there would be some scenario(s) where a type gets less polymorphic
like today it would be allowed to unify into a different type which later successfully unifies with other types, but instead it would refuse to unify into that other type, and therefore might have a type mismatch with other types later
but maybe I'm misunderstanding the implications there!
is there a case where a type gets imported from a module and doesn't become less, or equivalently, polymorphic?
i guess the crux of my point is that once you find a location of a recursionpointer in a type, you must decide whether that recursionpointer's location is allowed to change in the future or not.
if the type is unique in the sense that nothing else has unified with it, it's fine for it to change.
but if it's not unique (e.g. exported to another module), it cannot change without running into the bug here.
so finding some way to encode/enforce that would be the fix.
I'd love to try this out, and I'll actually have a week off after I get back from Milan, but I have no idea where to start with the implementation :laughing:
any suggestions for how to go about doing this?
yeah, so it's probably most useful to start here with the definitions of recursive types:
https://github.com/roc-lang/roc/blob/e500d664fdd5b79b004ec5be4614424557b67425/crates/compiler/types/src/subs.rs#L2362-L2366
and how fixpoint fixing works
im realizing though this may be hard more difficult than it might seem though
if there are two recursive types that are isomorphic but not equal, imported from two different modules into the same module M, then when we compare them in M, it may be impossible to reconcile them into the same type
yeah okay i don't know how to deal with that without breaking module isolation for typechecking, lol
or compiling all recursive types as boxed at every level
to put the problem concretely: suppose we have modules A and B both importing the definitions
F : [FromG G]
G : [G {lst : List F}]
# the unfurled representation of these types is
# F = [FromG [G {lst: List <1>}] as <1>
# G = [G {lst: List [FromG <2>]}] as <2>
# where <1>, <2> are recursion pointers
both expose val
where
# module A
val : F
val = FromG (G {lst: []})
# module B
val : G
val = G {lst: []}
now you import both in module C and do a comparison
A.val == (FromG B.val)
In C we are comparing
A.val : [FromG [G {lst: List <1>}] as <1>
to
(FromG B.val) : [FromG [G {lst: List [FromG <2>]}] as <2>]
which typechecks because these types are clearly isomorphic. But their layouts are not the same. And we cannot change the type of either A.val
or B.val
to be equal to the other's type without also changing the types in module A
and B
, respectively.
ugh. if we break module isolation, incremental compilation will become very difficult in the future.
can we not make F
and G
structurally the same in the defining module?
we could, but suppose it was instead
F : [FromG G_]
G_ : [G {lst : List F}]
F_ : [FromG G]
G : [G {lst : List F_}]
now F and G are not related in the defining module but [FromG G]
is still isomorphic to F
hmm yes
I'll run this by some ocaml people
also even in the earlier example F
and G
could be defined in distinct modules (with slight modifications), right?
yeah
because this is all about structural equality
yeah, exactly
the curse of structural recursion with unboxed compilation. maybe this is why everyone uses nominal types
Ayaz Hafiz said:
ugh. if we break module isolation, incremental compilation will become very difficult in the future.
is that necessarily true if this is the only way we break it?
for example, if what we cache is "these are the resolved type annotations, but they can be modified further later"
I guess we potentially have to load a lot more from disk if we need all the subs
i think the dependency graph becomes a mess, because it is no longer is DAG
If you had the graph A -> C <- B
, where C depends on both A and B as in the example above, a change in A now can force a change in B
hm true
ok crazy idea. recursive types must be nominal
this is potentially a can of worms, but probably worth considering
:thinking: what about "exposed types must be annotated" instead?
like to expose them from a module
we still have the same problem though right? For example if val
is annotated as F and G accordingly in both modules A and B in the example above.
yeah true
so in other words they can only be recursive if they're opaque
yes. the idea being that the name of the opaque type forces one representation of the type, and hence one layout
interesting!
can we think of cases where this would cause problems?
yeah, there's certainly some issues to figure out with that approach though. A couple off the top of my head:
Op := # my opaque recursive definition
eq = \@Op ob1, @Op ob2 -> ob1.fieldA == ob2.fieldA && ob1.fieldB == ob2.fieldB
inside this definition, the field accesses are structural and recursive, which we've disallowed. So some language rule to allow this needs to be defined.
Actually the second might have an easy solution - if the recursion point is named by an opaque type it's fine.
but this needs a design doc
I'm very hype about this direction! :smiley:
it seems like it would come up relatively rarely, and existing cases of "recursion has to happen behind an opaque type" haven't felt like a big burden
more of a surprise than anything
@Ayaz Hafiz are you up for writing a design doc on this?
yeah i can
awesome, thank you so much!
it would be so huge to have this working
huh, I guess morphic doesn't have this problem because it's a separate pass
maybe that's a relevant difference in correctness between doing it during type checking vs in a separate pass
morphic only has nominal types
for recursive defs anyway
ha, til!
but a separate pass would also fix the problem, right?
because then "revisiting modules" wouldn't be a problem
yes, but it would need a whole-program analysis
i.e. you would still need to examine all modules, i think
to find all instances of a type and produce one canonical form
right
yeah not saying it's the right design, just something I hadn't thought of until now
requiring opaque types feels like the better design to proceed with
I can't think of a reason this would be a problem, but if you define the opaque type as := _
does that reintroduce any issues because its (recursive) structure is being completely inferred?
i’m not sure but it probably does introduce complexity
worth thinking though I guess, since that's definitely a thing that can happen today!
huh? writing O := _
?
pretty sure that's not allowed
it's allowed today!
and even if it weren't, if it turned out that was a problem, you'd probably have to disallow _
from appearing in there at all, including in any type aliases it references - since type aliases can also be defined with _
today
that said, I'm open to disallowing _
in opaque type and/or type alias declarations if necessary to fix this (since they're allowed today but don't seem to be used significantly in practice)
── UNBOUND TYPE VARIABLE ─────────────────────────────────────────────── B.roc ─
The definition of O has an unbound type variable:
3│ O := _
^
Tip: Type variables must be bound before the :=. Perhaps you intended
to add a type parameter to this type?
Am I doing this wrong?
interface B exposes [] imports []
O := _
f = \@O x -> x + 1
expect
f (@O 1) == 2
this program doesn't compile for me
pretty sure type aliases also can't have _
── UNBOUND TYPE VARIABLE in B.roc ──────────────────────────────────────────────
The definition of O has an unbound type variable:
3│ O : _
^
Tip: Type variables must be bound before the :. Perhaps you intended
to add a type parameter to this type?
Allowing _
in type aliases would have a few UX problems too
huh, I'm on mobile and the online repl accepted it :sweat_smile:
seems like a bug. im pretty sure this isn't allowed. even if it is, i think we should revisit ever supporting this
looks like the repl is deferring evaluation. if you use
O := _
f = \@O x -> x + 1
f (@O 1)
you'll see it errors out
ahh gotcha
yeah I guess we should give an error for this at the parsing stage then
for both opaque type and type alias declarations
should ability declarations also disallow it?
I'm gonna add an explicit error for it
yes, let's disallow it in abilities too
I posted a simpler, single file reduction in the issue here.
app [main] {
pf: platform "https://github.com/roc-lang/basic-cli/releases/download/0.10.0/vNe6s9hWzoTZtFmNkvEICPErI9ptji_ySjicO6CkucY.tar.br",
}
import pf.Stdout
import pf.Task exposing [Task]
First : {}
Second : First {}
main =
testFunc {}
testFunc : Second -> Task {} [StdoutErr Stdout.Err]
testFunc = \{} ->
Stdout.line "123"
It looks (without compiler analysis) that we generate a layout for First
, think Second
has the same layout, but don't pass First
's layout to the generated Task
That's just a guess though
Sorry to interrupt discussion, just figured there's more visibility on Zulip
Thanks for the note Sam. Unfortunately I think that issue is different and the bug is that Second : First {}
isn't caught as a type error; if I change it to Second : First
, the program compiles.
Yes, that's true. Good to know
btw is this related to https://github.com/roc-lang/roc/issues/5464#issuecomment-1631583439 or is that a separate issue?
That's a separate issue
Document for consideration on this issue: https://github.com/roc-lang/rfcs/pull/1. Feedback appreciated from everyone.
Wow... That's super thorough and impressive!
As to your proposed solution of using opaque types. I think it is another example of the fact that opaque types kind of come in two forms "types I wish to hide the implementation of" and "types I need the features of opaque types for (see some of the discussion on Json decoding and enum types)".
I do think the current setup is unergonomic for those second class of types because I generally don't actually want them to be "opaque"
In my own codebase I tested out implementing from
and to
abilities for opaque types that I wnated to be less opaque. eg: when from myopaqueType is
In summary. My only concern with solution 1 is reduced ergonomics, but I think that is already an issue that needs to be investigated.
Thanks for the feedback Eli!
Eli Dowling said:
I think it is another example of the fact that opaque types kind of come in two forms "types I wish to hide the implementation of" and "types I need the features of opaque types for (see some of the discussion on Json decoding and enum types)".
I do think the current setup is unergonomic for those second class of types because I generally don't actually want them to be "opaque"
I know this has come up before, but personally I really hope we can avoid introducing "nominal but not opaque types" to the language
that said, I just realized that this solution to the problem will create a different (much smaller) problem: I think it's important for multiple reasons that we disallow sending opaque types across the host boundary (in either direction), and this will mean that it would be disallowed to send recursive types to or from the host
obviously it would be ideal to not have that limitation somehow, but like I said, I really would prefer not to introduce the language feature of nominal types that are not opaque :sweat_smile:
so we tried the solution of making task an opaque type, but that does not (yet) work either. This code
interface Task
exposes [Task, stdoutLine, Op]
imports []
Op : [
StdoutLine Str (Box ({} -> Op)),
StdinLine (Box (Str -> Op)),
None,
]
Task ok err := (Result ok err -> Op) -> Op
stdoutLine : Str -> Task {} a
stdoutLine = \line ->
helper: ((Result {} a) -> Op) -> Op
helper =
\fromResult -> StdoutLine line (Box.box \{} -> fromResult (Ok {}))
@Task helper
gives this error with cargo run -- check examples/platform-switching/rust-platform/Task.roc
── CIRCULAR TYPE in examples/platform-switching/rust-platform/Task.roc ─────────
I'm inferring a weird self-referential type for stdoutLine:
14│> stdoutLine = \line ->
15│> helper: ((Result {} a) -> Op) -> Op
16│> helper =
17│> # crash "foo"
18│> \fromResult -> StdoutLine line (Box.box \{} -> fromResult (Ok {}))
19│>
20│> @Task helper
Here is my best effort at writing down the type. You will see ∞ for
parts of the type that repeat something already printed out
infinitely.
Str -> Task {} a
@Ayaz Hafiz this is no longer a module crossing problem now, is what we observe here entirely different from the original problem we're trying to solve here?
yes that seems different
try making Op opaque too
right, we tried that next
Op := [
StdoutLine ({} -> Op),
StdinLine ({} -> Op),
None,
]
Task := ({} -> Op) -> Op
stdoutLine : Task
stdoutLine =
helper: ({} -> Op) -> Op
helper =
\fromResult -> @Op (StdoutLine ((\{} -> fromResult {})))
@Task helper
gives
── CIRCULAR TYPE in examples/platform-switching/rust-platform/Task.roc ─────────
I'm inferring a weird self-referential type for stdoutLine:
14│ stdoutLine =
^^^^^^^^^^
Here is my best effort at writing down the type. You will see ∞ for
parts of the type that repeat something already printed out
infinitely.
Task
hm
will take a look in a minute
this is some kind of bug in the occurs checking
if you remove the annotation on stdoutLine it works for me
well, it typechecks at least
interface B exposes [main] imports []
# Task.roc
Task v op := (v -> op) -> op
await : Task a op, (a -> Task b op) -> Task b op
await = \@Task fromResult, next ->
@Task \continue ->
fromResult \result ->
@Task inner = next result
inner continue
# StdinEffect.roc
OpIn a b : [
StdinLine (Str -> OpIn a b),
Done a,
]b
# Stdin.roc
lineIn : Task Str (OpIn * *)
lineIn = @Task \toNext -> StdinLine \s -> (toNext s)
# StdoutEffect.roc
OpOut a b : [
StdoutLine Str ({} -> OpOut a b),
Done a,
]b
# Stdout.roc
lineOut : Str -> Task {} (OpOut * *)
lineOut = \s -> @Task \toNext -> StdoutLine s \{} -> (toNext {})
# Platform
# really, we want a syntax like [Done a](OpIn a)(OpOut a) here
Op a : [
StdinLine (Str -> Op a),
StdoutLine Str ({} -> Op a),
Done a,
]
main : Task {} (Op *)
main =
lineIn
|> await \s -> lineOut s
interface B exposes [stdoutLine] imports []
Op := [
StdinLine (Str -> Op),
StdoutLine Str ({} -> Op),
Done,
]
Task := ({} -> Op) -> Op
stdoutLine =
@Task \fromResult -> @Op (StdoutLine "foo" (\{} -> fromResult {}))
Okay, I suggest as a path forward here:
-Remove lambda sets in favor of erasure, at least for the time being
-Remove anonymous recursive types as described above, or at least severely restrict them
I know this has various downsides including runtime performance. But lambda sets can be added later, in a separate pass. This will temporarily push Roc further away from its end goals, but I think the goal of making a fast compile-time and runtime language has put the implementation in a position where it’s difficult to recover from these issues we discover only years later. A functional compiler is more useful than a fast one that has bugs. And, without lambda sets or anonymous recursive types, the implementation should be much more approachable to contributors.
but, the erased version also does not work with current examples. Do you know what the problem is there?
Almost certainly the problem of recursive types across modules
I was just rereading the doc and I wondered - would the fixpoint fixing work if we delayed it until after typechecking was done? e.g. if we did it during mono, when we’re already revisiting modules
You would still need to break module isolation during type inference
because you need to know "what types, across all modules, are in the same strongly-connected components"
gotcha
I really think the goal here should be to simplify, at the cost of the performance, until this works again, and then layer on optimizations as needed
@Ayaz Hafiz is breaking that isolation required during type inference, or only in later steps, assuming these steps
where is the isolation breaking actually required? Intuitively it's only at step 3 that we need it, given the restriction on recursive types that was proposed earlier in this thread
yes, if recursive types must be nominal, then
MonoLam = Mono IR today
can we not perform step 2 just in subs only (while traversing Can, but not materializing a whole separate AST)?
there is probably a way, but i don't see it quite yet. one complication is abilities. if you do not materialize the call to a specific lambda in the AST you must keep the call information elsewhere. so you would need a lookaside table that associates a type specialization ID + can AST node ID => ability specialization
this could be an optimization. i suspect it will be easier to implement and debug if it's a separate AST.
Ayaz Hafiz said:
I really think the goal here should be to simplify, at the cost of the performance, until this works again, and then layer on optimizations as needed
ok so how would we go about doing this? I remember we'd talked about this before but I don't remember the specific steps it would take :sweat_smile:
something to do with marking closures as erased maybe?
RecursiveType
constructor needs to be adjusted to hold a symbol naming the recursive type. During can, construct recursive types from opaque type definitions as needed. Two recursive types are equal only if their RecursiveType constructors are equal in symbol name, and the arguments to their opaque type unify.Ayaz Hafiz said:
- Replace lambda sets with type erased closures. This is already implemented for the most part. The missing pieces are reference counting and abilities.
ok cool! So where should we look for implementing type-erased closures?
more specifically, where does the refcounting need to happen? Could we reuse some logic from Box
maybe?
also, what needs to happen with regards to abilities? Normally functions don't have any abilities, but I'm assuming there's some nuance I'm missing there :big_smile:
whe need to generate a function that decrements all captured values I think? and make that accessible such that a dec someClosure
calls the corresponding function
right?
Like with my work for lists, can we make this lazy? Does that make sense here.
As in just refcount the erased closure instead of every value in the closure? I guess it maybe makes less sense here than in lists.
Fundamentally, for wrapped types, I think only refcounting the wrapper is a nice property.
Of course when loading values from the wrapper, you would need to deal with refcounts
yes sure but the generation work is mostly the same (but only runs when the rc goes from 1 to 0)
Yep
actually, nothing needs to happen with abilities so you can disregard that
ok cool!
I wonder - is this a situation where we can split up the work by having someone knowledgeable about the low-level parts make a starting point commit like "ok now the refcounting happens in this one place, but it also needs to happen in these N other similar situations that need to be considered individually" and someone else can pick up that part?
@Ayaz Hafiz what is the memory layout supposed to be? where does that function pointer go?
one sec, im putting up the document i wrote about this before
https://github.com/roc-lang/rfcs/pull/4
the structure of the erased closure is
struct erased {
void* value;
void (*refcounter)(void*); // NULL if value needs no refcounting
void* callee; // NULL if the erased value is not a closure
struct ability_dictionary* abilities; // Discussed below
};
ability_dictionary is NULL for closures
callee
is the function target`
value
are the captures (NULL if no captures)
refcounter
is a pointer to the refcounting function for the captures. the name needs to be generated/looked up whenever the erased closure is constructed.
i suppose it should really be void (*refcounter)(int, void*)
where the first parameter indicates inc or dec
or split them out into separate pointers
when refcounting happens for captures is the same as when refcounting happens today. you just need to lookup the refcounting function differently because it's not statically known from the caller's side
fyi this memory layout is already implemented
but the refcounting function is not populated
nice! do you know where that code lives at the moment?
like where the refcounting fn needs to go
oh, well the actual implementation already refernces refcounter_inc and refcounter_dec, so it is two separate functions
/// Erased is laid out like
///
/// text
/// struct Erased {
/// value: void*,
/// callee: void*,
/// refcounter_inc: (void* -> void) *,
/// refcounter_dec: (void* -> void) *,
/// }
///
Last updated: Jul 06 2025 at 12:14 UTC