I'm working on canonicalization that allows shadowing, and I'm finding some tricky edge cases in even just defining (and trying to explain in a way where people can understand them) what the rules are
here's an idea I currently like for what the rules would be:
foo = at the top level, but then later on at the top level you can't say foo = again; that's an error (just like today).blah is defined at the top level, I you can refer to blah before it has been defined. (Just like today.) You can define mutually recursive functions this way, but of course you can't make mutually recursive non-functions (also like today).foo = to shadow the top-level one if you like, and in fact you can also write foo = again in that same scope to redeclare it.)these rules seem hopefully pretty straightforward to learn and to apply
an implication of these rules is that mutually recursive functions can only be defined at the top level
that seems fine to me since they come up very rarely, even when they do come up they're usually written at the top level anyway, and of course it wouldn't block anything because you can always rewrite any lambda to be a top-level function by having it take explicit parameters for whatever it would have been closing over
it might seem needlessly restrictive, but consider (as I just got through doing!) what happens if you lift the "When not at the top level, ordering is strictly enforced" restriction: what happens if I want to write mutually recursive closures? One must necessarily refer to the other one before it has been declared, and the other one might be declared later on in an enclosing scope rather than in the current block of defs
then think about how that interacts with shadowing: what if one of the closures in the mutual recursion is named foo, but foo is redeclaring (or shadowing) something that was previously named foo, but that one wasn't a function
so theoretically we have enough information to infer that foo and some other function are mutually recursive, by ignoring the non-function foo that's declared in between the function foo and the thing that mutually recurses with it...
...but at this point it might be very confusing, to say the least, to figure out what is referring to what
the nice thing about the proposed rule set is that:
foo refers to: "either a foo that's declared earlier in the source file, or if there is none, then a foo top-level declaration later in the file" - otherwise it's definitely a naming error!thoughts?
When not at the top level, should we disallow redeclaration of same-scope functions? It seems it would be confusing to have:
foo = -1
foo = \x -> x * foo
foo = \x ->
if x >= 5 then
x
else
foo (x * 3)
foo 2
Does this example (which iiuc may be permitted by the proposed rules) result in:
-6, in case the last foo uses the middle foo instead of self-recursion?foo is self-referential and thus fails due multiplication involving a function value?In any case, at least for non-dev builds, we should perhaps disallow unused declarations that are not top-level?
in a nested scope, you can write foo = to shadow the top-level one if you like
Can you give an example when this is useful?
sure, I'm writing a parser and I want a top level value named str so I can expose it as Parser.str but I also want to name something str locally as an intermediate value when implementing a different parser
that said, we can also address that by letting you name the top level one as like strInner and then expose it as strInner as str
so I could see an argument for "top level values can't be shadowed/redeclared at all" as opposed to just "they can't do it themselves"
Kevin Gillette said:
In any case, at least for non-dev builds, we should perhaps disallow unused declarations that are not top-level?
I don't think that's necessary; you get a warning for unused declarations regardless, so if you ship it it's because you were informed about it and decided to ignore it. I don't think we need to disallow shipping under those circumstances!
Kevin Gillette said:
When not at the top level, should we disallow redeclaration of same-scope functions? It seems it would be confusing to have:
foo = -1 foo = \x -> x * foo foo = \x -> if x >= 5 then x else foo (x * 3) foo 2
self-recursion should always be allowed (like today), so in this example:
foo = \x -> x * foo
here, foo refers to itself (self-recursion) because itself is always the "most recent declaration"
same here: foo refers to itself
foo = \x ->
if x >= 5 then
x
else
foo (x * 3)
foo 2
this refers to the most recent foo, also as normal
so I don't think that example needs to be disallowed
worth noting: another factor which led me to start thinking about this is that allowing things like nested mutual recursion both complicates and slows down the compiler significantly (at least percentage-wise) compared to this rule set
which led me to start wondering "why pay such a high runtime and implementation complexity cost to support writing confusing code?"
I don't know if I'd personally choose to redefine a function with foo = several times in a row, but I don't think it's hard to figure out what it's doing :big_smile:
This sounds great to me!
Can you shadow with lambda params? I.e.
foo = 10
bar = \foo ->
Str.concat foo “bar”
This is something I’ve run into a few times in practice
yeah that would be allowed in this idea
Okay perfect
I agree, same level re declaration is still a lot better than mutation because at any point in the code there is still only one declaration that matters and the editor can point me right to it
foo = -1 foo = \x -> x * foo
self-recursion should always be allowed (like today), so in this example:
foo = \x -> x * foohere, foo refers to itself (self-recursion) because itself is always the "most recent declaration"
I think this should be reconsidered. I think this code would confuse most users.
I think that most users would read this as:
-- Foo is -1
foo = -1
-- Foo is a function that takes in x, captures foo(which is -1), and returns x * foo
foo = \x -> x * foo
-- This returns -7
foo 7
If instead, the first declaration of foo is unused. The second is self recursive.
Then this code would instead return a compile time error. Cause in x * foo, foo is being used as a Num a instead of a Num a -> Num a cause no args are being passed to the self recursive foo function. That error alone would probably be quite confusing to users given just above that line is a version of foo that is defined as a Num a
interesting, so basically recursive closures would be disallowed in that design
not just mutually recursive, but recursive in general
so if you wanted to do any type of recursion, it would have to be at the top level
I guess that's how it works in a lot of languages, to be fair :thinking:
I think if no shadowing was involved, a recursive closure would makes sense. That said, once shadowing is involved, I think referencing the previous value makes more senese.
At least from the just quickly looking at the code what would I expect most users to immediately expect it to do perspective.
I dunno, honestly I prefer the simpler rule at that point: "if you want to recurse, do it at the top level"
as opposed to "you can recurse outside the top level, but only if you are specifically recursing on something that has not been defined earlier in scope"
I think I would actually prefer "shadowing of closures isn't allowed". This allows for closures to be recursive.
But yeah, should pick that or your other simple rule
Also, the reason for my preference is that it can often be nice to hide recursive helper functions and give them simple names
Instead of:
thisIsAlreadyALongName = \... ->
...
thisIsAlreadyALongNameHelper = \... ->
...
You can write:
thisIsAlreadyALongName = \... ->
helper = \... ->
...
...
so to summarize, the concrete idea I'm now thinking of is:
...and if you need them to be recursive, move them to the top level!
Given only top level functions are free from ordering, I think it makes a lot of logical sense.
oh yeah, good point - I edited it to note that :big_smile:
interestingly, in conjunction with compile-time evaluation, these can be taught to Rust programmers as:
fn or const depending on whether you're assigning to a lambdaletwith the one additional rule that top-level fn and const in Rust are allowed to be shadowed
but not in Roc
a thing I like about this is that the rules only come up when you're writing code, but when you're reading it you probably don't need to be aware of them at all
Brendan Hansknecht said:
Also, the reason for my preference is that it can often be nice to hide recursive helper functions and give them simple names
yeah thinking about it more, and talking to @Folkert de Vries about it, I think this is worth preserving
One nice thing about being able to define a recursive helper as a sub definition is that then you can close over values that don’t change during the recursion which can make the code a lot cleaner.
A note on the recursion discussion. Coming from any other language I would immediately assume that foo is recursive and the first definition is unused.
foo = -1
foo = \x -> x * foo
I would think the other behaviour would just cause regular frustration. I think the more weird edge cases you have the less pleasant a language is to use. Like "you can recurse, but not within shadowing within local functions" that's just feels like another silly thing to remember.
Coming from any other language?
foo = -1
foo = \x -> x * foo
I thought it was using the previous foo in the function body and wasn't recursing. The recursive case makes more sense if I do some reasoning from a lang. designer point of view, but I wouldn't want to tink about such things while coding. I don't think this kind of code should be allowed.
foo = -1
foo = \x -> x * foo
I don't even know what I would expect here. I think in practive, the foo = -1 would be somewhere else most of the time, and you would only see foo= \x -> x * foo, and logically would assume recursion here.
But another Idea: Should this even be allowed? I want shadowing for stuff like model = doSomethingWith model, not for functions. Would it be possible to just not allow reassigning a variable with a function?
in general I'm not a fan of the idea of more complicated shadowing rules than "is allowed at the top level but not anywhere else"
like shadowing/redeclaration rules varying by type sounds more complicated than it's worth to me
as opposed to (for example) allowing it but culturally discouraging it
another thing that just occurred to me: all the reasons that non-function defs should not be allowed to be out-of-order are just as applicable in the top level as they are in function bodies
e.g. if any of them has a dbg, or a failed expect, or a crash, and they're out of order, then those will print in a surprising order
because they got silently reordered by the compiler
so I think the actual rule we want here is "top-level functions can be declared in any order"
(and of course type aliases and opaque types and abilities, since those can be declared in any order anywhere)
but top-level constants that aren't functions still need to be in order
that becomes especially true when we evaluate those at compile time, because then they will always be getting evaluated in exactly the order they appear in the source file, so dbg/crash/expect output appearing in a different order will be even more surprising
so then in that world, the overall proposed rules would be:
It is kinda interesting. In most languages functions aren't values (at least the standard written way, they may still have lambdas separately) so they wouldn't hit this issue. Functions and values are just a separate class.
Only since functions are values does roc even have to consider this issue as needing special rules
e.g. if any of them has a
dbg, or a failedexpect, or acrash, and they're out of order, then those will print in a surprising order
I honestly wouldn't worry about that. Those messages and such will be part of compile time. Or the top level constants will be secretly lambdas like today (which means order really doesn't matter).
I think forcing top level constants to be declared in order would be non-ergonomic. All top levels being out of order is very nice for code organization.
Roc isn't python. I don't think anyone expects the top level declarations to run in order like an imperative scripting language. So I would really push against that assumption.
I think it would just make a the language semantics worse for no reason.
Once inside a function definition or a nested scope, that is different. Now you are in the world of a list of steps to do something or build something.
Brendan Hansknecht said:
I think forcing top level constants to be declared in order would be non-ergonomic. All top levels being out of order is very nice for code organization.
I think the thing that's nice ergonomically is that I can refer to any top-level constant from any function
like I don't think it really impacts my ergonomics whether this is allowed or not:
bar = foo + 1
foo = 2
Can I put constants used by a function after the function is defined?
sure
the constants would just need to be ordered with respect to each other
Ah. Then either is fine to me
so that they're ordered in the same order they'll be evaluated
cool!
Richard Feldman said:
so I could see an argument for "top level values can't be shadowed/redeclared at all" as opposed to just "they can't do it themselves"
I want to revisit this, actually - as I'm going through the implementation, I realize that enforcing this has a nontrivial performance cost; it either requires a second pass over the AST, or else much higher memory usage
what are people's thoughts about allowing top-level constants to be shadowed in nested scopes?
(given that the performance cost is nontrivial, I think the strength of the preference for this over the faster alternative design should also be nontrivial!)
Would this mean they can be shadowed anywhere, even at the top level? I assume not, I assume it is just that they can be shadowed within a function/ body of another top level?
yeah not at the top level
so the rule would be "top-level can't shadow" rather than "top-level can't be shadowed"
Sounds fine
Yeah, that sounds desirable to me
Top level defs can't be shadowed seems arbitrary to me anyways. As a language user, knowing that top level definitions are constants doesn't change that fact that I may want to shadow them, so I actually prefer the more performant one from a design point as well.
Yeah this makes sense to me too.
The way I would say this is: top level values can be shadowed but not redeclared.
And that's great, I prefer it. Fine to reuse a name as long as you have scope to distinguish it.
And redeclaring in the module scope would feel weird to me, so I'm happy not to have it.
Please reconsider the decision to allow shadowing. It allows non local edits to code to change meaning. You're making confusing rules about where recursion can be used in a pure functional language just in order to be able to footgunly reuse variable names and confuse the poor maintainer of your code six months down the line.
How hard is it to add an extra character or two versus how hard it is to fix a bug that you couldn't see because you read the foo, parsed the foo, transformed the foo, updated the foo, used the foo and sent the foo, but we found out last week that somehow since we started logging the foo a month ago, it isn't getting transformed at all and all the logged foos are out of date because they weren't updated. If I'd been forced by the super fast, super helpful compiler to call it something ugly like updatedFoo two years ago , none of this would have happened.
There's a function called str already. Don't let me call my variable str. There's an x in scope already. Don't let me make a nested lambda with x as the new parameter. I'll get this x sometimes when I meant the other one. I won't be able to immediately see what I did wrong by reading the code.
Yes, shadowing can be abused. Code style/review is important. Many languages have mutable variables. Many languages have shadowing. These languages are able to function just fine even with these features. This doesn't mean we should or shouldn't have shadowing, but it is an important piece of context.
Using many numeric suffixes on a variable in roc has already led to bugs and friction. So this isn't a clean tradeoff on one being buggy and messy while the other is clean and less buggy. This is a more complex tradeoff that needs more nuance in discussion. It isn't simply about adding an extra character to a variable name.
@Andrew C we plan on trying out shadowing, if it doesn't turn out good we'll change it.
Last updated: Jun 16 2026 at 16:19 UTC