Stream: ideas

Topic: shadowing and redeclaration except at top level


view this post on Zulip Richard Feldman (Feb 05 2024 at 02:03):

I'm working on canonicalization that allows shadowing, and I'm finding some tricky edge cases in even just defining (and trying to explain in a way where people can understand them) what the rules are

view this post on Zulip Richard Feldman (Feb 05 2024 at 02:07):

here's an idea I currently like for what the rules would be:

  1. At the top level, redeclaration is not allowed. You can say foo = at the top level, but then later on at the top level you can't say foo = again; that's an error (just like today).
  2. Anything can always refer to anything in the top level regardless of order. So if blah is defined at the top level, I you can refer to blah before it has been defined. (Just like today.) You can define mutually recursive functions this way, but of course you can't make mutually recursive non-functions (also like today).
  3. When not at the top level, you can shadow and redeclare things. (So in a nested scope, you can write foo = to shadow the top-level one if you like, and in fact you can also write foo = again in that same scope to redeclare it.)
  4. When not at the top level, ordering is strictly enforced. You can still reference top-level things that haven't been declared yet, but you cannot reference any non-top-level thing unless it has already been declared earlier in scope. Otherwise, you get an error.

view this post on Zulip Richard Feldman (Feb 05 2024 at 02:08):

these rules seem hopefully pretty straightforward to learn and to apply

view this post on Zulip Richard Feldman (Feb 05 2024 at 02:09):

an implication of these rules is that mutually recursive functions can only be defined at the top level

view this post on Zulip Richard Feldman (Feb 05 2024 at 02:10):

that seems fine to me since they come up very rarely, even when they do come up they're usually written at the top level anyway, and of course it wouldn't block anything because you can always rewrite any lambda to be a top-level function by having it take explicit parameters for whatever it would have been closing over

view this post on Zulip Richard Feldman (Feb 05 2024 at 02:12):

it might seem needlessly restrictive, but consider (as I just got through doing!) what happens if you lift the "When not at the top level, ordering is strictly enforced" restriction: what happens if I want to write mutually recursive closures? One must necessarily refer to the other one before it has been declared, and the other one might be declared later on in an enclosing scope rather than in the current block of defs

view this post on Zulip Richard Feldman (Feb 05 2024 at 02:13):

then think about how that interacts with shadowing: what if one of the closures in the mutual recursion is named foo, but foo is redeclaring (or shadowing) something that was previously named foo, but that one wasn't a function

view this post on Zulip Richard Feldman (Feb 05 2024 at 02:13):

so theoretically we have enough information to infer that foo and some other function are mutually recursive, by ignoring the non-function foo that's declared in between the function foo and the thing that mutually recurses with it...

view this post on Zulip Richard Feldman (Feb 05 2024 at 02:14):

...but at this point it might be very confusing, to say the least, to figure out what is referring to what

view this post on Zulip Richard Feldman (Feb 05 2024 at 02:15):

the nice thing about the proposed rule set is that:

view this post on Zulip Richard Feldman (Feb 05 2024 at 02:20):

thoughts?

view this post on Zulip Kevin Gillette (Feb 05 2024 at 07:17):

When not at the top level, should we disallow redeclaration of same-scope functions? It seems it would be confusing to have:

foo = -1
foo = \x -> x * foo
foo = \x ->
  if x >= 5 then
    x
  else
    foo (x * 3)

foo 2

Does this example (which iiuc may be permitted by the proposed rules) result in:

  1. -6, in case the last foo uses the middle foo instead of self-recursion?
  2. A compilation error because the middle foo is self-referential and thus fails due multiplication involving a function value?

In any case, at least for non-dev builds, we should perhaps disallow unused declarations that are not top-level?

view this post on Zulip Anton (Feb 05 2024 at 09:54):

in a nested scope, you can write foo = to shadow the top-level one if you like

Can you give an example when this is useful?

view this post on Zulip Richard Feldman (Feb 05 2024 at 11:38):

sure, I'm writing a parser and I want a top level value named str so I can expose it as Parser.str but I also want to name something str locally as an intermediate value when implementing a different parser

view this post on Zulip Richard Feldman (Feb 05 2024 at 11:41):

that said, we can also address that by letting you name the top level one as like strInner and then expose it as strInner as str

view this post on Zulip Richard Feldman (Feb 05 2024 at 11:42):

so I could see an argument for "top level values can't be shadowed/redeclared at all" as opposed to just "they can't do it themselves"

view this post on Zulip Richard Feldman (Feb 05 2024 at 11:50):

Kevin Gillette said:

In any case, at least for non-dev builds, we should perhaps disallow unused declarations that are not top-level?

I don't think that's necessary; you get a warning for unused declarations regardless, so if you ship it it's because you were informed about it and decided to ignore it. I don't think we need to disallow shipping under those circumstances!

view this post on Zulip Richard Feldman (Feb 05 2024 at 11:52):

Kevin Gillette said:

When not at the top level, should we disallow redeclaration of same-scope functions? It seems it would be confusing to have:

foo = -1
foo = \x -> x * foo
foo = \x ->
  if x >= 5 then
    x
  else
    foo (x * 3)

foo 2

self-recursion should always be allowed (like today), so in this example:

foo = \x -> x * foo

here, foo refers to itself (self-recursion) because itself is always the "most recent declaration"

view this post on Zulip Richard Feldman (Feb 05 2024 at 11:52):

same here: foo refers to itself

foo = \x ->
  if x >= 5 then
    x
  else
    foo (x * 3)

view this post on Zulip Richard Feldman (Feb 05 2024 at 11:53):

foo 2

this refers to the most recent foo, also as normal

view this post on Zulip Richard Feldman (Feb 05 2024 at 11:53):

so I don't think that example needs to be disallowed

view this post on Zulip Richard Feldman (Feb 05 2024 at 11:54):

worth noting: another factor which led me to start thinking about this is that allowing things like nested mutual recursion both complicates and slows down the compiler significantly (at least percentage-wise) compared to this rule set

view this post on Zulip Richard Feldman (Feb 05 2024 at 11:55):

which led me to start wondering "why pay such a high runtime and implementation complexity cost to support writing confusing code?"

view this post on Zulip Richard Feldman (Feb 05 2024 at 11:56):

I don't know if I'd personally choose to redefine a function with foo = several times in a row, but I don't think it's hard to figure out what it's doing :big_smile:

view this post on Zulip Isaac Van Doren (Feb 05 2024 at 12:24):

This sounds great to me!

view this post on Zulip Isaac Van Doren (Feb 05 2024 at 12:26):

Can you shadow with lambda params? I.e.

foo = 10

bar = \foo ->
Str.concat foo “bar”

view this post on Zulip Isaac Van Doren (Feb 05 2024 at 12:26):

This is something I’ve run into a few times in practice

view this post on Zulip Richard Feldman (Feb 05 2024 at 12:35):

yeah that would be allowed in this idea

view this post on Zulip Isaac Van Doren (Feb 05 2024 at 12:37):

Okay perfect

view this post on Zulip Pearce Keesling (Feb 05 2024 at 12:47):

I agree, same level re declaration is still a lot better than mutation because at any point in the code there is still only one declaration that matters and the editor can point me right to it

view this post on Zulip Brendan Hansknecht (Feb 05 2024 at 16:10):

foo = -1
foo = \x -> x * foo

self-recursion should always be allowed (like today), so in this example:

foo = \x -> x * foo

here, foo refers to itself (self-recursion) because itself is always the "most recent declaration"

I think this should be reconsidered. I think this code would confuse most users.
I think that most users would read this as:

-- Foo is -1
foo = -1
-- Foo is a function that takes in x, captures foo(which is -1), and returns x * foo
foo = \x -> x * foo
-- This returns -7
foo 7

If instead, the first declaration of foo is unused. The second is self recursive.
Then this code would instead return a compile time error. Cause in x * foo, foo is being used as a Num a instead of a Num a -> Num a cause no args are being passed to the self recursive foo function. That error alone would probably be quite confusing to users given just above that line is a version of foo that is defined as a Num a

view this post on Zulip Richard Feldman (Feb 05 2024 at 16:27):

interesting, so basically recursive closures would be disallowed in that design

view this post on Zulip Richard Feldman (Feb 05 2024 at 16:27):

not just mutually recursive, but recursive in general

view this post on Zulip Richard Feldman (Feb 05 2024 at 16:28):

so if you wanted to do any type of recursion, it would have to be at the top level

view this post on Zulip Richard Feldman (Feb 05 2024 at 16:34):

I guess that's how it works in a lot of languages, to be fair :thinking:

view this post on Zulip Brendan Hansknecht (Feb 05 2024 at 17:11):

I think if no shadowing was involved, a recursive closure would makes sense. That said, once shadowing is involved, I think referencing the previous value makes more senese.

view this post on Zulip Brendan Hansknecht (Feb 05 2024 at 17:11):

At least from the just quickly looking at the code what would I expect most users to immediately expect it to do perspective.

view this post on Zulip Richard Feldman (Feb 05 2024 at 17:14):

I dunno, honestly I prefer the simpler rule at that point: "if you want to recurse, do it at the top level"

view this post on Zulip Richard Feldman (Feb 05 2024 at 17:15):

as opposed to "you can recurse outside the top level, but only if you are specifically recursing on something that has not been defined earlier in scope"

view this post on Zulip Brendan Hansknecht (Feb 05 2024 at 17:16):

I think I would actually prefer "shadowing of closures isn't allowed". This allows for closures to be recursive.

view this post on Zulip Brendan Hansknecht (Feb 05 2024 at 17:16):

But yeah, should pick that or your other simple rule

view this post on Zulip Brendan Hansknecht (Feb 05 2024 at 17:18):

Also, the reason for my preference is that it can often be nice to hide recursive helper functions and give them simple names

Instead of:

thisIsAlreadyALongName = \... ->
    ...

thisIsAlreadyALongNameHelper = \... ->
    ...

You can write:

thisIsAlreadyALongName = \... ->
    helper = \... ->
        ...
    ...

view this post on Zulip Richard Feldman (Feb 05 2024 at 17:18):

so to summarize, the concrete idea I'm now thinking of is:

...and if you need them to be recursive, move them to the top level!

view this post on Zulip Brendan Hansknecht (Feb 05 2024 at 17:20):

Given only top level functions are free from ordering, I think it makes a lot of logical sense.

view this post on Zulip Richard Feldman (Feb 05 2024 at 17:21):

oh yeah, good point - I edited it to note that :big_smile:

view this post on Zulip Richard Feldman (Feb 05 2024 at 17:23):

interestingly, in conjunction with compile-time evaluation, these can be taught to Rust programmers as:

view this post on Zulip Richard Feldman (Feb 05 2024 at 17:23):

with the one additional rule that top-level fn and const in Rust are allowed to be shadowed

view this post on Zulip Richard Feldman (Feb 05 2024 at 17:23):

but not in Roc

view this post on Zulip Richard Feldman (Feb 05 2024 at 17:24):

a thing I like about this is that the rules only come up when you're writing code, but when you're reading it you probably don't need to be aware of them at all

view this post on Zulip Richard Feldman (Feb 05 2024 at 18:15):

Brendan Hansknecht said:

Also, the reason for my preference is that it can often be nice to hide recursive helper functions and give them simple names

yeah thinking about it more, and talking to @Folkert de Vries about it, I think this is worth preserving

view this post on Zulip Isaac Van Doren (Feb 05 2024 at 18:43):

One nice thing about being able to define a recursive helper as a sub definition is that then you can close over values that don’t change during the recursion which can make the code a lot cleaner.

view this post on Zulip Eli Dowling (Feb 05 2024 at 21:55):

A note on the recursion discussion. Coming from any other language I would immediately assume that foo is recursive and the first definition is unused.

foo = -1
foo = \x -> x * foo

I would think the other behaviour would just cause regular frustration. I think the more weird edge cases you have the less pleasant a language is to use. Like "you can recurse, but not within shadowing within local functions" that's just feels like another silly thing to remember.

view this post on Zulip Brendan Hansknecht (Feb 05 2024 at 21:58):

Coming from any other language?

view this post on Zulip Norbert Hajagos (Feb 06 2024 at 09:02):

foo = -1
foo = \x -> x * foo

I thought it was using the previous foo in the function body and wasn't recursing. The recursive case makes more sense if I do some reasoning from a lang. designer point of view, but I wouldn't want to tink about such things while coding. I don't think this kind of code should be allowed.

view this post on Zulip Fabian Schmalzried (Feb 06 2024 at 09:49):

foo = -1
foo = \x -> x * foo

I don't even know what I would expect here. I think in practive, the foo = -1 would be somewhere else most of the time, and you would only see foo= \x -> x * foo, and logically would assume recursion here.
But another Idea: Should this even be allowed? I want shadowing for stuff like model = doSomethingWith model, not for functions. Would it be possible to just not allow reassigning a variable with a function?

view this post on Zulip Richard Feldman (Feb 06 2024 at 15:21):

in general I'm not a fan of the idea of more complicated shadowing rules than "is allowed at the top level but not anywhere else"

view this post on Zulip Richard Feldman (Feb 06 2024 at 15:21):

like shadowing/redeclaration rules varying by type sounds more complicated than it's worth to me

view this post on Zulip Richard Feldman (Feb 06 2024 at 15:21):

as opposed to (for example) allowing it but culturally discouraging it

view this post on Zulip Richard Feldman (Feb 06 2024 at 15:24):

another thing that just occurred to me: all the reasons that non-function defs should not be allowed to be out-of-order are just as applicable in the top level as they are in function bodies

view this post on Zulip Richard Feldman (Feb 06 2024 at 15:25):

e.g. if any of them has a dbg, or a failed expect, or a crash, and they're out of order, then those will print in a surprising order

view this post on Zulip Richard Feldman (Feb 06 2024 at 15:25):

because they got silently reordered by the compiler

view this post on Zulip Richard Feldman (Feb 06 2024 at 15:26):

so I think the actual rule we want here is "top-level functions can be declared in any order"

view this post on Zulip Richard Feldman (Feb 06 2024 at 15:26):

(and of course type aliases and opaque types and abilities, since those can be declared in any order anywhere)

view this post on Zulip Richard Feldman (Feb 06 2024 at 15:26):

but top-level constants that aren't functions still need to be in order

view this post on Zulip Richard Feldman (Feb 06 2024 at 15:27):

that becomes especially true when we evaluate those at compile time, because then they will always be getting evaluated in exactly the order they appear in the source file, so dbg/crash/expect output appearing in a different order will be even more surprising

view this post on Zulip Richard Feldman (Feb 06 2024 at 15:32):

so then in that world, the overall proposed rules would be:

view this post on Zulip Brendan Hansknecht (Feb 06 2024 at 15:42):

It is kinda interesting. In most languages functions aren't values (at least the standard written way, they may still have lambdas separately) so they wouldn't hit this issue. Functions and values are just a separate class.

Only since functions are values does roc even have to consider this issue as needing special rules

view this post on Zulip Brendan Hansknecht (Feb 06 2024 at 15:44):

e.g. if any of them has a dbg, or a failed expect, or a crash, and they're out of order, then those will print in a surprising order

I honestly wouldn't worry about that. Those messages and such will be part of compile time. Or the top level constants will be secretly lambdas like today (which means order really doesn't matter).

view this post on Zulip Brendan Hansknecht (Feb 06 2024 at 15:46):

I think forcing top level constants to be declared in order would be non-ergonomic. All top levels being out of order is very nice for code organization.

view this post on Zulip Brendan Hansknecht (Feb 06 2024 at 15:48):

Roc isn't python. I don't think anyone expects the top level declarations to run in order like an imperative scripting language. So I would really push against that assumption.

view this post on Zulip Brendan Hansknecht (Feb 06 2024 at 15:48):

I think it would just make a the language semantics worse for no reason.

view this post on Zulip Brendan Hansknecht (Feb 06 2024 at 15:49):

Once inside a function definition or a nested scope, that is different. Now you are in the world of a list of steps to do something or build something.

view this post on Zulip Richard Feldman (Feb 06 2024 at 16:09):

Brendan Hansknecht said:

I think forcing top level constants to be declared in order would be non-ergonomic. All top levels being out of order is very nice for code organization.

I think the thing that's nice ergonomically is that I can refer to any top-level constant from any function

view this post on Zulip Richard Feldman (Feb 06 2024 at 16:10):

like I don't think it really impacts my ergonomics whether this is allowed or not:

bar = foo + 1
foo = 2

view this post on Zulip Brendan Hansknecht (Feb 06 2024 at 16:10):

Can I put constants used by a function after the function is defined?

view this post on Zulip Richard Feldman (Feb 06 2024 at 16:10):

sure

view this post on Zulip Richard Feldman (Feb 06 2024 at 16:10):

the constants would just need to be ordered with respect to each other

view this post on Zulip Brendan Hansknecht (Feb 06 2024 at 16:10):

Ah. Then either is fine to me

view this post on Zulip Richard Feldman (Feb 06 2024 at 16:10):

so that they're ordered in the same order they'll be evaluated

view this post on Zulip Richard Feldman (Feb 06 2024 at 16:10):

cool!

view this post on Zulip Richard Feldman (Feb 07 2024 at 04:45):

Richard Feldman said:

so I could see an argument for "top level values can't be shadowed/redeclared at all" as opposed to just "they can't do it themselves"

I want to revisit this, actually - as I'm going through the implementation, I realize that enforcing this has a nontrivial performance cost; it either requires a second pass over the AST, or else much higher memory usage

view this post on Zulip Richard Feldman (Feb 07 2024 at 04:46):

what are people's thoughts about allowing top-level constants to be shadowed in nested scopes?

view this post on Zulip Richard Feldman (Feb 07 2024 at 04:51):

(given that the performance cost is nontrivial, I think the strength of the preference for this over the faster alternative design should also be nontrivial!)

view this post on Zulip Brendan Hansknecht (Feb 07 2024 at 04:55):

Would this mean they can be shadowed anywhere, even at the top level? I assume not, I assume it is just that they can be shadowed within a function/ body of another top level?

view this post on Zulip Richard Feldman (Feb 07 2024 at 05:03):

yeah not at the top level

view this post on Zulip Richard Feldman (Feb 07 2024 at 05:03):

so the rule would be "top-level can't shadow" rather than "top-level can't be shadowed"

view this post on Zulip Brendan Hansknecht (Feb 07 2024 at 05:13):

Sounds fine

view this post on Zulip Isaac Van Doren (Feb 07 2024 at 13:00):

Yeah, that sounds desirable to me

view this post on Zulip Norbert Hajagos (Feb 07 2024 at 19:01):

Top level defs can't be shadowed seems arbitrary to me anyways. As a language user, knowing that top level definitions are constants doesn't change that fact that I may want to shadow them, so I actually prefer the more performant one from a design point as well.

view this post on Zulip Brian Carroll (Feb 07 2024 at 21:00):

Yeah this makes sense to me too.
The way I would say this is: top level values can be shadowed but not redeclared.
And that's great, I prefer it. Fine to reuse a name as long as you have scope to distinguish it.
And redeclaring in the module scope would feel weird to me, so I'm happy not to have it.

view this post on Zulip Andrew C (Feb 09 2024 at 01:08):

Please reconsider the decision to allow shadowing. It allows non local edits to code to change meaning. You're making confusing rules about where recursion can be used in a pure functional language just in order to be able to footgunly reuse variable names and confuse the poor maintainer of your code six months down the line.

How hard is it to add an extra character or two versus how hard it is to fix a bug that you couldn't see because you read the foo, parsed the foo, transformed the foo, updated the foo, used the foo and sent the foo, but we found out last week that somehow since we started logging the foo a month ago, it isn't getting transformed at all and all the logged foos are out of date because they weren't updated. If I'd been forced by the super fast, super helpful compiler to call it something ugly like updatedFoo two years ago , none of this would have happened.

There's a function called str already. Don't let me call my variable str. There's an x in scope already. Don't let me make a nested lambda with x as the new parameter. I'll get this x sometimes when I meant the other one. I won't be able to immediately see what I did wrong by reading the code.

view this post on Zulip Brendan Hansknecht (Feb 09 2024 at 03:01):

Yes, shadowing can be abused. Code style/review is important. Many languages have mutable variables. Many languages have shadowing. These languages are able to function just fine even with these features. This doesn't mean we should or shouldn't have shadowing, but it is an important piece of context.

Using many numeric suffixes on a variable in roc has already led to bugs and friction. So this isn't a clean tradeoff on one being buggy and messy while the other is clean and less buggy. This is a more complex tradeoff that needs more nuance in discussion. It isn't simply about adding an extra character to a variable name.

view this post on Zulip Anton (Feb 09 2024 at 10:00):

@Andrew C we plan on trying out shadowing, if it doesn't turn out good we'll change it.


Last updated: Jun 16 2026 at 16:19 UTC