so yesterday I was writing some Roc code and I got an error about shadowing. In this particular case, the error was helpful because I actually hadn't realized I had chosen the same name for two different values in the same scope!
This was an example of Roc's ban on shadowing making my code less error-prone, but we've also seen examples in practice of it making code more error-prone.
One way this can happen is when a value becomes stale and it becomes actively important that it not be used again in the same scope. There's a workaround for this (extract the chunk of code where it's stale into a separate function and call it passing only the non-stale value) but applying that workaround can induce otherwise undesirable indirection that makes the code harder to understand.
buf for a buffer is a classic example of this: when building up a buffer, you really don't want to ever accidentally use a precious version of the buffer
this also comes up with random number seeds
one idea we talked about in the past was having an opt-in sigil which enables shadowing for a given variable
for example,
~buf = ""
...and then from then on you always refer to it as ~buf so it's always clear at both the definition site and also at every usage site that this is potentially shadowed
I think if this existed, a potentially good rule might be that you can't have both ~buf and regular unshadowable buf in the same scope, because that would seem like a very easy mistake to make
This still would require changing Roc from being declarative to imperative (i.e., order/place of definitions matter), right?
I think the main pitch for adding this to the language would be:
LoipesMas said:
This still would require changing Roc from being declarative to imperative (i.e., order/place of definitions matter), right?
order of declarations already matters because of dbg, expect, and crash
we have an issue for giving a warning when declarations are out of order
(not implemented yet)
arguments against it would include:
btw without going on a big tangent, declaration order matters in every Turing complete language that runs on real hardware, so I don't think that's really something we've "lost" so much as "acknowledged" :big_smile:
anyway, I'm curious to get thoughts on this idea!
Richard Feldman said:
order of declarations already matters because of
dbg,expect, andcrash
Right, I forgot about those side effects.
I'm still slightly against shadowing, but less so with the sigil. I don't think the pain points are worth it.
"it's not introducing any new power to the language" - this arguably could be considered a point against shadowing in general
I have been hitting this a lot lately and I really want shadowing.
It is exceptionally painful when you never want to use the old version of a value again. You simply want the newest version always.
It also can be a nice naming/readability improvement when used with something like List.walk
Imagine this base idea but more complex
count =
List.walk list 0 \count, val ->
If val == ... then
count + 1
else
count
I have hit something like this often and a lot of the time, the inner function is simple and nice to read inline, but I have to change names of variables due to shadowing.
All of those values are the same count, it is nice to give them all the same name.
Anyway, I have hit this a solid amount, go look at the new dict code if you want to see plenty of real examples.
As for the syntax of shadowing, I would advice either:
snap = ~buf. (This is definitely what I would prefer). I think with 2, it gives a clear indicator in every location that a value is special. It always sticks out. We would emit a warning if the value is never shadowed.
yeah I like the design of 2 better than 1
so it's always ~foo whether you're declaring it or referencing it
Brendan Hansknecht said:
Imagine this base idea but more complex
count = List.walk list 0 \count, val -> If val == ... then count + 1 else countI have hit something like this often and a lot of the time, the inner function is simple and nice to read inline, but I have to change names of variables due to shadowing.
Couldn't the compiler deduct here that the inner count is only defined inside the inner function and there is no count before that expression, so it's fine? So the scope of the inner count ends before outer count is being assigned, which means no shadowing takes place
I think the small change from your original proposal is that I want seeing ~foo to ban foo in the same scope. It would be ~foo in the entire scope instead of starting at the first reassignment
so it's always ~foo whether you're declaring it or referencing it
so it's always ~foo whether you're declaring it or referencing it
I wonder - in the scenarios where you've wanted shadowing, how would the code look with and without this feature? :thinking:
People can look at my dict code and try to write it without shadowing (and my example in the shadowing and redeclarations thread, that is worse cause rngs).
I personally stuck with x0, x1, ...
That said, you can also split all of the functions into smaller chicks if you want to fix that
Couldn't the compiler deduct here that the inner
countis only defined inside the inner function and there is nocountbefore that expression, so it's fine? So the scope of the innercountends before outercountis being assigned, which means no shadowing takes place
If we wanted it to. That could be a limited valid use of shadowing. That said, with backpassing that can be very strange and feel like regular shadowing. In fact allowing that would allow someone to just make a shadow function that uses backpassing.
x = 7
x <- x + 1 |> shadow
x <- x + 1 |> shadow
This wouldn't work (in my idea), because x was already defined in the outer scope
that's a really interesting idea @LoipesMas! I've never heard of a language doing it.
Maybe another way to say it: "it doesn't count as shadowing if referencing the variable would be a naming error"
In my mind it's kind of similar to Rust's lifetimes. Lifetime of the count variable name ends within the function/expression, so after the function/expression returns it's free. And if we ensure that variable name can't be used while it's already in use (i.e., no actual shadowing), then it should be fine
LoipesMas said:
This wouldn't work (in my idea), because
xwas already defined in the outer scope
Oh, but this is a very common use case as part of the problem. I guess I have to make my example slightly more complex. This is the most common case:
count =
List.walk list count \count, val ->
# same body as before
Anyway, I would totally be for adding what @LoipesMas mentioned, but I don't think it elevates the need for shadowing at all.
Yeah, it wouldn't solve that case. But it solves other cases in an unambiguous way, without the need for shadowing.
For example, with shadowing, in:
List.walk list count \count, val ->
...
in the inner function you would have to guess (or know) that count refers to the argument and not the previous count. And if someone would change the argument name, then suddenly count refers to the previous count and not the argument. Potentially confusing, error-prone
Yeah, definitely solves one specific case, and I'm sure people run into it. So I think we should add it. Just want to clearly state that I hit multiple other cases that are still painful and really would like some form of shadowing.
@Brendan Hansknecht do you know of any examples in context of the more complex scenario? (Where the proposed idea wouldn't solve it)
Definitely hit it in on of my dict function (maybe in a branch that is gonna get tossed though).
Was essentially, pass in state, in the middle of the function walk something updating the state. Use the state a bit more. Pass it into the next recursive call to the same function.
Otherwise, as can be seen all over the new map PR, my main issue is things like List.set and other updating functions that happen in the middle of a function. So you have multiple versions of the same variable.
Anything where you want to actually "mutate" the variable needs shadowing. I don't think there is any way around that
Richard Feldman said:
Maybe another way to say it: "it doesn't count as shadowing if referencing the variable would be a naming error"
count =
List.walk list 0 \count, val ->
If val == ... then
count + 1
else
count
I think Roc already implements this rule. The problem in this example is that it is NOT a naming error to refer to count in the right-hand side of its declaration, because variables in Roc are allowed to have recursive definitions. (This is useful when defining functions in a local variable, but not so much when defining numbers)
oh that's a good point, I forgot about that :laughing:
I like the idea, particularly that it is opt in. Can I suggest using $ as the sigil. I think it looks like an S for Shadow.
something I definitely struggle with in this design question is that Rust allows shadowing and redeclaration anywhere and there's a huge imbalance between how often it feels error-prone (and the magnitude of the error - I think I've gotten bitten by it once, ever, and it was really minor and was quickly apparent what the problem was) and how often it feels nice (all the time, honestly)
so I kinda wonder about the complexity of this being worth it compared to just allowing shadowing and redeclaration everywhere
every time we have beginners doing Advent of Code there's renewed demand for shadowing (and also experts run into scenarios run into scenarios where it would be desirable), and I don't think a sigil would be discoverable for beginners
it would just be a pain point
maybe another way to say it is: the more I think about the world where shadowing and redeclaration are just allowed, the harder I find it to justify why that's a bad world to be in
it seems like the increase in error-prone-ness would be nonzero but honestly negligible based on my experience in Rust
and because of that, it feels hard to justify introducing new syntax to avoid a negligible concern
I wonder how much of this is just being used to shadowing/redeclaration and writing code in a style that "needs" that
for me it's definitely not that...I spent years writing Elm more than any other language, where shadowing and redeclaration are fully disallowed, and then some years writing more Rust where both are fully allowed, so I'm comfortable in both
and the reason it keeps coming up for experienced Roc programmers is that even though they're used to it being banned in Roc (which it always has been, because that was one of many design decisions Roc inherited from Elm), there are still scenarios where it's painful in practice to use multiple names (or more error-prone to allow a stale value to be accidentally reused, with no way to disallow it other than splitting out a function that makes the code harder to understand and thus more error-prone in a different way)
"shadowing sigil" sounds like a wizard spell, and that's got to be worth something
Maybe it's down to personal preference. Or maybe I'm just in the wrong here. Luckily it's not up to me to make a decision ;]
Here is abother example. In theory I can update this API to use record builder, but shadowing would be helpful.
graphic : Graphic
graphic =
g0 = Graphic.graphic {
width: 400,
height: 400,
}
g1, purple <- g0 |> Graphic.applyColor (Color.fromBasic Purple)
g2, green <- g1 |> Graphic.applyColor (Color.fromBasic Green)
# draw a series of vertical lines
lines = Command.drawLines
{
style: Style.radial (325, 210) (375, 235) purple green,
lw: Set 2.5,
}
[
({ x: 275, y: 185 }, { x: 375, y: 195 }),
({ x: 275, y: 195 }, { x: 375, y: 205 }),
({ x: 275, y: 205 }, { x: 375, y: 215 }),
({ x: 275, y: 215 }, { x: 375, y: 225 }),
]
g2
|> Graphic.addCommand lines
Apologies for the extra indentation, its hard to fix on my phone
in that exact example, it looks possible to make a wrapper which would thread the g through, similarly to how Random.Generator can avoid passing a seed around
Wrappers are pretty nice until you hit a task boundary.
Or some other wrapping type where interleaving doesn't work right
indeed
Richard Feldman said:
so yesterday I was writing some Roc code and I got an error about shadowing. In this particular case, the error was helpful because I actually hadn't realized I had chosen the same name for two different values in the same scope!
I just looked back at this code and I realized it would not have been a bug in this case - would have had the same behavior either way
also I realized a subset of potential shadowing bugs would be caught by unused warnings (due to assigning something and then reassigning/shadowing it before it ever got read), which would reduce the error-prone-ness even further
To me, the bann of shadowing has always felt like the graybeards telling me that my callow desire to use redeclaration is a danger yet incomprehensible by me. I also really like it in rust though and found that it is the opposite of error-prone. I want to redeclare that variable because the shadowed variable should not be used anymore. You could factor your code to not deed them, but I think there is value in functions that are more than 20 LoC, because there is mental-overhead in jumping around. The longer a function is, the more I'll need shadowing. Besides, when I don't have shadowing, this is a common thought process for me.
x = 1
# x is consumed by doSomething and should not be used after this line.
# Even though in your head, you simplify it as mutating x,
# I sure hope you don't forget that x has a new name from now on!
x1 = doSomething x
...later...
# very good job!
y = something x1
...later...
# I told you not to forget! This should be x1!
z = blahblah x
has anyone seen a good beginner-friendly explanation of how shadowing and redeclaration work?
I'm trying to think of how to teach that (the current "names can never be overridden/redeclared/etc." is extremely easy to explain!) and in particular how it's different from mutable variables
I mean the difference is explicit assignment
Nothing can change without explicitly seeing a new version of the name
e.g. the explanation would need to make it clear why this doesn't do what it would in (for example) JavaScript:
x = 1
y = List.map nums \num -> x = x + num
Hmm...well that wouldn't even check in roc, but fair.
like you'd find out the hard way pretty quickly that it didn't work, but I can imagine a lot of beginner questions asking why it didn't work given that this works:
x = 1
x = x + 1
That is why I like the explicitness of let for redeclarations in rust
Makes it very clear
But probably doesn't fit roc for shadowing.
yep
also it's pretty awkward with our style of type annotations
x : List Str
let x = ["foo"]
compared to
x : List Str
x = ["foo"]
and I think it's something that can be picked up pretty quickly
but I'm not sure how to teach it in the first place to avoid everyone having to learn it the hard way :big_smile:
Also cases such as:
xs = [..]
x = 0
y = List.map xs \x ->
x = x + 5
x
might be confusing. The x is shadowed in the anonymous function, but isn't changed in the outer scope and we use inner x. But if it wasn't shadowed (i.e., we used a different name for the argument) then the outer x could be used, but still not changed
Yeah, Rust's let makes it simpler. Shadowing sigil would be kind of like that, but not exactly
right, if you do a text-only rename (that is, not using an editor semantic Rename operation) the x in \x -> then it still compiles, but does something different
to be fair, although I think Rust's use of let makes it easier for beginners to learn, I don't think it matters in terms of how error prone it is once you understand the semantics
It does force you to be explicit when changing the type of the variable. Although with Roc being typed, it's hard to use a variable of wrong type
Would Roc's shadowing care about types? I'd guess not
yeah you could change types no problem
That could also be confusing. For learners and for code-readers
we could always make a rule that they have to have the same type, although I'm not sure what the benefit would be in practice :thinking:
To be fair, that could also be confusing (for learners) :upside_down:
and I don't think a sigil would be discoverable for beginners
We could mention it in the tip of the DUPLICATE NAME error and put a shadowing example in the examples repo.
I tried to pretend shadowing (no sigil) was allowed, and revised some code to use it - I have to say, I think it makes the code less error-prone in this case
I looked quickly at the code and, honestly, was a bit confused when I saw that the chomp function returns the src that was passed as an argument. Took me a second to recall that it was actually changed, especially because the name doesn't fit (rest would be more descriptive as a return value). Even if I were used to Roc having shadowing, it still takes time to see where src comes from and where it's changed (it isn't so bad in simple functions, but gets worse with bigger functions).
But I'm getting the feeling that I've probably voiced my position enough already :sweat_smile:
Two examples from my end with the change to having shadowing. Both big gains in my opinion.
@LoipesMas curious what you think of these two examples, I think the naming has less meaning than something like chomp. It is still the same dict or list or rng in my case, just mutated potentially.
In those examples it makes more sense. But I've got the feeling that those are more "low-level" examples that generally don't favor functional/declarative programming, and thus are not as reflective of the way the language would be used by end-users. We don't expect people to re-implement dictionaries in Roc, right?
Also, for example,keepSharedcould be re-written as:
keepShared = \xs, ys ->
if len ys < len xs then
keepShared ys xs
else
walk ...
removing the need for shadowing. And I know it's just a part of a bigger example, but I think my point still stands.
I think it would make more sense to make that decision once we have more people using Roc for things it's (more) intended for, for longer time. And I think it's easier to add shadowing later, than to take it back.
Yeah, ignore keep shared, it just hit a compile bug so I wrote it the other way (was the recursive way originally)
As for dictionaries, yeah, rare use case (though I'm sure libraries will implement other complex datastructures and this can be helpful even with normal lists). RNG is a lot less rare though.
here's how that example would look with the $ sigil https://github.com/rtfeldman/roc-iso8601/compare/shadowing-sigil
Still, the amount of time spent writing a datastructure is a lot smaller than time spent using that datastructure. Especially if the library is popular. And if it is, then it's also going to be more thoroughly reviewed than whatever uses that library.
RNG is a tough one, because it can't be as easily abstracted, but again, how much time will people spend actually writing RNG code (in all applications)? Your guess is probably better than mine, though.
IMO it's worth it to make it harder to write a library, if it makes it easier/safer to use one
In my case, I have not yet hit a case where disallowing shadowing has saved me from a bug (that I know of), but I have definitely written bugs due to not having shadowing.
So I pretty much only see shadowing as a gain. Yes, on small occasions it can lead to a bit of poor naming, but that is more tangential to shadowing than a problem caused by shadowing.
in fairness, I think the nicest way to write the parsing code I linked is actually with a parser combinator:
yr <- keep digits4
{} <- skip (symbol '-')
mn <- keep digits2
{} <- skip (symbol '-')
dy <- keep digits2
ok { year: yr, month: mn, day: dy }
however, I'm not sure how well we can optimize out the combinators (into the bare minimum number of conditionals that you'd get if writing it out)
:thinking: maybe with inlining closures it's possible to get there?
I wonder if Record Builder pattern could be used for RNG? Or something similar, I just read the example for that, so I'm not too familiar yet
wouldn't help in the cases where shadowing is desirable, unfortunately
I think I would like Roc to try shadowing. I always thought it was cool that Elm could prevent me from a bug by not having it, but I don't think I ever ran into that in practice. I have, however, definitely ran into the "using old value" problem.
I don't think you even need to go to complex problems such as parsing or data structure implementation to run into this. At work, I have fixed quite a few bugs in Elm update functions where a newModel let binding is defined with an updated model, but some things still use the old model in scope.
here's how that example would look with the
$sigil https://github.com/rtfeldman/roc-iso8601/compare/shadowing-sigil
I like it, it's nice to have shadowing clearly marked.
thinking about this more, I think probably the best way to go is to actually try out the sigil
all three options seem potentially reasonable, but this is the one that's never been tried in any language I know of, so the only way to get data on how it feels is to actually try it
Let's do it and see!
Should be easy to implement I think.
well the sigil part is easy
the hard part is rewriting canonicalization so that shadowing is possible :laughing:
canonicalization needs to be rewritten for multiple reasons though
Is shadowing (/ shadowing sigil) going to be handled differently for top level definitions?
that's a good question...my default thought would be to disallow it for top-level declarations, but maybe there's a use case I'm missing? :thinking:
like if we have https://roc.zulipchat.com/#narrow/stream/304641-ideas/topic/renaming.20while.20exposing then I'm not sure where the demand would come from
I think disabling to for top-level declarations makes sense
Yeah, please do not allow for top levels.
There's also a related question of ordering of top-level declarations. Right now it's ignored, because it's ignored for all declarations, but you mentioned that this will change. Will top-level declarations be exempted from this as well?
Yeah, they will be exempt from ordering as well. I know this was discussed before. Not sure what thread at this point though
:thinking: I remember concluding that they didn't need to be ordered, but I could see an argument for ordering them anyway so that if you use dbg in them, they don't appear in a misleading order
using dbg at the top level?
yeah like if you have a top-level declaration which calls a function that has a dbg in it
Why would that matter for top level ordering?
That is inside the top level itself
suppose I have two different top-level declarations, each of which calls the same function that has dbg in it, and we're in the future where we eval top-level constants at compile time
Oh, I guess:
x =
dbg y
7
y =
dbg "testing 123"
8
sure, that too (although I think top-level constants calling a function that has dbg in it is more likely!)
point being, you could see those dbgs and reasonably assume they're being outputted in the same order as the declarations in your source code
That said, ordering top levels is more complex cause I think for functions, many people prefer them in essentially a reverse topological sort
so if they could be silently reordered, that could lead to a misunderstanding in what you're seeing
oh functions never need to be ordered
in top-level or otherwise
ordering only ever matters for non-function constants
but they could have closure captures. So they do topologically sort
hm, true :thinking:
I don't think that matters though
because nothing gets printed when they're defined, only when they're called
As long as you don't do something crazy like:
fn =
dbg Something
\x -> x
That said, to capture a a function might have to lift a definition unless w website the top level definition is anchored and the function moves.
So still could reorder other prints.
oh yeah when I say function I mean like x = \ ... -> with nothing between the = and \ except comments :big_smile:
otherwise reordering might be necessary
Still could be a problem with something like this:
# This captures `y`. So it could pull its dbg print before x.
fn = \{} ->
y
x =
# This theoretically could call `fn`. That would definitely make it so `y` is required defined before `x`
dbg Something
"Some top level"
y =
dbg SomethingElse
"Some other top level"
hm yeah that's a fair point
I could see an argument for requiring it in a case like that :thinking:
since the order then actually does matter
I suppose in general if we have a warning for reordering and it turns out to be significantly annoying then we can always reconsider the design
Will the warning only print if you use dbgs or have captures?
Cause at least for functions I tend to put main function first and inner/helper functions after. So that is a reverse topological sort.
I can see arguments both ways
like on the one hand, if we only warn when there's an actual dbg, then often your code will be unaffected and you won't have to reorder in a situation where it would be annoying
on the other hand, it could mean that as soon as you introduce a dbg somewhere, you start getting these warnings out of nowhere that you need to reorganize things just to debug your code
in F# they have a thing where everything always has to be declared before it's referenced (I guess except mutually recursive functions?)
and apparently originally this was a limitation of the parser, but then they intentionally kept it because they liked that it was a forcing function for a consistent style
so that would be an argument for requiring it even if there's no dbg (or expect or even crash, arguably)
Means that for something like the dict library, all of the impl details like the hash algorithm and key search algorithm have to go at the top of the file due to ordering restrictions. Big time not a fan of that.
hm, really? If anything, I think it would mean that functions have to go below constants
or at least, below any constants they reference
I can't think of a scenario where ordering would require a function to appear higher than otherwise :thinking:
You said they have to be topo sorted as well due to potential captures and the effect of dog, right?
So:
exposedApiFn = \...
...
helperFn
helperFn = \...
...
lowLevelDetailFn
lowLevelDetailFn = \...
...
would need to be changed to:
lowLevelDetailFn = \...
...
helperFn = \...
...
lowLevelDetailFn
exposedApiFn = \...
...
helperFn
Cause any of those could have closure captures over constants that have a dbg statement which could lead to incorrect print ordering.
oh I'm saying base it on actual captures of constants (including transitively by functions) but not of functions themselves, and not taking into account whether dbg was actually used
ah...based on captures.
ok yeah, should be totally fine.
and "constants go before the functions that use them" is the normal style anyway, so making that a stronger convention doesn't seem bad, especially since it has concrete non-stylistic benefit
Though I guess I'm curious. This captures const, but isn't considered to capture fn? Just feels a bit strange or inconsistent somehow.
const = 123
fn = \x -> x + 1
capturing = \{} ->
fn const
well in terms of ordering at least
but anyway, I think I am on the same page for the end result that you want to generate
I generally prefer high level functions to go above the lower level functions they depend on, and always exported declarations before private. I find it unfortunate when the reader needs to wade through a lot of not-terribly-important code to get to the parts that remotely matter.
I can see the F# people wanting to enforce a consistent style, but I believe they chose to enforce one of the less valuable styles, and they chose to enforce it in the wrong way (a compiler limitation that modern languages have generally moved beyond)
WRT the sigil symbol - ~ feels like an operator coming from other languages, and will block this being an operator in the future.
Right now _ prefix is used to mark unused label names, but could it also be used as a shadowing sigil? I think these two features are orthogonal, so there won't be a conflict there?
the _ prefix is also used for unused variable names, not just labels :big_smile:
Oh, that's what I meant - for example, unused function arguments.
Imo, that would be a reasonable UX:
"Hey, that's a variable prefixed with underscore! I should pay attention to it - either it is an unused placeholder, or the context of this name changes during the function flow. Oh, and there cannot be a non-prefixed version of it."
:thinking: is $ the best sigil for this given the number of non-English keyboards which don't have that key?
might be confusing, but @ for lowercase identifiers (variable names) is not taken (it's only taken for uppercase identifiers, namely opaque types)
I think @ would be easily confused with opaque types even if it technically is clear. Just an easier gotcha.
:thinking: are there other reasonable options?
is $ the best sigil for this given the number of non-English keyboards which don't have that key?
from quick google search, it seems that $ is common on keyboards no matter the language, but I am not that educated on the matter.
Skimming other symbols, I do think that most other symbols like ~, |, ^, >, or & would look like operators before a variable name.
So yeah, probably @ or $ would be best.
I'm not sure if it was discussed, but how would shadowing a already shadowed variable work? e.g.
x = 1
func = \~x ->
mapX ~x \~x ->
...
Or maybe every variable can be only shadowed once?
The current thought is that a variable is shadow able if it is named with the sigil
So you would define it on the first line with the sigil and all uses would also have the sigil
Basically it is part of the variable name rather than a one off use when shadowing
A tell to the reader that the value will be shadowed at least once
ohh thank you, now it makes sense
yeah one way to think of the proposed design (with $ as the sigil) is:
$ at the start of the name, the compiler essentially appends an incremented integer to the end. So the first time you define $foo it gets internally defined as $foo1, and the next time you define something named $foo it it gets internally defined as $foo2, etc.myFunction $foo) it internally refers to the most recently defined one in scope (e.g. $foo1 or $foo2, whichever was defined more recently in scope)Is there an issue for that? Recently, I found myself struggling with the lack of shadowing (the same reason discussed here, I’d like to just drop obsolete values, in particular, after typecasting) and then found this thread.
The $ sigil looks like a reasonable approach tho I anticipate it will be a temptation to use it by default. One way to overcome it is to forbid the use of the sigil if there is nothing to shadow. This way it wouldn't be the case where $ means "potentially shadowed" but "definitely shadowed"
a couple of random speculative thoughts:
besides shadowing an outer scope variable, I feel that shadowing makes sense only for transforming the value behind the variable. I think about it as of linearity for an abstraction, or delayed pipelining. e.g it feels reasonable to have
$x = 1
$x = $x + 2 * $x
but in the following example, $x kinda loses its story
$x = 1
$x = 3
the first example can be written this way (let's assume that outer-scope shadowing is allowed) to preserve the story but it looks a bit messy to me
x =
1
|> \x -> x + 2 * x
the second example can be expressed the same way, and it's instantly clear that is something wrong there. btw it should generate a warning IMO (I checked and it didn't)
x =
1
|> \_ -> 3
In this metaphor, $ can mean $hady $tory :grinning_face_with_smiling_eyes:
I like how the sigil shouts "beware of mutations in my history!" so you have to find exactly the last reassignment of the variable if you want to use it
besides shadowing an outer scope variable, I feel that shadowing makes sense only for transforming the value behind the variable
On the other hand, it probably makes sense to forbid the use of the shadowed variable inside of the shadowing (i.e. $x = $x + 1).
So $ would mean “a new story”. This constraint would potentially balance the extra power the shadowing provides (but would it still be possible via indirection? :man_tipping_hand: need to try to understand).
I propose to restrict either the recursive case or the opposite one (the latter is preferrable I think). Or at least to think about what the implications are for these options.
allowing only the recursive case would mean $x is always derived from the previous state of $x, so you don't need to think hard if two $xs mean a completely different thing. they are just different pages of the same story.
as a result, the sigil would mean "derive new value, drop the previous" (scope-wise obviously. backpassing fits!) and never "override the value with smth new"
a random thought that just occurred to me: awhile back in the discussion there was a question about allowing shadowing but not redeclaration, and there was a point that backpassing meant that redeclaration effectively existed no matter what
I am not a fan of either of those restrictions.
If I am working with anything stateful or data structure like, I will absolutely need the recursive case. I think this is the fundamental motivation for shadowing.
That said, in certain cases I might clear or explicitly set the data structure. Those are both non recursive.
but if we have ! instead of backpassing, we could revisit that
this pattern would be possible for reassignment. but it's ok I think
shadow = \_ -> 42
main =
$x = 1
$x = shadow $x
On top of that, in something like Roc-Wasm4, I might set the line height as a value. Use it N times
Then set a different line height and use it N more times. So same name but different value.
I think redeclaration will be a common pattern that we shouldn't restrict behind bad code.
but again, wouldn't shadowing become a default pattern then? yes, the only restriction might be "don't allow the sigil unless there's an explicit redeclaration". otherwise, the effect might be the same as shadowing by default but with php-like naming.
I'm a fan of shadowing if anything and don't really believe in the downsides of it. I'm just trying to find the balance between its obvious advantages and the reasoning that came from Elm.
but again, wouldn't shadowing become a default pattern then?
Maybe, but you can only add the $ sigil if it is actually used. And I think most values won't have a use for it.
I personally am not really concerned.
I get that the work let is more verbose in rust, but I don't see it abused like crazy.
Rust also has mut, but it is far away from the default
Sure, this gives someone the power to abuse it, but as pointed out in your example, I can write s = \_, v -> v and then shadow all I want either way $x = s $x newValue. I think it would be much worse to see a bunch of code in that form with magic reassignment functions then to just fully allow shadowing with the sigil.
btw, in my terms, let mut is for keeping the story, and let is for starting a new one (not always of course). so there's an explicit differentiation. but $ means both concepts to some extent
let is the same story very often in rust. Especially with variables that need to have their type changed.
yeah, it would be more correct to say let mut doesn't allow a new story. nvm, just thinking aloud
the more I think about this (which is more and more lately because I'm hoping to start rewriting canonicalization in February for ! and making canonicalization tolerant to shadowing in some form) the more I think we should just try doing it the way Rust does it
like no sigil, just allow redeclaration and shadowing, see how it goes
the thing I keep coming back to is that I do it all the time in Rust and it's really been a very positive experience
I think I would like Rust less if they replaced it with a sigil or with disallowing shadowing or redeclaration, even though I'm aware of the potential risks
so I'm having an increasingly hard time convincing myself that it would be a mistake for Roc to have this feature that in practice I use all the time and haven't really seen the downsides materialize in a significant way (at least not anywhere near significant enough to outweigh the upsides)
or maybe to say it another way, when I look at the potential risks and I imagine the response being "yeah but it'll be fine in practice" I can't really honestly be like "no it won't be fine in practice!" because based on my experience from years of Rust, deep down I really do think that yeah, it actually will be fine in practice :big_smile:
and although it's true that reading and understanding code happens a lot more than writing it, allowing these things definitely makes code a lot more pleasant to write (I don't think it's at all close), and when it comes to reading I actually do think it's close - because being able to close off stale values and say "there's no chance the stale value will be referred to again after this point" is very valuable
and whenever I write Roc code and think "if I had a sigil, would I use it here?" I have to be honest, I don't like the experience of asking myself that question all the time
because what happens next is that I try to think "ok well let's see if I can do without" whereas in Rust I'd just be like "obviously I'm going to recycle the name here, let's go!" so it feels like my brain doesn't go on these little unimportant tangents as often
and the frequency with which that question comes up kinda reveals how often shadowing and redeclaration would be not only nice, but the obviously best choice if they were available :big_smile:
so putting all that together, I'm thinking that we should just try doing it full Rust style and see what we think of it in practice. It's very easy to detect, so if we later decide to make it a warning or error, and/or try a sigil instead, all of those options are still on the table
but I'm curious what others think!
I would love shadowing to work without extra sigils or ceremony. I do not think forbidding it is worth the annoyance of having to come up with new names for variables “all the time” :)
Nice. I am happy you lean towards not using sigils for this
Yeah, seems like a good investment in the friendliness of the language :smile:
My only concern is that let and mut both stick out in rust. With roc, there is never any sign.
But I would guess that I will be totally happy with full shadowing and no sigil.
Richard Feldman said:
and although it's true that reading and understanding code happens a lot more than writing it...
... being able to close off stale values and say "there's no chance the stale value will be referred to again after this point" is very valuable
Having fewer names to keep track of also helps with readability and comprehension. If the reader sees x1, x2, etc as a means to "name their way around shadowing restrictions," or in some ways worse, comes up with thoughtful names for each step, it can often give the reader the mistaken impression that the old names and new names need to be used side-by-side.
Some misc notes/anecdotes:
else which extends to cover the containing block. The conditionally truncated control has definitely been observed (at least by me) to improve code comprehension and reduce bugs, since at the point it happens, the reader has one less thing to keep track of, e.g. "this unusual case was handled with a return, so the rest of this function doesn't have to worry about it," instead of "well, I need to remember about this unusual case to see if might still apply after the end of this really long else, so I can either skip ahead to check now, or keep reading and hope I don't forget."Brendan Hansknecht said:
My only concern is that
letandmutboth stick out in rust. With roc, there is never any sign.
mut only sticks out at the point it's declared, which might be pages ago for a really long function.
Simple assignment can only be used with mutable variables in Rust, iirc, so it may be that the simple assignments themselves stick out. Further, iiuc, it's not possible to tell from a Rust let in isolation whether it's introducing a new name or shadowing one (unless you see let x = x + 1), so ultimately it seems to me that, either way, to know if there's shadowing in the general case, you need to retain knowledge about earlier code. That'd be no better or worse for Roc than it is for Rust lets.
I'm going to throw a contrary opinion into the discussion here.
@Richard Feldman I have a different subjective experience than you in Rust with redeclaration specifically. I really dislike this feature of Rust. While contributing to the Roc compiler I have very often wished Rust didn't have redeclaration, and I've rewritten blocks of Rust code to remove all redeclaration so that I can understand it better.
I don't think there's a good excuse in Rust for reusing the same name. There is always a reason for redeclaring a variable, so just use that. If you have a variable called state and you are redeclaring it because, say, you want to dereference it, then call it state_deref. :shrug:
It's a lot harder to avoid redeclaration in Roc because there's no mutation.
I have no such problem with shadowing in Rust because I need to be careful about scopes for other reasons and they are easy to see.
However in Roc we deliberately make scope boundaries harder to see with our back-passing feature (or the proposed chaining syntax, which is identical in this regard) so shadowing and redeclaration look more similar.
So I think we probably have to do this in Roc because of those differences. But based on my subjective Rust experience I expect to find it more annoying than you do!
I appreciate that thoughtful insight, thank you!
I definitely think it's tricky to balance all these considerations, because like you said, the tradeoffs in Roc aren't quite the same as in any other language
I have tried not to throw my two cents in this discussion, but look at them clink!
I have not written a lot of Rust, but I have read it quite a bit of it, and as Brian says it increases mental burden because while reading the code you have to start creating state in your head that as you progress thought the code will change.
One thing is that shadowing might make you "happy coder", but for maintenance that is big down-side. One of the best parts of Elm is its readability and speed with which you can dive in some code understand it and start making tweaks. My fear is that Roc would loose that if we allow any kind of shadowing.
Now we come to crux of the dilemma: should Roc be "Friendly" or "Maintainable" ? :)
I very well may be quite biased from mostly programming in mutable languages. That and/or working for google where the style of code recommends shadowing in certain cases, but I think shadowing and reusing names helps with code readability and maintainability.
Using a single name for a concept removes noise and helps a programmer focus on what is important. Yes, shadowing can be used poorly, but I think it is much more likely to help than to hurt.
The most obvious win is any sort of stateful data that gets versioned.
This is just noise
rng0 = Random.new seed
rng1, x = Random.I32InRange rng0 { start: At 0, end: Before 24 }
rng2, y = Random.I32InRange rng1 { start: At 0, end: Before 24 }
rng3, z = Random.I32InRange rng1 { start: At 0, end: Before 24 }
The second case is when you have one idea but it is masked by error cases.
This is also noise. We are giving three names to the same thing.
valueRes = someResultingConstruct a b c
value =
when valueRes is
Ok v -> v
Err _ -> ...
Changing the above two examples to always use the same name helps with readability and especially for the first example reduces bugs.
A more controversial, but is probably the next most common use of shadowing is for printing/parsing/type conversion.
count = someCalculations 123
countNat = Num.toNat count
... use in list functions
Overall, I think that shadowing will make roc code more maintainable and easier to read. The issue is that the cases where shadowing is really needed (state versioning) have no good solution in roc and are a huge source of hard to read and buggy code. It may not happen that commonly, but it happens enough to merit a solution.
Sorry Brandan, but I have to keep arguing against it. :)
Idea of shadowing names is changing its type and the value it holds - and that is anti-thesis of having immutability as a mental tool for constructing programs.
Versioning data in example with random generation can be solved trough piping, value deconsctuction can be solved by inlining someResultingConstruct 123 in when call eg. when someResultingConstruct 123 is ....
I think that we have to start figuring out definitions and metrics for "maintainable" and "easier to read", because we are diverging because of our different experiences reading different kind of code.
As a side note: Google's nor anybody's company standard practice are not good measuring stick for us because they have practices that are enforced first on hiring level and then on economic level. Idea with Roc is that any plebs can pick it up and start working with it, so we have to keep in mind that practices in companies are well tended gardens, but we are more like a public park where we have to design guard rails so dumb kids don't get eaten by bugs. (sic!)
Versioning data in example with random generation can be solved trough piping, value deconsctuction can be solved by inlining
someResultingConstruct 123in when call eg.when someResultingConstruct 123 is ....
I wish I could agree with this, but as code gains any form of complexity, this has not been true from what I have seen in practice. In lots of code, there are multiple pieces of logic intertwined such that you can't make a pipeline for every single variable that needs versioning (notice the variables with numeric suffixes). Lots of code is broken up by task handling, such that you can't simply inline a result. You also, often can't use a monad to hold state cause result or task is already being used in a monodic way that can't compose with an whatever state you have (rng or otherwise).
While I love the theory and think it is good to follow where practical, I think that it often falls short. This leaves us with hard read, write, and maintain code.
Google's nor anybody's company standard practice are not good measuring stick for us
Overall, I agree, but I do think they are the most robust places that you will find 'metrics for "maintainable" and "easier to read"'.
As I read trough the code, trying to understand where are you coming from, could you please just give me a hint with shadowing how it would look like:
in the removeBucket one you would go with
\@Dict { buckets, data, maxBucketCapacity, maxLoadFactor, shifts }, bucketIndex -> ...
and then further in the code you would say
(buckets, bucketIndex) = removeBucketHelper buckets bucketIndex
buckets = List.set buckets bucketIndex emptyBucket
```
Question mark :)
Exactly, you would remove all of the numeric suffixes and complete remove the option to use the previous version.
That way there are no bugs where you accidentally type buckets2 when you should have typed buckets3.
Also, if you add a new line of code, there is no need to update all bucketsN that come after your current iteration.
I've read through most of this discussion back and forth and whilst I started firmly on the side of shadowing is good and there's no reason not to have it, I do think that taking a step back there are some valid reasons from a readability standpoint to not have it. However from an annoyance of actually writing codes standpoint they suck. So given this is more of an issue of reading code rather than writing it maybe this would make more sense as some kind of editor integration, you could highlight all shadowed variables or provide some kind of little prefix or underline or other indicator that shows that this variable is shadowing some other variable and you could even have a code action that then takes you to the original.
In fact, if you did implement this you could then also get a taste of what reading code with explicit shadowing annotations would look like without having to commit to it or rewrite any code
With a custom editor, it would be possible to have a shadow-less representation on demand :thinking:maybe it’s possible to inline the shadow counter helper in editors via lsp?
Another dubious idea: I wonder, how bad it would be to allow shadowing with warnings and allow the formatter to fix it (via counter suffix). It would allow much faster prototyping without code fiddling.
First, a disclaimer: I’m also biased, and likely I don’t understand all the tradeoffs. As a code writer, I don't want to mess with manual counting but solve a certain problem. I also don’t see variables (defs?) as a matter of immutability but rather as an abstraction of data flow (but maybe I just got used to it). As a code reader, without shadowing, I still have to create the state in my head but obsolete states are piling up, at least they are for me.
Shadow-less syntax leads to more granular functions and extensive use of piping (which is great as it’s an unbreakable chain of computations). On the one hand, such code “breathes”. On the other hand, shadow-less leads to inevitable workarounds such as counters in names, meaningless names, or redundant indirections through helper functions. Shadowing in its turn can lead to writing big functions with complex but tangled dataflows which are hard to reason about. From these two I tend to choose the latter, but again, that’s probably because I’m biased.
Brendan Hansknecht said:
Exactly, you would remove all of the numeric suffixes and complete remove the option to use the previous version.
Ok, but in functional languages definition within a closure are order independent! No body says that we have to keep it like that, but as a code author in FP languages, I have benefited from it tremendously.
There would be hell of a weird behavior if, somebody, for some reason, even by accident move one line above the other.
I know that is something that order of assignments in imperative programming is religiously important, but in functional languages that was not the case. And I think that is functional paradigm that we should keep.
I don't think that holds
buckets2 when you meant buckets3.Than I stand corrected, but not convinced that is good idea.
Those arguments come from pushing functional paradigm out of the language in favor of imperative style, and no me gusta nada.
Let the experiment run and see how it goes!
thinking about the implementation of this, I wonder if we want a rule that function declarations specifically are not allowed to shadow. (So if I write x = \... then I get an error if x is already in scope.)
Otherwise mutual recursion can get really weird because that requires accessing a name that hasn't been defined yet. If functions can shadow, there might be multiple function definitions in the same scope after the current function which could work.
so we'd need a rule like "if a function is referring to another function that's defined later on in scope, then it'll be the next one that's defined rather than the most recent one, and if it gets redeclared again after that, it doesn't count" - which sounds more complicated to explain than just "function declarations can't shadow"
and I don't think any of the motivating use cases for shadowing involve shadowing functions with other functions
on the other hand, maybe this isn't a problem in practice
I guess could just try it out and see if it actually comes up in practice :big_smile:
I agree that shadowing a function with anything else would probably be confusing. When reading code, I'd be used to expecting typical variables to get shadowed, but once I see a function defined, I'm not expecting its definition to change
The only function shadowing that I could imagine being valid is some sort of building of a function by applying multiple levels on top of it.
Something like this:
mutator = \x -> ...
mutator =
if applyEffect1 then
effect1 mutator
else
mutator
mutator = effect2 mutator
that said, I think that is safe to deny by default and instead require an opaque wrapper to do something like that. Similar to what is done with a generator type.
@Brendan Hansknecht but which mutator do the second and third functions refer to? Different readers with different backgrounds may infer opposing answers.
Also, if I want to define a recursive fibonacci function, I'd like fibonacci = \n -> ... to eagerly bind so that i don't need a helper or alias to self-recurse. Does the incremental mutator usecase come up more than the self-recursion usecase?
To be clear, I'm not suggesting we support this. Just noting that with closures, this is the single valid case I can think of for shadowing a function. Probably would look more reasonable if mutator was passed into a function instead of being defined in a function. None of this would ever be top level.
Patterns like this are sometimes seen with web framework middleware for example.
A more correct example may look like this:
addEffects : (I64 -> I64), Bool -> (I64 -> I64)
addEffects = \mutator, applyEffect1 ->
mutator =
if applyEffect1 then
effect1 mutator
else
mutator
mutator = effect2 mutator
effect3 mutator
There’s another case for shadowing of functions: destruct module namespaces locally.
E.g. you want to use List functions inside a function and just don’t want to type List every time, you destruct it locally, but you can have reverse defined somewhere else. A made-up example, but you got the idea.
So something like
myFn = \... ->
# I'm gonna do a lot of list manipulation here, lets import those locally
{ get, reverse, set, ...} = import List
...
etc
# Also this is a data structure or I have this function for other reasons:
reverse = \myType ->
...
fair points! I'm gonna plan to make it work and see how it feels
:axe: :angelic:
I just have spent some time trying to formulate my reason for opposition to this idea and I couldn't come up with good words, I won't stop for trying to formulate it, but for now it comes down that shadowing is changing how are we looking at the code and the objects we have named them.
In a world full of shadows (I have to give it dark LotR vibe) an apple = ... is not just an association of a name with the immutable value, because first apple is just shadow of the apple might got bitten few times down the line. So when you are passing a value of apple to a applePieRecepie that might not be the same apple that we have created at the top of the scope. Given the complexity of the scope, we have lost track that we have bitten apple few times, e.g. the value might have changed, and we need to consult wall of text to search for apple = to try to check whether the value didn't got shadowed, either by accident or on purpose. Even then apple might get shadowed by deconstruction of an object {apple, pear, ...} = basketOfFruits .
Also am I wrong but the compiler has to support all of this and keep track of it during the execution time?
Please ignore me and go on with the experiment!
I am just trying to formulate all the problems that keep hitting me in the head, so I can sleep at night :)
The compiler would not have to keep track of it at execution time. Rust would not have it as a feature it would produce any slowdown. This is a compile-time question: "which definition value does this name refere to?" You could even think about it conceptually as the compiler appending numbers to the def names as we are doing it now, so this is all done at comp-time. To me, this does not seem like an operation that would slow down the compiler, so I don't worry about the performance aspect of it. That said, I have only written 1 interpreter, that kept track of scope and shadowing at runtime, so this is just my best guess :big_smile:
Last updated: Jun 16 2026 at 16:19 UTC