Followup to Advent of Code 2022 feedback:
There are times when being able to shadow a variable, either with the same type or a different type, would be both convenient and perhaps less error prone compared to needing to use multiple identifiers (or alternate techniques, such as magic number indices).
An example of this is incrementally consuming a list of word strings, where you might want to "reassign" the remainder of unconsumed words, at each step, back into a words variable of type List Str.
Another case would be parsing values or formatting values, where you might need to temporarily obtain a Str but ultimately need a U32, or where you have a U32 but ultimately need a Str. While pipelining can satisfy many of these cases, redeclaration after some processing steps might simply be clearer. In other cases, a prior intermediate result may no longer be needed (and juggling multiple [no longer used] names is tricky because naming is hard).
Rust has a redeclaration feature: although it doesn't allow assignment to immutable (default) variables, it does permit new declarations with the same name, and possibly a different type. iiuc, each such declaration essentially introduces an implicit nested scope.
@Brendan Hansknecht reasonably suggested that shadowing of top-level declarations, as well as the name of any function which may get called recursively, should be prohibited, as doing so in either of these cases would likely be confusing.
I would like more lenient rules on shadowing between scopes, but I would really not like to have redeclaration. I think having the same name mean different things depending on where you are inside the same scope makes it harder to understand what's going on. I think of it as a design flaw in Rust and feel it's bad practice to use it. I've often thought it should trigger a Clippy warning or something.
On a more practical note, I think it is only really possible to have redeclaration in an imperative language where the sequence of lines of text corresponds to a sequence in time. In Roc and all other declarative, expression-based languages, there are no assignments, only "equations" that can be in any order. I don't know how we could implement redeclaration in the compiler if we wanted to.
But as for shadowing, I do often find myself surprised at the places Roc gives an error about it, and feel it's getting in my way. I wouldn't have expected to have this experience, but I do.
I haven't tried to analyse exactly where these surprising places are though.
Static single assignment is a pretty common compiler pass in which each assignment is given a unique name, even if in the original source, they share the same name. Presumably something similar could apply to Roc?
Even though, yes, Roc distills down to one large computation, we could treat each non-top-level let-style expression as introducing a nested scope, permit shadowing, and thus all identifier references would correctly resolve to the latest/deepest scope in which it was (re)introduced.
Whether or not we should, it does seem feasible.
Stylistically, Roc does already have a notion of temporality/sequencing in the form of pipelines: the source of a pipeline step necessarily must be evaluated before the entirety of the destination may be evaluated (even if parts of the destination expression could be evaluated before the source). Backpassing is an even stronger example: the lines after backpassing syntax notionally happen after the lines before the backpassing syntax.
Generally, it's not a semantically incorrect interpretation that, within a function, earlier lines are evaluated before later lines. While in reality they could be evaluated in a different order, because Roc is a side-effect-free language, then, assuming the compiler is sound, such an interpretation is convenient and hard to challenge.
quick note: I actually think that although today we silently reorder defs, we should start giving a warning for it - https://github.com/roc-lang/roc/issues/4430 - reordering makes sense if it's unobservable, but it can give you misleading dbg output (among other things, e.g. expect and crashes) if those get reordered
that said, I don't think this has much bearing on the design question of whether or not we should allow redeclaration - I think the main consideration here should be to figure out what's most helpful overall
broadly speaking, I appreciate being able to shadow when I'm writing code, and I appreciate knowing that shadowing is disallowed when I'm reading code
if shadowing is disallowed, then if I see something declared in one place and then used later on at the same indentation level (or higher), then I know instantly that the usage connects to that declaration and couldn't have "changed" in between
if it's allowed, then I have to audit it to be sure; I have to scan all the lines in between the declaration and the usage to check whether the name has changed to refer to something else in between
that said, I have found it to be very convenient when writing code and miss it from Rust when I'm writing Roc.
but then again, I spend a lot more time writing Roc code than debugging it. I've very rarely been bitten by shadowing in Rust, but it has happened. I remember one time losing over an hour to something where I had the wrong mental model about what was going on due to shadowing in a particularly complicated function, and then thinking "wow, did this one bug just erase all the time shadowing in Rust has ever saved me in the writing phase?"
(I now wish I'd written down what the bug was!)
so given that code is read more often than it's written, and that right now Roc code bases are all pretty small and mostly only read by the person who wrote them, I think it's good to consider what we'd be giving up by allowing shadowing anywhere
I'm open to the idea btw, and I've independently considered it before. I actually asked Jose Valim what people think of it in Elixir, and he said it gets mixed reviews; a significant number of people like it and a significant number don't
he pointed me to this article as a good summary of the pros and cons in Elixir: https://dashbit.co/blog/comparing-elixir-and-erlang-variables
keeping in mind that elixir supports redeclaration like Rust does, not just when introducing a new scope (like Brian proposed)
Good points Richard. I still find it frustrating to write but maybe it's the best option
Kevin mentioned back passing and it occurred to me that that's not redeclaration but actually shadowing! Everything after the back pass is really the body of a function. So it's a new scope!
Which means if we allowed shadowing in inner scopes, it would read like redeclaration. This weakens my earlier argument!
ha, good point - I didn't think of that either! :big_smile:
Richard Feldman said:
broadly speaking, I appreciate being able to shadow when I'm writing code, and I appreciate knowing that shadowing is disallowed when I'm reading code
I would generally trade-offs which favor readability trade-offs which favor write-ability, in cases where they're opposed.
However, I'm not sure it's as simple as that in this case, since, as called out before, the absence of shadowing can lead to more more names being used, each with, from the perspective of the reader, indefinite lifetimes, compared a shadowed equivalent that just replaces over the same name multiple times (thus appearing to be procedural processing steps). Such processing steps usually are just refinements of the same data (i.e. peeling away words from the same list of words, or iteratively removing extraneous details from a string).
Arguably less mental context is needed for both the reader and the writer to deal with same-type/shape refinements using a small number of variables than juggling an a number of variables proportional to the number of processing steps. There are certainly techniques for dealing with this issue, such as splitting functions or alternate transforms with pipelines. That said, novices will likely be less familiar with those other transforms and will have a solution path in mind that will be a lot less satisfactory if the language forces them to consider other approaches without merely telling them what to do ("I see you're trying to extract a substring using a bunch of variables, but it's better to do it this way").
@Brian Carroll brings up a good point, which is that type-changing of the same name in the same function could lead to unnecessary confusion. It's certainly true that readers of code will often skip around rather than read linearly, and especially if the type of an identifier can change, it'll certainly increase cognitive burden.
Even in Python, where reusing variables and changing types had been fairly common a decade and more ago, with newer optional type checking, that has become far less common (since now the type checker may now complain).
One case where I've used shadowing to good effect in rust is let re-binding. In some cases due to how an algorithm is specified, you can sometimes have successive intermediate results that could logically have the same name. It would be a bug if an earlier one of these were accidentally used in place of a later one - and so I prevent that by giving them both the same name, thereby making it impossible to access the earlier one.
That both makes it clear that the misuse isn't happening (when reading), and also makes it less likely to introduce such a misuse when writing/refactoring the code.
Of course, in roc you can do much the same thing and make sure the original version isn't in scope just by breaking out each successive step in the computation into its own function. But then the reader is left to verify that things are actually called in the order you expect - and you didn't typo one of the calls in the chain (accidentally skipping a step, for example).
I guess maybe what I want is to be able to declare that the scope of some particular name ends - and doesn't extend farther down the function I'm writing or into nested scopes.
Interesting. Recently I have started doing that a lot in Rust. Putting {} around a group of temporary variables to make it clear they don't escape from that block.
In Roc you could maybe do that with nested declarations!
yeah I think that already works - just indent :big_smile:
I'd very much prefer shadowing, especially for backpassing. Having to name every intermediate binding is a source of friction and even led to bugs where I incorrectly used the previous binding. Two examples that can be better with shadowing: threading state in a random number generator and consecutive List.walks.
Edit: The second example has both lol
yeah threading state gets nicer but I'm not sure it's actually less error prone because of unused warnings
like if you generate a new seed and don't use it because you accidentally use a stale seed, you'll get an unused warning for the new seed
yeah that second example is a good I one though! Definitely state would be a less error prone name there than innerState
(if it were allowed)
as Brian noted, you can fix that by extracting it as a named function in a different scope (e.g. top level)
but I do prefer how the code reads when it can be nested like this
that is, I wouldn't choose to extract it except as a way to work around shadowing being disallowed
but what I struggle with in situations like this is: there is a workaround (extracting the function) for this, whereas there's no workaround for the downsides that come with shadowing - you just have to always be on the lookout for it forever
Can we enumerate the downsides of shadowing alongside how often we think it'll be an issue or whether we can get away with targeted restrictions? If we can detect cases which are problematic and non-useful, we should restrict them, while permitting cases which are useful and fit into common patterns.
Shritesh Bhattarai said:
Would it have worked well enough to call the outer state, which appears to be used just once outerState, while leaving the name state for the inner state, which is used through the remainder of the function?
If tracking variable lifetimes is a major concern, perhaps we can introduce some syntax, such as a sigil/symbol, to indicate that it must be referred to exactly once. These might be called linear types, though I doubt what I'm describing has all the required properties. For example, declare as %x, and pair with a reference %x. After that later reference, the value can no longer be referred to within that same scope (either the identifier ceases to exist and could be reused, or still exists but cannot be referenced again).
In @Shritesh Bhattarai's example, where the outer state's introduction and use are on adjacent lines, this might work pretty well, but in any case where the declaration and use are separated by many lines, the value will diminish quickly.
Sometimes pipeline is not flexible enough and I end up with:
X1
X2
X3
....
In those cases, it is really easy to write buggy code by accidentally using the wrong X.
Also, using names like this makes changing code really annoying. Some times you need to increment so many variable names.
yeah I think a relevant question here is:
"What's likely to cause more lost debugging time? Wrong metal model when reading code because it shadower something you didn't realize, or accidentally using a stale variable name when writing code?"
some answers that seem easy but which I think don't hold up very well in practice:
another consideration I hadn't thought of before: having access to the option of shadowing is pretty much strictly better for prototyping; it would make Roc better at that use case
so if we think these are similar in terms of overall impact on time spent debugging, that's a potential tiebreaker
on the other hand, another potential tiebreaker is learning curve: shadowing is strictly easier to teach. "You can't reuse names, the end."
On the other hand, shadowing takes longer to teach in one of two ways:
<- but not with = - there are just various rules you have to explain, along with why they work that way.x = num + 1 inside a List.map callback won't mutate an x that's declared outside the loop.to be fair, teaching the latter might be as easy as saying "pretend there's always a const there, except you don't have to write it. So basically you're writing const x = ... so of course it doesn't mutate the x in the outer scope!
Just remembered an extra use case. pipelining with lambdas.
buf
|> generateDeriveStr types enumType ExcludeDebug
|> Str.concat "#[repr(u\(reprBits))]\npub enum \(escapedName) {\n"
|> \b -> walkWithIndex tags b generateEnumTags
or
buf
|> \b -> if discriminantSize > 0 then
generateDiscriminant b types discriminantName tagNames discriminantSize
else
b
|> Str.concat ...
Note how I have to switch from buf to b in the lambdas.
Yes, these could be named functions, but they are very small and would be weird to name in my opinion. Of course depends on case by case.
here's another interesting angle to consider: to what extent could the editor mitigate the downsides of shadowing? :thinking:
for example, it could just straight-up tell you when something is shadowed (e.g. syntax highlight it in a different color)
or when looking at a definition, it could have a little icon next to it indicating that this definition is shadowed
maybe when you highlight a named variable, it doesn't just tell you its type, it also tells you if it's referring to a shadowed definition
would that mitigate the whole downside? part of it? none of it?
I think if it was highlight different or has a symbol that would mitigate most of the downside from my experience. Though if you shadow multiple times, you would probably need to distinguish each of them.
An interesting idea there is if you hover over a variable, what if you get a view of how it was defined rather than just the type?
my sense is that the editor solution only partially mitigates the problem though, and only for readers who are unsure of the definition source. it also doesn't address e.g. reading source code reviews on something like github
or accidentally using a stale variable name when writing code?
I've definitely had this happen to me before, but I think it happens less often than problems observed due to shadowing. like, the situation in which this happens is if you have e.g. xOuter and xInner and you use both xOuter and xInner in the innermost scope - but that seems unlikely, because presumably if you wanted to have xInner shadow xInner, than xOuter should not be relevant in the scope xInner lives. And if in the innermost scope you don't reference xInner, then you'll get a warning of an unused variable (which in my experienced has saved me from some bugs).
Another possible disadvantage of allowing shadowing is it can lead to disciplines where you both sometimes shadow, and sometimes use unique variable names (e.g. state1, state2) where you would otherwise shadow. I know I've done this many times even though it's not exactly a great discipline for either the reader or the writer (especially the reader), and seems strictly worse than not supporting shadowing IMO
One idea if re-binding/shadowing is allowed: only allow re-binding a variable if the new definition uses the shadowed variable. So e.g. you can do
path = ...
path = Path.toStr path
but not
path = ...
path = "telluride"
This doesn't address the nested-scope problem though, so maybe not the best idea.
One other reason in favor of shadowing: suppose someone is writing a module, and in some nested scope they define some name.
But then, they add a top-level definition (or in some higher scope) something that is best suited to use that name.
Now, they have to change the inner-scoped-name to something else, even though it may be unrelated to the top-level change they are making. This can increase diff noise for readers and writers.
I've never actually seen this happen though, I don't know if it would
that has definitely happened to me when writing libraries
like for example in a JSON deocder, I want to expose Decode.str - now I have a top-level declaration named str, so I can no longer use str as an identifier anywhere else in the module :sweat_smile:
same thing with Decode.num etc
this reminds me of another consideration: in Rust, I don't get as much value out of shadowing because often I'll do let mut state = and then mutate in-place, instead of shadowing
which affects the ratio of "times rebinding bit me" to "times it was useful" - it's useful less often in Rust than it would be in Roc, because a lot of the time, I wouldn't be using state1 = ... state2 = ... etc. in Rust anyway because instead I'd just be reassigning in-place
I usually don't mind if I have to rename lambda variables when creating a new top level definition as it makes it clearer for the reader they're not talking about the same thing.
Thinking about it, the only case I've used shadowing and I'd love to keep it, is when I'm using it in let bindings inside a functions. I'd be fine if I couldn't shadow top level definitions but not being able to shadow values defined inside a function (being they in a sequential let binding or in nested lambdas in a pipeline) is where I seem to draw the line.
ok, I want to make a concrete proposal: let's allow full Rust/Elixir-style rebinding.
that is, all of the usual shadowing stuff works, and in addition, this becomes allowed too:
x = 5
x = "blah"
what do people think of that specific proposal?
personally, don't feel strongly one way or another, but I would suggest than in a world where re-binding in the same scope is allowed, it is a warning to not use a variable (like the first x in your example) before it is re-bound.
oh yeah 100%
that would just be an unused variable warning, like normal
just like in Rust
I like this proposal :smiley:
I really don't like it at all, I find rebinding very confusing in Rust and wish it didn't have it. I always get rid of it in any code I have to do any serious work with. I don't like that the same name means different things in the same scope depending on what line of code you're looking at.
This would make Roc feel imperative to me, because binding would be more like an "assignment statement" where order matters. I think of binding as just "giving a name to an expression" but it wouldn't really mean that any more because it would also mean something about the time sequence in which things are executed.
Originally I was in favour of shadowing names in different scopes because all the big problems for me occur within the same scope.
BUT backpassing breaks that. It's an important syntax feature that deliberately makes different scopes look like they're in the same scope.
So my conclusion is that, although it often feels annoying to me to write, we need to keep the current behaviour.
Since nothing is mutable in roc, i think the problems that need redeclaration/shadowing are much much more common than rust.
As such, i think we will run into many many more situations where code is inconvenient and brittle without some form of shadowing or redeclaration. I think the problems often arise in the same scope. That said, in the cases it doesn't arise in the same scope it still feels like it arises in the same scope due to backpassing and lambdas.
If we only limit to different scopes, I think that would be much more confusing than allowing same scope as well
I agree that it makes the language feel much more imperative, but when people see a list of variable declarations within a function, they assume it is imperative anyway. People do not naturally understand the potentially out of order execution.
On top of that we already have many operations that force things to be imperative in execution order: backpassing, pipelining, dbg, expect, data dependencies, and arguably conditionals.
Aside, since internal to the compiler we can rename any variable that is shadowing another, we still can create a true SSA form with data depends graph. So it should be equally optimizable to the version with new names at each use.
semantically, I'd call it "out of order evaluation." The term execution implies a side effect, at least to me.
While Roc advertises its lower level performance behavior considerably more than other functional languages, you still shouldn't need to know these lower level details to know what result the program will have. In the semantic sense, the order of evaluation is entirely irrelevant because it shouldn't be an observable property of the program, short of using a debugger or triggering a core dump.
I think it can only be observable in terms of performance (especially with ordering potentially making something non-unique and leading to copying). That is fair.
Except when you add in dbg and expect where they have side effects and the order is then observable.
Wouldn't data dependencies be a characteristic that sets pure functional apart from imperative? Data dependencies define a trees as the evaluation partial ordering mechanism, while imperative typically uses line and expression order as the (hopefully) total ordering mechanism; since imperative expressions can have side effects, those languages need more complex definitions of behavior than Roc does.
Imperative languages code gets reordered as well. Yes, it is more restrictive, but we regularly add in similar restrictions via function calls, backpassing, and Tasks. I wouldn't look at them as fundamentally different in this case. They both boil down to an SSA form with effectful operations that block reordering.
debug and expect are not representative of the whole language. It's also not clear to me why expect needs to force an evaluation order: the compiler already does not print errors in line order.
In any case, could there not be declaration orderings that force debug to evaluate in non-source order (or force it to buffer?)
x = y + 5
dbg x
y = 7
dbg y
iiuc, since Roc allows out-of-order declarations within functions, the above should be a valid function body.
If it is valid (which it probably is). Its output would be very confusing to people and I would argue that it shouldnt be valid.
dbg and and expect are very important because they are a direct way to see the execution order of the program. If dbg prints out of order, it could lead to hours of wasted time debugging.
Why is x equally to 7...oh, it's not, the dbg prints was reordered and y was printed before x.
out-of-order declarations are going to become warnings in the future for the reason Brendan mentions
I agree with that. As such, if it is valid, I think it's an argument for limiting arbitrary declaration order to the global scope.
Within a function, too much confusion could come from writing declarations out of dependency order, and I imagine people naturally write declarations, within functions, in dependency order almost every time anyways (and perhaps many of the times they don't, it's by accident or following a refactor).
Conversely, the global scope should not allow redeclaration (except in the repl), because that seems like it should be undefined, i.e.
# All global scope...
x = x * 5
# Hundreds of unrelated lines
x = x + 2
# Hundreds of unrelated lines
x = 1
What's the value of x?
Yeah, I agree with all of that.
But if dbg and expect force a evaluation order, they are _changing_ the evaluation order, compared to a release build or simply removing those dbg and expect lines: the compiler may well determine it's more optimal to have a different order when dbg and expect are not involved.
I agree they're important for understanding properties of the program, but I disagree that they should be advertised as having any bearing or meaning on understanding the "execution order" of the program, except across tasks, since tasks are the only aspect of Roc that's "executed" (side effectful). We would make dbg and expect force line-order evaluation to avoid confusion but not to provide extra meaning to order, because that meaning would be deceptive (except across tasks).
And the tutorial lesson (for people used to imperative languages) is that evaluation order respects dependencies and respects tasks, but that's it. Any ordering that achieves the same result could be the ordering that the compiler selects, and while that's somewhat true for compiled high level imperative languages, it's even moreso the case for Roc because it has a wider optimization space to work with, or at least fewer (or different) language-induced impediments to get in the way of the optimizer.
Recently I ran into frustration around the current shadowing rules when trying to implement something that essentially uses the "state" pattern - where there should always be one "latest" version of the state that you use - and I shouldn't have to think critically about which one that needs to be (it should always be the latest one / inner-most one!).
Take this code for example:
toIdParserList : IdBindState, List Parser -> {state: IdBindState, ids: List Id}
toIdParserList = \state, parsers ->
List.walk parsers {state, ids: []} \{state: state2, ids}, parser ->
{state: newState2, id} = toIdParser state parser
{state: newState2, ids: List.append ids id}
It's kinda silly I have to keep coming up with new names for state.
And if I ever accidentally use the wrong one, that's immediately a bug.
Oh and actually funnily enough, there _is_ a bug of exactly that form in that code.
Can you spot it?
(I didn't do that on purpose, I promise)
toIdParser state parser1
should be state2
That would be IMO much more readable (and writable!) if I could just re-use the same state name. e.g.
toIdParserList : IdBindState, List Parser -> {state: IdBindState, ids: List Id}
toIdParserList = \state, parsers ->
List.walk parsers {state, ids: []} \{state, ids}, parser ->
{state, id} = toIdParser state parser
{state, ids: List.append ids id}
only true for linear state passing (where you truly don't want the old state any more)
Is there a better way to do the state pattern in roc?
generally you should not pass state around so explicitly i think, if you can. you can also refactor this into something less name-y
you could take this into a sort of custom monad direction, for instance
^ is a classic example of "everything is traverse "
in this particular case, without going in that haskell direction, I'd go with something like
toIdParserList : IdBindState, List Parser -> {state: IdBindState, ids: List Id}
toIdParserList = \initialState, parsers ->
List.walk parsers {state: initialState, ids: []} \accum, parser ->
stepped = toIdParser accum.state parser
{state: stepped.state, ids: List.append accum.ids stepped.id}
Hmm not sure I agree that's more readable
That naming is just more work. Allowing shadowing (and taking advantage of it) actually makes it clear to readers that there's an _absence_ of certain kinds of bugs.
+1 I’d love to be able to directly pattern match in the accum above instead of having to name it separately
well what I like about it is that I can find much quicker where a definition comes from.
more generally, I know that not having shadowing causes naming discomfort, and when you just append a number to the variable, that indeed makes it easy to slip up and re-use an old state.
but I like this resistance (in practice I think it makes my code better, even if it takes a bit more effort) and I like the absolute certainty that I have that there are 0 shadowing bugs in my code. I think it works very well in zig and elm.
separately it also makes certain compiler things a bit easier, because names are unique (in a scope)
Never have I ever encountered a bug _caused_ by shadowing. YMMV
This is honestly the scope of problem that'd make me want to maintain a fork of roc that allows shadowing. Or just not use roc at all. I find disallowing it to be very very restrictive.
it happens exactly when the state is not linear. "variable not used" warnings would mostly catch that though I think? haskell diagnostics are not so great
Interesting. I'm having a hard time imagining what that would look like. Is that a thing that only happens in functional languages? (I'm a bit of a beginner in that space...)
Maybe you could point to a concrete example where that happens?
shadowing implies an evaluation order. In functional languages (like haskell or elm) the evaluation order is deliberately unspecified (which can help greatly with optimization). So in a shadowing world, moving some code around can mean that all of a sudden you use a different state than you should.
the same is of course true in rust or zig but there at least the code is already imperative, and you'll write the code with that in mind (still it is error-prone enough that zig also has a no-shadowing rule; you need to mark variables as mut when the value bound to a name can change over the course of a scope)
Oh wow I had no idea zig had that rule. You can tell how much zig I've done :)
moving some code around can mean that all of a sudden you use a different state than you should
This is kind of in tension between the linear and not-linear worlds. In the linear + no-shadowing world, moving some code around can lead you to introduce new bugs accidentally if you don't update the names correctly. In the linear + shadowing world, moving code around "mostly works" without extra effort.
Seems like a bit of a duality to me.
it's not really "no shadowing" but "no re-declaration", but the effect is the same: if the value of a name changes over the course of its lifetime, it must be defined as var someName (similar to rust's let mut someName
and then it's mutably updated
In rust, having successive let bindings with the same name really is shadowing, not mutable updates. The old variables are still alive, just can't be accessed.
right. in zig that is not allowed, mutable updates are the only way
I really like this pattern:
let thing: String = func_that_returns_an_owned_string();
let thing = thing.as_str();
// followed by a bunch of calls that expect `&str` - so I don't have to repeat the .as_str() or & everywhere
Ditto for any other "trivial" transformations, where the original form of the variable is never accessed again.
I think that specific pattern is a limitation of the rust language
Do you mean the fact that rust has two different string types?
no the fact that you cannot to the .as_str() on the same line
Ah got it; that makes sense.
Is there a way to write "state-monad" patterns cleanly in roc, where you don't have to think carefully about the naming of each intermediate? (and be careful not to use the wrong one!)
My understanding was roc doesn't have quite the same monad niceness as haskell
I agree with the idea that a lot of these things should be refactored, but when you have something with X1, X2, etc....it can be very hard to figure out those refactoring. Even with pipeline and backpassing.
Joshua Warner said:
I really like this pattern:
let thing: String = func_that_returns_an_owned_string(); let thing = thing.as_str(); // followed by a bunch of calls that expect `&str` - so I don't have to repeat the .as_str() or & everywhere
This could easily be written using thingOwned in the first statement though.
Coming from Elm, shadowing is rarely a real issue. Mostly when trying do complex conditional transformation without leaving the same scope.
It seems to me like shadowing only solves the problem of unintenionally using a previously motified value. However it makes optimizations harder, makes moving code around more error-prone. Not allowing shadowing may be annoying on these scenarios but if we share knowledge on how to deal with them idiomatically then we could have the best of both worlds?
(e.g. controlling shadowing explicitily by using more functions instead of keeping everything under a shared scope)
it makes optimization harder
How?
just replicating what @Folkert de Vries said as I have zero knowledge around the subject :sunglasses:
Coming from Elm, shadowing is rarely a real issue. Mostly when trying do complex conditional transformation without leaving the same scope.
I hit it a lot with more complex pipelining. In some cases the dependencies are complex enough that you cant use |>. In other cases, you need |> with a lambda for a small function that isn't worth naming. Can't reuse the name there either.
Also, rust as str is just one example. You may also start with a string and convert to a path or any number of other types later. That often uses redefinition of the same variables.
my point was not really about optimization, but more about it being a nice property (in the compiler, but for programmers too) that a name means just one thing in a scope
there are workarounds to make shadowing possible, but it requires more work in the compiler
we actually need to have the compiler support shadowing under the hood for redeclaration in the repl to work, so we can't maintain that property of the compiler regardless :big_smile:
Related question. Is there a cost to using a record for pipelining? My gut feeling is yes, but maybe it should be optimized away.
As in if i am doing a pipeline with state but at one of the stages go from state to { state, tmp1, tmp2 }. Then the next pipeline stage might use that and go to { state, tmp1, tmp3 } then maybe i don't need the temporaries anymore and it collapses to state again.
This is a case where I am not sure if I should use pipelining and it feels a lot messier. If i don't use pipelining, i am stuck with state1, state2, ...
I felt the pain of no shadowing in Elm in a few places and I usually end up with the sad state of indexed variables. most of the time you get used to prefixing/suffixing and things are pretty sane.
however, it seems like the downsides in the ecosystem might overweight the upsides of writing being nicer if a few places since bugs might still appear in the two approaches (being harder to track what variables hold which value and incorrectly misusing an outdated value).
wouldn't a language like Roc favor safety in favor of convenience? I think it is easier to circumvent the problems of not having shadowing while having shadowing everywhere might create more unexpected problems. my 2c.
I don't see it as a safety <-> convenience tradeoff. There are legitimate and common cases where either using shadowing or not using shadowing could lead to accidental bugs.
Unless there's some better tool to address those cases (let's discuss!), then it should really be up to the developer to either shadow or not shadow as the situation dictates.
FWIW, I'd be perfectly happy if I had to sprinkle around a small sigil to indicate that "yes, I'm intentionally shadowing here".
that's interesting! I don't think we've ever discussed that idea
is there any precedent for something like that in other languages?
Not that I know of? Unless you count things like := for definitions vs = for later assignments - but that's not really the same thing.
I mean or rust doing let x = ... vs x = ... for something mutable
or like shadowed x =
Yeah, just pointing out, how it kinda appears in rust.
I would go for even something that explicit.
I think it is still nicer than X1 X2
How would that work with variables introduced in patterns? e.g.
a = 1
b = 2
List.walk list {a, b} \{a,b}, item -> {a: a + item, b: max(b, item)}
(I know that's a bit of a contrived example; but I was writing code pretty close to that yesterday)
Does that become:
List.walk list {a, b} \{shadowed a, shadowed b}, item -> {a: a + item, b: max(b, item)}
sure, let's say yes :big_smile:
(at least for purposes of exploring the idea!)
"shadowed" feels pretty verbose, but otherwise workable.
I would have gone for some sigil, like & or $ or something
or @ or ~
did you ever see a language use a sigil like that and think that was a good idea?
e.g. haskell has !var for strict evaluation
You could also just require that the all instances of a name that is or will be shadowed use the sigil, since there's usually nothing particularly special about the first declaration (often it's the _least_ special)
it's uncommon, so anytime it shows up you have to remember what that was and what it does
also sigils are hard to search for
or maybe the very last declaration of a reused identifier is marked specially (or not marked, to distinguish it) so as to signal "look no further: in this and all descendent scopes, this is the last meaning this name will ever have"
Agreed. Most of Haskell was very hard (or nearly impossible) to search documentation for a decade ago. If we go the route of using many sigils, we should have a quick reference guide be the first link anyone finds on the documentation part of the Roc website, and the guide would contain a table of all symbols and their meanings, associated abilities, etc.
I definitely like the keyword better
Even though it is verbose
Hopefully will still push people to the other patterns
I don't love the idea of a keyword/sigil, tbh. It feels like another thing for a developer to have to keep in their head, for nebulous value - now I need to care about the semantic value of a variable, and whether it's shadowed or not, sort of like let vs const or mut in some languages. It puts a toll on the reader and I don't see how it's better, from a reader's perspective, than explicitly allowing shadowing
Joshua Warner said:
Take this code for example:
toIdParserList : IdBindState, List Parser -> {state: IdBindState, ids: List Id} toIdParserList = \state, parsers -> List.walk parsers {state, ids: []} \{state: state2, ids}, parser -> {state: newState2, id} = toIdParser state parser {state: newState2, ids: List.append ids id}
In this example, did you get a warning that state2 was unused? I feel like in most situations where you increment the index of reused variables, at least a mitigating factor is that you typically use the variables in a linear fashion, so if they're unused the tooling can tell you.
Brendan Hansknecht said:
Hopefully will still push people to the other patterns
If the intent is to push people away from the keyword, then we shouldn't have this as a feature, except perhaps as a short term experiment to be concluded before the first stable release
hm, so I tried refactoring that example to use shadowing - here it is before and after:
original:
toIdParserList : IdBindState, List Parser -> { state : IdBindState, ids : List Id }
toIdParserList = \state, parsers ->
List.walk parsers { state, ids: [] } \{ state: state2, ids }, parser ->
{ state: newState2, id } = toIdParser state parser
{ state: newState2, ids: List.append ids id }
with shadowing:
toIdParserList : IdBindState, List Parser -> { state : IdBindState, ids : List Id }
toIdParserList = \state, parsers ->
List.walk parsers { state, ids: [] } \{ state, ids }, parser ->
{ state, id } = toIdParser state parser
{ state, ids: List.append ids id }
I have found cases where the alternatives don't work or are much more confusing. So i think it has uses. Just other forms are better when they work.
Specifically reply to Kevin above.
it's easier to write, but...that's a lot of different meanings of state in a small amount of code :sweat_smile:
here's an idea for another way to write it:
toIdParserList : IdBindState, List Parser -> { state : IdBindState, ids : List Id }
toIdParserList = \initState, parsers ->
List.walk parsers { state: initState, ids: [] } \{ state, ids }, parser ->
answer = toIdParser state parser
{ answer & ids: List.append ids answer.id }
That works once we fix the fact it would clone ids every time
Why not use a backpassing style for that?
you mean like this?
toIdParserList : IdBindState, List Parser -> { state : IdBindState, ids : List Id }
toIdParserList = \initState, parsers ->
{ state, ids }, parser <- List.walk parsers { state: initState, ids: [] }
answer = toIdParser state parser
{ answer & ids: List.append ids answer.id }
exactly
I personally think that's harder to understand
I like having loops indented
my gut reaction to reading that code is that it feels to me like a downside of backpassing that it's possible to write it that way :big_smile:
I see. I worry sometimes that the lambda params will not get noticed, though you're right that the indentation indicates that something interesting is going on
I see that aspect of backpassing differently: the rest of the function focuses on the next level down, and the outer context has nothing else to offer going forward. In that way, it's a bit like an "inception" operator
In this example, did you get a warning that state2 was unused?
@Ayaz Hafiz TBH, probably, but during development I've found roc's 'unused' warnings to be way too noisy to be valuable to pay attention to. Like, I just wrote ~10 functions that are probably all unused because I haven't hooked them up yet and I'm just trying to get things working with small expect unit tests first - which IIRC still cause 'unused' warnings (I think?).
I don't love the idea of a keyword/sigil, tbh.
I agree - but I'd take shadowing with a sigil over no shadowing ;)
Shadowing without a sigil is still an option ;)
...downside of backpassing that it's possible to write it that way...
oof. backpassing in maps and loops is my favorite Roc syntax. It is only possible to use it in places when the mapping function is "terminal", i.e. the last thing you do in that code block. Indenting there would just be visual noise.
regarding sigils: Elixir uses the ^ operator in pattern matching to bind to an existing value and prevent shadowing. Not sure how relevant it is to the discussion but I've wanted something similar when doing pattern match over lists (also, can I haz Rust's @ in patterns as well :pleading_face:)
Shritesh Bhattarai said:
backpassing in maps and loops is my favorite Roc syntax. It is only possible to use it in places when the mapping function is "terminal", i.e. the last thing you do in that code block. Indenting there would just be visual noise.
wow, that's a strong endorsement!
maybe my first impression is wrong, and I should try embracing it and see how I feel after getting used to it... :thinking:
Shritesh Bhattarai said:
also, can I haz Rust's
@in patterns as well :pleading_face:
oh I think we should totally have that, just with as instead of @ - e.g. { x: blah } as rec -> ...
another interesting way to write the previous example, which would be an option if we have tuples:
toIdParserList : IdBindState, List Parser -> (IdBindState, List Id)
toIdParserList = \init, parsers ->
(state, ids), parser <- List.walk parsers (init, [])
toIdParser state parser
|> Tuple.mapSecond \id -> List.append ids id
or, without backpassing:
toIdParserList : IdBindState, List Parser -> (IdBindState, List Id)
toIdParserList = \init, parsers ->
List.walk parsers (init, []) \(state, ids), parser ->
toIdParser state parser
|> Tuple.mapSecond \id -> List.append ids id
btw in the { answer & ids: List.append ids answer.id } line above, the type of answer changes. is/should that be allowed?
I believe elm does not allow you to do that
actually, more is wrong. answer does not have an ids field, it has an id field
oh good point!
so the tuple one would work but the record one wouldn't
an interesting thing I wonder about: people have plenty of feature requests in Elm, but shadowing has never been one of them as far as I can remember. I also used Elm before and after the release where shadowing became an error, and I don't remember any complaints about it.
I wonder what's different that leads to so much more interest in it for Roc. :thinking:
also Elm's compiler doesn't warn for unused variables (a separate linter does that, which tends not to get run as often) so I'd expect "accidentally reused stale state" bugs to come up strictly more often in Elm
my guesses:
Task and stateful computatione.g. I worked on a bytes parser in elm and there the state problem does come up. But the sort of person that would do that sort of library in elm probably has a bunch of experience in other functional language (or with elm)
yeah I also wonder if we're disproportionately running into that edge case right now because an unusual percentage of people's time in Roc is literally writing parsers :big_smile:
because that's not something you normally do in application development, but it is in Advent of Code specifically, and it also is when building foundational libraries like JSON and CSV parsing, which comes up more often when a library ecosystem is in its infancy
I think pipelining and trying to focus on data pipeline is what leads to me to want shadowing.
Even if data can't actually use |>, but is written in a staged transformation manner. I would want shadowing.
So in my cases at least, not parsers and the like.
yeah that's a difference between Roc and Elm - in Elm you need parens around a lambda in the middle of a pipeline, so in practice people instead typically end the pipeline there
(and then e.g. name an intermediate value before moving on)
to be fair, since those lambdas tend to be very small, using a short variable name like b doesn't seem very error prone to me
unless I suppose you used b in one part of it and the outer buf elsewhere
when you should have used b everywhere
Totally fair
My current thought is that we seem to have patterns that should help with this. Languages like elm manage just fine. We should try not adding shadowing/redeclaration yet and wait until have more samples of places we want shadowing.
We should try documenting those cases, seeing if there are nice rewrites to avoid wanting shadowing, and reconsider later.
In Elm, i/o processing funnels back into a single function using a message union (i.e. central dispatch), and it's very hard to do it any other way.
In Roc, especially with backpassing, task processing is comparatively procedural, and any function can do it.
Both approaches have their own tradeoffs, and one of them for Roc is more of a tendency towards larger functions that "do" more things, and a higher likelihood of wanting to reuse names.
that's true, but also none of the motivating use cases for shadowing we've seen so far involve Task.await, which is the essential difference between how I/O works in Roc and in Elm (that is, Task.await is super common in Roc and super rare in Elm)
so while that is definitely a difference, I don't think it's a big part of the difference in interest for shadowing
writing down the idea for #ideas > syntax for x = f x pattern gave me an idea: what if shadowing was allowed, but only within the same block of defs?
so for example you could do this:
x = 5
x = x + 1
x = x + 3
x
...but this would be a compile-time error:
x = 5
List.map nums \num ->
x = x + num
...because you're shadowing x from different defs, not the defs where x was originally defined
as I recall, the main request for shadowing is in exactly this situation (where you want to shadow within the same set of defs) and the major downsides are in other situations (where you introduce a new variable without realizing you've shadowed something from the outer scope)
Yeah, that would fix just about every reasonable use case i think
I would very much be for that.
What about inside an if or when? Given those are terminal, would it be fine to shadow?
x = 1
if foo then
x = x + 2
... using x
else
x = x - 3
...
I think what you said above plus this would be needed to cover most common cases (and would be safe unlike inside List.map)
This might be confusing with global scoping rules. For example, this reddit post gives a code example that would be error-prone, except for the global scoping and no shadowing:
https://www.reddit.com/r/roc_lang/comments/12ogars/a_noobs_bikeshedding_of_roc/
Scoping is weird:
weirdScoping = \{} ->
f = \c -> a + c
a = b + 3
b = 2 f 1
This surprisingly works. I also surprisingly really like it though, since it's consistent to how the global scope works, and it seems Roc has nice errors for infinite loops and doesn't allow shadowing. I wish this was part of the tutorial though.
It would be much worse if a could be shadowed even within the same scope. I don't expect this to come up often, but if it does come up, it's likely to cause problems.
I think we talked about this at some point and code like that should at least be a warning, maybe even an error. It makes code more confusing and harder to follow. Code that isn't at the global scope should be in some form of valid topological sort.
Wouldn't that prevent mutually recursive functions?
Only defined within another function.
It would still work at global scope
Also, you could technically pass a function into one of the functions to fix it at local scope, but that is less nice.
Yeah, I guess most mutually recursive functions should live in the global scope anyway. However, part of me, likes that scoping works the same at any level.
Also, found the comment: we want to test making this an error and see how it turns out in practice. If it doesn't add too much friction, we would keep it.
context: #ideas > roc format def re-ordering (at top of thread)
also, I don't think a new user would agree with the statement that scoping works the same at any level:
weirdScoping = \{} ->
f = \c -> a + c
a = b + 3
b = 2 f 1
# This functionally looks the same to a new user.
# `<-` is just a weird form of `=`
brokenScoping = \{} ->
f = \c -> a + c
a <- someFunc b
b = 2 f 1
here's the issue about a warning for ordering: https://github.com/roc-lang/roc/issues/4430
Should that be superceded by the discussion i linked above and #5078? That discusses making it an error instead of warning.
so the only distinction between "warning" and "error" is that error means roc dev will refuse to run the program
(since you can always force a run regardless of whether it's a warning or an error)
I think of the distinction being "error means the compiler is inserting a runtime crash somewhere in case this comes up"
and that wouldn't apply to def ordering!
Ah
Didn't think about that
Yeah, warning sounds fine then
Wait, can't you get a warning on roc build but we still complete the compile? Or will a warning stop the compile?
If so, an error would mean it complete blocks compilation outside of roc dev, but warning would mean compilation still completes
well the goal is that compilation always completes; you're never completely blocked from running if you want to (you just might get a crash, possibly as early as "right away" depending on where the error(s) are)
the point of introducing roc dev was to give you a workflow where you can say "I want to work through all my errors, and then once all that's left are warnings like 'unused variable' and whatnot, which won't cause crashes, then I actually want to run the program"
ok
So, @Richard Feldman does this mean that you are fine with us adding both the reordering warning and shadowing in the same scope?
Of course with exceptions for functions in both cases, I think. Functions defined out of order are fine and function shadowing is not?
I think shadowing needs more discussion
reordering warning is good to go!
Fair enough. What are your current concerns with shadowing? Like if it was limited to this:
state <- newRand {} |> Task.await
{state, data: x} = randFloat state
{state, data: y} = randFloat state
if hasZ then
{state, data: z} = randFloat state
{data: other} = randBool state
SomeTask other [x,y,z]
else
{data: other} = randBool state
SomeTask other [x,y]
[x,y]
Would be invalid within anything nested (nested definitions, nested functions, anything with a ->, <-, or = essentially).
allowing any form of the shadowing removes a significant language-wide guarantee (namely, that any time you introduce a new name you'll either get an error or else it won't affect any existing code) so I consider it a major change to the language regardless of what restrictions are put on it (unless it's something like a sigil which preserves that guarantee)
so I just think it needs more discussion
You mean like, you could do:
x = 3
...
y = x + 1
Then transform it to:
x = 3
...
x = 4 # this x was supposed to be local and not used elsewhere
...
y = x + 1 # now this is using the wrong x
that's one example, yeah
another consideration is that Elixir allows reassignment (or "redeclaration") and apparently it has mixed reviews from Elixir users in terms of whether they like it or don't
neither of which are outright deal-breakers, they're just serious considerations
I'm still not sure whether shadowing (and if so, in what form) is the right choice for Roc, especially considering Elm doesn't have it and the demand for Elm to introduce it is basically zero (so what's different about Roc that's creating the demand? How much of it is familiarity, how much is different use cases between Elm and Roc, how much of those are due to the language being relatively new - and will those use cases become less and less common over time? etc.)
How does elm do state and value generation like what we do for roc rand? I think pipeline fixes most things, but not that case.
I think i really only hit this in situations like that. Maybe they are less common in elm, maybe there is a different solution.
usually people don't pass around the seed, but rather compose together Random.Generator values (kinda like how you chain Task values together - this can also be done in Roc)
e.g. Random.andThen works the same way Task.await does
I'm not a fan of allowing reassignment. It seems to go against the functional programming philosephy and it's nice to be absolutely certain you only have one definition.
If we only use it for cases similar to random generation I would expect users to be surprised by it as well, because it would be a rare sight.
It also seems like it could be a source of tricky bugs.
I'm in favour of shadowing values that are defined inside a function scope.
This is basically the only thing that ever bit me using Elm because the solution there is to go for thing1 thing2 or even worse thing_ thing__.
Shadowing never caused me any problems in elixir (there is no concept of global variables, only modules and functions, so things are shadowed locally by default)
This is basically the only thing that ever bit me using Elm because the solution there is to go for thing1 thing2 or even worse thing_ thing__.
Can you give an example of code like this? I'm wondering if shadowing is the best solution in cases like this.
I definitely think going state1, state2, state3 is more error prone than reassigning to the same state variable. Also, i think this is made more complicated because many people will not know the generator pattern. So anything with state and values will hit this. I have already written many bugs because of this.
yeah, I buy that - I just wonder why it doesn't happen in Elm too :thinking:
or at least, not often enough for there to be as many requests for shadowing as we're seeing
Maybe most cases of this are dealt with by libraries? And they all know the generator pattern?
could be!
In situations where you would use state1, state2, state3 would it not be best to write a function like we do with randomList here?
Not normally. Generally I am generating multiple types with different names.
for example:
ar1 = new data
{state: ar2, value: str1} = arbitraryStr ar1
len1 = Str.countUtf8Bytes str1
{state: ar3, value: str2} = arbitraryStr ar2
len2 = Str.countUtf8Bytes str2
{state: ar4, value: reference1} = ratio ar3 1 2
{value: reference2} = ratio ar4 1 2
This is generating 2 strings and 2 bools
End goal was to get 2 random strings, their lengths, and 2 bools of where or not to do something special related to them. Specific to this function and it's api.
Generally it makes sense to name things that are different with different names (and it's useful to have this assurance when reading code). As a time saver, it's easier to just write state1, state2 to save from describing the actual difference. Having less descriptive names is convenient but leads to possible errors of selecting the wrong name.
Brendan Hansknecht said:
I definitely think going
state1,state2,state3is more error prone than reassigning to the samestatevariable. Also, i think this is made more complicated because many people will not know the generator pattern. So anything with state and values will hit this. I have already written many bugs because of this.
To address this...
Could we have roc do both: 1. insist that different things are named differently and 2. provide some system that would mean that the previous name may no longer be used either.
For instance:
state.1state.2 = state.1 ...The existence of state.2 would then mean that state.1 is not available.
That's a nice construct but it may be confusingly similar to a record access or tuple. I'd prefer to make the generator scenario work well with current roc or introducing as little complexity and/or new syntax as possible.
(syntax was just to get the point across) This could be done with state1, state2 or probably many other ways.
I don't think reasonable descriptive names exist for this code.
originalState, stateAfterCreatingFirstStr, stateAfterSecondStr, ...
Why are those names unreasonable? Too much typing?
I think it's ok for typing but in my opinion long names make the code look dense and complicated.
I think they mostly add noise and make the code harder to follow. They are describing what you already see by reading the code. So they essentially are repeating the rest of the line, but in non-tested english names.
newAribtraryGenerator = new data
{state: aribtraryGeneratorAfterFirstStr, value: str1} = arbitraryStr aribtraryGeneratorNew
len1 = Str.countUtf8Bytes str1
{state: aribtraryGeneratorAfterSecondStr, value: str2} = arbitraryStr aribtraryGeneratorAfterFirstStr
len2 = Str.countUtf8Bytes str2
{state: aribtraryGeneratorAfterReferenceBool, value: reference1} = ratio aribtraryGeneratorAfterSecondStr 1 2
{value: reference2} = ratio aribtraryGeneratorAfterReferenceBool 1 2
or with state to at least make the name smaller:
stateNew = new data
{state: stateAfterFirstStr, value: str1} = arbitraryStr stateNew
len1 = Str.countUtf8Bytes str1
{state: stateAfterSecondStr, value: str2} = arbitraryStr stateAfterFirstStr
len2 = Str.countUtf8Bytes str2
{state: stateAfterReferenceBool, value: reference1} = ratio stateAfterSecondStr 1 2
{value: reference2} = ratio stateAfterReferenceBool 1 2
I think when writing code things tend to seem obvious but long names like that do make it easier for someone to understand what's happening when looking at someone else's code.
I do not think so at all in this case. This is a background state that would generally be best to thread in the background and forget about.
There is no real value to a reader or difference between stateAfterFirstStr and stateAfterSecondStr.
Also, you know that it is stateAfterFirstStr because you can read the rest of the line of code: {..., value: str1} = arbitraryStr stateNew
variable should not encode the order of use/the imperative transitions in code. They should just encode names as to what they are.
This is also brittle because I can't add a new value in the middle without renaming multiple things.
Well having descriptive names is kind of a tangent.
You could currently write v1, v2,v3....but with the disadvantages you point out.
(I do find it easier to understand with descriptive names though).
It would be nice for the editor to hide and automatically handle the generators in situations like this but that's not a perfect solution either.
@Brendan Hansknecht
Have you already shown how you would like the code to look (with shadowing?)
Would be this:
state = new data
{state, value: str1} = arbitraryStr state
len1 = Str.countUtf8Bytes str1
{state, value: str2} = arbitraryStr state
len2 = Str.countUtf8Bytes str2
{state, value: reference1} = ratio state 1 2
{value: reference2} = ratio state 1 2
Isn't this a pretty special case then where randomness (arguably) negates the difference between variables?
I think that is a common use case, but you could also be building up or using a state without randomness. Anytime you want part of a value/return type, this will come up. Everytime you see a pipeline, this is implicitly coming up, it is just that pipeline syntax avoids the naming issue.
@Brendan Hansknecht Do you think it's worth it to sacrifice being able to confidently rely on a thing with a certain name having the same value? i.e. that it hasn't changed in the intervening lines.
there could also be a simple syntax to indicate that something is shadowed/shadowable: ~state
I am totally open to some form of specific syntax here. Actually probably could be pretty awesome as part of the name (though maybe really strange to read at first).
I do like this direction
~state = new data
{state: ~state, value: str1} = arbitraryStr ~state
len1 = Str.countUtf8Bytes str1
{state: ~state, value: str2} = arbitraryStr ~state
len2 = Str.countUtf8Bytes str2
{state: ~state, value: reference1} = ratio ~state 1 2
{state: ~state, value: reference2} = ratio ~state 1 2
In this code, you know that only ~state can be shadowed, no other variables
Also, still the restricted shadowing rules would apply. You couldn't do:
~state = new data
List.map mylist \x ->
{state: ~state, value: str1} = arbitraryStr ~state
Str.concat x str1
perhaps the first use of a shadowed variable could be used plainly (without the annotation) to make clear it is the first use?
The reason I made it explicit is that it is essentially me saying, this variable is allowed to be shadowed.
That way, if I see:
state = new data
I know that it can never be shadowed.
So upon seeing the name anywhere in the code, you know exactly what to expect
Yes, but I'm not sure that's necessary if every shadow is anotated....wouldn't it be nice to know where the original var is?
...otherwise there could always be a shadow (lurking :) ) in the code above.
Good point
I was thinking about the reverse. Accidentally using it then it gets shadowed
state = 3
...
y = state + 3
...
One day someone adds:
state = 3
...
~state = state + 1
...
y = state + 3
...
What is y now? What should it?
Essentially what happens if i miss that it was shadowed and the value changes on me?
but that would be an error because the last state has been shadowed out by ~state
So once ~state exists, state can no longer be used?
yes
state is shadowed by ~state (and every successive ~state)
That's fair... I kinda think adding the rename once is just an unnecessary inconsistency.
It really doesn't matter if another shadow is above:
Guaranteed shadow above
~state = ~state + 1
...
Or the other case where i simply don't care if there is a shadow above. It doesn't affect me:
~state = 7
...
For tooling it could matter to make it easy to find the first definition.
Would it make a difference? Either way, it is the first time the name appears
an analogy to this direction is how Zig deals with shadowing
which is to say, shadowing is banned, but you can opt into allowing variables to be reassignable or not on a per-variable basis
(using const or not)
I'm open to the sigil idea (I used to do Perl long ago, and I have fond memories of the sigils being an extremely concise way to tell things about a particular variable) although I do remember there being some push-back (much) earlier in this thread regarding sigils in general
I programmed in go for a while where capitalization mattered. I thought it was a terrible idea at first, but i came to respect it for things that you would prefer to know at a glance. So i am open to a sigil because of that.
a thought about the ~ prefix sigil specifically: -x looks fine, but -~x looks very weird to me :sweat_smile:
same with !foo vs !~foo
maybe a ! suffix since reassignable variables have a kind of imperative feel to them?
e.g. state! = ...
there's also always the old $ prefix, e.g. $foo = ... and then -$foo or !$foo
which I suppose would feel familiar in that in languages which use the $ prefix, those variables are always reassignable :big_smile:
$ is a bit more obvious than exclamation, so it depends on what is wanted with that.
...or exclamation could be used as a prefix (but then it looks ike not)
opposition to sigil earlier in the conversation:
Folkert de Vries said:
did you ever see a language use a sigil like that and think that was a good idea?
Folkert de Vries said:
also sigils are hard to search for
Ayaz Hafiz said:
I don't love the idea of a keyword/sigil, tbh. It feels like another thing for a developer to have to keep in their head, for nebulous value - now I need to care about the semantic value of a variable, and whether it's shadowed or not, sort of like
letvsconstormutin some languages. It puts a toll on the reader and I don't see how it's better, from a reader's perspective, than explicitly allowing shadowing
Ayaz Hafiz said:
I don't love the idea of a keyword/sigil, tbh. It feels like another thing for a developer to have to keep in their head, for nebulous value - now I need to care about the semantic value of a variable, and whether it's shadowed or not, sort of like
letvsconstormutin some languages. It puts a toll on the reader and I don't see how it's better, from a reader's perspective, than explicitly allowing shadowing
Having a sigil is better than explicitly allowing shadowing because it allows us to keep the guarantees that an unshadowed definition hasn't changed in intervening lines of code. The sigil makes it explicit where shadowing is happening.
For people like me, who are essentially opposed to shadowing, having a sigil (or keyword I suppose) is a nice compromise. I don't have to use it. I get the benifits of having definitions as constants. If someone else uses it then I'm not surprised by it (because I see the sigil). The sigil can provide all the desired benifits for those who wish to use shadowing.
based on everything that's been discussed so far, I'm convinced that both of the following are true:
that said, I don't think it automatically follows that the only design that makes sense is one that supports redefining/shadowing in some contexts and not others; another possible answer is "despite the fact that both of these have significant value, that value doesn't justify the cost of [a particular design]"
Hey @Richard Feldman, what is the minimal amount of code to convert this to the back passing friendly syntax. Also, would it be usable with tasks? Or would the two forms of backpacking conflict (iirc they would type mismatch)?
state = new data
{state, value: str1} = arbitraryStr state
len1 = Str.countUtf8Bytes str1
{state, value: str2} = arbitraryStr state
len2 = Str.countUtf8Bytes str2
{state, value: reference1} = ratio state 1 2
{value: reference2} = ratio state 1 2
Where we want something like (of course with the equivalent of await as needed):
str1 <- arbitraryStr
len1 = Str.countUtf8Bytes str1
str2 <- arbitraryStr
len2 = Str.countUtf8Bytes str2
reference1 <- ration 1 2
reference2 <- ration 1 2
Really trying to understand the alternative to shadowing or being stuck with state1, state2, state3.
Richard Feldman said:
that said, I don't think it automatically follows that the only design that makes sense is one that supports redefining/shadowing in some contexts and not others; another possible answer is "despite the fact that both of these have significant value, that value doesn't justify the cost of [a particular design]"
Sigils (or a keyword) are a compromise in terms of complexity, but without them you have to decide which half of the users to make unhappy.
well but what about the users who are unhappy with sigils? :big_smile:
People who don't like sigils may be assuming they can have it all their way in terms of
shadowing or no shadowing.
Maybe if they realized that we are forced to choose:
Having sigils along with their preference of shadowing (and the added complexity).
or
Having no sigils, and a simpler language, but possibly not having their preference of shadowing.
Then perhaps people would not be so opposed to sigils.
Having half of people unhappy with shadowing or no shadowing is really a lot of unhappy people!.
thinking about it personally, I think if the sigil approach existed, I would occasionally use it
I don't think I'd consider it the type of feature that's like "you can use this if you need it, but it's a code smell so try to avoid it" - rather, I imagine putting it in the tutorial as "you should default to not using it, but here are the circumstances where it can improve your code"
The really remarkable, and wonderful thing about using sigils to mark shadowing is that as someone who is opposed to all out shadowing, (and i think this goes for @Anton as well) I'm ok with the solution (except that there is a complexity trade off i guess, but I'm not in a position to measure the consequence of this.). and @Brendan Hansknecht can have all the benifits of shadowing that he is asking for (I believe). I wonder how much are @Folkert de Vries and @Ayaz Hafiz (and other's with a similar perspective) opposed to having sigils to mark a name as shadowable or shadowing (and the consequent language complexity), in light of how it does seem to resolve a major divide in terms of shadowing or no shadowing (i.e. it gives us kind of the best of both worlds, confidence about when variables are constants and shadowing with awareness of when shadowing is at play).
Richard Feldman said:
I don't think I'd consider it the type of feature that's like "you can use this if you need it, but it's a code smell so try to avoid it" - rather, I imagine putting it in the tutorial as "you should default to not using it, but here are the circumstances where it can improve your code"
Some people would still go to town and just use shadowing everywhere though.
as in they'd use the sigil everywhere?
yes
eh I kinda doubt that
it'd kinda stand out visually
I wouldn't expect people to just use that all over the place when (for example) top-level declarations couldn't use it
I like the sigil idea... but I fear opening another syntax might have its own side effects. is this the only sigil? are we creating a pattern? what are other things that can eventually use this syntax? what is the mindset the user is creating when seeing a sigil in Roc? just assuming "it's a one time thing" can easily turn into a can of worms in the future (but then again... I like it :sweat_smile: just trying to play the devil's advocate here)
Yeah, all of those are very legit concerns
Also, I just wrote up both version of random. The backpassing generator version and the regular version.
Here is what the end user functions look like:
repeatedState = \{} ->
state0 = new 1234
(state1, u1) = randU64 state0 10 20
(state2, f1) = randF64 state1 0 1
(_, str) = randStr state2
u1Str = Num.toStr u1
f1Str = Num.toStr f1
"\(str): \(u1Str), \(f1Str)"
genState = \{} ->
generator =
u1 <- randU64 10 20 |> andThen
f1 <- randF64 0 1 |> andThen
str <- randStr |> andThen
u1Str = Num.toStr u1
f1Str = Num.toStr f1
constant "\(str): \(u1Str), \(f1Str)"
state = new 1234
(_, out) = generate state generator
out
Messing with the examples, I think the biggest issue with the generator form is that I don't think it can be used with Tasks.
You can't do:
u1 <- randU64 10 20 |> andThen
maxFloat <- Stdin.readFloat |> Task.await
f1 <- randF64 0 maxFloat |> andThen
It will lead to a type mismatch. So generator work in isolation (with a bit more verbosity), but are limited in what they can represent. So I don't think the generate syntax fixes the issues we are discussing here.
yep, that's accurate!
doesn't compose with tasks
You could make it composable with Task if you really wanted to, but it wouldn't be the prettiest :upside_down:
:thinking: Should I even ask how?
Basically, you can make generators that return tasks: Random.Generator (Task a err)
and then have a few helper functions to chain them
Something like this:
genState = \{} ->
generator =
u1 <- randU64 10 20 |> andThenTask
maxFloat <- Stdin.readFloat |> Task.await
f1 <- randF64 0 maxFloat |> andThen
u1Str = Num.toStr u1
f1Str = Num.toStr f1
constant "\(u1Str), \(f1Str)"
state = new 1234
(_, out) <- taskGenerate state generator |> Task.await
out
andThenTask : Generator a, (a -> Task (Generator b) err) -> Generator (Task b err)
taskGenerate : State, Generator (Task a err) -> Task a err
As I said, not pretty
In this case you don't even need this because Stdin.readFloat doesn't use u1, and you could just do it outside of the generator
oh, true. also, quite intriguing, That isn't even that bad.
Yeah, probably puzzling for a beginner, but something I might be ok doing if I had to generate a lot of numbers that depended on results of effects
For sure.
64 messages were moved from this topic to #ideas > handling temporary API credentials by Richard Feldman.
Georges Boris said:
I like the sigil idea... but I fear opening another syntax might have its own side effects. is this the only sigil? are we creating a pattern? what are other things that can eventually use this syntax? what is the mindset the user is creating when seeing a sigil in Roc? just assuming "it's a one time thing" can easily turn into a can of worms in the future (but then again... I like it :sweat_smile: just trying to play the devil's advocate here)
Is @ not a sigil already? (For opaque types.)
true!
I found another use case where not having shadowing is really inconvenient.
I am trying to write a parser that is in DOD form. So instead of using a recursive node struct, it has a list of nodes and uses U32 to index into it.
So instead of:
Node: [
Let {ident: Node, expre: Node}
Ident Str
...
]
It would be:
Node: [
Let {ident: U32, expre: U32}
Ident Str
...
]
As such, to generate a new node, you have to append it to the node list. This means that you have a mutable node list that needs to be passed into every function of a recursive decent parser and returned mutated back up the stack. That list will get updated multiple times in a single function call. So it requires multiple names and makes the code less readable.
On top of that, you have an error list and a token index, which all are mutable and passed into and out of every function. So many things that all have multiple references in a single function.
Note: part of this pain would be alleviated if #2836 was fixed. Then all the mutable data could be passed around in one record. Until it is fixed, they all need to be separate or the function take a huge perf hit due to tons of unnecessary copying. That said, in both cases, shadowing would be quite helpful.
(deleted)
I haven't read this entire thread, but Richard asked in the AoC channel that we shared our thoughts, so here are mine:
In general, I agree that shadowing most likely makes code harder to understand in almost all situtations, since now there might be multiple places you need to check to figure out where a value might come from, and you also have to parse their scope in your head to actually figure out what is happening (goto-definition saves lives here). This is especially true if you allow shadowing directly on the same level, and when I first learned FP it actually took me a long time to understand the difference between
let x = 1
let x = x + 1
and mutable variables (The first functional language I learned was F#). I think I only really understood the difference after I saw recursive definitions in Haskell (like ones = 1 :: ones).
However, having done Elm for a few years now, I ran into these situations where it would have been really convenient or beneficial to have shadowing:
(Keep in mind that not being able to shadow is a really minor problem with a trivial fix, and the benefits are quite substantial I think - like being able to move definitions around while guaranteeing that everything will still work)
Introducing a new top-level definition:
Let's say I create a module with an opaque type that wraps numberOfKittens. In every function, I unwrap my opaque type and called the value numberOfKittens. Later, I decide that some users of my module also really need to know the number of kittens there are, so I expose a new accessor for this value. Since it will be exposed, I would really like to have a nice name for this function. So now I have to go through basically every other function in my module and rename the variable into something like numberOfKittens_ or kittenCount to free that name up to be used on the top-level.
Unwrapping values to continue processing:
This comes up in 2 situations: When dealing with Maybe and Result types, and when pattern-matching on tuples (like for example returned by Str.splitFirst to then continue processing stuff:
line <- Stdin.line |> Task.await
when line is
Input line -> # well, I have to come up with a new name now... maybeLine? theLine?
{ before: id, after: data } <- line |> Str.splitFirst " " |> Result.try
id <- id |> Str.toNat |> Result.try # again, probably going to rename the first one to rawId or idStr
_ -> crash "whatever"
(result1, state1) = doSomethingComplicated value initialState
(result2, state2) = rememberToUseResultNow result1 state1
(result3, state3) = hopeYouDontMessThisUp result2 state2
(combineResults result1 result2 result3, state3)
I know this case directly contradicts what I said in the beginning, so maybe this is entirely fine, since there is already this feel of "I need to be careful here" around it, at least for me.
go or aux function that takes the same arguments as the top-level function + some initial state for the loop. I need to pass the arguments of the top-level function again, since they will change on every iteration. I can't name these arguments the same thing though, since that would shadow them! So not only do I need to come up with new names (calling the outer ones initialX usually does the trick), I also need to make sure that I don't accidentally use the initial values inside of the loop, when I should have been using the local ones instead. The common wisdom of course is to make another doSomethingHelper function at the top-level instead.3 is definitely the main case that I struggle with when I hit it.
x = 2
y = 5
z = x * y
is the same as
z = x * y
x = 2
y = 5
(I'm not sure if there is a term for this, other than just "being declarative")
Shadowing would make this complicated, because shadowing implies order. So, for example:
z = x * y
x = 2
y = 5
x = y + 2
What would be z equal in this case?
This is probably the biggest benefit of not allowing shadowing - you can re-order and pull things out however you like, and it guarantees that in the end, the result will be the same (as long as it still compiles).
To me, that trumps any annoyance of sometimes having to name variables x2 or idStr.
I've just checked how it works in Haskell and for a do block order matters and you can shadow variables, but for where clause the order doesn't matter but you can't shadow variables (so just like Roc). I prefer the Roc approach, it just fits FP better for me.
joshi said:
- Threading state:
This comes up almost never. Most of the time, all the state-threading is already abstracted away for you into some nice helpers anyways. But when you are manually passing a random seed around or you implement a custom combinator, it might sometimes make the code actually easier to understand and safer if there was shadowing, because it would prevent you from using an old value that you should no longer use:(result1, state1) = doSomethingComplicated value initialState (result2, state2) = rememberToUseResultNow result1 state1 (result3, state3) = hopeYouDontMessThisUp result2 state2 (combineResults result1 result2 result3, state3)
I think this case wouldn't work with shadowing, because you need those results at the end. And if you didn't need them, then you could use pipes to pass the state and result. Not saying such cases don't happen though.
I was thinking about only shadowing state, since you definitely should use the previous state at every step. I was maybe not as good as I thought at coming up with a "real-feeling" example while typing this :sweat_smile:
I know it is not advise to try and directly port imperative code to a functional language like roc, but I wanted to port some benchmarks over and model them as closely as possible to the external benchmarks. This is by far the most I have ever wanted shadowing, this code is so painful to write.
It is managing 3 rngs along with other variables that are also versioned throughout the function.
I do find it easy to read, but definitely a good snippet to try out with possible shadowing implementations.
I was searching for a way to bind to the entire record with an optional field after setting the optional value inside a pattern match... and I ended back to my question lol. Is there a way to do this now?
Richard Feldman said:
Shritesh Bhattarai said:
also, can I haz Rust's
@in patterns as well :pleading_face:oh I think we should totally have that, just with
asinstead of@- e.g.{ x: blah } as rec -> ...
Essentially, I want this
foo = \{id ? "default value"} as rec ->
doSomethingWith rec
I don't think it's implemented yet :sweat_smile:
just saw a concrete shadowing benefit happen in Unison live-coding: Paul spent some time debugging a problem that turned out to be because he hadn't shadowed away some stale state, which he realized here:
https://www.youtube.com/watch?v=BDe28veTf1U&t=1040s
After fixing the bug he said "I did not follow my own advice, which is to shadow variables that you're no longer using."
Screenshot-2024-09-05-at-12.14.20PM.png
Did we ever get a clear verdict on yay or nay for shadowing? Since of been doing AOC and writing some roc experiments after a long break I've been trying to really take note of the places I'm finding a lot of friction.
Shadowing is probably number one, maybe tied with tag inference errors for frustrations. But shadowing is definitely the number one cause of bugs in my code right now. Exactly what Richard describes here, using stale state because I had no way to "hide the variable that shouldn't be used"
I'd be keen to do the implementation work here if there is some concenus that it's something we want.
Sorry ignore this I just saw it was added to planned breaking changes, with mention it's blocked by the canonicalization rewrite. I guess I'll try to improve tag union errors instead :sweat_smile:
As a note, I think shadowing as it is written here for any value is not planned now. I think it is planned to be reserved for variables with an _ after the name.
So interesting the differences in opinions. Andrew Kelley doesn't want shadowing in Zig because he views it as a major source of bugs. And here a lot of people arguing that NOT shadowing is a major source of bugs. I think both can be true, but how do you know which way to go?
I think that visual feedback can help. If I can see in my editor, that an identifier shadows another one, I will probably be suspicious and less likely to create a shadow bug.
But you cannot get visual help, that you used the wrong value (acc instead of acc2).
I think Zig might be a bit different in that many cases where a reasonable argument can be made that shadowing would avoid bugs in Roc, are cases where Zig would use mutation. For instance: in a pure function that generates a bunch of random values we might write this in Roc using shadowing:
randomPerson = \init_seed ->
var seed_ = init_seed
age, seed_ = random_int seed_
name, seed_ = random_str seed_ 10
{ name, age }
In Zig you would probably pass a reference to the seed into the random functions and let it mutate. The motivation to shadow would fall away.
I quite like the shadowing approach in the latest proposal. It allows shadowing, but creates just enough friction to using it, with the var keyword and name constraint, I'd expect most people to avoid shadowing unless they've a strong usecase.
Great point @Jasper. This is why Elixir allows shadowing by default (to them you are just rebinding). It's a bit different for them since the = operator is really just a pattern match that can crash a process. And they also let you "pin" the identifier if you want to say "don't bind this, make sure the matched value EQUALS the value of the existing binding". Which I think is thing that Roc could use in when (obviously can't do that in normal assigns).
Note, zig has mutation
Hay removes the need for shadowing
Roc does not have mutation and that is why it needs shadowing
The _ suffix is trying to play the middle ground. Shadowing is off by default but can be opted in
This is very similar to the mut keyword
The one piece this is missing that zig mut enforces is that they type stays the same. We probably should enforce that as well
Also, just realized I repeated what @Jasper Woudenberg said. Should have read all replies before replying myself
Yeah, love that enforced immutability allows for safe use of a very useful practice that's dangerous (or at least confounding) in places with mutability
Brendan Hansknecht said:
As a note, I think shadowing as it is written here for any value is not planned now. I think it is planned to be reserved for variables with an
_after the name.
Ahh yeah, k do remember seeing that. Seems like a fine solution for now, and something that hasn't been explored elsewhere
I like the idea of the identifier being forced to carry information about being rebindable down the body of the function, but _ does feel like it will be a little overloaded in semantics in the languages since it’s also used alone or as a prefix to signify its discarded or unused
But I realize there aren’t a lot of other symbols available
I think of the rest of possible symbols , ' is nice but is more common for meaning prime - that’s too close in intent and might be confusing
‘$’ is just subjectively not very aesthetic, especially as a suffix.
& might be nice as a prefix and there is a little bit of a overlap in semantics here and a reference.
Nothing else seems ergonomic and reasonable
Just some thoughts on a silly bit of syntax
Yeah, it's hard to pick good symbols when balancing aesthetics, typeability (especially on non-us keyboards), and overlap with other languages in ways that would be confusing.
In one of Richard's talks I remember him saying something like: 'An underscore at the start means I'm not using this. An underscore at the end means I'm using this more than once'. I like this mnemonic, even though 'using more than once' is really 'assigning more than once' but it convinced me somewhat of using the symbol
yeah the pithy version is "underscore in front means unused, and underscore at the end means reused"
Something I just thought of though. We are changing to snake_case. I feel like this will make semantic underscores slightly less readable, but I'd have to see in practice
Great point @Kilian Vounckx and something I meant to include (given I just did the snake_case work you’d think it would have been type of mind)
Here's a slightly reworked version of Jasper's example above with all three different realistic symbols using snake_case
random_person = \seed ->
var init_seed_ = seed
age, init_seed_ = random_int! init_seed_
name, init_seed_ = random_str! init_seed_ 10
{ name, age }
random_person = \seed ->
var &init_seed = seed
age, &init_seed = random_int! &init_seed
name, &init_seed = random_str! &init_seed 10
{ name, age }
random_person = \seed ->
var $init_seed = seed
age, $init_seed = random_int! $init_seed
name, $init_seed = random_str! $init_seed 10
{ name, age }
I think trailing _ is still fine. This is meant to be noticeable, but has no need to really stick out.
Fair. In any case, most editors can probably be configured to highlight them differently if wanted
Kilian Vounckx said:
Fair. In any case, most editors can probably be configured to highlight them differently if wanted
Maybe we can make a tree sitter node for the trailing underscore @Eli Dowling ?
That’s true. You can make that sort of ident in tree-sitter assigned to a different highlight group
Sure could, either or both :)
I think making the entire identity different would be most helpful @Sam Mohr
Great idea!
Ahem, I meant identifier. Sorry on phone
Yep
Speaking of which, I need to get tree-sitter-roc set up in my neovim config. It’s jarring going over to Zed just for syntax highlighting
@Eli Dowling do you have a PR yet to add it to Mason?
Personally I would choose to only highlight the underscore.
Or I’m sorry, nvim-treesitter
There are not all that many colours available in most colour schemes , so you'd most likely end up with a conflict with something else. Overall, I'd probably emit both and let people do whatever they want
I don't personally use nvim, so feel free to go forth and make a pr with my blessing :sweat_smile:
So you’d ask people to add their own query for that? So fork it or can we add extra somehow?
Oh sorry, I don’t run in to many tree-sitter enjoyers that aren’t nvim users
Because I don’t know many EMacs users
:joy:
As a helix user, you're gonna make me cry
Or does Zed use it too?
Zed uses it
In fairness Sam, helix makes EMacs looks mainstream :wink:
No no, I'd make a sensible default, but I'd separate the underscore in the parse tree.
That would let folks customise it if they want to using an override.
Cool, I don’t know how easy that is to do in any editor without forking
Hey! I'm also a helix user, you're outnumbered in this conversation bud!
Well this is a Rust project, there is some selection bias
I love my Lua and Vim compatibility
Very easy. in nvim you just add a file called something like roc.scm to a folder in your neovim config and I think you need a comment at the top.
Ok never done it before
I’ve wrote TS grammars, just never customized one from a plugin without forking
That lets you do overrides to any existing highlight queries. It's pretty cool.
Cool. I’ll try to put that PR up today and post it here
I'm not sure about if helix can do overrides or if you just need to copy the whole highlighter.
I actually use neovim for debugging because the TS inspector is fab
And don’t worry Richard, I think Zed is great for a GUI editor. But my Tmux/nvim workflow is in my soul
Yeah, zed seems cool but tbh... I hate to say it, it's kinda slow...
But everything is kindaslow compared to helix :sweat_smile:
But when my laptop is unplugged and locked at 400mhz on the CPU, helix is like butter and most other things... Still not too bad, but more like butter with sand and little rocks.
Also the helix editing model really does just feel much more intuitive to me.
Maybe if one day zed gets a helix mode I'll consider switching:sweat_smile:
Rendering ASCII is pretty easy compared to drawing the entire UI pixel buffer. Especially when ran in a fast terminal
But helix doesn't have AI integration... What do they expect me to write CSS by hand, like a cave man???
For work, sadly crapping out mediocre php code is just magnitudes faster with ai assistance. So I'm back to vscode a lot of the time these days.
Anthony Bullard said:
In fairness Sam, helix makes EMacs looks mainstream :wink:
I really wonder what the numbers on this are. I wouldn't be surprised if active monthly users was similar.
Eli Dowling said:
locked at 400mhz on the CPU
That is intense underclocking. Especially if it is locked
I'm also a helix user
There are a lot of us in roc. I remember first adopting it and no one knew what it was. The numbers have grown much higher since then.
a quick thought about highlighting the underscore on its own - I think that might give the misimpression that it's a separate operator instead of part of the name, so personally I think highlighting the whole name differently would be best :big_smile:
Agreed
My reasoning towards that was similar
Does it even need different highlighting? Generally for other languages, they just highlight the equivalent of a mut keyword. So the variables are less distinct than what roc will have with the trailing underscore.
I think longer variable names, now that we're moving towards snake_case, will be less obviously shadowed than an underscore after a camelCase name
Sure, but doesn't change my statement at all
So scanning code will be slightly more difficult
I think we can do better than other languages
You're right that just having the underscore is already an improvement over the status quo
Also, if we have enough colors in default schemes such that it is barely different, sounds fine, worried about it sticking out and needing extra config. I want to remind that originally we were gonna do shadowing without _. I really don't think these need to stick out at all.
Well, that was why I initially suggested just highlighting the final underscore to not make it overly conspicuous
Also, I think it will still be quite noticable with long names. The important part is assignments. Even if you can't see the trailing _, you will notice the extra spacing at assignment.
I personally think it would make things like PR reviews a lot easier or more effective. But it’s not a hill I’d die on
This seems like a "let's try it" kind of thing
If people find it distracting, very easy to just not highlight anymore
:nod: (I really wish Zulip had gif emoticons, this nod emoji is so unsatisfying)
This is what I would have sent at work: 723a9c68cd665a45.gif
My desire for shadowing is not just restricted to mutable updates. It also comes up when writing code to recursively traverse a tree, for example:
explore = \node ->
when node is
Line node -> explore node
Seq nodes -> List.map nodes explore
I could name that inner item innerNode or whatever, but if I do that, I can't guarantee that the code there _doesn't_ accidentally use the outer node arg. I have some code I was working on recently where I made this very bug (accidentally using node instead of innerNode. This is especially important as the code in the various branches gets more complicated.
This is somewhat similar to mutable rebinding in a for loop, since we are in this case iterating over a _tree_, and trying to update the current "state" variable to point to the next node we'll be looking at.
I think var should work here, because the idea is for patterns to be able to reassign to vars:
explore = \var node_ ->
when node_ is
Line node_ -> explore node_
Seq nodes -> List.map nodes explore
Oh, interesting!
What if I have some code before/after the when, and I'd like that code to definitely always use the outer node?
There's definitely some Rust code I was writing recently for the roc formatter that looks like that
Before the when is fine.....after.....idk what our scope plan is. I think it would have to be the new inner node_. So it would implicitly bind to the outer scope.
Joshua Warner said:
What if I have some code before/after the
when, and I'd like that code to definitely always use the outernode?
I'm pretty sure that in the given example, once you do Line node_ ->, the outer node_ has been covered up. But Seq would still be able to use it.
Right, which I guess means the current var design doesn't help here
Yeah, what does this become?
explore = \var node_ ->
something = node_
list =
when node_ is
Line node_ -> explore node_
Seq nodes ->
unused = node_
List.map nodes explore
List.append list node_
My gut feeling is this
explore = \node ->
something = node
(list, node2) =
when node is
Line innerNode -> (explore innerNode, innerNode)
Seq nodes ->
unused = node
(List.map nodes explore, node)
List.append list node2
Otherwise, it will be too bugprone
And I would say we have to explicitly ban any short of shadowing through a lambda.
I think you're right, or else loops would also not really work
Also, instead of the tuple return, probably would map to a special mono node to set a symbol to a new value.
I think internally, we'd want to implement mutability, yes
That's my current thought for canonicalization, is that it should think of it as mutation, even though typechecking treats it as re-assignment
It's a bit nitpicky to get into the particulars, but long story short the "under the hood" won't just be mutation or re-assignment, but a bit of both between canonicalization and codegen
Yeah shadowing and rebinding look similar sometimes, but are different in important ways
Shadowing: the variable with this name has this new value for the rest of this scope (and child scopes)
Rebinding: the variable with this name will have this value for the rest of the scope it was introduced in - and that scopes children(including this one)
In SSA shadowing would create a new var and ensure all references to that name would use the new var for the rest of the scope
In SSA rebinding would….
Brains fried right now. Going to bed :joy:
(By var I mean virtual register or whatever they call
It in LLVMIR)
yeah the way I think of var is that it's "reassignment but not mutation"
in Rust, mut does both - it enables reassignment (e.g. inside a for loop you can reassign something declared outside the loop with mut to have a different value) but also it enables mutation (e.g. if a function has mut on one of its arguments, and you call that function, the thing you passed in may get changed just because you passed it in there)
var does the first thing but not the second thing
so you can put var outside a loop and then reassign it inside the loop, which causes the outer thing to change
but if you declare that a function accepts a var foo as an argument, callers still don't have to worry about passing anything in there potentially resulting it in being changed after the function call
Richard Feldman said:
so you can put
varoutside a loop and then reassign it inside the loop, which causes the outer thing to change
This is the thing that "feels" like mutation. I think it'll end up getting implemented as mutation in codegen
totally
the reason I'm avoiding using the term "mutation" to describe it is that usually mutation means two things, and this only enables one of them
It's a good idea to not say mutation
True, we wouldn't want to scare off any of the functional programming pursuits.
I've heard saying mutation too loud tends to make them scurry back into the dark rocky caves they came from.
well we have opportunistic mutation already
so I guess anyone who's scared of that is already out :big_smile:
Nah, that's mutation, that's locked away and hidden out of sight. It's safe, like seeing mutation at the zoo vs coming face to face with it on the savanna!
Last updated: Jun 16 2026 at 16:19 UTC