Shadowing & Redeclaration · ideas

There are times when being able to shadow a variable, either with the same type or a different type, would be both convenient and perhaps less error prone compared to needing to use multiple identifiers (or alternate techniques, such as magic number indices).

An example of this is incrementally consuming a list of word strings, where you might want to "reassign" the remainder of unconsumed words, at each step, back into a words variable of type List Str.

Another case would be parsing values or formatting values, where you might need to temporarily obtain a Str but ultimately need a U32, or where you have a U32 but ultimately need a Str. While pipelining can satisfy many of these cases, redeclaration after some processing steps might simply be clearer. In other cases, a prior intermediate result may no longer be needed (and juggling multiple [no longer used] names is tricky because naming is hard).

Rust has a redeclaration feature: although it doesn't allow assignment to immutable (default) variables, it does permit new declarations with the same name, and possibly a different type. iiuc, each such declaration essentially introduces an implicit nested scope.

@Brendan Hansknecht reasonably suggested that shadowing of top-level declarations, as well as the name of any function which may get called recursively, should be prohibited, as doing so in either of these cases would likely be confusing.

Brian Carroll (Dec 23 2022 at 08:47):

I would like more lenient rules on shadowing between scopes, but I would really not like to have redeclaration. I think having the same name mean different things depending on where you are inside the same scope makes it harder to understand what's going on. I think of it as a design flaw in Rust and feel it's bad practice to use it. I've often thought it should trigger a Clippy warning or something.

Brian Carroll (Dec 23 2022 at 08:47):

On a more practical note, I think it is only really possible to have redeclaration in an imperative language where the sequence of lines of text corresponds to a sequence in time. In Roc and all other declarative, expression-based languages, there are no assignments, only "equations" that can be in any order. I don't know how we could implement redeclaration in the compiler if we wanted to.

Brian Carroll (Dec 23 2022 at 08:54):

But as for shadowing, I do often find myself surprised at the places Roc gives an error about it, and feel it's getting in my way. I wouldn't have expected to have this experience, but I do.

Brian Carroll (Dec 23 2022 at 08:56):

Kevin Gillette (Dec 23 2022 at 09:55):

Static single assignment is a pretty common compiler pass in which each assignment is given a unique name, even if in the original source, they share the same name. Presumably something similar could apply to Roc?

Even though, yes, Roc distills down to one large computation, we could treat each non-top-level let-style expression as introducing a nested scope, permit shadowing, and thus all identifier references would correctly resolve to the latest/deepest scope in which it was (re)introduced.

Kevin Gillette (Dec 23 2022 at 10:02):

Stylistically, Roc does already have a notion of temporality/sequencing in the form of pipelines: the source of a pipeline step necessarily must be evaluated before the entirety of the destination may be evaluated (even if parts of the destination expression could be evaluated before the source). Backpassing is an even stronger example: the lines after backpassing syntax notionally happen after the lines before the backpassing syntax.

Generally, it's not a semantically incorrect interpretation that, within a function, earlier lines are evaluated before later lines. While in reality they could be evaluated in a different order, because Roc is a side-effect-free language, then, assuming the compiler is sound, such an interpretation is convenient and hard to challenge.

Richard Feldman (Dec 23 2022 at 11:47):

quick note: I actually think that although today we silently reorder defs, we should start giving a warning for it - https://github.com/roc-lang/roc/issues/4430 - reordering makes sense if it's unobservable, but it can give you misleading dbg output (among other things, e.g. expect and crashes) if those get reordered

Richard Feldman (Dec 23 2022 at 11:47):

that said, I don't think this has much bearing on the design question of whether or not we should allow redeclaration - I think the main consideration here should be to figure out what's most helpful overall

Richard Feldman (Dec 23 2022 at 11:49):

broadly speaking, I appreciate being able to shadow when I'm writing code, and I appreciate knowing that shadowing is disallowed when I'm reading code

Richard Feldman (Dec 23 2022 at 11:50):

if shadowing is disallowed, then if I see something declared in one place and then used later on at the same indentation level (or higher), then I know instantly that the usage connects to that declaration and couldn't have "changed" in between

Richard Feldman (Dec 23 2022 at 11:51):

if it's allowed, then I have to audit it to be sure; I have to scan all the lines in between the declaration and the usage to check whether the name has changed to refer to something else in between

Richard Feldman (Dec 23 2022 at 11:52):

that said, I have found it to be very convenient when writing code and miss it from Rust when I'm writing Roc.

Richard Feldman (Dec 23 2022 at 11:54):

but then again, I spend a lot more time writing Roc code than debugging it. I've very rarely been bitten by shadowing in Rust, but it has happened. I remember one time losing over an hour to something where I had the wrong mental model about what was going on due to shadowing in a particularly complicated function, and then thinking "wow, did this one bug just erase all the time shadowing in Rust has ever saved me in the writing phase?"

Richard Feldman (Dec 23 2022 at 11:54):

Richard Feldman (Dec 23 2022 at 11:56):

so given that code is read more often than it's written, and that right now Roc code bases are all pretty small and mostly only read by the person who wrote them, I think it's good to consider what we'd be giving up by allowing shadowing anywhere

Richard Feldman (Dec 23 2022 at 12:01):

I'm open to the idea btw, and I've independently considered it before. I actually asked Jose Valim what people think of it in Elixir, and he said it gets mixed reviews; a significant number of people like it and a significant number don't

Richard Feldman (Dec 23 2022 at 12:01):

Richard Feldman (Dec 23 2022 at 12:03):

keeping in mind that elixir supports redeclaration like Rust does, not just when introducing a new scope (like Brian proposed)

Brian Carroll (Dec 23 2022 at 14:32):

Good points Richard. I still find it frustrating to write but maybe it's the best option

Kevin mentioned back passing and it occurred to me that that's not redeclaration but actually shadowing! Everything after the back pass is really the body of a function. So it's a new scope!

Which means if we allowed shadowing in inner scopes, it would read like redeclaration. This weakens my earlier argument!

Richard Feldman (Dec 23 2022 at 15:02):

Kevin Gillette (Dec 23 2022 at 19:22):

I would generally trade-offs which favor readability trade-offs which favor write-ability, in cases where they're opposed.
However, I'm not sure it's as simple as that in this case, since, as called out before, the absence of shadowing can lead to more more names being used, each with, from the perspective of the reader, indefinite lifetimes, compared a shadowed equivalent that just replaces over the same name multiple times (thus appearing to be procedural processing steps). Such processing steps usually are just refinements of the same data (i.e. peeling away words from the same list of words, or iteratively removing extraneous details from a string).

Arguably less mental context is needed for both the reader and the writer to deal with same-type/shape refinements using a small number of variables than juggling an a number of variables proportional to the number of processing steps. There are certainly techniques for dealing with this issue, such as splitting functions or alternate transforms with pipelines. That said, novices will likely be less familiar with those other transforms and will have a solution path in mind that will be a lot less satisfactory if the language forces them to consider other approaches without merely telling them what to do ("I see you're trying to extract a substring using a bunch of variables, but it's better to do it this way").

Kevin Gillette (Dec 23 2022 at 19:25):

@Brian Carroll brings up a good point, which is that type-changing of the same name in the same function could lead to unnecessary confusion. It's certainly true that readers of code will often skip around rather than read linearly, and especially if the type of an identifier can change, it'll certainly increase cognitive burden.

Even in Python, where reusing variables and changing types had been fairly common a decade and more ago, with newer optional type checking, that has become far less common (since now the type checker may now complain).

Joshua Warner (Dec 23 2022 at 19:47):

One case where I've used shadowing to good effect in rust is let re-binding. In some cases due to how an algorithm is specified, you can sometimes have successive intermediate results that could logically have the same name. It would be a bug if an earlier one of these were accidentally used in place of a later one - and so I prevent that by giving them both the same name, thereby making it impossible to access the earlier one.

Joshua Warner (Dec 23 2022 at 19:49):

That both makes it clear that the misuse isn't happening (when reading), and also makes it less likely to introduce such a misuse when writing/refactoring the code.

Joshua Warner (Dec 23 2022 at 19:51):

Of course, in roc you can do much the same thing and make sure the original version isn't in scope just by breaking out each successive step in the computation into its own function. But then the reader is left to verify that things are actually called in the order you expect - and you didn't typo one of the calls in the chain (accidentally skipping a step, for example).

Joshua Warner (Dec 23 2022 at 19:52):

I guess maybe what I want is to be able to declare that the scope of some particular name ends - and doesn't extend farther down the function I'm writing or into nested scopes.

Brian Carroll (Dec 23 2022 at 21:05):

Interesting. Recently I have started doing that a lot in Rust. Putting {} around a group of temporary variables to make it clear they don't escape from that block.
In Roc you could maybe do that with nested declarations!

Richard Feldman (Dec 23 2022 at 22:12):

Shritesh Bhattarai (Dec 23 2022 at 22:18):

I'd very much prefer shadowing, especially for backpassing. Having to name every intermediate binding is a source of friction and even led to bugs where I incorrectly used the previous binding. Two examples that can be better with shadowing: threading state in a random number generator and consecutive List.walks.

Richard Feldman (Dec 23 2022 at 22:26):

yeah threading state gets nicer but I'm not sure it's actually less error prone because of unused warnings

Richard Feldman (Dec 23 2022 at 22:26):

like if you generate a new seed and don't use it because you accidentally use a stale seed, you'll get an unused warning for the new seed

Richard Feldman (Dec 23 2022 at 22:30):

yeah that second example is a good I one though! Definitely state would be a less error prone name there than innerState

Richard Feldman (Dec 23 2022 at 22:30):

Richard Feldman (Dec 23 2022 at 22:32):

as Brian noted, you can fix that by extracting it as a named function in a different scope (e.g. top level)

Richard Feldman (Dec 23 2022 at 22:32):

Richard Feldman (Dec 23 2022 at 22:33):

that is, I wouldn't choose to extract it except as a way to work around shadowing being disallowed

Richard Feldman (Dec 23 2022 at 22:34):

but what I struggle with in situations like this is: there is a workaround (extracting the function) for this, whereas there's no workaround for the downsides that come with shadowing - you just have to always be on the lookout for it forever

Kevin Gillette (Dec 23 2022 at 22:53):

Can we enumerate the downsides of shadowing alongside how often we think it'll be an issue or whether we can get away with targeted restrictions? If we can detect cases which are problematic and non-useful, we should restrict them, while permitting cases which are useful and fit into common patterns.

Kevin Gillette (Dec 23 2022 at 22:57):

Would it have worked well enough to call the outer state, which appears to be used just once outerState, while leaving the name state for the inner state, which is used through the remainder of the function?

Kevin Gillette (Dec 23 2022 at 23:15):

If tracking variable lifetimes is a major concern, perhaps we can introduce some syntax, such as a sigil/symbol, to indicate that it must be referred to exactly once. These might be called linear types, though I doubt what I'm describing has all the required properties. For example, declare as %x, and pair with a reference %x. After that later reference, the value can no longer be referred to within that same scope (either the identifier ceases to exist and could be reused, or still exists but cannot be referenced again).

In @Shritesh Bhattarai's example, where the outer state's introduction and use are on adjacent lines, this might work pretty well, but in any case where the declaration and use are separated by many lines, the value will diminish quickly.

Brendan Hansknecht (Dec 24 2022 at 00:38):

In those cases, it is really easy to write buggy code by accidentally using the wrong X.

Also, using names like this makes changing code really annoying. Some times you need to increment so many variable names.

Richard Feldman (Dec 24 2022 at 17:46):

"What's likely to cause more lost debugging time? Wrong metal model when reading code because it shadower something you didn't realize, or accidentally using a stale variable name when writing code?"

Richard Feldman (Dec 24 2022 at 17:50):

some answers that seem easy but which I think don't hold up very well in practice:

Richard Feldman (Dec 24 2022 at 17:53):

another consideration I hadn't thought of before: having access to the option of shadowing is pretty much strictly better for prototyping; it would make Roc better at that use case

Richard Feldman (Dec 24 2022 at 17:54):

so if we think these are similar in terms of overall impact on time spent debugging, that's a potential tiebreaker

Richard Feldman (Dec 24 2022 at 17:55):

on the other hand, another potential tiebreaker is learning curve: shadowing is strictly easier to teach. "You can't reuse names, the end."

Richard Feldman (Dec 24 2022 at 17:59):

Richard Feldman (Dec 24 2022 at 18:00):

to be fair, teaching the latter might be as easy as saying "pretend there's always a const there, except you don't have to write it. So basically you're writing const x = ... so of course it doesn't mutate the x in the outer scope!

Brendan Hansknecht (Dec 24 2022 at 18:30):

    buf
    |> generateDeriveStr types enumType ExcludeDebug
    |> Str.concat "#[repr(u\(reprBits))]\npub enum \(escapedName) {\n"
    |> \b -> walkWithIndex tags b generateEnumTags

    buf
    |> \b -> if discriminantSize > 0 then
            generateDiscriminant b types discriminantName tagNames discriminantSize
        else
            b
    |> Str.concat ...

Brendan Hansknecht (Dec 24 2022 at 18:31):

Yes, these could be named functions, but they are very small and would be weird to name in my opinion. Of course depends on case by case.

Richard Feldman (Dec 25 2022 at 03:16):

here's another interesting angle to consider: to what extent could the editor mitigate the downsides of shadowing? :thinking:

Richard Feldman (Dec 25 2022 at 03:16):

for example, it could just straight-up tell you when something is shadowed (e.g. syntax highlight it in a different color)

Richard Feldman (Dec 25 2022 at 03:16):

or when looking at a definition, it could have a little icon next to it indicating that this definition is shadowed

Richard Feldman (Dec 25 2022 at 03:17):

maybe when you highlight a named variable, it doesn't just tell you its type, it also tells you if it's referring to a shadowed definition

Richard Feldman (Dec 25 2022 at 03:17):

Brendan Hansknecht (Dec 25 2022 at 03:49):

I think if it was highlight different or has a symbol that would mitigate most of the downside from my experience. Though if you shadow multiple times, you would probably need to distinguish each of them.

Ayaz Hafiz (Dec 25 2022 at 03:51):

An interesting idea there is if you hover over a variable, what if you get a view of how it was defined rather than just the type?

Ayaz Hafiz (Dec 25 2022 at 03:52):

my sense is that the editor solution only partially mitigates the problem though, and only for readers who are unsure of the definition source. it also doesn't address e.g. reading source code reviews on something like github

Ayaz Hafiz (Dec 25 2022 at 03:52):

I've definitely had this happen to me before, but I think it happens less often than problems observed due to shadowing. like, the situation in which this happens is if you have e.g. xOuter and xInner and you use both xOuter and xInner in the innermost scope - but that seems unlikely, because presumably if you wanted to have xInner shadow xInner, than xOuter should not be relevant in the scope xInner lives. And if in the innermost scope you don't reference xInner, then you'll get a warning of an unused variable (which in my experienced has saved me from some bugs).

Another possible disadvantage of allowing shadowing is it can lead to disciplines where you both sometimes shadow, and sometimes use unique variable names (e.g. state1, state2) where you would otherwise shadow. I know I've done this many times even though it's not exactly a great discipline for either the reader or the writer (especially the reader), and seems strictly worse than not supporting shadowing IMO

Ayaz Hafiz (Dec 25 2022 at 03:52):

One idea if re-binding/shadowing is allowed: only allow re-binding a variable if the new definition uses the shadowed variable. So e.g. you can do

path = ...
path = Path.toStr path

path = ...
path = "telluride"

This doesn't address the nested-scope problem though, so maybe not the best idea.

Ayaz Hafiz (Dec 25 2022 at 03:52):

One other reason in favor of shadowing: suppose someone is writing a module, and in some nested scope they define some name.
But then, they add a top-level definition (or in some higher scope) something that is best suited to use that name.
Now, they have to change the inner-scoped-name to something else, even though it may be unrelated to the top-level change they are making. This can increase diff noise for readers and writers.
I've never actually seen this happen though, I don't know if it would

Richard Feldman (Dec 25 2022 at 03:55):

like for example in a JSON deocder, I want to expose Decode.str - now I have a top-level declaration named str, so I can no longer use str as an identifier anywhere else in the module :sweat_smile:

Richard Feldman (Dec 25 2022 at 03:56):

Richard Feldman (Dec 25 2022 at 04:09):

this reminds me of another consideration: in Rust, I don't get as much value out of shadowing because often I'll do let mut state = and then mutate in-place, instead of shadowing

Richard Feldman (Dec 25 2022 at 04:10):

which affects the ratio of "times rebinding bit me" to "times it was useful" - it's useful less often in Rust than it would be in Roc, because a lot of the time, I wouldn't be using state1 = ... state2 = ... etc. in Rust anyway because instead I'd just be reassigning in-place

Georges Boris (Dec 25 2022 at 04:12):

I usually don't mind if I have to rename lambda variables when creating a new top level definition as it makes it clearer for the reader they're not talking about the same thing.

Thinking about it, the only case I've used shadowing and I'd love to keep it, is when I'm using it in let bindings inside a functions. I'd be fine if I couldn't shadow top level definitions but not being able to shadow values defined inside a function (being they in a sequential let binding or in nested lambdas in a pipeline) is where I seem to draw the line.

Richard Feldman (Dec 28 2022 at 02:02):

ok, I want to make a concrete proposal: let's allow full Rust/Elixir-style rebinding.

that is, all of the usual shadowing stuff works, and in addition, this becomes allowed too:

x = 5
x = "blah"

Richard Feldman (Dec 28 2022 at 02:03):

Ayaz Hafiz (Dec 28 2022 at 02:06):

personally, don't feel strongly one way or another, but I would suggest than in a world where re-binding in the same scope is allowed, it is a warning to not use a variable (like the first x in your example) before it is re-bound.

Richard Feldman (Dec 28 2022 at 02:07):

Joshua Warner (Dec 28 2022 at 03:00):

Brian Carroll (Dec 28 2022 at 09:01):

I really don't like it at all, I find rebinding very confusing in Rust and wish it didn't have it. I always get rid of it in any code I have to do any serious work with. I don't like that the same name means different things in the same scope depending on what line of code you're looking at.

Brian Carroll (Dec 28 2022 at 09:01):

This would make Roc feel imperative to me, because binding would be more like an "assignment statement" where order matters. I think of binding as just "giving a name to an expression" but it wouldn't really mean that any more because it would also mean something about the time sequence in which things are executed.

Brian Carroll (Dec 28 2022 at 09:04):

Originally I was in favour of shadowing names in different scopes because all the big problems for me occur within the same scope.

Brian Carroll (Dec 28 2022 at 09:05):

BUT backpassing breaks that. It's an important syntax feature that deliberately makes different scopes look like they're in the same scope.

Brian Carroll (Dec 28 2022 at 09:07):

So my conclusion is that, although it often feels annoying to me to write, we need to keep the current behaviour.

Brendan Hansknecht (Dec 28 2022 at 15:46):

Since nothing is mutable in roc, i think the problems that need redeclaration/shadowing are much much more common than rust.

Brendan Hansknecht (Dec 28 2022 at 15:48):

As such, i think we will run into many many more situations where code is inconvenient and brittle without some form of shadowing or redeclaration. I think the problems often arise in the same scope. That said, in the cases it doesn't arise in the same scope it still feels like it arises in the same scope due to backpassing and lambdas.

Brendan Hansknecht (Dec 28 2022 at 15:48):

If we only limit to different scopes, I think that would be much more confusing than allowing same scope as well

Brendan Hansknecht (Dec 28 2022 at 15:49):

I agree that it makes the language feel much more imperative, but when people see a list of variable declarations within a function, they assume it is imperative anyway. People do not naturally understand the potentially out of order execution.

Brendan Hansknecht (Dec 28 2022 at 15:51):

On top of that we already have many operations that force things to be imperative in execution order: backpassing, pipelining, dbg, expect, data dependencies, and arguably conditionals.

Brendan Hansknecht (Dec 28 2022 at 15:53):

Aside, since internal to the compiler we can rename any variable that is shadowing another, we still can create a true SSA form with data depends graph. So it should be equally optimizable to the version with new names at each use.

Kevin Gillette (Dec 28 2022 at 15:59):

semantically, I'd call it "out of order evaluation." The term execution implies a side effect, at least to me.

While Roc advertises its lower level performance behavior considerably more than other functional languages, you still shouldn't need to know these lower level details to know what result the program will have. In the semantic sense, the order of evaluation is entirely irrelevant because it shouldn't be an observable property of the program, short of using a debugger or triggering a core dump.

Brendan Hansknecht (Dec 28 2022 at 16:04):

I think it can only be observable in terms of performance (especially with ordering potentially making something non-unique and leading to copying). That is fair.

Except when you add in dbg and expect where they have side effects and the order is then observable.

Kevin Gillette (Dec 28 2022 at 16:06):

Wouldn't data dependencies be a characteristic that sets pure functional apart from imperative? Data dependencies define a trees as the evaluation partial ordering mechanism, while imperative typically uses line and expression order as the (hopefully) total ordering mechanism; since imperative expressions can have side effects, those languages need more complex definitions of behavior than Roc does.

Brendan Hansknecht (Dec 28 2022 at 16:11):

Imperative languages code gets reordered as well. Yes, it is more restrictive, but we regularly add in similar restrictions via function calls, backpassing, and Tasks. I wouldn't look at them as fundamentally different in this case. They both boil down to an SSA form with effectful operations that block reordering.

Kevin Gillette (Dec 28 2022 at 16:12):

debug and expect are not representative of the whole language. It's also not clear to me why expect needs to force an evaluation order: the compiler already does not print errors in line order.

In any case, could there not be declaration orderings that force debug to evaluate in non-source order (or force it to buffer?)

x = y + 5
dbg x
y = 7
dbg y

iiuc, since Roc allows out-of-order declarations within functions, the above should be a valid function body.

Brendan Hansknecht (Dec 28 2022 at 16:13):

If it is valid (which it probably is). Its output would be very confusing to people and I would argue that it shouldnt be valid.

Brendan Hansknecht (Dec 28 2022 at 16:16):

dbg and and expect are very important because they are a direct way to see the execution order of the program. If dbg prints out of order, it could lead to hours of wasted time debugging.

Brendan Hansknecht (Dec 28 2022 at 16:16):

Why is x equally to 7...oh, it's not, the dbg prints was reordered and y was printed before x.

Ayaz Hafiz (Dec 28 2022 at 16:17):

out-of-order declarations are going to become warnings in the future for the reason Brendan mentions

Kevin Gillette (Dec 28 2022 at 16:20):

I agree with that. As such, if it is valid, I think it's an argument for limiting arbitrary declaration order to the global scope.

Within a function, too much confusion could come from writing declarations out of dependency order, and I imagine people naturally write declarations, within functions, in dependency order almost every time anyways (and perhaps many of the times they don't, it's by accident or following a refactor).

Conversely, the global scope should not allow redeclaration (except in the repl), because that seems like it should be undefined, i.e.

# All global scope...

x = x * 5

# Hundreds of unrelated lines

x = x + 2

# Hundreds of unrelated lines

x = 1

Brendan Hansknecht (Dec 28 2022 at 16:21):

Kevin Gillette (Dec 28 2022 at 16:27):

But if dbg and expect force a evaluation order, they are _changing_ the evaluation order, compared to a release build or simply removing those dbg and expect lines: the compiler may well determine it's more optimal to have a different order when dbg and expect are not involved.

I agree they're important for understanding properties of the program, but I disagree that they should be advertised as having any bearing or meaning on understanding the "execution order" of the program, except across tasks, since tasks are the only aspect of Roc that's "executed" (side effectful). We would make dbg and expect force line-order evaluation to avoid confusion but not to provide extra meaning to order, because that meaning would be deceptive (except across tasks).

Kevin Gillette (Dec 28 2022 at 16:32):

And the tutorial lesson (for people used to imperative languages) is that evaluation order respects dependencies and respects tasks, but that's it. Any ordering that achieves the same result could be the ordering that the compiler selects, and while that's somewhat true for compiled high level imperative languages, it's even moreso the case for Roc because it has a wider optimization space to work with, or at least fewer (or different) language-induced impediments to get in the way of the optimizer.

Joshua Warner (Dec 28 2022 at 17:45):

Recently I ran into frustration around the current shadowing rules when trying to implement something that essentially uses the "state" pattern - where there should always be one "latest" version of the state that you use - and I shouldn't have to think critically about which one that needs to be (it should always be the latest one / inner-most one!).

Joshua Warner (Dec 28 2022 at 17:46):

toIdParserList : IdBindState, List Parser -> {state: IdBindState, ids: List Id}
toIdParserList = \state, parsers ->
    List.walk parsers {state, ids: []} \{state: state2, ids}, parser ->
        {state: newState2, id} = toIdParser state parser
        {state: newState2, ids: List.append ids id}

Joshua Warner (Dec 28 2022 at 17:46):

Joshua Warner (Dec 28 2022 at 17:49):

Oh and actually funnily enough, there _is_ a bug of exactly that form in that code.

Joshua Warner (Dec 28 2022 at 17:49):

Folkert de Vries (Dec 28 2022 at 17:49):

Joshua Warner (Dec 28 2022 at 17:50):

That would be IMO much more readable (and writable!) if I could just re-use the same state name. e.g.

toIdParserList : IdBindState, List Parser -> {state: IdBindState, ids: List Id}
toIdParserList = \state, parsers ->
    List.walk parsers {state, ids: []} \{state, ids}, parser ->
        {state, id} = toIdParser state parser
        {state, ids: List.append ids id}

Folkert de Vries (Dec 28 2022 at 17:50):

only true for linear state passing (where you truly don't want the old state any more)

Joshua Warner (Dec 28 2022 at 17:51):

Folkert de Vries (Dec 28 2022 at 17:54):

generally you should not pass state around so explicitly i think, if you can. you can also refactor this into something less name-y

Folkert de Vries (Dec 28 2022 at 17:54):

Folkert de Vries (Dec 28 2022 at 17:55):

Folkert de Vries (Dec 28 2022 at 17:58):

in this particular case, without going in that haskell direction, I'd go with something like

toIdParserList : IdBindState, List Parser -> {state: IdBindState, ids: List Id}
toIdParserList = \initialState, parsers ->
    List.walk parsers {state: initialState, ids: []} \accum, parser ->
        stepped = toIdParser accum.state parser
        {state: stepped.state, ids: List.append accum.ids stepped.id}

Joshua Warner (Dec 28 2022 at 17:59):

Joshua Warner (Dec 28 2022 at 18:01):

That naming is just more work. Allowing shadowing (and taking advantage of it) actually makes it clear to readers that there's an _absence_ of certain kinds of bugs.

Shritesh Bhattarai (Dec 28 2022 at 18:01):

+1 I’d love to be able to directly pattern match in the accum above instead of having to name it separately

Folkert de Vries (Dec 28 2022 at 18:15):

well what I like about it is that I can find much quicker where a definition comes from.

more generally, I know that not having shadowing causes naming discomfort, and when you just append a number to the variable, that indeed makes it easy to slip up and re-use an old state.

but I like this resistance (in practice I think it makes my code better, even if it takes a bit more effort) and I like the absolute certainty that I have that there are 0 shadowing bugs in my code. I think it works very well in zig and elm.

separately it also makes certain compiler things a bit easier, because names are unique (in a scope)

Joshua Warner (Dec 28 2022 at 18:31):

Joshua Warner (Dec 28 2022 at 18:33):

This is honestly the scope of problem that'd make me want to maintain a fork of roc that allows shadowing. Or just not use roc at all. I find disallowing it to be very very restrictive.

Folkert de Vries (Dec 28 2022 at 18:34):

it happens exactly when the state is not linear. "variable not used" warnings would mostly catch that though I think? haskell diagnostics are not so great

Joshua Warner (Dec 28 2022 at 18:39):

Interesting. I'm having a hard time imagining what that would look like. Is that a thing that only happens in functional languages? (I'm a bit of a beginner in that space...)

Folkert de Vries (Dec 28 2022 at 18:51):

shadowing implies an evaluation order. In functional languages (like haskell or elm) the evaluation order is deliberately unspecified (which can help greatly with optimization). So in a shadowing world, moving some code around can mean that all of a sudden you use a different state than you should.

the same is of course true in rust or zig but there at least the code is already imperative, and you'll write the code with that in mind (still it is error-prone enough that zig also has a no-shadowing rule; you need to mark variables as mut when the value bound to a name can change over the course of a scope)

Joshua Warner (Dec 28 2022 at 18:54):

Joshua Warner (Dec 28 2022 at 18:57):

This is kind of in tension between the linear and not-linear worlds. In the linear + no-shadowing world, moving some code around can lead you to introduce new bugs accidentally if you don't update the names correctly. In the linear + shadowing world, moving code around "mostly works" without extra effort.

Folkert de Vries (Dec 28 2022 at 18:58):

it's not really "no shadowing" but "no re-declaration", but the effect is the same: if the value of a name changes over the course of its lifetime, it must be defined as var someName (similar to rust's let mut someName

Folkert de Vries (Dec 28 2022 at 18:58):

Joshua Warner (Dec 28 2022 at 18:59):

In rust, having successive let bindings with the same name really is shadowing, not mutable updates. The old variables are still alive, just can't be accessed.

Folkert de Vries (Dec 28 2022 at 19:00):

Joshua Warner (Dec 28 2022 at 19:01):

let thing: String = func_that_returns_an_owned_string();
let thing = thing.as_str();

// followed by a bunch of calls that expect `&str` - so I don't have to repeat the .as_str() or & everywhere

Joshua Warner (Dec 28 2022 at 19:02):

Ditto for any other "trivial" transformations, where the original form of the variable is never accessed again.

Folkert de Vries (Dec 28 2022 at 19:02):

Joshua Warner (Dec 28 2022 at 19:03):

Folkert de Vries (Dec 28 2022 at 19:03):

Joshua Warner (Dec 28 2022 at 19:04):

Joshua Warner (Dec 28 2022 at 19:07):

Is there a way to write "state-monad" patterns cleanly in roc, where you don't have to think carefully about the naming of each intermediate? (and be careful not to use the wrong one!)

Joshua Warner (Dec 28 2022 at 19:07):

Brendan Hansknecht (Dec 28 2022 at 20:21):

I agree with the idea that a lot of these things should be refactored, but when you have something with X1, X2, etc....it can be very hard to figure out those refactoring. Even with pipeline and backpassing.

Georges Boris (Dec 28 2022 at 21:13):

Coming from Elm, shadowing is rarely a real issue. Mostly when trying do complex conditional transformation without leaving the same scope.

It seems to me like shadowing only solves the problem of unintenionally using a previously motified value. However it makes optimizations harder, makes moving code around more error-prone. Not allowing shadowing may be annoying on these scenarios but if we share knowledge on how to deal with them idiomatically then we could have the best of both worlds?

(e.g. controlling shadowing explicitily by using more functions instead of keeping everything under a shared scope)

Brendan Hansknecht (Dec 28 2022 at 21:15):

Georges Boris (Dec 28 2022 at 21:19):

just replicating what @Folkert de Vries said as I have zero knowledge around the subject :sunglasses:

Brendan Hansknecht (Dec 28 2022 at 21:21):

I hit it a lot with more complex pipelining. In some cases the dependencies are complex enough that you cant use |>. In other cases, you need |> with a lambda for a small function that isn't worth naming. Can't reuse the name there either.

Also, rust as str is just one example. You may also start with a string and convert to a path or any number of other types later. That often uses redefinition of the same variables.

Folkert de Vries (Dec 28 2022 at 21:23):

my point was not really about optimization, but more about it being a nice property (in the compiler, but for programmers too) that a name means just one thing in a scope

Folkert de Vries (Dec 28 2022 at 21:24):

there are workarounds to make shadowing possible, but it requires more work in the compiler

Richard Feldman (Dec 28 2022 at 21:26):

we actually need to have the compiler support shadowing under the hood for redeclaration in the repl to work, so we can't maintain that property of the compiler regardless :big_smile:

Brendan Hansknecht (Dec 28 2022 at 21:27):

Related question. Is there a cost to using a record for pipelining? My gut feeling is yes, but maybe it should be optimized away.

As in if i am doing a pipeline with state but at one of the stages go from state to { state, tmp1, tmp2 }. Then the next pipeline stage might use that and go to { state, tmp1, tmp3 } then maybe i don't need the temporaries anymore and it collapses to state again.

This is a case where I am not sure if I should use pipelining and it feels a lot messier. If i don't use pipelining, i am stuck with state1, state2, ...

Georges Boris (Dec 28 2022 at 21:28):

I felt the pain of no shadowing in Elm in a few places and I usually end up with the sad state of indexed variables. most of the time you get used to prefixing/suffixing and things are pretty sane.

however, it seems like the downsides in the ecosystem might overweight the upsides of writing being nicer if a few places since bugs might still appear in the two approaches (being harder to track what variables hold which value and incorrectly misusing an outdated value).

wouldn't a language like Roc favor safety in favor of convenience? I think it is easier to circumvent the problems of not having shadowing while having shadowing everywhere might create more unexpected problems. my 2c.

Joshua Warner (Dec 28 2022 at 21:33):

I don't see it as a safety <-> convenience tradeoff. There are legitimate and common cases where either using shadowing or not using shadowing could lead to accidental bugs.

Joshua Warner (Dec 28 2022 at 21:35):

Unless there's some better tool to address those cases (let's discuss!), then it should really be up to the developer to either shadow or not shadow as the situation dictates.

Joshua Warner (Dec 28 2022 at 21:43):

FWIW, I'd be perfectly happy if I had to sprinkle around a small sigil to indicate that "yes, I'm intentionally shadowing here".

Richard Feldman (Dec 28 2022 at 21:45):

Richard Feldman (Dec 28 2022 at 21:46):

Joshua Warner (Dec 28 2022 at 21:50):

Not that I know of? Unless you count things like := for definitions vs = for later assignments - but that's not really the same thing.

Brendan Hansknecht (Dec 28 2022 at 22:03):

Richard Feldman (Dec 28 2022 at 22:06):

Brendan Hansknecht (Dec 28 2022 at 22:07):

Joshua Warner (Dec 28 2022 at 22:41):

a = 1
b = 2
List.walk list {a, b} \{a,b}, item -> {a: a + item, b: max(b, item)}

(I know that's a bit of a contrived example; but I was writing code pretty close to that yesterday)

Joshua Warner (Dec 28 2022 at 22:42):

List.walk list {a, b} \{shadowed a, shadowed b}, item -> {a: a + item, b: max(b, item)}

Richard Feldman (Dec 28 2022 at 22:59):

Richard Feldman (Dec 28 2022 at 23:00):

Joshua Warner (Dec 28 2022 at 23:49):

Joshua Warner (Dec 28 2022 at 23:50):

Kevin Gillette (Dec 29 2022 at 00:24):

Folkert de Vries (Dec 29 2022 at 00:26):

did you ever see a language use a sigil like that and think that was a good idea?

Folkert de Vries (Dec 29 2022 at 00:26):

Kevin Gillette (Dec 29 2022 at 00:26):

You could also just require that the all instances of a name that is or will be shadowed use the sigil, since there's usually nothing particularly special about the first declaration (often it's the _least_ special)

Folkert de Vries (Dec 29 2022 at 00:27):

it's uncommon, so anytime it shows up you have to remember what that was and what it does

Folkert de Vries (Dec 29 2022 at 00:27):

Kevin Gillette (Dec 29 2022 at 00:28):

or maybe the very last declaration of a reused identifier is marked specially (or not marked, to distinguish it) so as to signal "look no further: in this and all descendent scopes, this is the last meaning this name will ever have"

Kevin Gillette (Dec 29 2022 at 00:31):

Agreed. Most of Haskell was very hard (or nearly impossible) to search documentation for a decade ago. If we go the route of using many sigils, we should have a quick reference guide be the first link anyone finds on the documentation part of the Roc website, and the guide would contain a table of all symbols and their meanings, associated abilities, etc.

Brendan Hansknecht (Dec 29 2022 at 00:43):

Brendan Hansknecht (Dec 29 2022 at 00:44):

Ayaz Hafiz (Dec 29 2022 at 01:02):

I don't love the idea of a keyword/sigil, tbh. It feels like another thing for a developer to have to keep in their head, for nebulous value - now I need to care about the semantic value of a variable, and whether it's shadowed or not, sort of like let vs const or mut in some languages. It puts a toll on the reader and I don't see how it's better, from a reader's perspective, than explicitly allowing shadowing

Ayaz Hafiz (Dec 29 2022 at 01:03):

In this example, did you get a warning that state2 was unused? I feel like in most situations where you increment the index of reused variables, at least a mitigating factor is that you typically use the variables in a linear fashion, so if they're unused the tooling can tell you.

Kevin Gillette (Dec 29 2022 at 01:10):

If the intent is to push people away from the keyword, then we shouldn't have this as a feature, except perhaps as a short term experiment to be concluded before the first stable release

Richard Feldman (Dec 29 2022 at 02:05):

hm, so I tried refactoring that example to use shadowing - here it is before and after:

toIdParserList : IdBindState, List Parser -> { state : IdBindState, ids : List Id }
toIdParserList = \state, parsers ->
    List.walk parsers { state, ids: [] } \{ state: state2, ids }, parser ->
        { state: newState2, id } = toIdParser state parser
        { state: newState2, ids: List.append ids id }

toIdParserList : IdBindState, List Parser -> { state : IdBindState, ids : List Id }
toIdParserList = \state, parsers ->
    List.walk parsers { state, ids: [] } \{ state, ids }, parser ->
        { state, id } = toIdParser state parser
        { state, ids: List.append ids id }

Brendan Hansknecht (Dec 29 2022 at 02:05):

I have found cases where the alternatives don't work or are much more confusing. So i think it has uses. Just other forms are better when they work.

Richard Feldman (Dec 29 2022 at 02:05):

it's easier to write, but...that's a lot of different meanings of state in a small amount of code :sweat_smile:

Richard Feldman (Dec 29 2022 at 02:10):

toIdParserList : IdBindState, List Parser -> { state : IdBindState, ids : List Id }
toIdParserList = \initState, parsers ->
    List.walk parsers { state: initState, ids: [] } \{ state, ids }, parser ->
        answer = toIdParser state parser
        { answer & ids: List.append ids answer.id }

Brendan Hansknecht (Dec 29 2022 at 02:11):

Kevin Gillette (Dec 29 2022 at 02:27):

Richard Feldman (Dec 29 2022 at 02:29):

toIdParserList : IdBindState, List Parser -> { state : IdBindState, ids : List Id }
toIdParserList = \initState, parsers ->
    { state, ids }, parser <- List.walk parsers { state: initState, ids: [] }

    answer = toIdParser state parser
    { answer & ids: List.append ids answer.id }

Kevin Gillette (Dec 29 2022 at 02:29):

Richard Feldman (Dec 29 2022 at 02:29):

Richard Feldman (Dec 29 2022 at 02:30):

my gut reaction to reading that code is that it feels to me like a downside of backpassing that it's possible to write it that way :big_smile:

Kevin Gillette (Dec 29 2022 at 02:31):

I see. I worry sometimes that the lambda params will not get noticed, though you're right that the indentation indicates that something interesting is going on

Kevin Gillette (Dec 29 2022 at 02:33):

I see that aspect of backpassing differently: the rest of the function focuses on the next level down, and the outer context has nothing else to offer going forward. In that way, it's a bit like an "inception" operator

Joshua Warner (Dec 29 2022 at 02:50):

@Ayaz Hafiz TBH, probably, but during development I've found roc's 'unused' warnings to be way too noisy to be valuable to pay attention to. Like, I just wrote ~10 functions that are probably all unused because I haven't hooked them up yet and I'm just trying to get things working with small expect unit tests first - which IIRC still cause 'unused' warnings (I think?).

Joshua Warner (Dec 29 2022 at 02:51):

Kevin Gillette (Dec 29 2022 at 03:00):

Shritesh Bhattarai (Dec 29 2022 at 03:35):

oof. backpassing in maps and loops is my favorite Roc syntax. It is only possible to use it in places when the mapping function is "terminal", i.e. the last thing you do in that code block. Indenting there would just be visual noise.

Shritesh Bhattarai (Dec 29 2022 at 03:46):

regarding sigils: Elixir uses the ^ operator in pattern matching to bind to an existing value and prevent shadowing. Not sure how relevant it is to the discussion but I've wanted something similar when doing pattern match over lists (also, can I haz Rust's @ in patterns as well :pleading_face:)

Richard Feldman (Dec 29 2022 at 04:40):

maybe my first impression is wrong, and I should try embracing it and see how I feel after getting used to it... :thinking:

Richard Feldman (Dec 29 2022 at 04:40):

oh I think we should totally have that, just with as instead of @ - e.g. { x: blah } as rec -> ...

Richard Feldman (Dec 29 2022 at 05:02):

another interesting way to write the previous example, which would be an option if we have tuples:

toIdParserList : IdBindState, List Parser -> (IdBindState, List Id)
toIdParserList = \init, parsers ->
    (state, ids), parser <- List.walk parsers (init, [])

    toIdParser state parser
    |> Tuple.mapSecond \id -> List.append ids id

Richard Feldman (Dec 29 2022 at 05:04):

toIdParserList : IdBindState, List Parser -> (IdBindState, List Id)
toIdParserList = \init, parsers ->
    List.walk parsers (init, []) \(state, ids), parser ->
        toIdParser state parser
        |> Tuple.mapSecond \id -> List.append ids id

Folkert de Vries (Dec 29 2022 at 19:41):

btw in the { answer & ids: List.append ids answer.id } line above, the type of answer changes. is/should that be allowed?

Folkert de Vries (Dec 29 2022 at 19:41):

Folkert de Vries (Dec 29 2022 at 19:42):

actually, more is wrong. answer does not have an ids field, it has an id field

Richard Feldman (Dec 29 2022 at 19:52):

Richard Feldman (Dec 30 2022 at 22:04):

an interesting thing I wonder about: people have plenty of feature requests in Elm, but shadowing has never been one of them as far as I can remember. I also used Elm before and after the release where shadowing became an error, and I don't remember any complaints about it.

I wonder what's different that leads to so much more interest in it for Roc. :thinking:

Richard Feldman (Dec 30 2022 at 22:07):

also Elm's compiler doesn't warn for unused variables (a separate linter does that, which tends not to get run as often) so I'd expect "accidentally reused stale state" bugs to come up strictly more often in Elm

Folkert de Vries (Dec 30 2022 at 22:11):

e.g. I worked on a bytes parser in elm and there the state problem does come up. But the sort of person that would do that sort of library in elm probably has a bunch of experience in other functional language (or with elm)

Richard Feldman (Dec 30 2022 at 22:40):

yeah I also wonder if we're disproportionately running into that edge case right now because an unusual percentage of people's time in Roc is literally writing parsers :big_smile:

Richard Feldman (Dec 30 2022 at 22:41):

because that's not something you normally do in application development, but it is in Advent of Code specifically, and it also is when building foundational libraries like JSON and CSV parsing, which comes up more often when a library ecosystem is in its infancy

Brendan Hansknecht (Dec 30 2022 at 23:16):

I think pipelining and trying to focus on data pipeline is what leads to me to want shadowing.

Brendan Hansknecht (Dec 30 2022 at 23:18):

Even if data can't actually use |>, but is written in a staged transformation manner. I would want shadowing.

Richard Feldman (Dec 30 2022 at 23:27):

yeah that's a difference between Roc and Elm - in Elm you need parens around a lambda in the middle of a pipeline, so in practice people instead typically end the pipeline there

Richard Feldman (Dec 30 2022 at 23:28):

Richard Feldman (Dec 30 2022 at 23:29):

to be fair, since those lambdas tend to be very small, using a short variable name like b doesn't seem very error prone to me

Richard Feldman (Dec 30 2022 at 23:29):

Richard Feldman (Dec 30 2022 at 23:30):

Brendan Hansknecht (Dec 30 2022 at 23:36):

Brendan Hansknecht (Dec 31 2022 at 00:16):

My current thought is that we seem to have patterns that should help with this. Languages like elm manage just fine. We should try not adding shadowing/redeclaration yet and wait until have more samples of places we want shadowing.

We should try documenting those cases, seeing if there are nice rewrites to avoid wanting shadowing, and reconsider later.

Kevin Gillette (Dec 31 2022 at 00:24):

In Elm, i/o processing funnels back into a single function using a message union (i.e. central dispatch), and it's very hard to do it any other way.

In Roc, especially with backpassing, task processing is comparatively procedural, and any function can do it.

Both approaches have their own tradeoffs, and one of them for Roc is more of a tendency towards larger functions that "do" more things, and a higher likelihood of wanting to reuse names.

Richard Feldman (Dec 31 2022 at 01:21):

that's true, but also none of the motivating use cases for shadowing we've seen so far involve Task.await, which is the essential difference between how I/O works in Roc and in Elm (that is, Task.await is super common in Roc and super rare in Elm)

Richard Feldman (Dec 31 2022 at 01:22):

so while that is definitely a difference, I don't think it's a big part of the difference in interest for shadowing

Richard Feldman (May 04 2023 at 00:33):

writing down the idea for #ideas > syntax for x = f x pattern gave me an idea: what if shadowing was allowed, but only within the same block of defs?

Richard Feldman (May 04 2023 at 00:33):

x = 5
x = x + 1
x = x + 3

x

Richard Feldman (May 04 2023 at 00:34):

x = 5

List.map nums \num ->
    x = x + num

...because you're shadowing x from different defs, not the defs where x was originally defined

Richard Feldman (May 04 2023 at 00:35):

as I recall, the main request for shadowing is in exactly this situation (where you want to shadow within the same set of defs) and the major downsides are in other situations (where you introduce a new variable without realizing you've shadowed something from the outer scope)

Brendan Hansknecht (May 04 2023 at 01:27):

Brendan Hansknecht (May 04 2023 at 01:28):

What about inside an if or when? Given those are terminal, would it be fine to shadow?

Brendan Hansknecht (May 04 2023 at 01:29):

x = 1

if foo then
    x = x + 2
    ... using x
else
    x = x - 3
    ...

Brendan Hansknecht (May 04 2023 at 01:30):

I think what you said above plus this would be needed to cover most common cases (and would be safe unlike inside List.map)

Sky Rose (May 04 2023 at 13:15):

weirdScoping = \{} ->
    f = \c -> a + c
    a = b + 3
    b = 2 f 1

It would be much worse if a could be shadowed even within the same scope. I don't expect this to come up often, but if it does come up, it's likely to cause problems.

Brendan Hansknecht (May 04 2023 at 13:19):

I think we talked about this at some point and code like that should at least be a warning, maybe even an error. It makes code more confusing and harder to follow. Code that isn't at the global scope should be in some form of valid topological sort.

Agus Zubiaga (May 04 2023 at 13:27):

Brendan Hansknecht (May 04 2023 at 13:29):

Brendan Hansknecht (May 04 2023 at 13:31):

Also, you could technically pass a function into one of the functions to fix it at local scope, but that is less nice.

Agus Zubiaga (May 04 2023 at 13:31):

Yeah, I guess most mutually recursive functions should live in the global scope anyway. However, part of me, likes that scoping works the same at any level.

Brendan Hansknecht (May 04 2023 at 13:34):

Also, found the comment: we want to test making this an error and see how it turns out in practice. If it doesn't add too much friction, we would keep it.

Brendan Hansknecht (May 04 2023 at 13:36):

also, I don't think a new user would agree with the statement that scoping works the same at any level:

weirdScoping = \{} ->
    f = \c -> a + c
    a = b + 3
    b = 2 f 1

# This functionally looks the same to a new user.
# `<-` is just a weird form of `=`
brokenScoping = \{} ->
    f = \c -> a + c
    a <- someFunc b
    b = 2 f 1

Richard Feldman (May 04 2023 at 13:56):

Brendan Hansknecht (May 04 2023 at 14:18):

Should that be superceded by the discussion i linked above and #5078? That discusses making it an error instead of warning.

Richard Feldman (May 04 2023 at 15:18):

so the only distinction between "warning" and "error" is that error means roc dev will refuse to run the program

Richard Feldman (May 04 2023 at 15:18):

(since you can always force a run regardless of whether it's a warning or an error)

Richard Feldman (May 04 2023 at 15:19):

I think of the distinction being "error means the compiler is inserting a runtime crash somewhere in case this comes up"

Richard Feldman (May 04 2023 at 15:19):

Brendan Hansknecht (May 04 2023 at 15:20):

Brendan Hansknecht (May 04 2023 at 15:21):

Wait, can't you get a warning on roc build but we still complete the compile? Or will a warning stop the compile?

Brendan Hansknecht (May 04 2023 at 15:22):

If so, an error would mean it complete blocks compilation outside of roc dev, but warning would mean compilation still completes

Richard Feldman (May 04 2023 at 15:29):

well the goal is that compilation always completes; you're never completely blocked from running if you want to (you just might get a crash, possibly as early as "right away" depending on where the error(s) are)

Richard Feldman (May 04 2023 at 15:30):

the point of introducing roc dev was to give you a workflow where you can say "I want to work through all my errors, and then once all that's left are warnings like 'unused variable' and whatnot, which won't cause crashes, then I actually want to run the program"

Brendan Hansknecht (May 04 2023 at 15:39):

Brendan Hansknecht (May 04 2023 at 17:02):

So, @Richard Feldman does this mean that you are fine with us adding both the reordering warning and shadowing in the same scope?

Of course with exceptions for functions in both cases, I think. Functions defined out of order are fine and function shadowing is not?

Richard Feldman (May 04 2023 at 18:22):

Brendan Hansknecht (May 04 2023 at 23:53):

Fair enough. What are your current concerns with shadowing? Like if it was limited to this:

state <- newRand {} |> Task.await
{state, data: x} = randFloat state
{state, data: y} = randFloat state
if hasZ then
    {state, data: z} = randFloat state
    {data: other} = randBool state
    SomeTask other [x,y,z]
else
    {data: other} = randBool state
    SomeTask other [x,y]
    [x,y]

Would be invalid within anything nested (nested definitions, nested functions, anything with a ->, <-, or = essentially).

Richard Feldman (May 05 2023 at 00:28):

allowing any form of the shadowing removes a significant language-wide guarantee (namely, that any time you introduce a new name you'll either get an error or else it won't affect any existing code) so I consider it a major change to the language regardless of what restrictions are put on it (unless it's something like a sigil which preserves that guarantee)

Richard Feldman (May 05 2023 at 00:28):

Brendan Hansknecht (May 05 2023 at 00:40):

x = 3
...
y = x + 1

x = 3
...
x = 4 # this x was supposed to be local and not used elsewhere
...
y = x + 1 # now this is using the wrong x

Richard Feldman (May 05 2023 at 00:44):

Richard Feldman (May 05 2023 at 00:45):

another consideration is that Elixir allows reassignment (or "redeclaration") and apparently it has mixed reviews from Elixir users in terms of whether they like it or don't

Richard Feldman (May 05 2023 at 00:45):

neither of which are outright deal-breakers, they're just serious considerations

Richard Feldman (May 05 2023 at 00:47):

I'm still not sure whether shadowing (and if so, in what form) is the right choice for Roc, especially considering Elm doesn't have it and the demand for Elm to introduce it is basically zero (so what's different about Roc that's creating the demand? How much of it is familiarity, how much is different use cases between Elm and Roc, how much of those are due to the language being relatively new - and will those use cases become less and less common over time? etc.)

Brendan Hansknecht (May 05 2023 at 00:57):

How does elm do state and value generation like what we do for roc rand? I think pipeline fixes most things, but not that case.

Brendan Hansknecht (May 05 2023 at 00:59):

I think i really only hit this in situations like that. Maybe they are less common in elm, maybe there is a different solution.

Richard Feldman (May 05 2023 at 00:59):

usually people don't pass around the seed, but rather compose together Random.Generator values (kinda like how you chain Task values together - this can also be done in Roc)

Richard Feldman (May 05 2023 at 00:59):

Anton (May 05 2023 at 09:29):

I'm not a fan of allowing reassignment. It seems to go against the functional programming philosephy and it's nice to be absolutely certain you only have one definition.

Anton (May 05 2023 at 09:36):

If we only use it for cases similar to random generation I would expect users to be surprised by it as well, because it would be a rare sight.

Anton (May 05 2023 at 09:40):

Georges Boris (May 05 2023 at 11:36):

This is basically the only thing that ever bit me using Elm because the solution there is to go for thing1 thing2 or even worse thing_ thing__.

Shadowing never caused me any problems in elixir (there is no concept of global variables, only modules and functions, so things are shadowed locally by default)

Anton (May 05 2023 at 11:39):

Can you give an example of code like this? I'm wondering if shadowing is the best solution in cases like this.

Brendan Hansknecht (May 05 2023 at 13:22):

I definitely think going state1, state2, state3 is more error prone than reassigning to the same state variable. Also, i think this is made more complicated because many people will not know the generator pattern. So anything with state and values will hit this. I have already written many bugs because of this.

Richard Feldman (May 05 2023 at 13:22):

Richard Feldman (May 05 2023 at 13:23):

or at least, not often enough for there to be as many requests for shadowing as we're seeing

Brendan Hansknecht (May 05 2023 at 13:23):

Maybe most cases of this are dealt with by libraries? And they all know the generator pattern?

Richard Feldman (May 05 2023 at 13:23):

Anton (May 05 2023 at 13:36):

In situations where you would use state1, state2, state3 would it not be best to write a function like we do with randomList here?

Brendan Hansknecht (May 05 2023 at 13:53):

ar1 = new data

{state: ar2, value: str1} = arbitraryStr ar1
len1 = Str.countUtf8Bytes str1
{state: ar3, value: str2} = arbitraryStr ar2
len2 = Str.countUtf8Bytes str2

{state: ar4, value: reference1} = ratio ar3 1 2
{value: reference2} = ratio ar4 1 2

Brendan Hansknecht (May 05 2023 at 13:55):

End goal was to get 2 random strings, their lengths, and 2 bools of where or not to do something special related to them. Specific to this function and it's api.

Matthias Toepp (May 05 2023 at 14:07):

Generally it makes sense to name things that are different with different names (and it's useful to have this assurance when reading code). As a time saver, it's easier to just write state1, state2 to save from describing the actual difference. Having less descriptive names is convenient but leads to possible errors of selecting the wrong name.

To address this...
Could we have roc do both: 1. insist that different things are named differently and 2. provide some system that would mean that the previous name may no longer be used either.

Anton (May 05 2023 at 14:32):

That's a nice construct but it may be confusingly similar to a record access or tuple. I'd prefer to make the generator scenario work well with current roc or introducing as little complexity and/or new syntax as possible.

Matthias Toepp (May 05 2023 at 14:34):

(syntax was just to get the point across) This could be done with state1, state2 or probably many other ways.

Brendan Hansknecht (May 05 2023 at 14:36):

Matthias Toepp (May 05 2023 at 14:37):

Anton (May 05 2023 at 14:39):

I think it's ok for typing but in my opinion long names make the code look dense and complicated.

Brendan Hansknecht (May 05 2023 at 14:40):

I think they mostly add noise and make the code harder to follow. They are describing what you already see by reading the code. So they essentially are repeating the rest of the line, but in non-tested english names.

Brendan Hansknecht (May 05 2023 at 14:42):

newAribtraryGenerator = new data

{state: aribtraryGeneratorAfterFirstStr, value: str1} = arbitraryStr aribtraryGeneratorNew
len1 = Str.countUtf8Bytes str1
{state: aribtraryGeneratorAfterSecondStr, value: str2} = arbitraryStr aribtraryGeneratorAfterFirstStr
len2 = Str.countUtf8Bytes str2

{state: aribtraryGeneratorAfterReferenceBool, value: reference1} = ratio aribtraryGeneratorAfterSecondStr 1 2
{value: reference2} = ratio aribtraryGeneratorAfterReferenceBool 1 2

Brendan Hansknecht (May 05 2023 at 14:43):

stateNew = new data

{state: stateAfterFirstStr, value: str1} = arbitraryStr stateNew
len1 = Str.countUtf8Bytes str1
{state: stateAfterSecondStr, value: str2} = arbitraryStr stateAfterFirstStr
len2 = Str.countUtf8Bytes str2

{state: stateAfterReferenceBool, value: reference1} = ratio stateAfterSecondStr 1 2
{value: reference2} = ratio stateAfterReferenceBool 1 2

Matthias Toepp (May 05 2023 at 14:43):

I think when writing code things tend to seem obvious but long names like that do make it easier for someone to understand what's happening when looking at someone else's code.

Brendan Hansknecht (May 05 2023 at 14:46):

I do not think so at all in this case. This is a background state that would generally be best to thread in the background and forget about.

There is no real value to a reader or difference between stateAfterFirstStr and stateAfterSecondStr.

Also, you know that it is stateAfterFirstStr because you can read the rest of the line of code: {..., value: str1} = arbitraryStr stateNew

Brendan Hansknecht (May 05 2023 at 14:46):

variable should not encode the order of use/the imperative transitions in code. They should just encode names as to what they are.

Brendan Hansknecht (May 05 2023 at 14:47):

This is also brittle because I can't add a new value in the middle without renaming multiple things.

Matthias Toepp (May 05 2023 at 14:49):

Matthias Toepp (May 05 2023 at 14:50):

You could currently write v1, v2,v3....but with the disadvantages you point out.

Matthias Toepp (May 05 2023 at 14:52):

Anton (May 05 2023 at 14:57):

It would be nice for the editor to hide and automatically handle the generators in situations like this but that's not a perfect solution either.

Matthias Toepp (May 05 2023 at 14:59):

@Brendan Hansknecht
Have you already shown how you would like the code to look (with shadowing?)

Brendan Hansknecht (May 05 2023 at 15:00):

state = new data

{state, value: str1} = arbitraryStr state
len1 = Str.countUtf8Bytes str1
{state, value: str2} = arbitraryStr state
len2 = Str.countUtf8Bytes str2

{state, value: reference1} = ratio state 1 2
{value: reference2} = ratio state 1 2

Matthias Toepp (May 05 2023 at 15:13):

Isn't this a pretty special case then where randomness (arguably) negates the difference between variables?

Brendan Hansknecht (May 05 2023 at 15:19):

I think that is a common use case, but you could also be building up or using a state without randomness. Anytime you want part of a value/return type, this will come up. Everytime you see a pipeline, this is implicitly coming up, it is just that pipeline syntax avoids the naming issue.

Matthias Toepp (May 05 2023 at 15:24):

@Brendan Hansknecht Do you think it's worth it to sacrifice being able to confidently rely on a thing with a certain name having the same value? i.e. that it hasn't changed in the intervening lines.

Matthias Toepp (May 05 2023 at 15:29):

there could also be a simple syntax to indicate that something is shadowed/shadowable: ~state

Brendan Hansknecht (May 05 2023 at 15:34):

I am totally open to some form of specific syntax here. Actually probably could be pretty awesome as part of the name (though maybe really strange to read at first).

Anton (May 05 2023 at 15:38):

Brendan Hansknecht (May 05 2023 at 15:40):

~state = new data

{state: ~state, value: str1} = arbitraryStr ~state
len1 = Str.countUtf8Bytes str1
{state: ~state, value: str2} = arbitraryStr ~state
len2 = Str.countUtf8Bytes str2

{state: ~state, value: reference1} = ratio ~state 1 2
{state: ~state, value: reference2} = ratio ~state 1 2

Brendan Hansknecht (May 05 2023 at 15:40):

Brendan Hansknecht (May 05 2023 at 15:41):

~state = new data
List.map mylist \x ->
    {state: ~state, value: str1} = arbitraryStr ~state
    Str.concat x str1

Matthias Toepp (May 05 2023 at 15:51):

perhaps the first use of a shadowed variable could be used plainly (without the annotation) to make clear it is the first use?

Brendan Hansknecht (May 05 2023 at 15:52):

The reason I made it explicit is that it is essentially me saying, this variable is allowed to be shadowed.
That way, if I see:

state = new data

Brendan Hansknecht (May 05 2023 at 15:53):

Matthias Toepp (May 05 2023 at 15:57):

Yes, but I'm not sure that's necessary if every shadow is anotated....wouldn't it be nice to know where the original var is?

Matthias Toepp (May 05 2023 at 15:58):

Anton (May 05 2023 at 15:58):

Brendan Hansknecht (May 05 2023 at 16:00):

state = 3
...
y = state + 3
...

state = 3
...
~state = state + 1
...
y = state + 3
...

Brendan Hansknecht (May 05 2023 at 16:01):

Essentially what happens if i miss that it was shadowed and the value changes on me?

Matthias Toepp (May 05 2023 at 16:01):

but that would be an error because the last state has been shadowed out by ~state

Brendan Hansknecht (May 05 2023 at 16:02):

Matthias Toepp (May 05 2023 at 16:02):

Matthias Toepp (May 05 2023 at 16:03):

Brendan Hansknecht (May 05 2023 at 16:05):

That's fair... I kinda think adding the rename once is just an unnecessary inconsistency.

~state = ~state + 1
...

Or the other case where i simply don't care if there is a shadow above. It doesn't affect me:

~state = 7
...

Anton (May 05 2023 at 16:06):

Brendan Hansknecht (May 05 2023 at 16:07):

Richard Feldman (May 05 2023 at 16:15):

Richard Feldman (May 05 2023 at 16:16):

which is to say, shadowing is banned, but you can opt into allowing variables to be reassignable or not on a per-variable basis

Richard Feldman (May 05 2023 at 16:16):

Richard Feldman (May 05 2023 at 16:17):

I'm open to the sigil idea (I used to do Perl long ago, and I have fond memories of the sigils being an extremely concise way to tell things about a particular variable) although I do remember there being some push-back (much) earlier in this thread regarding sigils in general

Brendan Hansknecht (May 05 2023 at 16:21):

I programmed in go for a while where capitalization mattered. I thought it was a terrible idea at first, but i came to respect it for things that you would prefer to know at a glance. So i am open to a sigil because of that.

Richard Feldman (May 05 2023 at 16:21):

a thought about the ~ prefix sigil specifically: -x looks fine, but -~x looks very weird to me :sweat_smile:

Richard Feldman (May 05 2023 at 16:22):

maybe a ! suffix since reassignable variables have a kind of imperative feel to them?

Richard Feldman (May 05 2023 at 16:23):

there's also always the old $ prefix, e.g. $foo = ... and then -$foo or !$foo

Richard Feldman (May 05 2023 at 16:23):

which I suppose would feel familiar in that in languages which use the $ prefix, those variables are always reassignable :big_smile:

Matthias Toepp (May 05 2023 at 16:27):

$ is a bit more obvious than exclamation, so it depends on what is wanted with that.

Matthias Toepp (May 05 2023 at 16:28):

Matthias Toepp (May 05 2023 at 16:36):

Matthias Toepp (May 05 2023 at 16:40):

Having a sigil is better than explicitly allowing shadowing because it allows us to keep the guarantees that an unshadowed definition hasn't changed in intervening lines of code. The sigil makes it explicit where shadowing is happening.

Matthias Toepp (May 05 2023 at 16:48):

For people like me, who are essentially opposed to shadowing, having a sigil (or keyword I suppose) is a nice compromise. I don't have to use it. I get the benifits of having definitions as constants. If someone else uses it then I'm not surprised by it (because I see the sigil). The sigil can provide all the desired benifits for those who wish to use shadowing.

Richard Feldman (May 05 2023 at 16:53):

based on everything that's been discussed so far, I'm convinced that both of the following are true:

Richard Feldman (May 05 2023 at 16:55):

that said, I don't think it automatically follows that the only design that makes sense is one that supports redefining/shadowing in some contexts and not others; another possible answer is "despite the fact that both of these have significant value, that value doesn't justify the cost of [a particular design]"

Brendan Hansknecht (May 05 2023 at 16:56):

Hey @Richard Feldman, what is the minimal amount of code to convert this to the back passing friendly syntax. Also, would it be usable with tasks? Or would the two forms of backpacking conflict (iirc they would type mismatch)?

state = new data

{state, value: str1} = arbitraryStr state
len1 = Str.countUtf8Bytes str1
{state, value: str2} = arbitraryStr state
len2 = Str.countUtf8Bytes str2

{state, value: reference1} = ratio state 1 2
{value: reference2} = ratio state 1 2

Where we want something like (of course with the equivalent of await as needed):

str1 <- arbitraryStr
len1 = Str.countUtf8Bytes str1
str2 <- arbitraryStr
len2 = Str.countUtf8Bytes str2

reference1 <- ration 1 2
reference2 <- ration 1 2

Really trying to understand the alternative to shadowing or being stuck with state1, state2, state3.

Matthias Toepp (May 05 2023 at 17:15):

Sigils (or a keyword) are a compromise in terms of complexity, but without them you have to decide which half of the users to make unhappy.

Richard Feldman (May 05 2023 at 17:16):

Matthias Toepp (May 05 2023 at 17:31):

People who don't like sigils may be assuming they can have it all their way in terms of
shadowing or no shadowing.

Matthias Toepp (May 05 2023 at 17:35):

Matthias Toepp (May 05 2023 at 17:37):

Having half of people unhappy with shadowing or no shadowing is really a lot of unhappy people!.

Richard Feldman (May 05 2023 at 17:40):

thinking about it personally, I think if the sigil approach existed, I would occasionally use it

Richard Feldman (May 05 2023 at 17:42):

I don't think I'd consider it the type of feature that's like "you can use this if you need it, but it's a code smell so try to avoid it" - rather, I imagine putting it in the tutorial as "you should default to not using it, but here are the circumstances where it can improve your code"

Matthias Toepp (May 05 2023 at 17:48):

The really remarkable, and wonderful thing about using sigils to mark shadowing is that as someone who is opposed to all out shadowing, (and i think this goes for @Anton as well) I'm ok with the solution (except that there is a complexity trade off i guess, but I'm not in a position to measure the consequence of this.). and @Brendan Hansknecht can have all the benifits of shadowing that he is asking for (I believe). I wonder how much are @Folkert de Vries and @Ayaz Hafiz (and other's with a similar perspective) opposed to having sigils to mark a name as shadowable or shadowing (and the consequent language complexity), in light of how it does seem to resolve a major divide in terms of shadowing or no shadowing (i.e. it gives us kind of the best of both worlds, confidence about when variables are constants and shadowing with awareness of when shadowing is at play).

Matthias Toepp (May 05 2023 at 18:26):

Richard Feldman (May 05 2023 at 18:27):

Matthias Toepp (May 05 2023 at 18:37):

Richard Feldman (May 05 2023 at 18:37):

I wouldn't expect people to just use that all over the place when (for example) top-level declarations couldn't use it

Georges Boris (May 05 2023 at 19:01):

I like the sigil idea... but I fear opening another syntax might have its own side effects. is this the only sigil? are we creating a pattern? what are other things that can eventually use this syntax? what is the mindset the user is creating when seeing a sigil in Roc? just assuming "it's a one time thing" can easily turn into a can of worms in the future (but then again... I like it :sweat_smile: just trying to play the devil's advocate here)

Brendan Hansknecht (May 05 2023 at 19:08):

Brendan Hansknecht (May 05 2023 at 19:13):

Also, I just wrote up both version of random. The backpassing generator version and the regular version.
Here is what the end user functions look like:

repeatedState = \{} ->
    state0 = new 1234

    (state1, u1) = randU64 state0 10 20
    (state2, f1) = randF64 state1 0 1
    (_, str) = randStr state2

    u1Str = Num.toStr u1
    f1Str = Num.toStr f1

    "\(str): \(u1Str), \(f1Str)"


genState = \{} ->
    generator =
        u1 <- randU64 10 20 |> andThen
        f1 <- randF64 0 1 |> andThen
        str <- randStr |> andThen

        u1Str = Num.toStr u1
        f1Str = Num.toStr f1

        constant "\(str): \(u1Str), \(f1Str)"

    state = new 1234
    (_, out) = generate state generator
    out

Messing with the examples, I think the biggest issue with the generator form is that I don't think it can be used with Tasks.
You can't do:

u1 <- randU64 10 20 |> andThen
maxFloat <- Stdin.readFloat |> Task.await
f1 <- randF64 0 maxFloat |> andThen

It will lead to a type mismatch. So generator work in isolation (with a bit more verbosity), but are limited in what they can represent. So I don't think the generate syntax fixes the issues we are discussing here.

Richard Feldman (May 05 2023 at 19:22):

Agus Zubiaga (May 05 2023 at 19:49):

You could make it composable with Task if you really wanted to, but it wouldn't be the prettiest :upside_down:

Brendan Hansknecht (May 05 2023 at 20:28):

Agus Zubiaga (May 05 2023 at 22:44):

Basically, you can make generators that return tasks: Random.Generator (Task a err)

Agus Zubiaga (May 05 2023 at 22:44):

Agus Zubiaga (May 05 2023 at 22:58):

genState = \{} ->
    generator =
        u1 <- randU64 10 20 |> andThenTask
        maxFloat <- Stdin.readFloat |> Task.await
        f1 <- randF64 0 maxFloat |> andThen

        u1Str = Num.toStr u1
        f1Str = Num.toStr f1

        constant "\(u1Str), \(f1Str)"

    state = new 1234
    (_, out) <- taskGenerate state generator |> Task.await
    out

andThenTask : Generator a, (a -> Task (Generator b) err) -> Generator (Task b err)

taskGenerate : State, Generator (Task a err) -> Task a err

Agus Zubiaga (May 05 2023 at 22:59):

Agus Zubiaga (May 05 2023 at 23:04):

In this case you don't even need this because Stdin.readFloat doesn't use u1, and you could just do it outside of the generator

Brendan Hansknecht (May 05 2023 at 23:05):

Agus Zubiaga (May 05 2023 at 23:11):

Yeah, probably puzzling for a beginner, but something I might be ok doing if I had to generate a lot of numbers that depended on results of effects

Brendan Hansknecht (May 05 2023 at 23:13):

Notification Bot (May 11 2023 at 15:54):

Matthias Toepp (May 13 2023 at 08:24):

Georges Boris (May 13 2023 at 14:05):

Brendan Hansknecht (May 22 2023 at 15:05):

I am trying to write a parser that is in DOD form. So instead of using a recursive node struct, it has a list of nodes and uses U32 to index into it.

Node: [
    Let {ident: Node, expre: Node}
    Ident Str
    ...
]

Node: [
    Let {ident: U32, expre: U32}
    Ident Str
    ...
]

As such, to generate a new node, you have to append it to the node list. This means that you have a mutable node list that needs to be passed into every function of a recursive decent parser and returned mutated back up the stack. That list will get updated multiple times in a single function call. So it requires multiple names and makes the code less readable.

On top of that, you have an error list and a token index, which all are mutable and passed into and out of every function. So many things that all have multiple references in a single function.

Note: part of this pain would be alleviated if #2836 was fixed. Then all the mutable data could be passed around in one record. Until it is fixed, they all need to be separate or the function take a huge perf hit due to tons of unnecessary copying. That said, in both cases, shadowing would be quite helpful.

drathier (Aug 12 2023 at 14:17):

joshi (Dec 02 2023 at 19:02):

I haven't read this entire thread, but Richard asked in the AoC channel that we shared our thoughts, so here are mine:

In general, I agree that shadowing most likely makes code harder to understand in almost all situtations, since now there might be multiple places you need to check to figure out where a value might come from, and you also have to parse their scope in your head to actually figure out what is happening (goto-definition saves lives here). This is especially true if you allow shadowing directly on the same level, and when I first learned FP it actually took me a long time to understand the difference between

let x = 1
let x = x + 1

and mutable variables (The first functional language I learned was F#). I think I only really understood the difference after I saw recursive definitions in Haskell (like ones = 1 :: ones).

However, having done Elm for a few years now, I ran into these situations where it would have been really convenient or beneficial to have shadowing:

(Keep in mind that not being able to shadow is a really minor problem with a trivial fix, and the benefits are quite substantial I think - like being able to move definitions around while guaranteeing that everything will still work)

line <- Stdin.line |> Task.await
when line is
    Input line -> # well, I have to come up with a new name now... maybeLine? theLine?
        { before: id, after: data } <- line |> Str.splitFirst " " |> Result.try
        id <- id |> Str.toNat |> Result.try # again, probably going to rename the first one to rawId or idStr
    _ -> crash "whatever"

(result1, state1) = doSomethingComplicated value initialState
(result2, state2) = rememberToUseResultNow result1 state1
(result3, state3) = hopeYouDontMessThisUp result2 state2
(combineResults result1 result2 result3, state3)

I know this case directly contradicts what I said in the beginning, so maybe this is entirely fine, since there is already this feel of "I need to be careful here" around it, at least for me.

Brendan Hansknecht (Dec 02 2023 at 19:12):

LoipesMas (Dec 02 2023 at 19:50):

x = 2
y = 5
z = x * y

z = x * y
x = 2
y = 5

(I'm not sure if there is a term for this, other than just "being declarative")
Shadowing would make this complicated, because shadowing implies order. So, for example:

z = x * y
x = 2
y = 5
x = y + 2

joshi (Dec 02 2023 at 19:56):

This is probably the biggest benefit of not allowing shadowing - you can re-order and pull things out however you like, and it guarantees that in the end, the result will be the same (as long as it still compiles).

To me, that trumps any annoyance of sometimes having to name variables x2 or idStr.

LoipesMas (Dec 02 2023 at 19:59):

I've just checked how it works in Haskell and for a do block order matters and you can shadow variables, but for where clause the order doesn't matter but you can't shadow variables (so just like Roc). I prefer the Roc approach, it just fits FP better for me.

LoipesMas (Dec 02 2023 at 20:02):

I think this case wouldn't work with shadowing, because you need those results at the end. And if you didn't need them, then you could use pipes to pass the state and result. Not saying such cases don't happen though.

joshi (Dec 02 2023 at 20:04):

I was thinking about only shadowing state, since you definitely should use the previous state at every step. I was maybe not as good as I thought at coming up with a "real-feeling" example while typing this :sweat_smile:

Brendan Hansknecht (Dec 06 2023 at 23:26):

I know it is not advise to try and directly port imperative code to a functional language like roc, but I wanted to port some benchmarks over and model them as closely as possible to the external benchmarks. This is by far the most I have ever wanted shadowing, this code is so painful to write.

Brendan Hansknecht (Dec 06 2023 at 23:26):

It is managing 3 rngs along with other variables that are also versioned throughout the function.

Anton (Dec 08 2023 at 10:06):

I do find it easy to read, but definitely a good snippet to try out with possible shadowing implementations.

Shritesh Bhattarai (Mar 12 2024 at 17:12):

I was searching for a way to bind to the entire record with an optional field after setting the optional value inside a pattern match... and I ended back to my question lol. Is there a way to do this now?

foo = \{id ? "default value"} as rec ->
    doSomethingWith rec

Richard Feldman (Mar 12 2024 at 18:23):

Richard Feldman (Sep 05 2024 at 16:14):

just saw a concrete shadowing benefit happen in Unison live-coding: Paul spent some time debugging a problem that turned out to be because he hadn't shadowed away some stale state, which he realized here:

After fixing the bug he said "I did not follow my own advice, which is to shadow variables that you're no longer using."

Eli Dowling (Dec 08 2024 at 08:43):

Did we ever get a clear verdict on yay or nay for shadowing? Since of been doing AOC and writing some roc experiments after a long break I've been trying to really take note of the places I'm finding a lot of friction.

Shadowing is probably number one, maybe tied with tag inference errors for frustrations. But shadowing is definitely the number one cause of bugs in my code right now. Exactly what Richard describes here, using stale state because I had no way to "hide the variable that shouldn't be used"

I'd be keen to do the implementation work here if there is some concenus that it's something we want.

Eli Dowling (Dec 08 2024 at 08:51):

Sorry ignore this I just saw it was added to planned breaking changes, with mention it's blocked by the canonicalization rewrite. I guess I'll try to improve tag union errors instead :sweat_smile:

Brendan Hansknecht (Dec 08 2024 at 16:44):

As a note, I think shadowing as it is written here for any value is not planned now. I think it is planned to be reserved for variables with an _ after the name.

Anthony Bullard (Dec 08 2024 at 18:31):

So interesting the differences in opinions. Andrew Kelley doesn't want shadowing in Zig because he views it as a major source of bugs. And here a lot of people arguing that NOT shadowing is a major source of bugs. I think both can be true, but how do you know which way to go?

Oskar Hahn (Dec 08 2024 at 18:45):

I think that visual feedback can help. If I can see in my editor, that an identifier shadows another one, I will probably be suspicious and less likely to create a shadow bug.

But you cannot get visual help, that you used the wrong value (acc instead of acc2).

Jasper Woudenberg (Dec 08 2024 at 19:00):

I think Zig might be a bit different in that many cases where a reasonable argument can be made that shadowing would avoid bugs in Roc, are cases where Zig would use mutation. For instance: in a pure function that generates a bunch of random values we might write this in Roc using shadowing:

randomPerson = \init_seed ->
    var seed_ = init_seed
    age, seed_ = random_int seed_
    name, seed_ = random_str seed_ 10
    { name, age }

In Zig you would probably pass a reference to the seed into the random functions and let it mutate. The motivation to shadow would fall away.

Jasper Woudenberg (Dec 08 2024 at 19:07):

I quite like the shadowing approach in the latest proposal. It allows shadowing, but creates just enough friction to using it, with the var keyword and name constraint, I'd expect most people to avoid shadowing unless they've a strong usecase.

Anthony Bullard (Dec 08 2024 at 19:15):

Great point @Jasper. This is why Elixir allows shadowing by default (to them you are just rebinding). It's a bit different for them since the = operator is really just a pattern match that can crash a process. And they also let you "pin" the identifier if you want to say "don't bind this, make sure the matched value EQUALS the value of the existing binding". Which I think is thing that Roc could use in when (obviously can't do that in normal assigns).

Brendan Hansknecht (Dec 08 2024 at 19:26):

Brendan Hansknecht (Dec 08 2024 at 19:27):

The _ suffix is trying to play the middle ground. Shadowing is off by default but can be opted in

Brendan Hansknecht (Dec 08 2024 at 19:27):

Brendan Hansknecht (Dec 08 2024 at 19:28):

The one piece this is missing that zig mut enforces is that they type stays the same. We probably should enforce that as well

Brendan Hansknecht (Dec 08 2024 at 19:29):

Also, just realized I repeated what @Jasper Woudenberg said. Should have read all replies before replying myself

Anthony Bullard (Dec 08 2024 at 19:45):

Yeah, love that enforced immutability allows for safe use of a very useful practice that's dangerous (or at least confounding) in places with mutability

Eli Dowling (Dec 09 2024 at 01:47):

Ahh yeah, k do remember seeing that. Seems like a fine solution for now, and something that hasn't been explored elsewhere

Anthony Bullard (Dec 09 2024 at 02:14):

I like the idea of the identifier being forced to carry information about being rebindable down the body of the function, but _ does feel like it will be a little overloaded in semantics in the languages since it’s also used alone or as a prefix to signify its discarded or unused

Anthony Bullard (Dec 09 2024 at 02:14):

Anthony Bullard (Dec 09 2024 at 02:22):

I think of the rest of possible symbols , ' is nice but is more common for meaning prime - that’s too close in intent and might be confusing

& might be nice as a prefix and there is a little bit of a overlap in semantics here and a reference.

Brendan Hansknecht (Dec 09 2024 at 02:44):

Yeah, it's hard to pick good symbols when balancing aesthetics, typeability (especially on non-us keyboards), and overlap with other languages in ways that would be confusing.

Kilian Vounckx (Dec 09 2024 at 05:45):

In one of Richard's talks I remember him saying something like: 'An underscore at the start means I'm not using this. An underscore at the end means I'm using this more than once'. I like this mnemonic, even though 'using more than once' is really 'assigning more than once' but it convinced me somewhat of using the symbol

Richard Feldman (Dec 09 2024 at 05:55):

yeah the pithy version is "underscore in front means unused, and underscore at the end means reused"

Kilian Vounckx (Dec 09 2024 at 05:56):

Something I just thought of though. We are changing to snake_case. I feel like this will make semantic underscores slightly less readable, but I'd have to see in practice

Anthony Bullard (Dec 09 2024 at 10:49):

Great point @Kilian Vounckx and something I meant to include (given I just did the snake_case work you’d think it would have been type of mind)

Anthony Bullard (Dec 09 2024 at 11:03):

Here's a slightly reworked version of Jasper's example above with all three different realistic symbols using snake_case

random_person = \seed ->
    var init_seed_ = seed
    age, init_seed_ = random_int! init_seed_
    name, init_seed_ = random_str! init_seed_ 10
    { name, age }

random_person = \seed ->
    var &init_seed = seed
    age, &init_seed = random_int! &init_seed
    name, &init_seed = random_str! &init_seed 10
    { name, age }

random_person = \seed ->
    var $init_seed = seed
    age, $init_seed = random_int! $init_seed
    name, $init_seed = random_str! $init_seed 10
    { name, age }

Brendan Hansknecht (Dec 09 2024 at 16:33):

I think trailing _ is still fine. This is meant to be noticeable, but has no need to really stick out.

Kilian Vounckx (Dec 09 2024 at 16:43):

Fair. In any case, most editors can probably be configured to highlight them differently if wanted

Sam Mohr (Dec 09 2024 at 17:06):

Anthony Bullard (Dec 09 2024 at 17:06):

That’s true. You can make that sort of ident in tree-sitter assigned to a different highlight group

Eli Dowling (Dec 09 2024 at 17:07):

Anthony Bullard (Dec 09 2024 at 17:07):

Eli Dowling (Dec 09 2024 at 17:07):

Anthony Bullard (Dec 09 2024 at 17:07):

Sam Mohr (Dec 09 2024 at 17:07):

Anthony Bullard (Dec 09 2024 at 17:08):

Speaking of which, I need to get tree-sitter-roc set up in my neovim config. It’s jarring going over to Zed just for syntax highlighting

Eli Dowling (Dec 09 2024 at 17:09):

Anthony Bullard (Dec 09 2024 at 17:09):

Eli Dowling (Dec 09 2024 at 17:10):

There are not all that many colours available in most colour schemes , so you'd most likely end up with a conflict with something else. Overall, I'd probably emit both and let people do whatever they want

Eli Dowling (Dec 09 2024 at 17:11):

I don't personally use nvim, so feel free to go forth and make a pr with my blessing :sweat_smile:

Anthony Bullard (Dec 09 2024 at 17:11):

So you’d ask people to add their own query for that? So fork it or can we add extra somehow?

Anthony Bullard (Dec 09 2024 at 17:11):

Oh sorry, I don’t run in to many tree-sitter enjoyers that aren’t nvim users

Anthony Bullard (Dec 09 2024 at 17:12):

Sam Mohr (Dec 09 2024 at 17:12):

Anthony Bullard (Dec 09 2024 at 17:12):

Sam Mohr (Dec 09 2024 at 17:12):

Anthony Bullard (Dec 09 2024 at 17:12):

Eli Dowling (Dec 09 2024 at 17:12):

No no, I'd make a sensible default, but I'd separate the underscore in the parse tree.
That would let folks customise it if they want to using an override.

Anthony Bullard (Dec 09 2024 at 17:13):

Eli Dowling (Dec 09 2024 at 17:13):

Anthony Bullard (Dec 09 2024 at 17:13):

Anthony Bullard (Dec 09 2024 at 17:14):

Eli Dowling (Dec 09 2024 at 17:14):

Very easy. in nvim you just add a file called something like roc.scm to a folder in your neovim config and I think you need a comment at the top.

Anthony Bullard (Dec 09 2024 at 17:14):

Anthony Bullard (Dec 09 2024 at 17:15):

I’ve wrote TS grammars, just never customized one from a plugin without forking

Eli Dowling (Dec 09 2024 at 17:15):

Anthony Bullard (Dec 09 2024 at 17:15):

Eli Dowling (Dec 09 2024 at 17:16):

I'm not sure about if helix can do overrides or if you just need to copy the whole highlighter.
I actually use neovim for debugging because the TS inspector is fab

Anthony Bullard (Dec 09 2024 at 17:17):

And don’t worry Richard, I think Zed is great for a GUI editor. But my Tmux/nvim workflow is in my soul

Eli Dowling (Dec 09 2024 at 17:19):

Yeah, zed seems cool but tbh... I hate to say it, it's kinda slow...
But everything is kindaslow compared to helix :sweat_smile:
But when my laptop is unplugged and locked at 400mhz on the CPU, helix is like butter and most other things... Still not too bad, but more like butter with sand and little rocks.

Eli Dowling (Dec 09 2024 at 17:22):

Also the helix editing model really does just feel much more intuitive to me.
Maybe if one day zed gets a helix mode I'll consider switching:sweat_smile:

Anthony Bullard (Dec 09 2024 at 17:22):

Rendering ASCII is pretty easy compared to drawing the entire UI pixel buffer. Especially when ran in a fast terminal

Eli Dowling (Dec 09 2024 at 17:25):

But helix doesn't have AI integration... What do they expect me to write CSS by hand, like a cave man???
For work, sadly crapping out mediocre php code is just magnitudes faster with ai assistance. So I'm back to vscode a lot of the time these days.

Brendan Hansknecht (Dec 09 2024 at 17:36):

I really wonder what the numbers on this are. I wouldn't be surprised if active monthly users was similar.

Brendan Hansknecht (Dec 09 2024 at 17:38):

Brendan Hansknecht (Dec 09 2024 at 17:39):

There are a lot of us in roc. I remember first adopting it and no one knew what it was. The numbers have grown much higher since then.

Richard Feldman (Dec 09 2024 at 17:44):

a quick thought about highlighting the underscore on its own - I think that might give the misimpression that it's a separate operator instead of part of the name, so personally I think highlighting the whole name differently would be best :big_smile:

Sam Mohr (Dec 09 2024 at 17:44):

Anthony Bullard (Dec 09 2024 at 17:44):

Brendan Hansknecht (Dec 09 2024 at 17:49):

Does it even need different highlighting? Generally for other languages, they just highlight the equivalent of a mut keyword. So the variables are less distinct than what roc will have with the trailing underscore.

Sam Mohr (Dec 09 2024 at 17:50):

I think longer variable names, now that we're moving towards snake_case, will be less obviously shadowed than an underscore after a camelCase name

Brendan Hansknecht (Dec 09 2024 at 17:51):

Sam Mohr (Dec 09 2024 at 17:51):

Sam Mohr (Dec 09 2024 at 17:52):

You're right that just having the underscore is already an improvement over the status quo

Brendan Hansknecht (Dec 09 2024 at 17:52):

Also, if we have enough colors in default schemes such that it is barely different, sounds fine, worried about it sticking out and needing extra config. I want to remind that originally we were gonna do shadowing without _. I really don't think these need to stick out at all.

Sam Mohr (Dec 09 2024 at 17:54):

Well, that was why I initially suggested just highlighting the final underscore to not make it overly conspicuous

Brendan Hansknecht (Dec 09 2024 at 17:54):

Also, I think it will still be quite noticable with long names. The important part is assignments. Even if you can't see the trailing _, you will notice the extra spacing at assignment.

Anthony Bullard (Dec 09 2024 at 17:54):

I personally think it would make things like PR reviews a lot easier or more effective. But it’s not a hill I’d die on

Sam Mohr (Dec 09 2024 at 17:55):

Brendan Hansknecht (Dec 09 2024 at 17:56):

:nod: (I really wish Zulip had gif emoticons, this nod emoji is so unsatisfying)

Brendan Hansknecht (Dec 09 2024 at 18:24):

Joshua Warner (Dec 10 2024 at 02:58):

My desire for shadowing is not just restricted to mutable updates. It also comes up when writing code to recursively traverse a tree, for example:

explore = \node ->
  when node is
    Line node -> explore node
    Seq nodes -> List.map nodes explore

I could name that inner item innerNode or whatever, but if I do that, I can't guarantee that the code there _doesn't_ accidentally use the outer node arg. I have some code I was working on recently where I made this very bug (accidentally using node instead of innerNode. This is especially important as the code in the various branches gets more complicated.

This is somewhat similar to mutable rebinding in a for loop, since we are in this case iterating over a _tree_, and trying to update the current "state" variable to point to the next node we'll be looking at.

Richard Feldman (Dec 10 2024 at 03:09):

I think var should work here, because the idea is for patterns to be able to reassign to vars:

explore = \var node_ ->
  when node_ is
    Line node_ -> explore node_
    Seq nodes -> List.map nodes explore

Joshua Warner (Dec 10 2024 at 03:32):

Joshua Warner (Dec 10 2024 at 03:33):

What if I have some code before/after the when, and I'd like that code to definitely always use the outer node?

Joshua Warner (Dec 10 2024 at 03:33):

There's definitely some Rust code I was writing recently for the roc formatter that looks like that

Brendan Hansknecht (Dec 10 2024 at 03:48):

Before the when is fine.....after.....idk what our scope plan is. I think it would have to be the new inner node_. So it would implicitly bind to the outer scope.

Sam Mohr (Dec 10 2024 at 03:50):

I'm pretty sure that in the given example, once you do Line node_ ->, the outer node_ has been covered up. But Seq would still be able to use it.

Joshua Warner (Dec 10 2024 at 03:54):

Brendan Hansknecht (Dec 10 2024 at 03:56):

explore = \var node_ ->
    something = node_
    list =
        when node_ is
            Line node_ -> explore node_
            Seq nodes ->
                unused = node_
                List.map nodes explore
    List.append list node_

explore = \node ->
    something = node
    (list, node2) =
        when node is
            Line innerNode -> (explore innerNode, innerNode)
            Seq nodes ->
                unused = node
                (List.map nodes explore, node)
    List.append list node2

Brendan Hansknecht (Dec 10 2024 at 03:56):

And I would say we have to explicitly ban any short of shadowing through a lambda.

Sam Mohr (Dec 10 2024 at 03:57):

Brendan Hansknecht (Dec 10 2024 at 03:58):

Also, instead of the tuple return, probably would map to a special mono node to set a symbol to a new value.

Sam Mohr (Dec 10 2024 at 03:58):

Sam Mohr (Dec 10 2024 at 03:59):

That's my current thought for canonicalization, is that it should think of it as mutation, even though typechecking treats it as re-assignment

Sam Mohr (Dec 10 2024 at 04:00):

It's a bit nitpicky to get into the particulars, but long story short the "under the hood" won't just be mutation or re-assignment, but a bit of both between canonicalization and codegen

Anthony Bullard (Dec 10 2024 at 04:00):

Yeah shadowing and rebinding look similar sometimes, but are different in important ways

Shadowing: the variable with this name has this new value for the rest of this scope (and child scopes)

Rebinding: the variable with this name will have this value for the rest of the scope it was introduced in - and that scopes children(including this one)

Anthony Bullard (Dec 10 2024 at 04:03):

In SSA shadowing would create a new var and ensure all references to that name would use the new var for the rest of the scope

Anthony Bullard (Dec 10 2024 at 04:03):

Richard Feldman (Dec 10 2024 at 04:30):

Richard Feldman (Dec 10 2024 at 04:31):

in Rust, mut does both - it enables reassignment (e.g. inside a for loop you can reassign something declared outside the loop with mut to have a different value) but also it enables mutation (e.g. if a function has mut on one of its arguments, and you call that function, the thing you passed in may get changed just because you passed it in there)

Richard Feldman (Dec 10 2024 at 04:31):

Richard Feldman (Dec 10 2024 at 04:32):

so you can put var outside a loop and then reassign it inside the loop, which causes the outer thing to change

Richard Feldman (Dec 10 2024 at 04:32):

but if you declare that a function accepts a var foo as an argument, callers still don't have to worry about passing anything in there potentially resulting it in being changed after the function call

Sam Mohr (Dec 10 2024 at 04:33):

This is the thing that "feels" like mutation. I think it'll end up getting implemented as mutation in codegen

Richard Feldman (Dec 10 2024 at 04:33):

Richard Feldman (Dec 10 2024 at 04:34):

the reason I'm avoiding using the term "mutation" to describe it is that usually mutation means two things, and this only enables one of them

Sam Mohr (Dec 10 2024 at 04:35):

Eli Dowling (Dec 10 2024 at 04:53):

True, we wouldn't want to scare off any of the functional programming pursuits.
I've heard saying mutation too loud tends to make them scurry back into the dark rocky caves they came from.

Richard Feldman (Dec 10 2024 at 04:55):

Eli Dowling (Dec 10 2024 at 04:58):

Nah, that's mutation, that's locked away and hidden out of sight. It's safe, like seeing mutation at the zoo vs coming face to face with it on the savanna!

Stream: ideas

Topic: Shadowing & Redeclaration

Kevin Gillette (Dec 23 2022 at 05:14):

Brian Carroll (Dec 23 2022 at 08:47):

Brian Carroll (Dec 23 2022 at 08:47):

Brian Carroll (Dec 23 2022 at 08:54):

Brian Carroll (Dec 23 2022 at 08:56):

Kevin Gillette (Dec 23 2022 at 09:55):

Kevin Gillette (Dec 23 2022 at 10:02):

Richard Feldman (Dec 23 2022 at 11:47):

Richard Feldman (Dec 23 2022 at 11:47):

Richard Feldman (Dec 23 2022 at 11:49):

Richard Feldman (Dec 23 2022 at 11:50):

Richard Feldman (Dec 23 2022 at 11:51):

Richard Feldman (Dec 23 2022 at 11:52):

Richard Feldman (Dec 23 2022 at 11:54):

Richard Feldman (Dec 23 2022 at 11:54):

Richard Feldman (Dec 23 2022 at 11:56):

Richard Feldman (Dec 23 2022 at 12:01):

Richard Feldman (Dec 23 2022 at 12:01):

Richard Feldman (Dec 23 2022 at 12:03):

Brian Carroll (Dec 23 2022 at 14:32):

Richard Feldman (Dec 23 2022 at 15:02):

Kevin Gillette (Dec 23 2022 at 19:22):

Kevin Gillette (Dec 23 2022 at 19:25):

Joshua Warner (Dec 23 2022 at 19:47):

Joshua Warner (Dec 23 2022 at 19:49):

Joshua Warner (Dec 23 2022 at 19:51):

Joshua Warner (Dec 23 2022 at 19:52):

Brian Carroll (Dec 23 2022 at 21:05):

Richard Feldman (Dec 23 2022 at 22:12):

Shritesh Bhattarai (Dec 23 2022 at 22:18):

Richard Feldman (Dec 23 2022 at 22:26):

Richard Feldman (Dec 23 2022 at 22:26):

Richard Feldman (Dec 23 2022 at 22:30):

Richard Feldman (Dec 23 2022 at 22:30):

Richard Feldman (Dec 23 2022 at 22:32):

Richard Feldman (Dec 23 2022 at 22:32):

Richard Feldman (Dec 23 2022 at 22:33):

Richard Feldman (Dec 23 2022 at 22:34):

Kevin Gillette (Dec 23 2022 at 22:53):

Kevin Gillette (Dec 23 2022 at 22:57):

Kevin Gillette (Dec 23 2022 at 23:15):

Brendan Hansknecht (Dec 24 2022 at 00:38):

Richard Feldman (Dec 24 2022 at 17:46):

Richard Feldman (Dec 24 2022 at 17:50):

Richard Feldman (Dec 24 2022 at 17:53):

Richard Feldman (Dec 24 2022 at 17:54):

Richard Feldman (Dec 24 2022 at 17:55):

Richard Feldman (Dec 24 2022 at 17:59):

Richard Feldman (Dec 24 2022 at 18:00):

Brendan Hansknecht (Dec 24 2022 at 18:30):

Brendan Hansknecht (Dec 24 2022 at 18:31):

Richard Feldman (Dec 25 2022 at 03:16):

Richard Feldman (Dec 25 2022 at 03:16):

Richard Feldman (Dec 25 2022 at 03:16):

Richard Feldman (Dec 25 2022 at 03:17):

Richard Feldman (Dec 25 2022 at 03:17):

Brendan Hansknecht (Dec 25 2022 at 03:49):

Ayaz Hafiz (Dec 25 2022 at 03:51):

Ayaz Hafiz (Dec 25 2022 at 03:52):

Ayaz Hafiz (Dec 25 2022 at 03:52):

Ayaz Hafiz (Dec 25 2022 at 03:52):

Ayaz Hafiz (Dec 25 2022 at 03:52):

Richard Feldman (Dec 25 2022 at 03:55):

Richard Feldman (Dec 25 2022 at 03:55):

Richard Feldman (Dec 25 2022 at 03:56):

Richard Feldman (Dec 25 2022 at 04:09):

Richard Feldman (Dec 25 2022 at 04:10):

Georges Boris (Dec 25 2022 at 04:12):

Richard Feldman (Dec 28 2022 at 02:02):

Richard Feldman (Dec 28 2022 at 02:03):

Ayaz Hafiz (Dec 28 2022 at 02:06):

Richard Feldman (Dec 28 2022 at 02:07):

Richard Feldman (Dec 28 2022 at 02:07):

Richard Feldman (Dec 28 2022 at 02:07):

Joshua Warner (Dec 28 2022 at 03:00):

Brian Carroll (Dec 28 2022 at 09:01):

Brian Carroll (Dec 28 2022 at 09:01):

Brian Carroll (Dec 28 2022 at 09:04):