Underscores in variable names · contributing

Stream: contributing

Topic: Underscores in variable names

Ajai Nelson (May 21 2023 at 18:55):

I was starting to try to improve some error messages involving ignored arguments and underscores in variable names (including #3987). But then I found #3026: "Allow underscores in field names and variable names." Is it still the plan to start allowing underscores?

Richard Feldman (May 21 2023 at 23:10):

yep!

Ajai Nelson (May 22 2023 at 01:01):

If underscores are allowed in regular variable names, how will we handle unused variables?

Is this the idea?

As before, all unused variables should start with an underscore.
So f = \x -> 3 still produces a warning, and you can still suppress it by adding an underscore: f = \_x -> 3.
But not all variables starting with an underscore have to be unused.
So f = \_x -> _x is allowed and doesn't produce a warning.

Ayaz Hafiz (May 22 2023 at 01:07):

maybe leading underscores are unused variables? though your suggestion also makes sense to me

Richard Feldman (May 22 2023 at 01:34):

yeah I like your suggestion

Richard Feldman (May 22 2023 at 01:34):

part of the motivation here is just knowing that people will commonly serialize things that have underscores in them, and not wanting to create unnecessary friction there

Richard Feldman (May 22 2023 at 01:35):

and it's also plausible that people will serialize things with underscores at the start

Richard Feldman (May 22 2023 at 01:35):

I believe what you suggested is Rust's policy, and it seems to have worked out fine in practice

Brendan Hansknecht (May 22 2023 at 01:42):

Wouldn't it make more sense to find a way to map names when encoding/decoding? Isn't it just as plausible that a json field name will start with a capital letter, which roc wouldn't allow?

Qqwy / Marten (May 22 2023 at 10:28):

What some languages do is to give you a similar warning when you end up using a variable that starts with an underscore as you get when you do not use a variable that doesn't start with an underscore. (A 'should not have been used' warning).

Qqwy / Marten (May 22 2023 at 10:40):

Brendan Hansknecht said:

Wouldn't it make more sense to find a way to map names when encoding/decoding? Isn't it just as plausible that a json field name will start with a capital letter, which roc wouldn't allow?

Languages that use UpperCamelCase for field names are rarer than languages that use snake_case or dromedarisCase. (To my knowledge only C# and C#-adjacent languages follow this convention). There should be a way to map between them, but the question is what to support as part of the language itself and what to delegate to a separate library.

Richard Feldman (May 22 2023 at 11:06):

Wouldn't it make more sense to find a way to map names when encoding/decoding? Isn't it just as plausible that a json field name will start with a capital letter, which roc wouldn't allow?

oh yeah, that reminds me that @Luke Boswell's JSON implementation already supports converting to camelCase (possibly already from snake_case?)

Richard Feldman (May 22 2023 at 11:06):

@Ajai Nelson let's actually hold off on https://github.com/roc-lang/roc/issues/3026 for now

Richard Feldman (May 22 2023 at 11:07):

but definitely still want https://github.com/roc-lang/roc/issues/3987 !

Brendan Hansknecht (May 22 2023 at 13:24):

Is the name conversion done via a dict, or is it first class like when the name actually matches the record field?

Brendan Hansknecht (May 22 2023 at 13:48):

Oh, just looked at the source, no intermediate dictionary needed. It looks like we load the full name, convert it as specified (all names must use the same conversion, but you can write your own custom function), and then converts the full string name into a record field update (not exactly sure how, I assume we generate thst function automatically and it is just a when on the sting name that converts to the field updater).

Brendan Hansknecht (May 22 2023 at 13:57):

So definitely a step forward, but still missing some pieces. At least from my quick reading:

can't change field name type on a fine granularity
will always parse the full field name (so easy to DDoS with a large field name with many words that would take a lot of work to convert between cases). Also, just means we don't generate a more optimal parser that skips things sooner.
the record update lookup has to use the roc field name. Preferably, we would change the lookup to use the field name with the new case instead of updating the field name of inputs from json. Would fix the DDoS mentioned above.

I think we likely have to look into a way to parametrize this more directly such that the names directly affect decodeRecord no matter the decoder. Also, would be great to expose the names in a way that more optimized lookup could be generated. I get that we don't want macros, so we can't just generate something super optimal at compile time due to knowing all the details, but I think there is still definitely a gap here that will be very measurable.

Richard Feldman (May 22 2023 at 18:24):

this is really tricky with structural types though

Richard Feldman (May 22 2023 at 18:25):

like if you have a nominal record type like Rust does, there's a clear place to put per-field information on how you want it to be serialized

Richard Feldman (May 22 2023 at 18:29):

I do think a possible answer here (which is seeming more and more likely to be the way to go the more we talk about this) is:

if you want maximum performance, don't use Encode and Decode at all; rather, do something like protobuf which explicitly uses its own schema and generates code to serialize individual types directly, while also accounting for backwards compatibility
if you want more convenience at the expense of performance, use Encode and Decode, possibly in conjunction with a converter that changes camelCase into snake_case
if you want something in between (e.g. because you're dealing with a weird JSON schema that you don't control, which - for example - mixes camelCase and snake_case in field names), there are various levers you can pull to do this, such as making part of the type you're decoding into opaque, and giving it a custom Decode implementation, etc.

Brendan Hansknecht (May 22 2023 at 18:30):

Specifically for field names: Isn't it just mostly a matter of allowing a custom definition of the Str mapping in stepField? Instead of matching"fieldName" -> decode... I want "field-name" -> decode....

Richard Feldman (May 22 2023 at 18:31):

oh you mean when writing a Decode implementation by hand?

Richard Feldman (May 22 2023 at 18:32):

I'm really just thinking of the automatic one that comes with every record

Brendan Hansknecht (May 22 2023 at 18:35):

I am thinking about in Json.roc when specificying you want kebab-case for example. To avoid the very easy DDoS attack of sending a json with a really long kebab case name. Then json decoding will waste a lot of CPU converting it to camelCase.

Brendan Hansknecht (May 22 2023 at 18:37):

Would rather convert nameIControl (this is the field name in the Roc record) to name-i-control and then use that for comparison than convert some-name-i-dont-control-that-is-very-long-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-a-... to SomeNameIDontControlThatIsVeryLongAAAAAAAAAAAAAAAAAAAAAAAAA... and use that for comparison.

Richard Feldman (May 22 2023 at 18:52):

oh I see

Richard Feldman (May 22 2023 at 18:52):

so basically a way to tell decode "hey right before you're about to compare my field names, first run this conversion on them"

Richard Feldman (May 22 2023 at 18:52):

something like that?

Luke Boswell (May 22 2023 at 20:09):

I might have a look at changing that, good idea Brendan.

Ajai Nelson (May 22 2023 at 21:10):

In R, if you put backticks around a variable name, you can include any character you want as part of the variable name, including dashes and spaces. So `myVar` is the same thing as just myVar , but you can also have a variable called `my var` or `my-var` . This is actually very useful in the context of R, since, for example, it's common to read CSV files where the column names have spaces in them.

I doubt it's the right solution for Roc, but maybe it’s something to keep in mind if people really want something more flexible than the current solution.

Brendan Hansknecht (May 22 2023 at 21:32):

Yeah, exactly, give decode a mapping for field names rather than trying to map all json strings to rocs naming format.

Last updated: Aug 17 2025 at 12:14 UTC