Stream: ideas

Topic: snake_case instead of camelCase


view this post on Zulip Agus Zubiaga (Nov 10 2024 at 01:57):

I always found snake_case a lot nicer to read, and I'd love to use it instead of camelCase for defs, pattern bindings, and type variables

view this post on Zulip Agus Zubiaga (Nov 10 2024 at 02:00):

I think we should first allow _ to be used everywhere we parse an ident that starts with a lowercase letter, and then make the formatter automatically convert all camelCase ones to snake_case

view this post on Zulip Brian Teague (Nov 10 2024 at 02:01):

While I do agree snake case is easier to read, it's more annoying to type. I want fewer key strokes when coding.

view this post on Zulip Agus Zubiaga (Nov 10 2024 at 02:02):

Well, you can type camelCase and let the formatter do it for you :smiley:

view this post on Zulip Agus Zubiaga (Nov 10 2024 at 02:02):

I wouldn't make uppercase letters a syntax error but a warning. This would allow you to run the formatter on it and have it clean it up.

view this post on Zulip Agus Zubiaga (Nov 10 2024 at 02:02):

This will also help migrate all code

view this post on Zulip Agus Zubiaga (Nov 10 2024 at 02:07):

I did that when I implemented the new modules syntax, and it worked great

view this post on Zulip jan kili (Nov 10 2024 at 02:14):

I'm a big fan of snake_case for uncertain reasons that I'm excited to introspect on and report back here tomorrow when at a keyboard.

view this post on Zulip jan kili (Nov 10 2024 at 02:16):

In languages that prefer camel case but allow snake_case, I use it and enjoy that I can tell instantly which names are mine and which are built-in/third-party. (I believe that says more about my personal preference for snake_case than it says about my desire for the visual distinction.)

view this post on Zulip jan kili (Nov 10 2024 at 02:17):

I will say right now that any coding environment with autocomplete renders almost all _ typing troubles moot.

view this post on Zulip jan kili (Nov 10 2024 at 02:17):

For example, I'd type uslanaTAB for user_last_name.

view this post on Zulip Richard Feldman (Nov 10 2024 at 02:21):

an argument for camelCase is that it's more consistent with the type syntax, which is PascalCase (and I think should stay that way)

view this post on Zulip Richard Feldman (Nov 10 2024 at 02:21):

that said, I also enjoy snake_case aesthetically

view this post on Zulip Agus Zubiaga (Nov 10 2024 at 02:22):

Yeah, I also think types and module names should remain UpperCamelCase

view this post on Zulip Agus Zubiaga (Nov 10 2024 at 02:22):

I think that's what every snake_case language I've used does and it seems fine

view this post on Zulip Richard Feldman (Nov 10 2024 at 02:23):

and it's worth noting that Rust has PascalCase for types and tags (or rather Rust's equivalent of tags) and snake_case for values and field names, and it seems fine

view this post on Zulip Agus Zubiaga (Nov 10 2024 at 02:23):

yeah, python and ruby too

view this post on Zulip Richard Feldman (Nov 10 2024 at 02:24):

I think both options are fine personally, although for some reason I do prefer how function named ending in ! look in snake_case

view this post on Zulip Agus Zubiaga (Nov 10 2024 at 02:25):

Same. I think that might be because the only two languages I've used where ! appears in names also happen to prefer snake_case (Rust and Ruby) :smiley:

view this post on Zulip Brendan Hansknecht (Nov 10 2024 at 02:27):

I think starting or trailing _ does look better with snake case

view this post on Zulip Richard Feldman (Nov 10 2024 at 02:28):

oh yeah that one too :big_smile:

view this post on Zulip Brendan Hansknecht (Nov 10 2024 at 02:33):

I also kinda like variables being distinctly separate from type names (and module). Like more distinct than the difference between PascalCase and camelCase

view this post on Zulip Derin Eryilmaz (Nov 10 2024 at 02:45):

and using _ to ignore a variable fits in better this way

view this post on Zulip Isaac Van Doren (Nov 10 2024 at 04:24):

I like typing camel case more than snake. Snake case is more readable though

view this post on Zulip Kilian Vounckx (Nov 10 2024 at 08:17):

I don't have a preference either way, but would definitely prefer types and modules to remain UpperCamelCase.
In the super rare/probably something wrong case of someone wanting to have a multi-word type variable (like List thisFeelsWrong), I would prefer it to stay lowerCamelCase though. Just for some consistency with other types and distinguishing from normal variables. I can't think of a use case for this, but it is possible I guess

view this post on Zulip Hannes (Nov 10 2024 at 09:42):

I want to suggest something so that I can vote against it: Nim is the only language I know that allows you to use either camelCase or snake_case and the compiler will automatically convert between the two, as in, you can define myFunction and then call it as my_function if you want.

It was something I really disliked about Nim, and I'm not sure I know why, although I'm sure I could've got used to it if I stuck with Nim.

view this post on Zulip Jasper Woudenberg (Nov 10 2024 at 09:49):

I also like snake_case, and using different casings for different types of value. Prior art: Ruby and Zig do this too.

view this post on Zulip Sky Rose (Nov 10 2024 at 14:37):

Assuming we have the formatter convert from one to the other (in either direction), one gotcha we'll need to avoid: If you define both fooBar and foo_bar, then the formatter needs to not convert, because that would be a logic change. You'd need a human to rename one to resolve the conflict.

But if you define fooBar and then reference foo_bar, should the compiler tolerate that, and should the formatter rename one to match the other? It'd be similar to the Nim problem, except with the assumption that when you eventually run the formatter there will be one correct version. It'd also be a very common case, for example if you're trying to call a standard library function with the wrong casing.

view this post on Zulip Norbert Hajagos (Nov 10 2024 at 19:26):

I'm in favor of snake_case, even though camels are more friendly towards birds :)
Since we don't have the feature of _ being a marker where the piped value should go amongst the arguments.
Also, make sure we can't have a definition name be just an _, except if we ment to ignore the value.

view this post on Zulip Richard Feldman (Nov 10 2024 at 22:08):

does anyone have a preference for camelCase? I'd like to hear from that perspective too!

view this post on Zulip Derin Eryilmaz (Nov 10 2024 at 22:17):

Richard Feldman said:

does anyone have a preference for camelCase? I'd like to hear from that perspective too!

i mean, it's faster to type, and the most popular among other languages, so it does have some beginner familiarity for people coming from java or javascript

view this post on Zulip Richard Feldman (Nov 10 2024 at 22:21):

which do you personally prefer?

view this post on Zulip Sam Mohr (Nov 10 2024 at 23:21):

I think that I'm used to snake case and it can improve readability, but between type names and type variables being camelCase, and camelCase being easier to type, I lean towards it

view this post on Zulip Agus Zubiaga (Nov 11 2024 at 01:53):

Do you think that would still matter if you could write camelCase and formatter fixed it?

view this post on Zulip Sam Mohr (Nov 11 2024 at 01:57):

If the auto-formatter fixed it, I'd write snake_case instead

view this post on Zulip Sam Mohr (Nov 11 2024 at 01:58):

It's so minor to me that I'd rather write the right code the first time

view this post on Zulip Agus Zubiaga (Nov 11 2024 at 01:58):

but you could save some time :grinning:

view this post on Zulip Sam Mohr (Nov 11 2024 at 01:58):

speedrunning.mov

view this post on Zulip Sam Mohr (Nov 11 2024 at 01:58):

lol

view this post on Zulip Sam Mohr (Nov 11 2024 at 01:58):

The main point I want to impress is the value of consistency

view this post on Zulip Sam Mohr (Nov 11 2024 at 01:59):

I think Rust is able to communicate that well with TypeNames being in a different case than variable_names

view this post on Zulip Sam Mohr (Nov 11 2024 at 01:59):

But type vars make it a bit trickier in Roc

view this post on Zulip Sam Mohr (Nov 11 2024 at 01:59):

Though by 0.01%, since the number of times you'll see a multi-word type var are extremely small

view this post on Zulip Sam Mohr (Nov 11 2024 at 02:03):

I guess we're gonna do snake_case, then. I'm here for it, I was only pushing against it because it seemed like Richard was surprised that everyone was onboard for a conversion and literally no one was disagreeing

view this post on Zulip Sam Mohr (Nov 11 2024 at 02:03):

But it works well, it's readable, yada yada

view this post on Zulip Richard Feldman (Nov 11 2024 at 02:07):

Sam Mohr said:

I guess we're gonna do snake_case, then. I'm here for it, I was only pushing against it because it seemed like Richard was surprised that everyone was onboard for a conversion and literally no one was disagreeing

well I think making informed decisions requires understanding the major different perspectives, and the status quo hadn't had any advocates in this thread :big_smile:

view this post on Zulip Richard Feldman (Nov 11 2024 at 02:07):

I mean maybe that's because absolutely everyone prefers the change, which can certainly happen, but I'd like to give some time to hear from people who maybe just haven't passed by the discussion first

view this post on Zulip Sam Mohr (Nov 11 2024 at 02:09):

Yeah, not rushing us making a decision! Just making explicit the common surprise that we've been doing camelCase this long and no one has opposed a move to snake_case yet

view this post on Zulip Ajai Nelson (Nov 11 2024 at 02:54):

I also prefer underscores. Here's an interesting article that includes some arguments from both sides, though its conclusion is pro-underscores: https://whatheco.de/2011/02/10/camelcase-vs-underscores-scientific-showdown/.

I think my personal favorite is actually kebab-case, though that might not be a good use of the strangeness budget. (For non-Lisps, the usual counterargument is that the dash could be confused with a minus sign, but I've heard that it hasn't caused any confusion for Pyret, which has been used to teach hundreds of beginning programmers. Pyret requires spaces around operators.)

view this post on Zulip jan kili (Nov 11 2024 at 03:21):

I didn't notice that Rust has that same convention we're considering. Another step toward ROC being Rust, Only Cuter.

view this post on Zulip Sam Mohr (Nov 11 2024 at 03:34):

Ajai Nelson said:

I also prefer underscores. Here's an interesting article that includes some arguments from both sides, though its conclusion is pro-underscores: https://whatheco.de/2011/02/10/camelcase-vs-underscores-scientific-showdown/.

I think my personal favorite is actually kebab-case, though that might not be a good use of the strangeness budget. (For non-Lisps, the usual counterargument is that the dash could be confused with a minus sign, but I've heard that it hasn't caused any confusion for Pyret, which has been used to teach hundreds of beginning programmers. Pyret requires spaces around operators.)

I actually agree that kebab-case is better than either option, but we tend to rule it out for negation reasons, e.g. -three-word-var is easy to miss the - prefix on. If we could make that work for Roc (suggested here), that'd be my vote. This ignores the importance of matching the style of other languages

view this post on Zulip Sam Mohr (Nov 11 2024 at 03:34):

I just assumed it wasn't a real option

view this post on Zulip Eli Dowling (Nov 11 2024 at 04:10):

I am definitely in the pro snake case crowd. For my own coding it won very naturally. I used camel cased languages in my first few years of programming and then slowly moved to some snake cases ones and now I naturally always do snake case if possible .
The main motivator is that acronyms and abbreviations are much harder for me to read in camel case.

view this post on Zulip Eli Dowling (Nov 11 2024 at 04:12):

For me the minus sign meaning negation or subtraction is way too deeply ingrained to even consider kebab case.

view this post on Zulip Hannes (Nov 11 2024 at 07:18):

I mildly prefer camel case, I've used snake case languages for most of my career, but I find camel case slightly nicer, but I wouldn't say it's easier to read, just a purely aesthetic preference

view this post on Zulip Sky Rose (Nov 11 2024 at 14:28):

I mildly prefer camelCase. I switch between both in different contexts at work, and I like that camelCase is easier to type and also easier to edit, like changing the second half of a variable name. Similar to how I prefer ML-like spaces between function args rather than commas and spaces, there's just one less piece of punctuation to juggle. But it's minor. snake_case is definitely easier to read. The most important thing is to be consistent. Seems like there's consensus on snake_case and I have no objection.

view this post on Zulip Sky Rose (Nov 11 2024 at 14:29):

I do like kebab-case even more than snake_case in places that allow it, because you don't have to hold shift while typing it. But it would eat up a bit of strangeness budget.

view this post on Zulip Sky Rose (Nov 11 2024 at 14:29):

I also think it's worth trying to keep our variables written in the same way as in the languages that platforms are written in (Rust+Zig), so that if a roc is embedded in larger application it can use the same variable names. That would mean snake_case.

view this post on Zulip Richard Feldman (Nov 12 2024 at 15:59):

so to summarize, it seems like:

view this post on Zulip Richard Feldman (Nov 12 2024 at 15:59):

does that all sound like a reasonable summary of the thread?

view this post on Zulip Tanner Nielsen (Nov 12 2024 at 16:42):

I think @Sam Mohr's point about type variable casing is an important one that may be overlooked.

I would also pushback against multi-word type variable names being a rarity. Sometimes it's good to be more verbose for clarity, so I think this will definitely come up for some folks.

view this post on Zulip Richard Feldman (Nov 12 2024 at 16:51):

hm, I can't think of a time think I've ever seen it come up so far. :thinking:

view this post on Zulip Tanner Nielsen (Nov 12 2024 at 16:52):

I've done it in my own code, let me see if I can dig up an example

view this post on Zulip Tanner Nielsen (Nov 12 2024 at 16:56):

Well I thought I did, but I guess I didn't :sweat_smile:. But I think the point stands that there could conceivably be a case where someone wouldn't be satisfied with the clarity afforded by a single letter or word, and would want to use a multi-word type var.

view this post on Zulip Tanner Nielsen (Nov 12 2024 at 16:57):

Maybe type vars staying camelCase would be a good enough solution to the issue though. The one thing I wouldn't want to see is snake_case type vars, as they'd easily get confused with variables

view this post on Zulip Brendan Hansknecht (Nov 12 2024 at 16:59):

I assume type vars stay camelCase and this just changes variable_names and function_names. Though it practice most type vars will probably just be lowercase due to being a single word. Then all types and alias and module names as PascalCase

view this post on Zulip jan kili (Nov 12 2024 at 17:04):

I'm okay with that, but leaving one use of camelCase for 0.0:face_with_spiral_eyes:01% of lines of code seems like a confusing surprise for most users a weird annoyance for a few unlucky users a piece of trivia for power users. It would also be an extra thing to remember to document/implement in several places: "In Roc, variable names use snake_case (unless it's a type variable name, like that a you saw earlier, but don't worry, you don't need to remember that because long ones are rare)."

Would it be that weird for type variable names to use the same casing as non-type variable names? We do that today and I don't think anyone's complained.

view this post on Zulip Tanner Nielsen (Nov 12 2024 at 17:11):

camelCase for type vars seems more consistent to me than snake_case if you think of camelCase as AlmostPascalCase :grinning_face_with_smiling_eyes:. PascalishCase for types, snake_case for variables and functions. I seem to be in the minority regarding the commonality of multi-word type vars, but if people are right that it's super rare, then I can't imagine too many people will be upset either way, and in that case I'd vote for consistency (although we seem to disagree on which is more consistent as well :sweat_smile:)

view this post on Zulip Tanner Nielsen (Nov 12 2024 at 17:13):

In Roc, variable names use snake_case (unless it's a type variable name, like that a you saw earlier, but don't worry, you don't need to remember that because long ones are rare).

I would be surprised to see it framed this way. I see type variables as a special kind of type, not a special kind of variable

view this post on Zulip Richard Feldman (Nov 12 2024 at 17:16):

I think that given that it will almost never come up (again, I've literally never seen it come up organically yet), it would be really strange to see it for the first time in your life after years and be like "oh, camelCase in Roc does exist! TIL!"

view this post on Zulip Richard Feldman (Nov 12 2024 at 17:17):

seems like "lowercase is snake_case and uppercase is PascalCase" is the least surprising convention

view this post on Zulip Brendan Hansknecht (Nov 12 2024 at 17:23):

Fair enough

view this post on Zulip Tanner Nielsen (Nov 12 2024 at 17:25):

I'm clearly in the minority here :sweat_smile:, but I'll just close with saying that keeping the distinction between types and variables as clear is possible seems worthwhile, and to that end, distinct casing is useful. A type variable is closer to a type than a variable in my opinion (others may disagree), and camelCase is closer to PascalCase than snake_case, thus the natural choice seems to be camelCase (for type variables specifically). I'll also point out that if multi-word type vars are so rare, then there's no difference between snake_case and camelCase in the majority of cases!

That said, types and "actual code" don't mix as much in Roc as they do in other languages where annotations are necessary, so maybe the visual distinction is less important than I'm thinking. And (while I somewhat disagree), the consensus seems to be that mult-word type vars are so rare that this probably isn't even worth a discussion.

Thank you for at least hearing my argument :smile:

view this post on Zulip Richard Feldman (Nov 12 2024 at 17:52):

ok, let's do it!

If anyone wants to open an issue (and/or just implement it), I think what needs to be done is:

for now, let's not do any compiler warnings for using the wrong style or anything like that...we can separately discuss introducing them later if desired, but at a minimum it seems clear that both styles should be accepted by the parser so that the formatter can automatically convert to the preferred style

view this post on Zulip Norbert Hajagos (Nov 12 2024 at 18:48):

Yesssss, sssssir #7214 is free for anyone to pick up!
I'm so glad for this change, be it however small.

view this post on Zulip Richard Feldman (Nov 12 2024 at 19:00):

nice! want to link to this discussion in case anyone is wondering where the issue came from?

view this post on Zulip Agus Zubiaga (Nov 12 2024 at 19:03):

let’s gooo

view this post on Zulip Norbert Hajagos (Nov 12 2024 at 19:22):

Richard Feldman said:

nice! want to link to this discussion in case anyone is wondering where the issue came from?

Good call, did it!

view this post on Zulip Richard Feldman (Nov 12 2024 at 19:31):

sweet, thanks!

view this post on Zulip Kasper Møller Andersen (Nov 12 2024 at 21:24):

As someone who’s never written any significant code in snake case, I’ve never understood the appeal of it. Is it just for readability? As someone used to camel case, adding in underscores everywhere feels like extra ceremony for little gain, like adding semi colons everywhere.

Does it play nicer with certain editors people use maybe?

view this post on Zulip Sam Mohr (Nov 12 2024 at 21:25):

It's more readable IMO because the underscores act like spaces

view this post on Zulip Sam Mohr (Nov 12 2024 at 21:25):

Meaning when you read it, you can visually group the words more easily

view this post on Zulip Sam Mohr (Nov 12 2024 at 21:26):

with camelCase, I have to look a little bit longer each time for where one word starts and another ends

view this post on Zulip Sam Mohr (Nov 12 2024 at 21:26):

It's minor, but it's definitely a thing for me

view this post on Zulip Kasper Møller Andersen (Nov 12 2024 at 21:32):

I don’t have the experience to make a fair comparison, but I fail to see the appeal at least. :shrug:

view this post on Zulip jan kili (Nov 12 2024 at 21:35):

Norbert Hajagos said:

I'm so glad for this change, be it however small.

Within the context of someone's first 5 seconds with Roc, this could be in the top 5 biggest changes to the language since its creation!

view this post on Zulip Kilian Vounckx (Nov 12 2024 at 22:25):

Richard Feldman said:

and then automatically combine consecutive underscores into 1 underscore so names never have 2 underscores in a row)

What if you actually want to have 2 underscores? I wouldn't know when though

view this post on Zulip Richard Feldman (Nov 12 2024 at 22:52):

if someone raises a specific use case they have, we can talk about it, but by default that's a mistake the formatter can fix

view this post on Zulip Agus Zubiaga (Nov 12 2024 at 22:56):

I’ve seen that used as a poor man’s namespacing technique. I’m in favor of disallowing that by default, though.

view this post on Zulip Brendan Hansknecht (Nov 12 2024 at 23:21):

Yeah, I have seen it in python as essentially extra namespacing for tests

view this post on Zulip Kasper Møller Andersen (Nov 13 2024 at 06:26):

We use the two underscores a lot in Elm for namespacing. Because Elm discourages nesting, we sometimes use this naming convention for “nested” items (both record items and in constructor names)

view this post on Zulip Sam Mohr (Nov 13 2024 at 07:05):

If it is common practice in Elm to subvert an anti-nesting design decision with variable naming conventions, then we probably should take that as a sign that the anti-nesting decision is hampering good code quality and do the opposite.

A.k.a. we should disallow double underscores and encourage nested data structures where appropriate.

view this post on Zulip Norbert Hajagos (Nov 13 2024 at 09:21):

Kasper Møller Andersen said:

As someone who’s never written any significant code in snake case, I’ve never understood the appeal of it. Is it just for readability? As someone used to camel case, adding in underscores everywhere feels like extra ceremony for little gain, like adding semi colons everywhere.

Does it play nicer with certain editors people use maybe?

It's harder to come across a pathologically unreadable name. yesterday, I worked on the compiler. Glad I didn't find acallIndex , but rather a call_index variable. the first one I have to parse letter-by-letter to see what's there, while with the second one, I recognise the words "without reading them".
There also aren't dilemmas about acronyms. getHttpStatus is straight forward to me. Some prefer to capitalize HTTP (since that's an acronym), but they are highly discuraged to do it with camelCase. getHTTPStatus puts me off for a second, since I see HTTPS right there. Minor things, but still, not having to squint every time I read a longer var name is really nice. I also tend to use longer var names, so that's my bias.

view this post on Zulip Kasper Møller Andersen (Nov 13 2024 at 11:00):

Norbert Hajagos sagde:

It's harder to come across a pathologically unreadable name.

Which is fair, but still feels to me like one of those things where the cure is worse than the disease. As in, to avoid pathological cases, everything becomes more verbose. Generally the cases you mention don’t bother me much at least, but again, I don’t have the experience to compare them fairly. So :shrug:

view this post on Zulip witoldsz (Nov 15 2024 at 00:53):

Sam Mohr said:

anti-nesting design decision

The _anti-nesting_ was something I hated about Elm so much. They tell you: craft your types, compose them, but for some bizarre reason: composing a record of other records was a bad idea. Don't ask why, just "trust-me-bro".

view this post on Zulip Brian Teague (Nov 15 2024 at 01:04):

I concur one reason to use snake_case over camelCase is acronyms. I have seen many real world scenarios in my career where acronyms are very inconsistent between properCamelCase and UPPERCase / innerUPPERCase for the acronym.

view this post on Zulip Anthony Bullard (Nov 20 2024 at 00:30):

I have registered interest in making this my first contribution to Roc in the GH issue. Parsers are the thing I've done most in my PL career to this part, so I feel at home there.

view this post on Zulip Luke Boswell (Nov 20 2024 at 00:43):

Welcome @Anthony Bullard, I assigned the issue to you. Let us know if you need anything. :smiley:

view this post on Zulip Anthony Bullard (Nov 20 2024 at 01:00):

What is the perferred place for discussion of issues in active development? Here or the GH issue?

view this post on Zulip Anthony Bullard (Nov 20 2024 at 01:01):

I guess #compiler development may also be appropriate

view this post on Zulip Sam Mohr (Nov 20 2024 at 01:05):

Anthony Bullard said:

What is the perferred place for discussion of issues in active development? Here or the GH issue?

The GitHub issue is good for anyone that's helping you review the change, otherwise #compiler development is good once we are past the discussion stage and have started trying to implement stuff

view this post on Zulip Brendan Hansknecht (Nov 20 2024 at 01:33):

Yeah, but #compiler development and #contributing work. You are much more likely to get implementation and debugging help quickly on Zulip then on a GitHub issues.

If it is about the design and there is already a thread in #ideas, you can also just continue that thread.

view this post on Zulip Anthony Bullard (Nov 20 2024 at 12:08):

@Richard Feldman I just want to be super clear here. We are allowing underscores in Tags and other uppercase names?

So

SomeTag

and

Some_Tag

are going to be valid?

I only ask because the latter is not exactly aesthetically pleasing, but I guess it does allow one to escape some of the same pitfalls as you would find in lowercase names. This is obviously not my call, but I just like absolute clarity before I begin implementation

view this post on Zulip Richard Feldman (Nov 20 2024 at 12:09):

yeah I think the parser should accept both

view this post on Zulip Richard Feldman (Nov 20 2024 at 12:10):

I'm not really concerned about people adopting Pascal_Snake case :big_smile:

view this post on Zulip Anthony Bullard (Nov 20 2024 at 12:11):

Great, yeah I think the aesthetics of it is it's own limiting function :laughing:

view this post on Zulip Anthony Bullard (Nov 20 2024 at 15:16):

Put up the first commit of my first PR. Good progress, but feedback very much appreciated.

view this post on Zulip Luke Boswell (Nov 20 2024 at 19:56):

Is there a risk people use UPPER_CASE for types and modules? Is that ok? I would associate that with a constant.

view this post on Zulip Richard Feldman (Nov 20 2024 at 20:09):

yeah I think if people wanted to use that for anything it'd be values, but that wouldn't even compile since values have to be lowercase

view this post on Zulip Brendan Hansknecht (Nov 20 2024 at 20:23):

Personally I don't see a reason to allow either pascal with underscores or full upper

view this post on Zulip Brendan Hansknecht (Nov 20 2024 at 20:23):

Just allows people to break standards and write code that is really different from everything else

view this post on Zulip Brendan Hansknecht (Nov 20 2024 at 20:24):

Someone will do it and others will have to deal with it

view this post on Zulip Richard Feldman (Nov 20 2024 at 20:41):

I just generally think it's nice to have the parser be more permissive

view this post on Zulip Richard Feldman (Nov 20 2024 at 20:41):

and the formatter less so

view this post on Zulip Isaac Van Doren (Nov 20 2024 at 21:11):

I agree about keeping the parser permissive and the formatter strict in cases where the formatter immediately fixes the mistake the user made, but we couldn’t do that here because it could cause name collisions so I would prefer to not allow Pascal_Snake.

view this post on Zulip Richard Feldman (Nov 20 2024 at 21:17):

oh that's a good point!

view this post on Zulip Richard Feldman (Nov 20 2024 at 21:18):

fair enough - since the formatter can't fix it without potentially introducing compiler errors, let's give a parse error for underscores in uppercase names @Anthony Bullard

view this post on Zulip Anthony Bullard (Nov 20 2024 at 21:29):

I’ll fix up my PR tonight. It’s a small change

view this post on Zulip Richard Feldman (Nov 20 2024 at 21:30):

awesome, thank you!

view this post on Zulip Anthony Bullard (Nov 21 2024 at 11:53):

So I've added some open questions to my PR that I would to have feedback on from the participants in this topic: https://github.com/roc-lang/roc/pull/7233

view this post on Zulip Richard Feldman (Nov 21 2024 at 12:01):

Do we wish to allow trailing '_' in lowercase idents?

definitely! This is a convention we actually want to make use of in the future :big_smile:

view this post on Zulip Richard Feldman (Nov 21 2024 at 12:04):

Do we want to allow multiple '_'s to parse successfully?

I'd say we should treat this the same way as underscores in uppercase names, and for the same reason: if the formatter quietly fixes it, that could introduce bugs, so instead we should have the compiler complain.

that said, I think both of those scenarios should be a warning - that is, the compiler pushes a Problem but otherwise accepts the name as valid, so it doesn't block you from running your program over a stylistic problem

view this post on Zulip Richard Feldman (Nov 21 2024 at 12:08):

Do we want to be strict on disallowing uppers in snake_case idents?

long term, my default thinking is that we should do the same thing here as what we do for double underscore or underscores in capitalized names (that is, warn but don't block).

However, short term I think we should actually have the formatter change this one for you, because otherwise converting all the existing code from camelCase to snake_case will take forever. :big_smile:

view this post on Zulip Anthony Bullard (Nov 21 2024 at 13:35):

Cool, I'll add these answers to the PR description and I'll treat these are invariants for me to track / test. I'll have to dig a bit to look at Problem and how that system works next.

view this post on Zulip jan kili (Nov 21 2024 at 16:32):

What's our plan for acronyms? I remember some folks above liking snake case partly because names like parse_HTTP_response are so readable. Maybe a name shouldn't start with caps like ID_card? Maybe proper noun caps aren't allowed like latest_message_from_Jan? I'd be a little bummed if acronyms must always be lowercase, but it isn't critical.

view this post on Zulip Anthony Bullard (Nov 21 2024 at 16:33):

I personally never use uppercase acronyms in snake_case. And I don’t think I’ve seen it

view this post on Zulip Anthony Bullard (Nov 21 2024 at 16:34):

But I haven’t worked in many code bases the use snake_case (a little Rust and all of Starlark and Elixir)

view this post on Zulip jan kili (Nov 21 2024 at 16:39):

:thinking: Hmm, I could just be imagining this being a thing other people do too, after years of doing it myself in languages like Typescript, Python, and Terraform.

view this post on Zulip Brendan Hansknecht (Nov 21 2024 at 16:46):

I haven't seen it either

view this post on Zulip Anthony Bullard (Nov 21 2024 at 16:50):

I would say we follow whatever norms there are in snake case languages. Python, Elixir, Ruby. In those languages there doesn’t seem to be a language level prohibition on uppers or multiple underscores

view this post on Zulip Anthony Bullard (Nov 21 2024 at 16:51):

I think the question is would we want to be more constrained in the parser or fallback to linting / formatting to address things that we don’t want

Of course after there is consensus on exactly what we don’t want

view this post on Zulip Anthony Bullard (Nov 21 2024 at 16:52):

So, I’ll post this in separate messages and people can register their feedback with :+1: and :-1:

view this post on Zulip Anthony Bullard (Nov 21 2024 at 16:53):

Do we want to allow uppers in snake case identifiers?

view this post on Zulip Anthony Bullard (Nov 21 2024 at 16:53):

Do we want multiple _ in the middle of identifiers?

view this post on Zulip Anthony Bullard (Nov 21 2024 at 16:55):

so parse_HTTP_header and parse_node__expr are the two types identifier we are discussing just so the examples are clear

view this post on Zulip Agus Zubiaga (Nov 21 2024 at 16:57):

I think both should be allowed in the parser, but:

view this post on Zulip Agus Zubiaga (Nov 21 2024 at 16:58):

Treating either of these as syntax errors is needlessly frustrating

view this post on Zulip Agus Zubiaga (Nov 21 2024 at 16:58):

You should be able to run your program fine

view this post on Zulip Richard Feldman (Nov 21 2024 at 17:21):

yeah for sure none of these should be blocking errors

view this post on Zulip Richard Feldman (Nov 21 2024 at 17:21):

like the parser should always accept them, the question is just whether it also records a warning to display to the user as a nonblocking FYI

view this post on Zulip Anthony Bullard (Nov 21 2024 at 17:30):

Yes and then if we say yes to the latter, I need to decide if this is a Problem pushed during canonicalization by inspecting the identifier after parsing

view this post on Zulip Anthony Bullard (Nov 21 2024 at 17:31):

So maybe we say :warning: for “provide warning” and :check: “let it be”?

view this post on Zulip Richard Feldman (Nov 21 2024 at 17:53):

Anthony Bullard said:

Yes and then if we say yes to the latter, I need to decide if this is a Problem pushed during canonicalization by inspecting the identifier after parsing

yeah this seems like the best time to do it :thumbs_up:

view this post on Zulip Jasper Woudenberg (Nov 21 2024 at 21:19):

Random thought, possibly a terrible idea: What if the compiler internally normalizes identifiers, so that foo_bar, foo__bar, foo_BAR and fooBar are all considered the same identifier? Then the formatter would be able to standardize variable names without fear of introducing naming conflicts or other changes of behavior.

Nim does something like this so that people can use their favorige casing style for variable names. This would be a similar feature with the opposite purpose: letting the formatter help enforce a single naming convention.

view this post on Zulip Isaac Van Doren (Nov 21 2024 at 21:21):

The issue with that is that if someone does not use the formatter then you‘ll have to read code that uses different formats for the same name which I would definitely like to avoid.

view this post on Zulip Anthony Bullard (Nov 21 2024 at 21:21):

We _could_ do something like that in canonicalization

view this post on Zulip Anthony Bullard (Nov 21 2024 at 21:22):

But I think I agree with Isaac here

view this post on Zulip Jasper Woudenberg (Nov 21 2024 at 21:27):

Do we think many people would forgo the formatter? A while ago I argued against a change that would make the coding experience a bit worse for people like me that (so far) are still LSP-less, but even I use the formatter :sweat_smile:.

view this post on Zulip Richard Feldman (Nov 21 2024 at 21:58):

relevant from earlier in the thread:

Hannes said:

I want to suggest something so that I can vote against it: Nim is the only language I know that allows you to use either camelCase or snake_case and the compiler will automatically convert between the two, as in, you can define myFunction and then call it as my_function if you want.

It was something I really disliked about Nim, and I'm not sure I know why, although I'm sure I could've got used to it if I stuck with Nim.

view this post on Zulip Anthony Bullard (Nov 21 2024 at 22:55):

Definitely feel like I'm far enough down the road where I'd like to pause and get some feedback on the PR. I've absorbed a lot about the compiler in the past 2.5 days, but I want to make sure I'm following the correct patterns and not misunderstanding expectations.

view this post on Zulip Anthony Bullard (Nov 21 2024 at 22:56):

In case it got lost in the thread, here is a link to the PR: https://github.com/roc-lang/roc/pull/7233

view this post on Zulip Jasper Woudenberg (Nov 22 2024 at 06:47):

Yeah, I read Hannes' comment and agree, I'm not a fan of that Nim feature either.

The difference would be that in Nim being able to use a mix of ways to refer to the same variable is the point. In our case we'd use it for the opposite purpose: applying a single style (through the formatter) better then we otherwise could. So it'd be a formatter-facing feature and not a user-facing feature, if that makes sense.

view this post on Zulip Anthony Bullard (Nov 22 2024 at 14:57):

@Jasper Woudenberg I think I get what you are saying, but that requires canonicalization be part of formatting, no? That could significantly slow down the formatter

view this post on Zulip Anthony Bullard (Nov 22 2024 at 14:58):

A very low priority question for @Richard Feldman : What is the plan for the stdlib as regards this change from camel -> snake case? Will we just rip the bandaid off - maybe with a codemod? Or will we alias and deprecate the camelCase members?

view this post on Zulip Richard Feldman (Nov 22 2024 at 16:14):

rip the band-aid off I think

view this post on Zulip Richard Feldman (Nov 22 2024 at 16:15):

I think it would be too big a project to try to make this backwards compatible, especially when the upgrade path is "run roc format and you're done" :big_smile:

view this post on Zulip Anthony Bullard (Nov 22 2024 at 16:22):

I'm concerned _slightly_ that that won't quite work as well as hope. But moving forward with that approach, getting some feedback and then pivoting if necessary seems reasonable

view this post on Zulip Brendan Hansknecht (Nov 24 2024 at 17:08):

If we want more general Internet opinions and debates....not that we actually do, but:
https://x.com/zack_overflow/status/1860046682018447877?t=FeeTxYmEPO2aMCkjAozTGw&s=19

And probably the most interesting reply in favor of camelCase:
https://gist.github.com/redbar0n/c011f0e0c682a9e1baf3f273fddf730c

view this post on Zulip Richard Feldman (Nov 25 2024 at 03:20):

that gist is actually an interesting example of the context of whitespace vs parens-and-commas calling seeming relevant to me

view this post on Zulip Richard Feldman (Nov 25 2024 at 03:21):

the first line is:

// The following is an example from the language Kitten, but generalizes to other languages.

I'm actually not sure that it does generalize, because the point seems to be that it's harder to tell the whitespace apart from the spaces separating the arguments

view this post on Zulip Richard Feldman (Nov 25 2024 at 03:21):

but if spaces aren't separating the arguments, but rather parens and commas, that seems like a pretty relevant distinction to the point!

view this post on Zulip Brendan Hansknecht (Nov 25 2024 at 03:41):

For sure

view this post on Zulip Brendan Hansknecht (Nov 25 2024 at 03:42):

I also still prefer snake case for readability in their example

view this post on Zulip Brendan Hansknecht (Nov 25 2024 at 03:43):

But also, I never make my text so small to not see the underscores

view this post on Zulip Eli Dowling (Nov 25 2024 at 04:05):

Looking at it without syntax highlighting on mobile in a bright environment, I actually agree. The underscores and kebab are much less "separated" than camel case

view this post on Zulip Eli Dowling (Nov 25 2024 at 04:08):

the underscores and kebabs definitely look a lot more like a big sea for words where the camel each token is much easier to identify and pickup and it is easier to read generally.

I'd like to see this comparison with syntax highlighting, I'm not sure if the difference would still stand

view this post on Zulip Richard Feldman (Nov 25 2024 at 04:39):

I mean overall, the revealed preference is the most important thing imo

view this post on Zulip Richard Feldman (Nov 25 2024 at 04:40):

it's not like all the excitement in this thread about switching from camelCase to snake_case would go away if people saw a sufficiently logical argument :laughing:

view this post on Zulip Richard Feldman (Nov 25 2024 at 04:40):

that's not how syntax preferences work!

view this post on Zulip Kasper Møller Andersen (Nov 25 2024 at 05:53):

That could also just be confirmation bias with a fairly small sample of people who are susceptible to snake case. I’m personally very unexcited by snake case, but I don’t have any strong arguments over what’s already been presented, so there’s no point in banging that same drum for me. It might be that many others feel the same. Or it might be that most people just genuinely prefer snake case

view this post on Zulip Kasper Møller Andersen (Nov 25 2024 at 06:21):

Generally I’m also concerned about even the idea of a formatter changing names for me (though that doesn’t seem to be the aim anymore at least?) or just getting compiler warnings that I’ve used a “bad” name. If I’ve ended up in a situation where using double underscores feels like a reasonable solution, any compiler warning has to have some damn good reasons for being there. And “it’s not consistent with our preferences” and “you should use nesting” doesn’t feel like that, at least from here. For example, in Elm we might have several custom types relating to the same domain in the same file. My experience says that even if those custom types are technically distinct, the code can still become much more readable from adding a bit of explicit namespacing in the name, so you can tell at a glance what you’re looking at, without needing to refer back to a type signature somewhere. At the same time, I also get thoroughly pissed anytime I’m told to do something that doesn’t seem to make any sense. So any compiler warning needs to work hard to sell why I shouldn’t be doing this, and showing what I should do instead

view this post on Zulip Anthony Bullard (Nov 25 2024 at 10:50):

Kasper Møller Andersen said:

So any compiler warning needs to work hard to sell why I shouldn’t be doing this, and showing what I should do instead

This is actually where I'm at. I highly doubt with a snake_case norm that people would throw out identifiers with multiple underscores willy-nilly - they would do it for a purpose. Warning there is just likely to be frustrating for people just trying to get their job done. I've actually already implemented this warning in my PR locally - and it would be annoying to roll it back - but I would rather do that than deliver a frustrating user experience, especially if it lands before the AOC release.

As for the formatter changing names, I can understand the concern. But I am putting in extra effort to make sure the conversion function creates the snake_case identifier that you would want, and not just a "add a _ before an upper and then lowercase the whole thing" type algorithm, but one that tries to understand where acronyms exist and respecting digits as boundaries are well. When I have it complete (I'm traveling currently and have limited time), I'll @ you on the PR and you can check out the test cases and tell me what you think.

view this post on Zulip Brendan Hansknecht (Nov 25 2024 at 16:02):

Did we decide to allow capitals in snake case or are those warnings?

Can someone do

  1. some_HTTP_var
  2. MY_CONSTANT
  3. SoMeThInG_hOrRiD

personally, I really want all of those to be warnings.

I have no qualms with __. It might be accidental, but it doesn't hurt consistency in the same way capitals can: equality__nested_failure_test

view this post on Zulip Kasper Møller Andersen (Nov 25 2024 at 16:11):

I think the formater changing things from camelCase to snake_case is alright, I just got the impression it would do more I guess. Feel free to add me though :smile:

view this post on Zulip Kasper Møller Andersen (Nov 25 2024 at 16:22):

I guess what I don’t understand is why we want to put so much effort into disallowing “weird” names. It’s fair enough to choose either camel or snake case, but beyond that, it feels like needless control to me. It’s the sort of thing where somebody is going to run foul of them even when they have a reasonable use case, and there’s no good reason for it as far as I can tell.

view this post on Zulip Richard Feldman (Nov 25 2024 at 16:42):

I appreciate that perspective, but I'd least like to have the warnings in place at the outset while we're transitioning the ecosystem.

if people complain about it getting in the way of reasonable use cases in practice we can always reevaluate in the future!

view this post on Zulip Brendan Hansknecht (Nov 25 2024 at 16:43):

Roc is an opinionated language in general and enforces a lot of things for consistency. People used to say the same about enforcing a formatter config. Go was the first language to force everyone to use the same formatter (with no config options), and it turned out great. Enforcing consistency often adds value to the community as a whole even if it adds friction for some individuals.


All that said, I don't think the plan has anything too harsh in it. I think the full list of current rules is:

Variable identifiers are lowercase letters and numbers with underscores interspersed. Variables may not start with a number or underscore followed by a number. There can be a leading underscore for unused vars. There can be a trailing underscore for reassignable vars. Repeated underscores are not allowed.


I agree that more constraints could be dropped, but I don't think any of these constraints will significantly hinder someone.

view this post on Zulip Brendan Hansknecht (Nov 25 2024 at 16:43):

Especially if some of the constraint failures just lead to warnings which still allow the code to compile.

view this post on Zulip Kasper Møller Andersen (Nov 25 2024 at 19:52):

And I'd be alright with having consistent naming on acronyms personally, but I'm not sure it's a rule that has large enough benefits to be worth the downsides. If you force lowercasing, you can no longer tell the difference between a SoC and a SOC for example. Not that those two exact examples are likely to occur together, but it just feels like a situation that is bound to occur in real life.

As for constants, how does Roc plan to let you match against constants? As in, can I do something like:

MY_CONSTANT = "constant"
...
where myString is
    MY_CONSTANT -> ...
    "someConstantString" -> ...
    ...

In Scala, an upper case first letter allows you to match against a value like this. That's quite subtle in practice, so I wouldn't recommend it necessarily, but it is a potential use case at least.

Finally, I don't really consider SoMeThInG_hOrRiD as a valid example, as that seems very much like "banning something for the sake of banning it", as it's not something I've ever encountered in practice.

Anyway, it's not a huge deal, I just want to be sure that we're not making rules for the sake of making rules. That tends to give more work and pain that it's worth, in my experience.

view this post on Zulip Brendan Hansknecht (Nov 25 2024 at 20:53):

Yeah, totally fair points. And good context.

view this post on Zulip Brendan Hansknecht (Nov 25 2024 at 20:54):

Given everything is semantically constant in roc, I don't think capitalizing constants has any real meaning in roc. Would just be capitalizing everything that isn't a function.

view this post on Zulip Eli Dowling (Nov 26 2024 at 05:18):

On the topic of caps:
I like making variables that store an environment variable the same as the environment variable they came from. So I like to have IS_DEV over is_dev.

view this post on Zulip Eli Dowling (Nov 26 2024 at 05:19):

Then again, roc being pure global environment variable flags are not super relevant I guess :sweat_smile:

view this post on Zulip Kasper Møller Andersen (Nov 26 2024 at 06:26):

I do appreciate the goal of not having to think about capitalization of acronyms, but it struck me that I don’t actually care about that in function names and variables. The vast majority of the time, I will be consuming such functions rather than defining new ones, and then my IDE will just tell me what they look like, which means I don’t have to expend any energy deciding what I think they should be called.

Instead, where I really care about them is in module names (or module aliases specifically). Because when I call functions from e.g. a GraphQL module, I always have to stop and think whether we are usually importing that module as GraphQL or Graphql in the rest of the application.

In other words, the vast majority of the time when I am spending decision power on how my acronyms should be capitalized, it’s when I’m importing modules. And it seems like we are not solving for that here, but rather only the (in my opinion) much smaller issue of function and variable names.

view this post on Zulip Kasper Møller Andersen (Nov 26 2024 at 06:32):

Eli Dowling sagde:

Then again, roc being pure global environment variable flags are not super relevant I guess :sweat_smile:

Roc can read environment variables just fine, so that’s a fair use case I’d say :big_smile:

view this post on Zulip Anthony Bullard (Nov 26 2024 at 12:35):

Trying to get some early feedback on the use of numbers in identifiers, here are some test cases I have and what my first instinct was (which after running text_syntax's tests I'm not so sure about any more:

        test_once(&arena, "some123", "some_123");
        // nll lll lll lld _ldd ddd ddn
        test_once(&arena, "thehttpstatus404", "the_http_status_404"); // _ldd
        test_once(&arena, "inthe99thpercentile", "in_the_99th_percentile"); // _ldd ddl dll llu
                                                                            // _lul
        test_once(
            &arena,
            "all400serieserrorcodes",
            "all_400_series_error_codes",
        ); // _ldd _dll

        test_once(&arena, "number4yellow", "number_4_yellow"); // _ldu _dul
        test_once(&arena, "usecases4cobol", "use_cases_4_cobol"); // _lul _ldu _duu
        test_once(&arena, "c3po", "c_3_po") // _udu _duu

What would everyone here expect these to be formatted to?

view this post on Zulip Anthony Bullard (Nov 26 2024 at 12:37):

My biggest concern being stuff like in our test cases like infinityF32 which is becoming infinity_f_32 but it feels like most people would expect infinity_f32

view this post on Zulip Richard Feldman (Nov 26 2024 at 14:18):

yeah infinity_f32 seems better to me!

view this post on Zulip Anthony Bullard (Nov 26 2024 at 14:26):

Here's the cases in a more presentable format:

CAMELCASE -> SNAKE_CASE_V1
some123 -> some_123
theHTTPStatus404 -> the_http_status_404
inThe99thPercentile -> in_the_99th_percentile
all400SeriesErrorCodes -> all_400_series_error_codes,
number4Yellow -> number_4_yellow
useCases4Cobol -> use_cases_4_cobol
c3PO -> c_3_po

view this post on Zulip Anthony Bullard (Nov 26 2024 at 14:29):

And this is without treating digits are boundaries

some123 -> some123
theHTTPStatus404 -> the_http_status404
inThe99thPercentile -> in_the99th_percentile
all400SeriesErrorCodes -> all400_series_error_codes,
number4Yellow -> number4yellow
useCases4Cobol -> use_cases4_cobol
c3PO -> c3_po

view this post on Zulip Paul Stanley (Nov 26 2024 at 14:58):

I think I prefer not treating the digits as boundaries. How aggressively will this format? Will it un boundaries, so that it would convert some_123 to some123 ... or is this just going to leave some_123. If it will leave some_123 unchanged then it's an easy one I think.

At any rate I'd have thought number suffixes are common enough that they should normally not be boundaries. isValidUtf8 should not become is_valid_utf_8 but is_valid_utf8.

view this post on Zulip Paul Stanley (Nov 26 2024 at 15:01):

(byTheWayReallyHappyToSeeSnakeCaseComingInAndCamelCaseTakingItsHorribleUnreadableSelfBackToWhereItBelongs. Now if only someone would sacrifice ? as an operator to allow it in identifiers, we'd be in a perfect world ...)

view this post on Zulip Brendan Hansknecht (Nov 26 2024 at 15:42):

Question mark in identifies? What for?

view this post on Zulip Brendan Hansknecht (Nov 26 2024 at 15:42):

I don't think I have ever seen that

view this post on Zulip Anthony Bullard (Nov 26 2024 at 15:44):

Brendan Hansknecht said:

Question mark in identifies? What for?

Ruby and Elixir I think use it for predicate functions, just to say “this returns a Boolean”

view this post on Zulip Brendan Hansknecht (Nov 26 2024 at 15:45):

Interesting. I guess that is the equivalent to an is prefix used in some styles

view this post on Zulip Brendan Hansknecht (Nov 26 2024 at 15:47):

I feel like I'm roc results are more common than booleans. So a ? operator for early return would be used more than a ? at the end of a function name, but I guess it is just a style difference/decision

view this post on Zulip Anthony Bullard (Nov 26 2024 at 15:50):

Wouldn’t argue that. Though an early return from an effect function would look ugly give_me_something!?

view this post on Zulip Richard Feldman (Nov 26 2024 at 15:59):

in the parens and commas world, that will be foo!()?

view this post on Zulip Paul Stanley (Nov 26 2024 at 16:08):

I wasn't seriously suggesting it -- I think I've seen it in scheme, and I always thought it was rather elegant, because the isX convention looks like a statement than a question. Where the convention is used it instantly signals what the function is going to do.

(In fact, it wouldn't be a bad heuristic for a Result type name ... because everyone would immediately know from the name that List.get? would return a Result, whereas List.takeLast does not. But I'm still not suggesting it, mostly because every language is short of easily typed symbols, and ? has better uses.)

view this post on Zulip jan kili (Nov 26 2024 at 16:56):

Kilian Vounckx said:

JanCVanB said:

Unless you're looking for something deeper than equality, I believe that already works with myConstant since every def is constant and every Eq-able type is matchable.

It wouldn't I think? It would just match everything (same as underscore). The only difference is that is would actually bind the thing to the variable. You would get a shadow error if the name is bound somewhere already, and a variable not used error if it isn't.

Hmm, I don't understand what you're communicating here, so I infer that it involves a deep concept that I haven't perceived before. Hopefully someone more experienced can answer your question.

view this post on Zulip jan kili (Nov 26 2024 at 17:07):

Anthony Bullard said:

without treating digits are boundaries

I prefer this, not because your examples look better on the lower right side (they look best on the upper right to me) but because sometimes a word will mix letters and numbers
internationalization_handler
i18n_handler

view this post on Zulip jan kili (Nov 26 2024 at 17:08):

or suffixes like @Paul Stanley said above
u8_converter

view this post on Zulip jan kili (Nov 26 2024 at 17:12):

I'm okay if the formatter misses some ambiguous cases like my501c3nonprofit that should be my_501c3_nonprofit

view this post on Zulip jan kili (Nov 26 2024 at 17:16):

If someone's intentionally coding in camel case where a language wants snake case, it seems reasonable that a couple of variable names will be suboptimal. For the mass conversion of existing codebases, we could just double check all variable names with numbers in them (or actually all variable names, at the small scale we're at) to make sure the results feel right.

view this post on Zulip jan kili (Nov 26 2024 at 17:17):

I volunteer to manually read every variable name in GitHub.com/*/*/**/*.roc if the formatter update generates a CSV of proposed conversions :nerd:

view this post on Zulip Kasper Møller Andersen (Nov 26 2024 at 19:10):

I just wanna highlight this again for opinions, since it seems like it got lost in the conversation :grinning_face_with_smiling_eyes: https://roc.zulipchat.com/#narrow/channel/304641-ideas/topic/snake_case.20instead.20of.20camelCase/near/484442330

view this post on Zulip Brendan Hansknecht (Nov 26 2024 at 20:04):

Ok, I think messages are cleanup up now and named constant in matching was moved to #ideas > pattern matching on named constants which has all of the old context as well.

view this post on Zulip Brendan Hansknecht (Nov 26 2024 at 20:11):

As for the message above on module names. That is an interesting point. I definitely see fights over acronyms in names.

I think what I see most often for readability is that only the first letter of acronyms are capitalized

ParseHtml in pascal case. Or GraphQl....

view this post on Zulip Kasper Møller Andersen (Nov 27 2024 at 06:12):

If it’s important to create more consistency around names with these rules in general, and likening it to a formatter that is not configurable, then module names are the names with the highest impact to me at least.

Making decisions about how I need to name a module alias with acronyms is probably 97% of the cases where I need to make naming decisions on acronyms in our large code base at work. Which is also why I don’t really care about this rule in function names: the impact for me personally is quite minuscule in that space.

view this post on Zulip Brendan Hansknecht (Nov 27 2024 at 06:29):

If that still true with the static dispatch proposal where it will be much less common to see module names. Many function calls will use method syntax instead of being qualified

view this post on Zulip Anthony Bullard (Nov 27 2024 at 11:29):

Here's something that doesn't have to do with the actual format itself. Having the formatter fix the casing of identifiers means that this is somewhere where the AST will be changed by the formatter, which is currently not allowed. I can think of two ways of handling this:

I'm leaning towards the latter as it is simpler. We could also probably add a confirmation prompt for this. I think that means the formatter would have to pass configuration for this around and only perform the migration when the flag is passed. This could be powerful for parens-and-commas migration in the future (@Richard Feldman ).

view this post on Zulip Richard Feldman (Nov 27 2024 at 11:55):

seems reasonable! I'm curious what others think.

view this post on Zulip Anthony Bullard (Nov 27 2024 at 12:07):

The issue with the latter is we implemented most of the formatting functionality as a trait. So in order to accomplish this we will have to add yet another parameter to format_with_options, maybe a struct that will hold all of the immutable (per run) options.

struct FmtOptions {
  snakify: bool,
  // Others like parensAndCommas will come later....
}

view this post on Zulip Anthony Bullard (Nov 27 2024 at 12:08):

And this will need to be drilled down all the way

view this post on Zulip Sky Rose (Nov 27 2024 at 13:44):

What if the formatter treated numbers as word boundaries except in a hardcoded list of special cases, f32, utf8, and whatever else shows up in the builtins. It seems like most examples are better with the extra word boundary, so that with as many exceptions as we can think of should cover nearly every case.

view this post on Zulip Richard Feldman (Nov 27 2024 at 14:22):

hm, I don't think the formatter needs to have a concept of "word boundaries" :thinking:

view this post on Zulip Richard Feldman (Nov 27 2024 at 14:24):

I think it's sufficient to have the rules be:

view this post on Zulip Richard Feldman (Nov 27 2024 at 14:24):

I don't think we need any other rules than that!

view this post on Zulip Anthony Bullard (Nov 27 2024 at 14:38):

Richard Feldman said:

I think it's sufficient to have the rules be:

My experience writing test cases is a lot of camelCase identifiers will have some very terrible snake_case analogs. Any sort of uppercase acronym with come out like _h_t_t_p instead of the more logical _http.

So in my implementation I track the previous current and next letter so that a series of uppers will only emit a single underscore. If you don’t want to take this approach please let me know now so I can adjust

view this post on Zulip Brendan Hansknecht (Nov 27 2024 at 15:58):

Does the formatter actually need to adjust the ast? Can't it just print the identifiers "wrong" such that they are in snake case?

view this post on Zulip Anthony Bullard (Nov 27 2024 at 16:20):

Brendan Hansknecht said:

Does the formatter actually need to adjust the ast? Can't it just print the identifiers "wrong" such that they are in snake case?

The formatted checks the file afterwards and ensures the ast was the same between runs

view this post on Zulip Richard Feldman (Nov 27 2024 at 16:32):

Anthony Bullard said:

Richard Feldman said:

I think it's sufficient to have the rules be:

My experience writing test cases is a lot of camelCase identifiers will have some very terrible snake_case analogs. Any sort of uppercase acronym with come out like _h_t_t_p instead of the more logical _http.

So in my implementation I track the previous current and next letter so that a series of uppers will only emit a single underscore. If you don’t want to take this approach please let me know now so I can adjust

that seems fine :thumbs_up:

view this post on Zulip Kasper Møller Andersen (Nov 27 2024 at 21:22):

Brendan Hansknecht sagde:

If that still true with the static dispatch proposal where it will be much less common to see module names. Many function calls will use method syntax instead of being qualified

It would definitely impact it, but I don't think the overall importance would change much. For example, in Elm I might write

someSelectionSet |> MyGraphQLModule.runQuery

where someSelectionSet is an opaque type defined in the GraphQL library we're using, whereas we have our own function for actually making it into an HTTP request. So in Roc, with static dispatch, we'd write it as

someSelectionSet.pass_to(MyGraphQLModule.runQuery)

which I think would be a normal pattern? You can imagine the same thing for any package you're using, which might build up some API information, which it then leaves to you to send out over your acronymly-named protocol of choice.

view this post on Zulip Brendan Hansknecht (Nov 27 2024 at 21:24):

Makes sense.

view this post on Zulip Kasper Møller Andersen (Nov 27 2024 at 21:55):

I guess the next question is how people feel about module names being snake cased too? Is there any technical reason not to do it, or is it purely a preference thing?

view this post on Zulip Anthony Bullard (Nov 28 2024 at 14:13):

Pretty close to having my PR accomplish all of the above AND passes all of the tests. I have one annoying warning that I can't seem to get rust analyzer to stop emitting short of an explicit annotation. Anyone ever have a test helper function just always say that it is unused even though it's very plainly used many times?

view this post on Zulip Sam Mohr (Nov 28 2024 at 14:25):

It might be because it's only enabled with a feature

view this post on Zulip Anthony Bullard (Nov 28 2024 at 14:29):

I mean it's a test helper, so it's only used in functions with the #[test] annotation

view this post on Zulip Anthony Bullard (Nov 28 2024 at 14:31):

You can see it here: https://github.com/roc-lang/roc/pull/7233/files#diff-1e2afbfc20f8b630b4bb8a987e1e88e148662ff6dfefc9cdcc58d6fc27f11e03R378

view this post on Zulip Sam Mohr (Nov 28 2024 at 14:36):

I think it's because rust analyzer doesn't understand stuff only used in tests

view this post on Zulip Sam Mohr (Nov 28 2024 at 14:36):

There's proooobably a config option for it you can set in your LSP settings

view this post on Zulip Sam Mohr (Nov 28 2024 at 14:37):

https://stackoverflow.com/questions/32900809/how-to-suppress-function-is-never-used-warning-for-a-function-used-by-tests

view this post on Zulip Anthony Bullard (Nov 28 2024 at 14:39):

If it was just my LSP I couldn't care less, but it's when you build or test as well. And no other test helper seems to have this issue.

view this post on Zulip Anthony Bullard (Nov 28 2024 at 14:43):

Thanks for that, read it all the way through and then checked back on what we do elsewhere, missed having #[cfg(test)] at the top of the test module

view this post on Zulip Anthony Bullard (Nov 29 2024 at 11:22):

The PR is now ready for review: https://github.com/roc-lang/roc/pull/7233

view this post on Zulip Anthony Bullard (Nov 29 2024 at 11:31):

@Anton Could I ask for checks to be run against this PR?

view this post on Zulip Anton (Nov 29 2024 at 12:15):

Sure, can you take a look at the merge conflicts first? CI can not be triggered if there are conflicts

view this post on Zulip Anthony Bullard (Nov 29 2024 at 12:17):

OK. That might take me a bit. I'm working on this right now

view this post on Zulip Anthony Bullard (Nov 29 2024 at 12:18):

And then I'll be traveling back to Chicago

view this post on Zulip Anton (Nov 29 2024 at 12:25):

Feel free to @ me when you're done :)

view this post on Zulip Kasper Møller Andersen (Nov 30 2024 at 12:00):

I’m not sure what to make of the silence on module names. It is my clear experience that most decisions on acronym casing happens when importing modules, and I don’t think static dispatch will change that significantly. So if everyone was fond of snake case for solving acronym casing (among other things) for functions and variables, why does no one want to discuss solving it for modules?

Do you disagree with my analysis? Or was the acronym casing not all that important to begin with perhaps?

view this post on Zulip Anton (Nov 30 2024 at 12:45):

It may be nice to keep module names camel case so they they are easily identifiable in the code. I also think it's fine to just wait to see how the currently proposed changes feel in practice and iterate step by step vs making multiple changes at once.

view this post on Zulip Anthony Bullard (Nov 30 2024 at 12:47):

Sorry, what I've implemented allows for snake_case in ALL lowercase identifiers, so packages/platforms could be lowercase, but modules - which are uppercase identifiers - will remain camelCase

view this post on Zulip Kasper Møller Andersen (Nov 30 2024 at 12:56):

For reference, Rust has snake case module names. But instead of using dot between a module name and a function name, it uses :: of course, which makes it easier to differentiate them.

view this post on Zulip Anthony Bullard (Nov 30 2024 at 13:00):

Rust uses :: for all static member access. I'd personally prefer modules - being records, aka values - to have lowercase identifiers (maybe it's the Gopher in me?). But modules in some ways are types as well.

view this post on Zulip Kasper Møller Andersen (Nov 30 2024 at 13:32):

I do think it’s fair to just push this discussion for a later time, if it’s because it would be nicer to break it into steps:smile:

view this post on Zulip Richard Feldman (Nov 30 2024 at 14:55):

Anthony Bullard said:

Rust uses :: for all static member access. I'd personally prefer modules - being records, aka values - to have lowercase identifiers (maybe it's the Gopher in me?). But modules in some ways are types as well.

if we did that and kept . for module access, then you could never name a variable list or num or str (etc.) because then you'd be shadowing the (lowercase) module name. That doesn't sound enjoyable to me. :sweat_smile:

view this post on Zulip Richard Feldman (Nov 30 2024 at 14:58):

we could change to :: for module access like Rust does, but then . for autocomplete works less consistently, beginners have to learn when it's . and when it's :: (the latter would probably also have to be used for custom tag unions, like in Rust) etc.

view this post on Zulip Richard Feldman (Nov 30 2024 at 14:59):

also it still wouldn't address the "how to uppercase acronyms" question because tags would still be uppercase, so the question would remain for how to uppercase tags in acronyms

view this post on Zulip Richard Feldman (Nov 30 2024 at 15:07):

my personal view on this is:

view this post on Zulip Kasper Møller Andersen (Nov 30 2024 at 15:20):

Types will still need acronym casting decisions, but the reason I focus on module names is that, in my Elm experience, the vast, vast majority of the time where I need to actually decide on how I want acronyms to be cased, is when writing module names and aliases. For types and functions, I will mostly be consuming them, so I’m not expending decision power. But we always import modules with an alias at work, so I need to make a decision on casing there over and over again.

Anyway, in the grand scope of things, it’s definitely minor. It just struck me that people wanted to solve the casing problem for function names and variables only, when that is, in my view, the least impactful part of the language to solve it in :blush:

view this post on Zulip Richard Feldman (Nov 30 2024 at 15:35):

sure, but it's no worse than the status quo haha

view this post on Zulip Richard Feldman (Nov 30 2024 at 15:35):

I agree that it's not much of a selling point for snake_case

view this post on Zulip Kasper Møller Andersen (Nov 30 2024 at 15:55):

At least you could actually have consistent casing before, so in that sense it feels like a regression. It’s really just trading one kind of consistency for another I guess :sweat_smile:

view this post on Zulip Brendan Hansknecht (Nov 30 2024 at 20:32):

I’m not sure what to make of the silence on module names.

I haven't said anything cause I don't know if I have a useful opinion. In most languages with a feature like static dispatch, I find that I don't type module names too often and when I do it is in an import line that will autocomplete to the correct capitalization. So I'm personally not too worried. I feel like it is up to each codebase and that is ok. I feel much stronger about function and variable names then I do about module names.

view this post on Zulip Anthony Bullard (Dec 01 2024 at 12:52):

@Anton My PR is ready for a test run now, and I also took the opportunity to make it a single _signed_ commit after setting up commit signing.

view this post on Zulip Anton (Dec 01 2024 at 12:58):

Thanks @Anthony Bullard, I can do a quick review and test run tomorrow. Because we use some self-hosted CI servers we do need to check for malicious code before approving tests

view this post on Zulip Anthony Bullard (Dec 01 2024 at 12:58):

Sounds totally reasonable to me

view this post on Zulip Anthony Bullard (Dec 01 2024 at 12:59):

I'll look for more issues to tackle while I wait :-)

view this post on Zulip Anton (Dec 01 2024 at 12:59):

Awesome!

view this post on Zulip Anthony Bullard (Dec 03 2024 at 11:43):

I saw my PR failed in some of the CI Manager checks, looks like when I update to resolve conflicts, I need to also fix this up:

<                Slice { start: 0, length: 0 },
>                Slice<roc_parse::ast::CommentOrNewline> { start: 0, length: 0 },

view this post on Zulip Anton (Dec 03 2024 at 11:47):

Make sure to do the clippy check as well before you want to run CI:

cargo clippy --workspace --tests -- --deny warnings

view this post on Zulip Anthony Bullard (Dec 04 2024 at 02:53):

Clippy ran, all OK. All tests pass. Build succeeds. All merge conflicts resolved (again; this time it was pretty painful). Test run requested

view this post on Zulip Sam Mohr (Dec 04 2024 at 02:56):

Done

view this post on Zulip Anthony Bullard (Dec 04 2024 at 03:06):

:fingers_crossed:

view this post on Zulip Anton (Dec 04 2024 at 10:29):

All tests passed :tada: , I'll review today

view this post on Zulip Anthony Bullard (Dec 04 2024 at 15:54):

This has been merged, can you mark as complete @Agus Zubiaga ?

view this post on Zulip Anthony Bullard (Dec 04 2024 at 15:55):

I guess the next task would be to update the tutorial and the builtins, and then the roc-lang org owned platforms?

view this post on Zulip Anton (Dec 04 2024 at 16:04):

Maybe we should hold off on that until after advent of code?

view this post on Zulip Richard Feldman (Dec 04 2024 at 16:19):

I think we could get the PRs ready and just hold off on merging?

view this post on Zulip Richard Feldman (Dec 04 2024 at 16:20):

in the case of the platforms, could actually go ahead and land the change + release since the current releases would be unaffected

view this post on Zulip Anton (Dec 04 2024 at 16:32):

I think we could get the PRs ready and just hold off on merging?

They could accumulate substantial conflicts

view this post on Zulip Richard Feldman (Dec 04 2024 at 16:35):

fair

view this post on Zulip Richard Feldman (Dec 04 2024 at 16:36):

I don't think there's any downside to updating platforms and packages, is there?

view this post on Zulip Anton (Dec 04 2024 at 16:37):

That should be good, but they probably should be based on purity-inference branches if those exist for that package/platform

view this post on Zulip Anthony Bullard (Dec 04 2024 at 16:48):

I definitely would not leave PRs lying around in the compiler repo until PI lands and we are ready for it

view this post on Zulip Anthony Bullard (Dec 04 2024 at 16:49):

I think updating the tutorial to mention that snake case is now preferred for lowercase idents would be good

view this post on Zulip Anton (Dec 04 2024 at 16:55):

I think updating the tutorial to mention that snake case is now preferred for lowercase idents would be good

Hmm, snake case has not been extensively tested on platforms and packages. There is also no example code using it, so those seems like good reasons to wait with that.

view this post on Zulip Anthony Bullard (Dec 04 2024 at 16:58):

Fair

view this post on Zulip Luke Boswell (Dec 04 2024 at 19:25):

they probably should be based on purity-inference branches if those exist for that package/platform

Re purity inference, I haven't started on basic-webserver... I've been holding off until we figure out if the basic-cli thing is purity inference or platform related. I'm starting to think it's specific to the platform.

Also I think basic-webserver needs some love in the upgrade to purity inference to remove/cleanup a bunch of the glue types. It makes the most sense to do all this at the same time and they can also align with the improvements made in basic-cli (i.e. handling IO errors).

view this post on Zulip Anthony Bullard (Dec 04 2024 at 19:29):

Anything I can do to help there, let me know

view this post on Zulip Luke Boswell (Dec 04 2024 at 19:36):

I haven't been wanting to rush it, and have been prioritising AoC rn.

I'm definitely tracking it and it shouldn't take very long. Just need a few hours free to make it happen.

I've got a nice Christmas break coming up so plan on getting lots of stuff done then (if I dont get distracted :upside_down:)


Last updated: Jun 16 2026 at 16:19 UTC