so today, this doesn't parse:
a = {
b: c,
d: {
e: f,
},
}
this is for a pretty simple reason: defs are defined to end when you outdent
so the closing } at the end of this code is a parse error, because it's outdented
if you instead wrote it like this:
a =
{
b: c,
d:
{
e: f,
},
}
...then it parses successfully, because there's no outdent at the end
everything within the a definition is indented more than the letter a itself, unlike in the first example where the } at the end is at the same indentation level as a, indicating that the definition has ended
so why is this rule important?
consider this example:
a = foo
bar baz
is this a = foo bar baz? or is it this expression:
a = foo
bar baz
today, we know it's the latter, because of the outdent
but if we don't recognize outdents as the end of expressions, then it becomes ambiguous
so the upside of the current rule is that it's simple, but the downside is that with delimiters like {, [, and (, formatting gets pretty widely spaced out (like the second example above, which is what the formatter does today)
it's also surprising to people that the first example doesn't work, because that's how it looks in most languages
so I want to explore an idea here of making the rule more complex, for the sake of allowing that
so let's say this parsed:
a = {
b: c,
d: {
e: f,
},
}
what would that require?
well, one idea is that we assume that if you outdent but are missing a closing delimiter, it means you're not done yet
in other words, instead of "outdent with unclosed delimiter" is no longer an automatic parse error, but rather means "this expression must not be done yet, so continue parsing even though there was an outdent"
that would solve the above case, but would also allow this:
a = {
b: c,
d: {
e: f,
},
} foo bar
that one is sort of "obviously wrong" since records aren't functions, and you can't pass them arguments, but here's another example that actually could make sense:
a = foo {
b: c,
d: {
e: f,
},
} bar baz
should that be allowed? with the modified "keep parsing if there's an unclosed delimiter" rule, it would be allowed. Is that okay?
one argument is that it's fine, even if it looks a bit weird
another argument is that the parser should accept it, and then the formatter should rewrite it to something that looks nicer - but then there's a reasonable question: specifically what would the formatter format that to that looks better?
another question here is: what about this one?
a = foo {
b: c,
d: {
e: f,
},
} bar
baz
according to the "unclosed delimiter means keep parsing" rule, this should be accepted, but should be different from the previous one in that baz is no longer an argument to foo
because we had an outdent (with respect to a) but there was no unclosed delimiter preventing it from continuing to be a part of a's definition
so that's unambiguous, but perhaps surprising
one more concern here: what does it do to parsing errors?
an unclosed delimiter is a common mistake to make
right now if we have an unclosed delimiter, we know pretty soon where it happened - the end of the def, which we detect as an outdent
if the rule changed, then we'd potentially not be able to detect it until later on
although, granted, if you had another def next (e.g. b = ... after a =), or an outdent to even further than a (indicating that the original def expression is now done) then maybe that wouldn't be too bad
also using the editor should prevent unclosed delimiter parse errors
okay, those are all the tradeoffs i can think of here!
in summary, the question boils down to: should the parser allow this?
a = {
b: c,
d: {
e: f,
},
} foo bar
...or require that it be something more like this?
a =
{
b: c,
d:
{
e: f,
},
} foo bar
...given all the considerations above :big_smile:
any thoughts on this welcome!
Is it worth taking a step back and asking what we want Roc's formatting "culture"/"vibe"/"approach"/"system" to be? Is flexibility prioritized over consistency? Is formatting a task for bots? What are the hard and fast rules that newbies can learn on day one that will guide their expectations?
oh, one other scenario I just thought of:
a = {
b: c,
d: {
e: f,
},
} foo bar
does unclosed delimiter mean outdents get ignored completely? or just that you're allowed specifically to outdent to the same level as a but no further?
the parser should be able to give a good error here
(maybe those questions are already answered or out of scope, and this is just a spot fix)
it should recognize both that it is parsing a record, and something is wrong with the indentation
also given the editor, I'd prefer the parser to be simple and a bit on the strict side
Is it worth taking a step back and asking what we want Roc's formatting "culture"/"vibe"/"approach"/"system" to be? Is flexibility prioritized over consistency? Is formatting a task for bots? What are the hard and fast rules that newbies can learn on day one that will guide their expectations?
I think formatting should be done by the formatter, not by humans
and the formatter should never have any configuration options
Some types of flexibility greatly complicate the parser, so that's also something to consider.
(side question, does the Roc CLI provide Prettier-style file formatting today? If not, is that on the way?)
I would really like that
don't think anything is blocking that? might just try it
We have cargo run format dir/file.roc working, or do you have something else in mind?
Ooh! Does that get bundled into the build as roc format dir/file.roc? I almost exclusively use the built CLI executable.
I think so
Great. Recursive directory-wide formatting would be nice, but this will help a lot. Thank you!
cc @Chad Stearns @Joshua Warner - this discussion may be of interest!
Some types of flexibility greatly complicate the parser, so that's also something to consider.
definitely, which also makes it harder for humans to understand the rules.
this is the one case I can think of in the current parser where the rule is surprising to people in practice
I think formatting should be fine by the formatter, not by humans
Should the formatter accept input that the parser can't, in order to clean it up? Should the formatter have warning/error messages of its own?
oh, I forgot to mention: the way other languages with ML syntax (e.g. Elm, Haskell) deal with this is to format it like the following:
a =
{ b: c,
, d:
{ e : f
}
} foo bar
so you have leading commas instead of trailing commas, and never outdent to the level of the initial def
however, I ruled this out early on
surprising to people
This could be alleviated with clear a order-of-operations lesson in a tutorial - the reason it's surprising is that in most languages delimiters have higher precedence than indentation
because I have seen people literally lose interest in learning Elm just because this looks so alien
and although I don't think that's a good reason to walk away from a language, as many Lisp people will report - people not using a language because the syntax looks too aesthetically displeasing is a very real thing
and I don't think this is worth that cost
so I don't think leading commas are the way to go, for that reason alone
in most languages delimiters have higher precedence than indentation
huh, interesting! I actually hadn't thought about this - the only languages I know of where indentation matters are Python, CoffeeScript, Elm, and Haskell
I guess in Python and CoffeeScript that's the rule? :thinking:
"indentation matters" =?= "indentation overrules delimiters"
I have to admit - personally, having tried the "outdent is the only rule, ignore unclosed delimiters" rule for a couple of years now, I still aesthetically prefer how the more typical formatting looks :big_smile:
where the final } is at the same indentation level as a =
so I'm genuinely open to changing the rule here!
but I'd like to get some other perspectives on it
I have the grammar setup to accept the following def:
a = {
b: c,
d: {
e: f,
},
}
Those parsing rules are not very complicated but I'm not sure how well it would work for errors.
whoa, nice!
the current parser on trunk won't accept that, but maybe it's a quick change to accept it? :thinking:
Well, I'm not sure :p I've looked at the current parser quite a bit but I don't feel I understand it well. For the grammar I also rely on the tokenizer, adding the tokenizer to the current parser would take some work.
here's another case to consider: should this parse?
button : {
label : Elem state,
onPress : state, PressEvent -> Action state,
}
-> Elem state
if so, I think that requires backtracking (could be wrong), which is bad for parsing performance
alternatively, could require that it be:
button : {
label : Elem state,
onPress : state, PressEvent -> Action state,
} -> Elem state
but then what if the return value is multiline?
it would have to be:
button : {
label : Elem state,
onPress : state, PressEvent -> Action state,
} -> {
blah : Str,
thing : Etc,
}
which I don't think looks great :sweat_smile:
maybe the formatter could reformat it to something else, but I'm not sure what that would be!
I also have to admit: Though I default to writing multilines like
a = {
...
}
I don't think it's crazy or ugly for Roc to require
a =
{
...
}
since it's clear and extensible with pre-record defs like
a =
b = foo
c = bar
{
d: b + c,
...
}
However, would the same pattern extend to type defs?
button :
{
label : Elem state,
onPress : state, PressEvent -> Action state,
} -> {
blah : Str,
thing : Etc,
}
Its consistency with value defs is nice, but it doesn't benefit from the same "extensible with pre-return defs" pattern... unless Roc has some crazy type def block feature like
button :
a :
{
label : Elem state,
onPress : state, PressEvent -> Action state,
}
b :
{
blah : Str,
thing : Etc,
}
a -> b
the more I look at it, the more I'm fine with "values and types" are consistent
e.g.
-> Blah
vs.
} -> Blah
actually read pretty similarly even though the -> isn't quite at the beginning of the line in the latter
My general thoughts: I'm fairly ambivolent on exactly what the "correct according to the formatter" way to indent things - but I think it's pretty important for beginners that the parser is as forgiving as it can reasonably be. I've put down more than one language because I spent hours fighting with nuances of the syntax and got frustrated.
For this reason, I'd personally be willing to bend over backwards in the parser to accept as many of these indentation/newline combinations as possible - optionally issuing warnings and/or autocorrects in case things might be ambiguous.
is there a chance of someone calling a function pass a list of arguments over multiple lines? such as:
myRealSweetFunction {
foo: 1,
bar: 2
} secondArg thirdArg
this Elm snippet compiles:
foobar : { a : Int } -> Int -> Int -> Int
foobar _ _ _ = 0
magicValue : Int
magicValue =
foobar {
a = 99
} 1 2
that should parse, although I think I'd want the formatter to put the second and third args on their own lines
Joshua Warner said:
I think it's pretty important for beginners that the parser is as forgiving as it can reasonably be. I've put down more than one language because I spent hours fighting with nuances of the syntax and got frustrated.
I would second this, IF we would write Roc code in a normal text editor and then hand it off to the parser/compiler. But I expect that our editor will be smart enough to detect and correct wrong indents on the fly and even explain what was wrong and how to avoid such mistakes in the future.
Yep, agree that the editor can be smarter here. However, I think making a normal text editor "inviting" will be a critical path for onboarding new users. IMO, the "normal text editor" experience of roc needs to be at par with other languages - and the roc-editor experience needs to be even better.
I don't think having the roc-editor is an acceptable excuse for making the normal-text-editor experience painful.
I think parsing and formatting to..
a = {
b: c,
d: {
e: f,
},
} foo bar
..sounds good.
And maybe illegal for the closing brace to have an indent level lower than the line of the opening brace.
And then maybe
a = foo {
b: c,
d: {
e: f,
},
} bar
baz
should format to..
a = foo
{
b: c,
d: {
e: f,
},
}
bar
baz
.. because some kind of multi-line rule will kick in, that requires "if any part of this expression is multiline, then the whole thing should be"
Just my impression. Does that all make sense?
I accidentally violate this same line syntax rule for brackets every time I approach some Roc code- just because of what I am familiar with in other languages- so I definitely see the value in making a syntax rule exception for opening and closing braces like this in order to make the language more accessible to new people. And given that its more accessible, why not just make that the default all the time?
a = foo {
b: c,
d: {
e: f,
},
} bar
baz
I think this one has to parse differently than the first one, because the baz at the end would be treated as the expression at the end of a def
Oh yeah. That makes sense.
Last updated: Jun 16 2026 at 16:19 UTC