I've been experimenting with a new formatter on the side: https://github.com/roc-lang/roc/pull/7011
Not quite ready for prime time, but far enough along that it's worth getting some feedback on.
Lot's of niceties from the old parser aren't implemented yet, so beware!
TL;DR:
is_multiline
and format_with_options
and keep them in sync.At this point, I'd be interested in having some discussion around:
Are there key cases where we don't ever want to omit newlines?
Do we currently get rid of all newlines and introduce our own?
In most current situations, we honor the original user newlines. As a motivating example, here's something that's "stable" under the current formatter (i.e. it doesn't change when formatting):
when b is
1
| 2
| 3 ->
4
5 | 6 | 7 ->
8
9
| 10 -> 11
12 | 13 ->
when c is
14 | 15 -> 16
17
| 18 -> 19
20 -> 21
And here's what the new formatter gives:
when b is
1 | 2 | 3 -> 4
5 | 6 | 7 -> 8
9 | 10 -> 11
12 | 13 ->
when c is
14 | 15 -> 16
17 | 18 -> 19
20 -> 21
So much neater as long as we eventually line wrap
Cause sometime 1 match per line is a lot cleaner/clearer
We do eventually line wrap. Here's what happens when I add a bunch of long numbers:
when b is
1
| 2
| 3
| 123412341234123412341234
| 123412341234
| 123412341234
| 12341234123412
| 123412341234 -> 4
5 | 6 | 7 -> 8
9 | 10 -> 11
12 | 13 ->
when c is
14 | 15 -> 16
17 | 18 -> 19
20 -> 21
Yeah, exactly that
Maybe in this case it'd be cleaner if the branch result was forced to be on its own line
(if the pattern is multiline, I mean)
Anyway - that's actually pretty simple to do in this system, if we want
Awesome!
I'm mostly thinking about our rust code around ast and symbol. We have a lot of really dense matches with various formatting that are of this style.
Personally, I really like that the current formatter honors line breaks. I remember loving that feature on elm-format after using prettier for years.
There are definitely a few case in rust where we would prefer for the rust formatter to honor line breaks. I know of some lines in the roc code base that are just //
with nothing else in order to force a line break.
So I get both sides for sure. I feel like this is one of the 99% of the time not honoring line breaks leads to more readable code, but that 1% can really hurt if the formatter doesn't honor line breaks
One of the things I was planning on doing (but not implemented yet) is always honoring blank lines - i.e. if you've explicitly put one or more blank lines in your code, those will remain in the formatted output.
That's not exactly the same as "should we honor the user's newlines" - but it is an important subset.
@Agus Zubiaga I'd be interested to hear more about your experience. Do you have some examples you could share where the elm formatter honoring newlines was important to maintain the readability of the code?
It's not so much about the readability of a fully-written snippet of code, but about the experience for partially-written code.
I find it really annoying when I'm writing a function (or case branch, if, list, etc) that I know it's going to be long in a minute, but it currently isn't so when I save (to run tests or something) the formatter collapses it, and then I have to immediately introduce the line break again to continue writing it.
That experience is so much better with a formatter that honors newlines because I naturally introduce them ahead of time when they're needed
Cool, that's helpful
Spitballing for a moment...
There are two potentially significant pieces of information here: where the user put newlines, but also where the user may have intentionally _not_ put newlines. (e.g. the user may intentionally not put a line break after every element of a list, to make it more compact or to show some structure)
Right now the current formatter will completely disregard any such decisions from the user.
What I'd like to find is what the right "happy medium" is in terms of what "newline" information we preserve and what we ignore.
(it could be that the best "happy medium" is pretty close to the current behavior - but for the sake of discussion I would like to open the possibilities a bit more)
Your point about code you're actively writing was particularly insightful
I've noticed before when coding and rust, but I might have just written a function definition with a pair of empty braces that I'm going to fill in later. But then I format the code and the formatter collapses the braces onto one line (with the function definition).
I do find that mildly annoying.
Is that the sort of thing you're thinking about?
(I have format-on-save turned on, and I decided to hit save after writing the braces)
I do like the behavior of collapsing empty braces onto the same line as the function definition when it's not in code I've just written, so it doesn't feel super clear cut.
What is the formatter had two different modes? The command line/batch mode where it does a more relaxed version of formatting preserves more of the user new lines, and an interactive mode that explicitly ignores most user new lines. The latter would explicitly require an additional input of "where is the user's cursor " - it would only format the surrounding definition. That can then be an explicitly triggered command: "this code is really ugly, please clean it up for me."
I guess at this point this post should probably be in #ideas
personally my preference is:
my reason for the preference of full control over newlines in handwritten code is basically that there are some situations where I have several chunks of code that I want to treat newlines the same way (e.g. branches in a when
), and when the formatter automatically forces one of them to look different from the others because it's slightly longer, that outcome is so bad that for me it outweighs all the upside of the feature :sweat_smile:
the #1 thing I like better about current roc format
compared to cargo fmt
is that it never enforces making code look worse like that
that said, for generated code such as inferred types (e.g. in error messages or to display inferred types in editors), and also in a future replay debugging feature (where you can generate a .roc file which reproduces the steps you just took manually, so you can make customizable tests out of them) - I very much want it for those use cases!
when the formatter automatically forces one of them to look different from the others because it's slightly longer
This is something we could explicitly try to preserve - e.g. "like things have similar formatting decisions"
(that is, even if we disregard user newlines)
e.g. we decide whether branch results of a given when
will be on the same line as the pattern or a new line, and we force that decision to be the same for all the branches of the when
.
I'd be interested to try that!
if we could succeed at making it feel helpful across the board, that would be awesome :smiley:
Ok, cool. So I think the right way to proceed here will be to (1) have as an initial goal that this new formatter must match the current formatter's output (potentially modulo very obscure changes), (2) prove out the improved flexibility/hackability and (hopefully) negligible perf cost, (3) get that reviewed/merged, and (4) then start experimenting with stricter formatting defaults.
Last updated: Jul 06 2025 at 12:14 UTC