New pretty-printing formatter · compiler development

Not quite ready for prime time, but far enough along that it's worth getting some feedback on.

Anton (Aug 20 2024 at 08:33):

Joshua Warner (Aug 20 2024 at 23:59):

In most current situations, we honor the original user newlines. As a motivating example, here's something that's "stable" under the current formatter (i.e. it doesn't change when formatting):

when b is
    1
    | 2
    | 3 ->
        4

    5 | 6 | 7 ->
        8

    9
    | 10 -> 11

    12 | 13 ->
        when c is
            14 | 15 -> 16
            17
            | 18 -> 19

    20 -> 21

when b is
    1 | 2 | 3 -> 4
    5 | 6 | 7 -> 8
    9 | 10 -> 11
    12 | 13 ->
        when c is
            14 | 15 -> 16
            17 | 18 -> 19
    20 -> 21

Brendan Hansknecht (Aug 21 2024 at 00:01):

Brendan Hansknecht (Aug 21 2024 at 00:02):

Joshua Warner (Aug 21 2024 at 00:04):

We do eventually line wrap. Here's what happens when I add a bunch of long numbers:

when b is
    1
    | 2
    | 3
    | 123412341234123412341234
    | 123412341234
    | 123412341234
    | 12341234123412
    | 123412341234 -> 4
    5 | 6 | 7 -> 8
    9 | 10 -> 11
    12 | 13 ->
        when c is
            14 | 15 -> 16
            17 | 18 -> 19
    20 -> 21

Brendan Hansknecht (Aug 21 2024 at 00:04):

Joshua Warner (Aug 21 2024 at 00:04):

Maybe in this case it'd be cleaner if the branch result was forced to be on its own line

Joshua Warner (Aug 21 2024 at 00:04):

Joshua Warner (Aug 21 2024 at 00:05):

Brendan Hansknecht (Aug 21 2024 at 00:05):

I'm mostly thinking about our rust code around ast and symbol. We have a lot of really dense matches with various formatting that are of this style.

Agus Zubiaga (Aug 21 2024 at 00:14):

Personally, I really like that the current formatter honors line breaks. I remember loving that feature on elm-format after using prettier for years.

Brendan Hansknecht (Aug 21 2024 at 00:24):

There are definitely a few case in rust where we would prefer for the rust formatter to honor line breaks. I know of some lines in the roc code base that are just // with nothing else in order to force a line break.

Brendan Hansknecht (Aug 21 2024 at 00:25):

So I get both sides for sure. I feel like this is one of the 99% of the time not honoring line breaks leads to more readable code, but that 1% can really hurt if the formatter doesn't honor line breaks

Joshua Warner (Aug 21 2024 at 00:31):

One of the things I was planning on doing (but not implemented yet) is always honoring blank lines - i.e. if you've explicitly put one or more blank lines in your code, those will remain in the formatted output.

Joshua Warner (Aug 21 2024 at 00:32):

That's not exactly the same as "should we honor the user's newlines" - but it is an important subset.

Joshua Warner (Aug 21 2024 at 00:33):

@Agus Zubiaga I'd be interested to hear more about your experience. Do you have some examples you could share where the elm formatter honoring newlines was important to maintain the readability of the code?

Agus Zubiaga (Aug 21 2024 at 00:51):

It's not so much about the readability of a fully-written snippet of code, but about the experience for partially-written code.
I find it really annoying when I'm writing a function (or case branch, if, list, etc) that I know it's going to be long in a minute, but it currently isn't so when I save (to run tests or something) the formatter collapses it, and then I have to immediately introduce the line break again to continue writing it.

Agus Zubiaga (Aug 21 2024 at 00:53):

That experience is so much better with a formatter that honors newlines because I naturally introduce them ahead of time when they're needed

Joshua Warner (Aug 21 2024 at 01:59):

Joshua Warner (Aug 21 2024 at 02:02):

There are two potentially significant pieces of information here: where the user put newlines, but also where the user may have intentionally _not_ put newlines. (e.g. the user may intentionally not put a line break after every element of a list, to make it more compact or to show some structure)

Right now the current formatter will completely disregard any such decisions from the user.

Joshua Warner (Aug 21 2024 at 02:03):

What I'd like to find is what the right "happy medium" is in terms of what "newline" information we preserve and what we ignore.

Joshua Warner (Aug 21 2024 at 02:03):

(it could be that the best "happy medium" is pretty close to the current behavior - but for the sake of discussion I would like to open the possibilities a bit more)

Joshua Warner (Aug 21 2024 at 02:05):

Joshua Warner (Aug 21 2024 at 02:06):

I've noticed before when coding and rust, but I might have just written a function definition with a pair of empty braces that I'm going to fill in later. But then I format the code and the formatter collapses the braces onto one line (with the function definition).

Joshua Warner (Aug 21 2024 at 02:06):

Joshua Warner (Aug 21 2024 at 02:07):

(I have format-on-save turned on, and I decided to hit save after writing the braces)

Joshua Warner (Aug 21 2024 at 02:10):

I do like the behavior of collapsing empty braces onto the same line as the function definition when it's not in code I've just written, so it doesn't feel super clear cut.

Joshua Warner (Aug 21 2024 at 02:13):

What is the formatter had two different modes? The command line/batch mode where it does a more relaxed version of formatting preserves more of the user new lines, and an interactive mode that explicitly ignores most user new lines. The latter would explicitly require an additional input of "where is the user's cursor " - it would only format the surrounding definition. That can then be an explicitly triggered command: "this code is really ugly, please clean it up for me."

Joshua Warner (Aug 21 2024 at 02:13):

Richard Feldman (Aug 21 2024 at 02:27):

Richard Feldman (Aug 21 2024 at 02:31):

my reason for the preference of full control over newlines in handwritten code is basically that there are some situations where I have several chunks of code that I want to treat newlines the same way (e.g. branches in a when), and when the formatter automatically forces one of them to look different from the others because it's slightly longer, that outcome is so bad that for me it outweighs all the upside of the feature :sweat_smile:

Richard Feldman (Aug 21 2024 at 02:32):

the #1 thing I like better about current roc format compared to cargo fmt is that it never enforces making code look worse like that

Richard Feldman (Aug 21 2024 at 02:36):

that said, for generated code such as inferred types (e.g. in error messages or to display inferred types in editors), and also in a future replay debugging feature (where you can generate a .roc file which reproduces the steps you just took manually, so you can make customizable tests out of them) - I very much want it for those use cases!

Joshua Warner (Aug 21 2024 at 02:37):

This is something we could explicitly try to preserve - e.g. "like things have similar formatting decisions"

Joshua Warner (Aug 21 2024 at 02:38):

Joshua Warner (Aug 21 2024 at 02:39):

e.g. we decide whether branch results of a given when will be on the same line as the pattern or a new line, and we force that decision to be the same for all the branches of the when.

Richard Feldman (Aug 21 2024 at 02:39):

if we could succeed at making it feel helpful across the board, that would be awesome :smiley:

Joshua Warner (Aug 21 2024 at 02:46):

Ok, cool. So I think the right way to proceed here will be to (1) have as an initial goal that this new formatter must match the current formatter's output (potentially modulo very obscure changes), (2) prove out the improved flexibility/hackability and (hopefully) negligible perf cost, (3) get that reviewed/merged, and (4) then start experimenting with stricter formatting defaults.

Stream: compiler development

Topic: New pretty-printing formatter

Joshua Warner (Aug 20 2024 at 03:57):

Anton (Aug 20 2024 at 08:33):

Joshua Warner (Aug 20 2024 at 23:59):

Brendan Hansknecht (Aug 21 2024 at 00:01):

Brendan Hansknecht (Aug 21 2024 at 00:02):

Joshua Warner (Aug 21 2024 at 00:04):

Brendan Hansknecht (Aug 21 2024 at 00:04):

Joshua Warner (Aug 21 2024 at 00:04):

Joshua Warner (Aug 21 2024 at 00:04):

Joshua Warner (Aug 21 2024 at 00:05):

Brendan Hansknecht (Aug 21 2024 at 00:05):

Brendan Hansknecht (Aug 21 2024 at 00:05):

Agus Zubiaga (Aug 21 2024 at 00:14):

Brendan Hansknecht (Aug 21 2024 at 00:24):

Brendan Hansknecht (Aug 21 2024 at 00:25):

Joshua Warner (Aug 21 2024 at 00:31):

Joshua Warner (Aug 21 2024 at 00:32):

Joshua Warner (Aug 21 2024 at 00:33):

Agus Zubiaga (Aug 21 2024 at 00:51):

Agus Zubiaga (Aug 21 2024 at 00:53):

Joshua Warner (Aug 21 2024 at 01:59):

Joshua Warner (Aug 21 2024 at 01:59):

Joshua Warner (Aug 21 2024 at 02:02):

Joshua Warner (Aug 21 2024 at 02:03):

Joshua Warner (Aug 21 2024 at 02:03):

Joshua Warner (Aug 21 2024 at 02:05):

Joshua Warner (Aug 21 2024 at 02:06):

Joshua Warner (Aug 21 2024 at 02:06):

Joshua Warner (Aug 21 2024 at 02:06):

Joshua Warner (Aug 21 2024 at 02:07):

Joshua Warner (Aug 21 2024 at 02:10):

Joshua Warner (Aug 21 2024 at 02:13):

Joshua Warner (Aug 21 2024 at 02:13):

Richard Feldman (Aug 21 2024 at 02:27):

Richard Feldman (Aug 21 2024 at 02:31):

Richard Feldman (Aug 21 2024 at 02:32):

Richard Feldman (Aug 21 2024 at 02:36):

Joshua Warner (Aug 21 2024 at 02:37):

Joshua Warner (Aug 21 2024 at 02:38):

Joshua Warner (Aug 21 2024 at 02:39):

Richard Feldman (Aug 21 2024 at 02:39):

Richard Feldman (Aug 21 2024 at 02:39):

Joshua Warner (Aug 21 2024 at 02:46):