s-expression format · compiler development

I've been down a rabbit hole on the API for our s-expressions... this is what I have so far.

(can_ir
    (d_let :idx=#87
        (p_assign (8:1-8:6) :ident="main!" :idx=#73)
        (e_lambda (8:9-12:2) :idx=#86
            (args
                (p_underscore (8:10-8:11) :idx=#74))
            (e_block (8:13-12:2)
                (s_let (9:2-9:17)
                    (p_assign (9:2-9:7) :ident="world" :idx=#75)
                    (e_string (9:10-9:17) :idx=#77
                        (e_literal (9:11-9:16) :string="World")))
                (e_call (11:2-11:31)
                    (e_lookup_external
                        (ext_decl (11:2-11:14) :qualified="pf.Stdout.line!" :module="pf.Stdout" :local="line!" :kind="value" :type_var=#79))
                    (e_string (11:15-11:30)
                        (e_literal (11:16-11:29) :string="Hello, world!"))))))
    (s_import (6:1-6:17) :module="pf.Stdout" :idx=#72
        (exposes)))

Luke Boswell (Jun 24 2025 at 10:23):

I'm trying to make it easier to read and clearer by adding attributes :name=value to nodes instead of just hardcore nesting

Luke Boswell (Jun 24 2025 at 10:24):

Anthony Bullard (Jun 24 2025 at 10:27):

Anthony Bullard (Jun 24 2025 at 10:32):

It would seem that the newline rule should be a newline for each child node, and all attribute nodes are first and on the same line as the tag for the sexpr

(tag (line:col-line:col) (attr1 val) (attr2 val) ...
    (child_1_tag (line:col-line:col) (attr1 val) ...
    (child_2_tag) ...
    ...)

Luke Boswell (Jun 24 2025 at 10:33):

Anthony Bullard (Jun 24 2025 at 10:33):

(tag (line col line col) (attr1 val) (attr2 val) ...
    (child_1_tag (line col line col) (attr1 val) ...
    (child_2_tag) ...
    ...)

Luke Boswell (Jun 24 2025 at 10:34):

(can_ir
    (d_let (idx #87)
        (p_assign (8:1-8:6) (ident "main!") (idx #73))
        (e_lambda (8:9-12:2) (idx #86)
            (args
                (p_underscore (8:10-8:11) (idx #74)))
            (e_block (8:13-12:2)
                (s_let (9:2-9:17)
                    (p_assign (9:2-9:7) (ident "world") (idx #75))
                    (e_string (9:10-9:17) (idx #77)
                        (e_literal (9:11-9:16) (string "World"))))
                (e_call (11:2-11:31)
                    (e_lookup_external
                        (ext_decl (11:2-11:14) (qualified "pf.Stdout.line!") (module "pf.Stdout") (local "line!") (kind "value") (type_var #79)))
                    (e_string (11:15-11:30)
                        (e_literal (11:16-11:29) (string "Hello, world!")))))))
    (s_import (6:1-6:17) (module "pf.Stdout") (idx #72)
        (exposes)))

Anthony Bullard (Jun 24 2025 at 10:34):

Luke Boswell (Jun 24 2025 at 10:35):

(can_ir
    (d_let (id 87)
        (p_assign (8 1 8 6) (ident "main!") (id 73))
        (e_lambda (8 9 12 2) (id 86)
            (args
                (p_underscore (8 10 8 11) (id 74)))
            (e_block (8 13 12 2)
                (s_let (9 2 9 17)
                    (p_assign (9 2 9 7) (ident "world") (id 75))
                    (e_string (9 10 9 17) (id 77)
                        (e_literal (9 11 9 16) (string "World"))))
                (e_call (11 2 11 31)
                    (e_lookup_external
                        (ext_decl (11 2 11 14) (qualified "pf.Stdout.line!") (module "pf.Stdout") (local "line!") (kind "value") (type_var 79)))
                    (e_string (11 15 11 30)
                        (e_literal (11 16 11 29) (string "Hello, world!")))))))
    (s_import (6 1 6 17) (module "pf.Stdout") (id 72)
        (exposes)))

Notification Bot (Jun 24 2025 at 10:35):

Anthony Bullard (Jun 24 2025 at 10:37):

Maybe tag the Pos node and put the line/col pairs in ()s? Like (pos (6 1) (6 17))?

Luke Boswell (Jun 24 2025 at 10:37):

Anthony Bullard (Jun 24 2025 at 10:37):

Anthony Bullard (Jun 24 2025 at 10:39):

you want to do the least amount of parsing possible, while it still being readable to someone unfamiliar with the specific format (but who has a base level of understanding of sexpr)

Luke Boswell (Jun 24 2025 at 10:39):

Anthony Bullard (Jun 24 2025 at 10:39):

are we saving the ids so that we can recover the IR from the Sexpr representation?

Luke Boswell (Jun 24 2025 at 10:40):

I got a bit carried away there. Only some nodes (patterns I think?) need to show them.

Luke Boswell (Jun 24 2025 at 10:40):

Luke Boswell (Jun 24 2025 at 10:41):

I figure any node that is referenced (or could be) by another probably should show it's id

Anthony Bullard (Jun 24 2025 at 10:41):

Luke Boswell (Jun 24 2025 at 10:41):

Luke Boswell (Jun 24 2025 at 10:42):

I guess we use the Region info to map back to the source, and the type information will reference a pattern in a declaration maybe.

Anthony Bullard (Jun 24 2025 at 10:42):

Ok, as far as region you can just have it be a standard s-expr symbol if you start with @

Luke Boswell (Jun 24 2025 at 10:43):

Anthony Bullard (Jun 24 2025 at 10:43):

Anthony Bullard (Jun 24 2025 at 10:44):

Anthony Bullard (Jun 24 2025 at 10:45):

I think @ is helpful for understanding that it's a region (or related to where it's "at")

Luke Boswell (Jun 24 2025 at 10:45):

(can_ir
    (d_let (id 87)
        (p_assign @8!1-8!6 (ident "main!") (id 73))
        (e_lambda @8!9-12!2 (id 86)
            (args
                (p_underscore @8!10-8!11 (id 74)))
            (e_block @8!13-12!2
                (s_let @9!2-9!17
                    (p_assign @9!2-9!7 (ident "world") (id 75))
                    (e_string @9!10-9!17 (id 77)
                        (e_literal @9!11-9!16 (string "World"))))
                (e_call @11!2-11!31
                    (e_lookup_external
                        (ext_decl @11!2-11!14 (qualified "pf.Stdout.line!") (module "pf.Stdout") (local "line!") (kind "value") (type_var 79)))
                    (e_string @11!15-11!30
                        (e_literal @11!16-11!29 (string "Hello, world!")))))))
    (s_import @6!1-6!17 (module "pf.Stdout") (id 72)
        (exposes)))

Anthony Bullard (Jun 24 2025 at 10:45):

Anthony Bullard (Jun 24 2025 at 10:46):

Luke Boswell (Jun 24 2025 at 10:47):

(p_assign @8-1-8-6 (ident "main!") (id 73))

Luke Boswell (Jun 24 2025 at 10:51):

Anthony Bullard (Jun 24 2025 at 10:52):

(p_assign @8-1-8-6 (ident "main!") (id 73))

Anthony Bullard (Jun 24 2025 at 10:52):

Just the smallest of comments here, but canonically, sexpr symbols should be in kebab case, not snake

Anthony Bullard (Jun 24 2025 at 10:53):

but if we are using the string value of a zig tag directly, it might not be worth a conversion

Luke Boswell (Jun 24 2025 at 10:56):

Anthony Bullard (Jun 24 2025 at 11:02):

Luke Boswell (Jun 24 2025 at 11:10):

(app @1-1-3-57
    (provides @2-3-2-10
        (exposed-lower-ident (text "main!")))
    (record-field @3-28-3-55 (name "pf")
        (e-string @3-41-3-54
            (e-string-part @3-42-3-53 (raw "../main.roc"))))
    (packages @3-2-3-57
        (record-field @3-4-3-27 (name "somePkg")
            (e-string @3-13-3-26
                (e-string-part @3-14-3-25 (raw "../main.roc"))))
        (record-field @3-28-3-55 (name "pf")
            (e-string @3-41-3-54
                (e-string-part @3-42-3-53 (raw "../main.roc"))))))

Luke Boswell (Jun 24 2025 at 11:11):

Anthony Bullard (Jun 24 2025 at 11:13):

Luke Boswell (Jun 24 2025 at 11:23):

(can-ir
    (d-let (id 87)
        (p-assign @8-1-8-6 (ident "main!") (id 73))
        (e-lambda @8-9-12-2 (id 86)
            (args
                (p-underscore @8-10-8-11 (id 74)))
            (e-block @8-13-12-2
                (s-let @9-2-9-17
                    (p-assign @9-2-9-7 (ident "world") (id 75))
                    (e-string @9-10-9-17 (id 77)
                        (e-literal @9-11-9-16 (string "World"))))
                (e-call @11-2-11-31
                    (e-lookup-external
                        (ext-decl @11-2-11-14 (qualified "pf.Stdout.line!") (module "pf.Stdout") (local "line!") (kind "value") (type-var 79)))
                    (e-string @11-15-11-30
                        (e-literal @11-16-11-29 (string "Hello, world!")))))))
    (s-import @6-1-6-17 (module "pf.Stdout") (id 72)
        (exposes)))

Anthony Bullard (Jun 24 2025 at 11:26):

Luke Boswell (Jun 24 2025 at 11:36):

Random side question... does anyone know how to rename a file so git is happy. I change the casing on this but I'm not sure git has carried it through.

Kiryl Dziamura (Jun 24 2025 at 11:38):

Luke Boswell (Jun 24 2025 at 11:40):

Joshua Warner (Jun 24 2025 at 15:29):

FWIW I think greppability of the s-expr names (i.e. exact match between what a thing is called in the s-expr and what it's called in the zig source) is really useful for ramping up in an unfamiliar codebase. Agree it's a bit more readable with kebab-case, but that does give me a bit of pause...

Stream: compiler development

Topic: s-expression format

Luke Boswell (Jun 24 2025 at 10:22):

Luke Boswell (Jun 24 2025 at 10:23):

Luke Boswell (Jun 24 2025 at 10:24):

Anthony Bullard (Jun 24 2025 at 10:27):

Anthony Bullard (Jun 24 2025 at 10:32):

Luke Boswell (Jun 24 2025 at 10:33):

Anthony Bullard (Jun 24 2025 at 10:33):

Luke Boswell (Jun 24 2025 at 10:34):

Anthony Bullard (Jun 24 2025 at 10:34):

Luke Boswell (Jun 24 2025 at 10:35):

Notification Bot (Jun 24 2025 at 10:35):

Anthony Bullard (Jun 24 2025 at 10:37):

Luke Boswell (Jun 24 2025 at 10:37):

Anthony Bullard (Jun 24 2025 at 10:37):

Anthony Bullard (Jun 24 2025 at 10:39):

Luke Boswell (Jun 24 2025 at 10:39):

Anthony Bullard (Jun 24 2025 at 10:39):

Luke Boswell (Jun 24 2025 at 10:40):

Luke Boswell (Jun 24 2025 at 10:40):

Luke Boswell (Jun 24 2025 at 10:41):

Anthony Bullard (Jun 24 2025 at 10:41):

Luke Boswell (Jun 24 2025 at 10:41):

Luke Boswell (Jun 24 2025 at 10:42):

Anthony Bullard (Jun 24 2025 at 10:42):

Luke Boswell (Jun 24 2025 at 10:43):

Anthony Bullard (Jun 24 2025 at 10:43):

Anthony Bullard (Jun 24 2025 at 10:44):

Anthony Bullard (Jun 24 2025 at 10:45):

Luke Boswell (Jun 24 2025 at 10:45):

Anthony Bullard (Jun 24 2025 at 10:45):

Anthony Bullard (Jun 24 2025 at 10:46):

Anthony Bullard (Jun 24 2025 at 10:46):

Luke Boswell (Jun 24 2025 at 10:47):

Luke Boswell (Jun 24 2025 at 10:47):

Luke Boswell (Jun 24 2025 at 10:51):

Anthony Bullard (Jun 24 2025 at 10:52):

Anthony Bullard (Jun 24 2025 at 10:52):

Anthony Bullard (Jun 24 2025 at 10:53):

Luke Boswell (Jun 24 2025 at 10:56):

Anthony Bullard (Jun 24 2025 at 11:02):

Anthony Bullard (Jun 24 2025 at 11:02):

Luke Boswell (Jun 24 2025 at 11:10):

Luke Boswell (Jun 24 2025 at 11:11):

Anthony Bullard (Jun 24 2025 at 11:13):

Luke Boswell (Jun 24 2025 at 11:23):

Anthony Bullard (Jun 24 2025 at 11:26):

Luke Boswell (Jun 24 2025 at 11:36):

Kiryl Dziamura (Jun 24 2025 at 11:38):

Luke Boswell (Jun 24 2025 at 11:40):

Joshua Warner (Jun 24 2025 at 15:29):