I've been down a rabbit hole on the API for our s-expressions... this is what I have so far.
(can_ir
(d_let :idx=#87
(p_assign (8:1-8:6) :ident="main!" :idx=#73)
(e_lambda (8:9-12:2) :idx=#86
(args
(p_underscore (8:10-8:11) :idx=#74))
(e_block (8:13-12:2)
(s_let (9:2-9:17)
(p_assign (9:2-9:7) :ident="world" :idx=#75)
(e_string (9:10-9:17) :idx=#77
(e_literal (9:11-9:16) :string="World")))
(e_call (11:2-11:31)
(e_lookup_external
(ext_decl (11:2-11:14) :qualified="pf.Stdout.line!" :module="pf.Stdout" :local="line!" :kind="value" :type_var=#79))
(e_string (11:15-11:30)
(e_literal (11:16-11:29) :string="Hello, world!"))))))
(s_import (6:1-6:17) :module="pf.Stdout" :idx=#72
(exposes)))
I'm trying to make it easier to read and clearer by adding attributes :name=value
to nodes instead of just hardcore nesting
Doing this simplifies the rule for newlines, basically one for each child
why is :ident="main!"
better than (ident "main!")
?
It would seem that the newline rule should be a newline for each child node, and all attribute nodes are first and on the same line as the tag for the sexpr
(tag (line:col-line:col) (attr1 val) (attr2 val) ...
(child_1_tag (line:col-line:col) (attr1 val) ...
(child_2_tag) ...
...)
It would seem that the newline rule should be a newline for each child node, and all attribute nodes are first and on the same line as the tag for the sexpr
This was the key change... I'll give the formatting a go with parens again
I would probably go full sexpr and have no custom microformats to parse
(tag (line col line col) (attr1 val) (attr2 val) ...
(child_1_tag (line col line col) (attr1 val) ...
(child_2_tag) ...
...)
(can_ir
(d_let (idx #87)
(p_assign (8:1-8:6) (ident "main!") (idx #73))
(e_lambda (8:9-12:2) (idx #86)
(args
(p_underscore (8:10-8:11) (idx #74)))
(e_block (8:13-12:2)
(s_let (9:2-9:17)
(p_assign (9:2-9:7) (ident "world") (idx #75))
(e_string (9:10-9:17) (idx #77)
(e_literal (9:11-9:16) (string "World"))))
(e_call (11:2-11:31)
(e_lookup_external
(ext_decl (11:2-11:14) (qualified "pf.Stdout.line!") (module "pf.Stdout") (local "line!") (kind "value") (type_var #79)))
(e_string (11:15-11:30)
(e_literal (11:16-11:29) (string "Hello, world!")))))))
(s_import (6:1-6:17) (module "pf.Stdout") (idx #72)
(exposes)))
ok, i don't think the # is necessary
(can_ir
(d_let (id 87)
(p_assign (8 1 8 6) (ident "main!") (id 73))
(e_lambda (8 9 12 2) (id 86)
(args
(p_underscore (8 10 8 11) (id 74)))
(e_block (8 13 12 2)
(s_let (9 2 9 17)
(p_assign (9 2 9 7) (ident "world") (id 75))
(e_string (9 10 9 17) (id 77)
(e_literal (9 11 9 16) (string "World"))))
(e_call (11 2 11 31)
(e_lookup_external
(ext_decl (11 2 11 14) (qualified "pf.Stdout.line!") (module "pf.Stdout") (local "line!") (kind "value") (type_var 79)))
(e_string (11 15 11 30)
(e_literal (11 16 11 29) (string "Hello, world!")))))))
(s_import (6 1 6 17) (module "pf.Stdout") (id 72)
(exposes)))
10 messages were moved here from #compiler development > casual conversation by Luke Boswell.
Maybe tag the Pos node and put the line/col pairs in ()
s? Like (pos (6 1) (6 17))
?
It's an improvement over the current I think https://github.com/roc-lang/roc/blob/main/src/snapshots/hello_world_with_block.md#canonicalize
Definitely
you want to do the least amount of parsing possible, while it still being readable to someone unfamiliar with the specific format (but who has a base level of understanding of sexpr)
I'm thinking something custom for Region's is unnavoidable
are we saving the ids so that we can recover the IR from the Sexpr representation?
I got a bit carried away there. Only some nodes (patterns I think?) need to show them.
Other nodes reference them by NodeIdx. Also TypeVars are NodeIdx's
I figure any node that is referenced (or could be) by another probably should show it's id
I'm just trying to understand if that's useful for debugging
I don't know though... just experimenting with it
I guess we use the Region info to map back to the source, and the type information will reference a pattern in a declaration maybe.
Ok, as far as region you can just have it be a standard s-expr symbol if you start with @
How?
@6!1-6!17
is a valid S-expr symbol
And that can just be an atom
I think @ is helpful for understanding that it's a region (or related to where it's "at")
(can_ir
(d_let (id 87)
(p_assign @8!1-8!6 (ident "main!") (id 73))
(e_lambda @8!9-12!2 (id 86)
(args
(p_underscore @8!10-8!11 (id 74)))
(e_block @8!13-12!2
(s_let @9!2-9!17
(p_assign @9!2-9!7 (ident "world") (id 75))
(e_string @9!10-9!17 (id 77)
(e_literal @9!11-9!16 (string "World"))))
(e_call @11!2-11!31
(e_lookup_external
(ext_decl @11!2-11!14 (qualified "pf.Stdout.line!") (module "pf.Stdout") (local "line!") (kind "value") (type_var 79)))
(e_string @11!15-11!30
(e_literal @11!16-11!29 (string "Hello, world!")))))))
(s_import @6!1-6!17 (module "pf.Stdout") (id 72)
(exposes)))
But unfortunately :
is not usually valid there
Hmm. I'm surprised with the syntax highlighting
Maybe (@ (8 1) (8 16)) is better?
Yeah I'm trying to find something nice
What about this?
(p_assign @8-1-8-6 (ident "main!") (id 73))
I think that's my favourite so far.
That looks fine to me
(p_assign @8-1-8-6 (ident "main!") (id 73))
Just the smallest of comments here, but canonically, sexpr symbols should be in kebab case, not snake
but if we are using the string value of a zig tag directly, it might not be worth a conversion
I might do a pass through each s-expr node and change that as I go.
up to you, just calling it out
you've been doing a great job with these and the snapshots!
Here's an updated PARSE section
(app @1-1-3-57
(provides @2-3-2-10
(exposed-lower-ident (text "main!")))
(record-field @3-28-3-55 (name "pf")
(e-string @3-41-3-54
(e-string-part @3-42-3-53 (raw "../main.roc"))))
(packages @3-2-3-57
(record-field @3-4-3-27 (name "somePkg")
(e-string @3-13-3-26
(e-string-part @3-14-3-25 (raw "../main.roc"))))
(record-field @3-28-3-55 (name "pf")
(e-string @3-41-3-54
(e-string-part @3-42-3-53 (raw "../main.roc"))))))
The kebab makes is more searchable using ctrl-F
I think this looks great
And this is the app we started with, Can section
(can-ir
(d-let (id 87)
(p-assign @8-1-8-6 (ident "main!") (id 73))
(e-lambda @8-9-12-2 (id 86)
(args
(p-underscore @8-10-8-11 (id 74)))
(e-block @8-13-12-2
(s-let @9-2-9-17
(p-assign @9-2-9-7 (ident "world") (id 75))
(e-string @9-10-9-17 (id 77)
(e-literal @9-11-9-16 (string "World"))))
(e-call @11-2-11-31
(e-lookup-external
(ext-decl @11-2-11-14 (qualified "pf.Stdout.line!") (module "pf.Stdout") (local "line!") (kind "value") (type-var 79)))
(e-string @11-15-11-30
(e-literal @11-16-11-29 (string "Hello, world!")))))))
(s-import @6-1-6-17 (module "pf.Stdout") (id 72)
(exposes)))
Beautiful
Random side question... does anyone know how to rename a file so git is happy. I change the casing on this but I'm not sure git has carried it through.
Screenshot 2025-06-24 at 21.35.33.png
maybe this would help
https://adamj.eu/tech/2022/12/09/git-change-case-of-filenames/
tldr: git mv
Thank you.
FWIW I think greppability of the s-expr names (i.e. exact match between what a thing is called in the s-expr and what it's called in the zig source) is really useful for ramping up in an unfamiliar codebase. Agree it's a bit more readable with kebab-case, but that does give me a bit of pause...
Last updated: Jul 06 2025 at 12:14 UTC