Stream: compiler development

Topic: snapshot tests


view this post on Zulip Luke Boswell (Oct 08 2024 at 22:02):

I'd like to upgrade the roc_cli tests to use snapshots... because I'm going a bit crazy chasing minor changes. Is it ok to use https://rust-exercises.com/advanced-testing/02_snapshots/00_intro which describes the use of the insta library?

view this post on Zulip Luke Boswell (Oct 08 2024 at 22:03):

Ah, actually it looks like we are already using it elsewhere... I'm going to give it a go

view this post on Zulip Richard Feldman (Oct 08 2024 at 22:05):

sounds good to me! I've liked insta in the places we've used it :thumbs_up:

view this post on Zulip Luke Boswell (Oct 08 2024 at 22:12):

I think it's going to dramatically simplify that test module too which is super nice

view this post on Zulip Luke Boswell (Jun 30 2025 at 22:14):

Should the snapshots be getting checked for diffs in CI?

view this post on Zulip Luke Boswell (Jun 30 2025 at 22:15):

It seems to be a common source of merge conflicts

view this post on Zulip Luke Boswell (Jun 30 2025 at 22:15):

We should be checking these in the PR review

view this post on Zulip Kiryl Dziamura (Jun 30 2025 at 22:16):

aren't they already checked for diffs? or what do you mean?

view this post on Zulip Luke Boswell (Jun 30 2025 at 22:16):

They are checked for any diffs in CI

view this post on Zulip Luke Boswell (Jun 30 2025 at 22:17):

And they are also easy to get merge conflicts

view this post on Zulip Luke Boswell (Jun 30 2025 at 22:17):

I'm not sure it's a big problem or anything... just fishing for ideas/opinions

view this post on Zulip Richard Feldman (Jun 30 2025 at 22:34):

I actually would love to just give them a naming convention that's enforced, like "if the snapshot's name ends with _err2 then it should report 2 errors , and similar with like _warn2err5 etc." and we enforce that in ci

view this post on Zulip Richard Feldman (Jun 30 2025 at 22:35):

basically so if you make a change that causes snapshots to change the number of errors or warnings being reported, that's a CI failure

view this post on Zulip Luke Boswell (Jun 30 2025 at 22:35):

Do we want fuzz crashes and the like in snapshot tests?

view this post on Zulip Luke Boswell (Jun 30 2025 at 22:36):

I'm wondering if we should be maintaining a smaller number of higher quality snapshots, as opposed to the shotgun we currently have

view this post on Zulip Luke Boswell (Jun 30 2025 at 22:40):

Richard Feldman said:

I actually would love to just give them a naming convention that's enforced, like "if the snapshot's name ends with _err2 then it should report 2 errors , and similar with like _warn2err5 etc." and we enforce that in ci

I'm thinking of taking this idea a little further... adding "tags" or similar flags we can use in the META section to express the intent for the snapshot.

view this post on Zulip Luke Boswell (Jun 30 2025 at 22:46):

One issue I know I want to fix is we print the Idx for patterns and type vars etc... except this isnt stable. If we change the number of nodes etc we get a really noisy diff.

I want to change the s-expressions to use Regions I think as the unifying thing as that is stable. So the type vars etc still use idx's under the hood but when displaying they lookup the region for the referenced node and use that.

view this post on Zulip Richard Feldman (Jun 30 2025 at 23:27):

I think those will still change quite a bit when the node structure changes

view this post on Zulip Luke Boswell (Jun 30 2025 at 23:32):

Maybe I didn't explain it very well. I've made the change and I think it looks much nicer.

SOURCE

module [foo]

foo : Str
foo = "one"

CANONICALIZE

(can-ir
    (d-let
        (p-assign @4.1-4.4 (ident "foo"))
        (e-string @4.7-4.12
            (e-literal @4.8-4.11 (string "one")))
        (annotation @4.1-4.4
            (declared-type
                (ty @3.7-3.10 (name "Str"))))))

TYPES

(inferred-types
    (defs
        (d-assign @4.1-4.4 (type "Str")))
    (expressions
        (expr @4.7-4.12 (type "Str"))))

view this post on Zulip Luke Boswell (Jun 30 2025 at 23:33):

Now we don't have any node id's. I think this will reduce the noise a lot.

view this post on Zulip Luke Boswell (Jun 30 2025 at 23:42):

@Jared Ramirez in the TYPES section above, would d-assign be clearer as pattern?

view this post on Zulip Luke Boswell (Jun 30 2025 at 23:42):

Or even patt

view this post on Zulip Jared Ramirez (Jun 30 2025 at 23:43):

Imo pattern of or d-pat is clearer

view this post on Zulip Luke Boswell (Jun 30 2025 at 23:44):

I like patt it's the same length as expr so everything lines up.

view this post on Zulip Jared Ramirez (Jun 30 2025 at 23:45):

I also have a todo in there for this: sometimes it's nice to print out the type var IDs and the entire type store when debugging. This definitely should not be committed but it would be cool if there was a way to easily enable that with a flag.

view this post on Zulip Luke Boswell (Jun 30 2025 at 23:45):

d-pat would be similar, but we already have defs as the parent, so maybe we don't need the d- prefix

view this post on Zulip Luke Boswell (Jun 30 2025 at 23:45):

Should be trivial to add a flag to debug print the type store. We have --verbose on the snapshot tool already.

view this post on Zulip Jared Ramirez (Jun 30 2025 at 23:46):

ptrn? patt is fine tho

view this post on Zulip Luke Boswell (Jun 30 2025 at 23:47):

Also what do you think of flattening it out and removing the type as well?

view this post on Zulip Luke Boswell (Jun 30 2025 at 23:48):

(inferred-types
    (defs
        (d-assign @4.1-4.4 "Str"))
    (expressions
        (expr @4.7-4.12 "Str")))

view this post on Zulip Luke Boswell (Jun 30 2025 at 23:51):

Nvm I remembered we are using the parens for attributes, only region info is displayed without enclosing parens because it's so obvious.

view this post on Zulip Luke Boswell (Jun 30 2025 at 23:56):

@Joshua Warner this should also make the side-by-side HTML version your working on much easier I think.

view this post on Zulip Luke Boswell (Jul 01 2025 at 00:13):

And it's magical... we're back to diffs in PR's that are related to the changes :clap: :smiley:

view this post on Zulip Brendan Hansknecht (Jul 01 2025 at 03:19):

Luke Boswell said:

Do we want fuzz crashes and the like in snapshot tests?

I think we definitely should keep them. They are minimal repros for bugs that were not caught elsewhere.

view this post on Zulip Brendan Hansknecht (Jul 01 2025 at 03:20):

Unless snapshot tests take too long (which I don't think should ever happen), we should keep as many as possible, especially small ones that reproduce issues we find

view this post on Zulip Luke Boswell (Jul 01 2025 at 03:21):

@Brendan Hansknecht did you have any thoughts on my upper bound perf idea? I mentioned you in the snapshot summary PR

view this post on Zulip Brendan Hansknecht (Jul 01 2025 at 03:21):

Luke Boswell said:

Richard Feldman said:

I actually would love to just give them a naming convention that's enforced, like "if the snapshot's name ends with _err2 then it should report 2 errors , and similar with like _warn2err5 etc." and we enforce that in ci

I'm thinking of taking this idea a little further... adding "tags" or similar flags we can use in the META section to express the intent for the snapshot.

That's a good idea.

view this post on Zulip Anton (Jul 01 2025 at 10:18):

Once the reference counting pass gets added to the snapshots, the exact position of the incs and decs is definitely important. Example

view this post on Zulip Brendan Hansknecht (Jul 01 2025 at 15:13):

Yes, though those are maybe better tested with valgrind and running code.... That said, having both is totally reasonable


Last updated: Jul 06 2025 at 12:14 UTC