Stream: beginners

Topic: pattern matching bug


view this post on Zulip Nick Hallstrom (Mar 13 2023 at 17:00):

Shouldn't this work?

scanFinal : [Integer Str, Float Str, Ident Str, Slash, Comment Str, String Str, Not, Eq, Lt, Gt, Start] -> [Number Str, Ident Str, Slash, Comment Str, String Str, Not, Eq, Lt, Gt, Start]
scanFinal = \state ->
    when state is
        Start -> Newline
        Integer n | Float n -> Number n
        Ident a -> Ident a
        Slash -> Slash
        Comment c -> Comment c
        String s -> String s
        Not -> Not
        Eq -> Eq
        Lt -> Lt
        other -> other

If I simply change other -> other to Gt -> Gt then it works. The error I'm getting is:

This other value is a:

    [Comment Str, Eq, Float Str, Gt, Ident Str, Integer Str, Lt, Not,
    Slash, Start, String Str]

But the type annotation on scanFinal says it should be:

    [Comment Str, Eq, Ident Str, Lt, Newline, Not, Number Str, Slash,
    String Str]

view this post on Zulip Folkert de Vries (Mar 13 2023 at 17:02):

if you look at the types, then no, it should not work

view this post on Zulip Nick Hallstrom (Mar 13 2023 at 17:03):

Can you help me understand then? It seems like changing other -> other to Gt -> Gt should be equivalent here, no?

view this post on Zulip Folkert de Vries (Mar 13 2023 at 17:03):

because of how tag unions are represented, it is very likely that the memory representation changes by this function

view this post on Zulip Nick Hallstrom (Mar 13 2023 at 17:04):

I'm just trying to avoid having to have all the ones that are the same on the left and right side of the -> and it seems like other -> other should cover all those

view this post on Zulip Folkert de Vries (Mar 13 2023 at 17:07):

intuitively, yes.

But the Gt tag can look very differently on either side of the arrow, in practice. Consider

foo : [ Gt, Lit Str ] -> [ Gt, Lit U8 ]
foo = \input ->
    when input is
        Lit str -> Lit (Num.intCast (Str.countBytes str))
        Gt -> Gt

The [ Gt, Lit Str ] value requires 24 + 1 = 25 bytes of storage (which is rounded up to 32 because of alignment. In contrast [ Gt, Lit U8 ] requires just 2 bytes of storage. That means the value from the left-hand side cannot just be copied to be used on the right-hand side.

view this post on Zulip Nick Hallstrom (Mar 13 2023 at 17:13):

That's only if other -> other means to blindly copy the value from left side to right side. It seems like other should be equivalent to "whatever tags in the union we haven't matched yet". I can see how this wouldn't work if the input was an open tag union, but for a closed tag union it seems like this could be made to work

view this post on Zulip Brendan Hansknecht (Mar 13 2023 at 17:54):

It definitely is an interesting question which should be the default. currently other -> other links the input and output type. Which make sense given it is an identity function. I think we need to add some sort of helper to allow explicit conversion in cases like this.

view this post on Zulip Brendan Hansknecht (Mar 13 2023 at 17:55):

If we make it implicit, it means that we could have tons of data movement and copying for something that should be free. Would be a very odd performance bug to nail down for end users.

view this post on Zulip Folkert de Vries (Mar 13 2023 at 17:56):

I think zig's inline switch does something like this

view this post on Zulip Folkert de Vries (Mar 13 2023 at 17:57):

where you can do x -> x but really what it does is inline all of the actual things that x could be

view this post on Zulip Richard Feldman (Mar 13 2023 at 18:07):

@Ayaz Hafiz has an issue (I can't find it at the moment) discussing how we could make other -> other Just Work - and yeah it would have an unavoidable runtime performance cost because it would have to create a new one from scratch based on the old one

view this post on Zulip Richard Feldman (Mar 13 2023 at 18:07):

I think that would be a reasonable default

view this post on Zulip Richard Feldman (Mar 13 2023 at 18:08):

it seems like it would be an extremely extremely rare performance case where not only was changing this significant, but there was actually room for performance improvement in practice!

view this post on Zulip Richard Feldman (Mar 13 2023 at 18:08):

like the number of scenarios where you write other -> other and there's a significantly more performant way to have your code do what it needs to do seems extremely small :sweat_smile:

view this post on Zulip Richard Feldman (Mar 13 2023 at 18:09):

and if you can't meaningfully improve it anyway, it may not be all that beneficial knowing what it's doing from a perf perspective to begin with

view this post on Zulip Brendan Hansknecht (Mar 13 2023 at 18:13):

Tags live on the stack by default, right?

view this post on Zulip Richard Feldman (Mar 13 2023 at 18:14):

yeah

view this post on Zulip Ayaz Hafiz (Mar 13 2023 at 18:20):

i don’t know if i have an issue for it, but i do have a blog post about how to do this and the trade offs: https://ayazhafiz.com/articles/22/simple-flow-refinement-of-anonymous-sum-types


Last updated: Jul 05 2025 at 12:14 UTC