Stream: ideas

Topic: Request for ideas on enum decoding


view this post on Zulip Eli Dowling (Apr 07 2024 at 08:36):

So I've been working on getting the lsp spec working in roc and honestly it's been pretty painful, enums and unions are really pretty rough. Here is an example of what's needed to decode a single enum:

interface CompletionItemKind
    exposes [
        CompletionItemKind,
    ]
    imports [
        DecodeUtils,
    ]

CompletionItemKind := [
    Text,
    Method,
    Function,
    Constructor,
    Field,
    Variable,
    Class,
    Interface,
    Module,
    Property,
    Unit,
    Value,
    Enum,
    Keyword,
    Snippet,
    Color,
    File,
    Reference,
    Folder,
    EnumMember,
    Constant,
    Struct,
    Event,
    Operator,
    TypeParameter,
]
    implements [
        Decoding {
            decoder: decodeCompletionItemKind,
        },
        Encoding {
            toEncoder: encodeCompletionItemKind,
        },
    ]
get= \@CompletionItemKind val-> val
from= \val-> @CompletionItemKind val

decodeCompletionItemKind =
    ok = \tag -> Ok (@CompletionItemKind tag)
    DecodeUtils.wrapDecode \val ->
        when val is
            1 -> ok Text
            2 -> ok Method
            3 -> ok Function
            4 -> ok Constructor
            5 -> ok Field
            6 -> ok Variable
            7 -> ok Class
            8 -> ok Interface
            9 -> ok Module
            10 -> ok Property
            11 -> ok Unit
            12 -> ok Value
            13 -> ok Enum
            14 -> ok Keyword
            15 -> ok Snippet
            16 -> ok Color
            17 -> ok File
            18 -> ok Reference
            19 -> ok Folder
            20 -> ok EnumMember
            21 -> ok Constant
            22 -> ok Struct
            23 -> ok Event
            24 -> ok Operator
            25 -> ok TypeParameter
            _ -> Err TooShort

encodeCompletionItemKind = \@CompletionItemKind val ->
    num =
        when val is
            Text -> 1
            Method -> 2
            Function -> 3
            Constructor -> 4
            Field -> 5
            Variable -> 6
            Class -> 7
            Interface -> 8
            Module -> 9
            Property -> 10
            Unit -> 11
            Value -> 12
            Enum -> 13
            Keyword -> 14
            Snippet -> 15
            Color -> 16
            File -> 17
            Reference -> 18
            Folder -> 19
            EnumMember -> 20
            Constant -> 21
            Struct -> 22
            Event -> 23
            Operator -> 24
            TypeParameter -> 25
    Encode.u32 num

I was hoping someone might have some ideas how this might be improved. If no such improvement exists I think a discussion of some improved syntax for defining number and string enum encodings might be warranted.

view this post on Zulip Eli Dowling (Apr 07 2024 at 08:48):

For comparison here is the same definition in C#:

    public enum CompletionItemKind
    {
        Text = 1,
        Method = 2,
        Function = 3,
        Constructor = 4,
        Field = 5,
        Variable = 6,
        Class = 7,
        Interface = 8,
        Module = 9,
        Property = 10,
        Unit = 11,
        Value = 12,
        Enum = 13,
        Keyword = 14,
        Snippet = 15,
        Color = 16,
        File = 17,
        Reference = 18,
        Folder = 19,
        EnumMember = 20,
        Constant = 21,
        Struct = 22,
        Event = 23,
        Operator = 24,
        TypeParameter = 25,
    }

view this post on Zulip Richard Feldman (Apr 07 2024 at 12:42):

yeah I've thought about this in the past

view this post on Zulip Richard Feldman (Apr 07 2024 at 12:43):

one idea is to allow something like this:

CompletionItemKind := [
    Text = 1,
    Method = 2,
    ...etc
]

and then only allow that syntax specifically when defining opaque types

view this post on Zulip Richard Feldman (Apr 07 2024 at 12:44):

and not when defining type aliases or anonymous tag union types

view this post on Zulip Richard Feldman (Apr 07 2024 at 12:47):

there are some follow-up questions there though, such as:

view this post on Zulip Agus Zubiaga (Apr 07 2024 at 12:47):

(deleted)

view this post on Zulip Richard Feldman (Apr 07 2024 at 12:49):

another case where this comes up is http response codes

view this post on Zulip Richard Feldman (Apr 07 2024 at 12:49):

like wanting to be able to say NotFound and have that turn into 404

view this post on Zulip Eli Dowling (Apr 07 2024 at 12:53):

I was thinking something similar.
Would it be remotely possible within type aliases?
With an opaque type we do still have to annoyingly convert too and from using the get and from methods.

I was thinking, could we perhaps attach some extra field to the type at compile time and then use that when generating the decoder/encoder for those tags?
So you can just write: CompletionKind:[Text=1,Method=2]

view this post on Zulip Eli Dowling (Apr 07 2024 at 12:57):

I'd say a definite no to "enum tags" with payloads. It's hard to imagine how that would work in most of the "key value" style encodings.

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:03):

the problem with doing it in type aliases is that there could be conflicts when unions merge

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:04):

like if I have [Foo, Bar] and that gets unioned with [Foo, Baz], that results in [Foo, Bar, Baz]

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:05):

but what if in the first one I said Foo is 1 and in the second one I said Foo is 3...in the combined union, what is Foo?

view this post on Zulip Eli Dowling (Apr 07 2024 at 13:06):

Ahh, yeah u see what you mean. It does seem like something that should probably only exist in annotations, these types are at the edges of your program after all. Like if your unions merge that info gets dropped.

Could it be required that it be a closed union? Would that fix it?

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:06):

right now since you can't specify, the compiler is free to choose whatever numbers it wants and there's no ambiguity

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:06):

hm maybe, but I think it could still be confusing

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:08):

opaque types seem like a better fit for this because they don't get unioned etc.

view this post on Zulip Eli Dowling (Apr 07 2024 at 13:14):

Do we plan to encode standard un-opaque tag unions as numbers or strings?

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:14):

no plans, but if there's some motivating use case we could discuss it

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:23):

one idea for how to go to/from numbers would be abilities:

ToU8 implements {
    toU8 : val -> U8
        where val implements ToU8
}
FromU8 implements {
    fromU8 : U8 -> Result val [OutOfRange]
        where val implements ToU8
}

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:27):

and then auto implement those for opaque types which use that syntax

view this post on Zulip Eli Dowling (Apr 07 2024 at 13:31):

Richard Feldman said:

opaque types seem like a better fit for this because they don't get unioned etc.

I do think not having to wrap and unwrap the opaque type is a significant ergonomic improvement when working with these types.

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:39):

I just don't see how it can work though haha

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:41):

like if I write:

Even : [Foo = 2, Bar = 4]
Odd : [Foo = 1, Baz = 5]

getAnEven : Stuff -> Even
getAnOdd : Stuff -> Odd

x =
    if something then
        getAnEven stuff1
    else
        getAnOdd stuff2

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:41):

what is the type of x?

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:42):

or is the idea that [Foo = 1] is a totally separate type from [Foo] and now we track numbers as part of the type?

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:42):

I guess that's an option

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:43):

in which case x would be a type mismatch, whereas it would work fine if you removed the type annotations or removed the = 1 =2 etc

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:47):

as a brief aside, one of the notable things about having this baked into the language is that there can be a performance benefit compared to the status quo

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:48):

so there's the ergonomic piece of having the conversion to/from number be auto-derived instead of having to write it out (like you do today), but then separately there's the piece of having the in-memory representation already be the number, so converting it to the desired number is free at runtime

view this post on Zulip Eli Dowling (Apr 07 2024 at 13:49):

Basically, but potentially you could have it so that when you merge the tag unions it just drops the numbers, also Foo could automatically cast to Foo=1

I'll make some examples and play with the idea a bit more tomorrow when I'm at my computer.

Basically I was thinking you have to do these annotations at the decode encode edges of your program and it all gets automatically converted to and from normal tag unions.

view this post on Zulip Richard Feldman (Apr 07 2024 at 13:50):

isn't that essentially the opaque type design? :thinking:

view this post on Zulip Jasper Woudenberg (Apr 07 2024 at 16:38):

How would you feel about storing the mapping in a dictionary, with maybe two helpers encoderFromDict and decoderFromDict?

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 17:17):

In a case like this would probably be better to just use a list and make the index be the integer. Dict would be unnecessarily slow for something this small.

view this post on Zulip Eli Dowling (Apr 07 2024 at 21:16):

I have tried that, but it's messy and I dislike it. It's easy to add another tag and then forget to add it to the list and then get a panic

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 21:18):

Yeah, sorry, I was saying use a list instead of a dict (cause perf), probably still isn't nice to use a list in general

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 21:28):

I feel like enabling this feature on type aliases will likely cause problems due to type inference.

Like with default values, this can't be definitely only in the type system. It has to be defined even if the user never specified types.

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 21:31):

That's where I expect the complexity of this feature to come in. That and by default everything is ordered alphabetically, not in definition order.

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 21:32):

The default order will make autoderive not useful by default.

view this post on Zulip Eli Dowling (Apr 07 2024 at 21:32):

Well it wouldn't make any sense if the user didn't specify types. I'd say in this case it doesn't make sense to infer it. Like, you don't know what values to assign to each tag, sometimes enums have gaps or start at 0. The user should be made to annotate the type or just not use the feature

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 21:34):

Yeah, so then it would be an opaque type feature

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 21:38):

Though, maybe we can accept that some features are only allowed when adding a type definition. That's what I want us to do for default values as well. Only allow them in type definitions and require the type definition to use them at all.

The big issue with that is it means you can't comment out the type alias and leave roc to infer the type. It will change the semantics of the program.

view this post on Zulip Eli Dowling (Apr 07 2024 at 21:46):

True. But you can say the same about opaque types, which if this feature replaces the need for in this case,and improves ergonomics... Have we given up anything?

I'm not saying we should, I'd just like to explore all options so we can be aware of the compromises we are making by picking one :)

I think being able to define a type like this:

Message:{
messageType:[Text=1, Email=2]
body:Str
}

Is about as good as the ergonomics could be, and far superior to an opaque type where you need these functions:

get: ....
from: ...
# and maybe
text: ...
Email ...

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 21:49):

Yeah, I 100% agree. Which I why I also want us to put default values into the type definition. I think giving extra power if you add type definitions is super valuable.

At least so far, it just isn't a tradeoff that has been accepted.

view this post on Zulip Eli Dowling (Apr 07 2024 at 21:53):

Cool, well maybe as we accumulate more cases where it would be an improvement it might even out that tradeoff more :).
I do like that both these features wouldn't at all prevent you from never annotating a type. Just give extra power if you do

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 21:58):

I think the big difference with opaque types is that the compiler can infer them even if you comment out all type info.

This is due to the @MyOpaque someData that clearly adds the type info even though types are not explicitly specified.

So this is inferable with the types commented out

MyOpaque := [
    Foo = 7,
    Bar = 3
] implements [
    ToU8, # auto derive impl
]

# This program does the same thing with this commented out
# Foo is always 7
# someFunc : MyOpaque -> List U8
someFunc \myOpaque ->
    Encode.encode myOpaque

main =
    ....
   # This adds in the type info due to the `@MyOpaque`
    someFunc (@MyOpaque Foo)

This changes semantics:

MyAlias : [
    Foo = 7,
    Bar = 3
]

# This encodes Foo as 1 if commented out and as 7 if typed.
# someFunc : MyAlias -> List U8
someFunc \myAlias ->
    Encode.encode myAlias

main =
    ....
   # No added type info to connect to MyAlias
    someFunc Foo

view this post on Zulip Richard Feldman (Apr 07 2024 at 22:58):

a relevant distinction between this and default record fields is whether the program is still runnable without the annotation

view this post on Zulip Richard Feldman (Apr 07 2024 at 22:59):

as opposed to giving a type mismatch at compile time (due to missing fields and not knowing what to use as a default value) and then having to crash at runtime if the program is run anyway

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 23:37):

I think a compiler error is the better case. So I would label default values as safer. It blocks compilation instead of compiling a subtly incorrect program.

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 23:44):

Specifically I am imagining the ci and debugging story (if it gets into prod).

view this post on Zulip witoldsz (Apr 08 2024 at 08:11):

I have never came across a case when I had to map enums to integers (or almost never) it is always an enum to a string instead. If anything from the discussion goes into the Roc, will it be limited to enum↔integer mappings?

view this post on Zulip Eli Dowling (Apr 08 2024 at 08:15):

If we did implement some alternative syntax I would like it to work for both strings and ints.
C and CPP and C# and Java all use ints for enums. I've used quite a lot of APIs that use that. String enums are a very JS thing in my mind

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 14:42):

I have seen limited use of string enums in printing/parsing. I mean that is even what decode does by default, but the name exactly matches the tag.

view this post on Zulip Isaac Van Doren (Apr 08 2024 at 23:19):

We convert enums to strings all the time in our Java code base at work. I can’t think of any instances of converting them to numbers.

view this post on Zulip Eli Dowling (Apr 08 2024 at 23:24):

Ah okay, my bad. I should have checked that, i thought I remembered it being the same but it seems like that's a C, CPP, C# thing.

view this post on Zulip Isaac Van Doren (Apr 08 2024 at 23:29):

You know I thought that there were automatically assigned numbers for enums in Java but looks like there actually aren’t

view this post on Zulip Brendan Hansknecht (Apr 09 2024 at 00:24):

So it sounds like both numbers and strings are common.

Obviously due to type checking, it is easy to go from enum to number/string and ensure you haven't missed anything (even if it is a bit verbose, it is just a simple when ... is). Going the other way is where most of the pain comes in. No clean way to match. No clean way to ensure you didn't actually miss something.

Of course both could be automatically generated, but the general concern may be solvable even in a simpler manner.

view this post on Zulip Jasper Woudenberg (Apr 09 2024 at 07:09):

Suppose a person defines a toEnum function with a when statement:

toEnum = \tag ->
    when tag is
        Foo -> 0
        Bar -> 1
        ...

Another approach might be for the standard library to include Decode.enum or Encode.enum helpers, that take a function like the above and generate an encoder or decoder from it.

encoder = Encode.enum toEnum

decoder = Decode.enum toEnum

The implementation of Decode.enum would have to 'reverse' the implementation of toEnum, which it could do by calling toEnum once with every value in the tag. Don't know how tricky this would be, but I imagine it might be possible to implement in the standard library?

To support strings, we could instead have stdlib helpers Encode.strEnum, Encode.intEnum, and similar decoders.

view this post on Zulip Eli Dowling (Apr 09 2024 at 07:14):

Oh hey, that's a really cool idea! It's certainly a lot simpler than adding new suntax and stuff. Great suggestion!
We could definitely do that using the "macros" within the compiler(like the way deriving decoding is implemented).

view this post on Zulip Brendan Hansknecht (Apr 09 2024 at 14:47):

Could it be implemented in a reasonable way though? I'd assume it would either be a super brittle pattern match or it would have a large runtime cost.

view this post on Zulip timotree (Apr 09 2024 at 16:38):

Brendan Hansknecht said:

Going the other way is where most of the pain comes in. No clean way to match. No clean way to ensure you didn't actually miss something.

Another approach to helping you ensure you didn't miss something could be making that easy to test. Currently if you wanted to test "for each possible tag t of my tag union, decode (encode t) == Ok t" how would you write that?

view this post on Zulip Brendan Hansknecht (Apr 09 2024 at 18:05):

We don't have a good way sadly. Would be a list that has to be manually updated as well.

view this post on Zulip witoldsz (Apr 09 2024 at 22:55):

Richard Feldman said:

one idea is to allow something like this:

CompletionItemKind := [
    Text = 1,
    Method = 2,
    ...etc
]

and then only allow that syntax specifically when defining opaque types

Many languages have a possibility to annotate different things, like enums and then serialization libraries use it to decode/encode automatically. This is the simplest way, because you define everything once and in-place. So, following @Richard Feldman we could define:

CompletionItemKind := [
    Text = 1,
    Method = 2,
    ...etc
]

or

CompletionItemKind := [
    Text = "text",
    Method = "method,
    ...etc
]

and voilà!

But... there is a case to be made that we should not mix types (application logic) and some "trivial matters" like serialization. Or we simply cannot do it (when e.g. the enums are too generic or from other packages)... or maybe when we would have to use same enums in different serialization contexts!

Having said all that, I think we should allow for flexibility. The best of both worlds would be to let simple case be simple while not limiting other scenarios:

It would be gorgeous if it would be dead-easy to provide just one-way mapping, like @Jasper Woudenberg suggested, and the other direction would be derived :heart_eyes:

view this post on Zulip Brendan Hansknecht (Apr 09 2024 at 23:25):

One very important note:

Tags are more than just enums. Each tag can contain data. That should hopefully fit seamlessly into whatever design we pick.

view this post on Zulip witoldsz (Apr 10 2024 at 00:12):

Brendan Hansknecht said:

One very important note:

Tags are more than just enums. Each tag can contain data. That should hopefully fit seamlessly into whatever design we pick.

Currently, my colleagues are working with APIs and have to decode and encode JSONs back and forth and "enums" (i.e. tag unions without any additional payload) are super common. Most popular languages (the rumors of them being dead are sadly still premature) do not have tag unions anyway and this is probably why most of the API (and so the JSONs) we have to deal with have nothing that would fit them. It is always simple fields of strings, numbers and "enums".

So, it would be really great if we could:

view this post on Zulip Brendan Hansknecht (Apr 10 2024 at 00:32):

I don't see any reason to restrict to only one. The poor man's tagged enum are in a lot of apis.

view this post on Zulip Brendan Hansknecht (Apr 10 2024 at 00:38):

I have seen any apis in roughly the form

{
message: Enum (often string enum in json),
data: varying data type that is specific to the message variant,
}

This is a tagged union just in a poor form.

view this post on Zulip Brendan Hansknecht (Apr 10 2024 at 00:41):

Also, more properly typed protocols like protobuf have something that maps directly to tagged unions.

view this post on Zulip Luke Boswell (Apr 10 2024 at 01:57):

I wanted to check I understood what we are talking about here, so I made the below example which is a simple encoding and decoding of a list of CompletionItemKind. This may be helpful for others so sharing here. The below app will print the following to stdio:

$ roc dev example.roc
(@CompletionItemKind Text)
(@CompletionItemKind Method)
(@CompletionItemKind Function)
(@CompletionItemKind Constructor)
(@CompletionItemKind Field)
(@CompletionItemKind Variable)
(@CompletionItemKind Class)
(@CompletionItemKind Interface)
(@CompletionItemKind Module)
app "example"
    packages {
        pf: "https://github.com/roc-lang/basic-cli/releases/download/0.8.1/x8URkvfyi9I0QhmVG98roKBUs_AZRkLFwFJVJ3942YA.tar.br",
        json: "https://github.com/lukewilliamboswell/roc-json/releases/download/0.6.3/_2Dh4Eju2v_tFtZeMq8aZ9qw2outG04NbkmKpFhXS_4.tar.br",
    }
    imports [pf.Stdout.{line},json.Core.{json}]
    provides [main] to pf

main =

    input : List U8
    input = ['[','1',',','2',',','3',',','4',',','5',',','6',',','7',',','8',',','9',']']

    itemKinds : List CompletionItemKind
    itemKinds = Decode.fromBytes input json |> Result.withDefault []

    itemKinds
    |> List.map Inspect.toStr
    |> Str.joinWith "\n"
    |> Stdout.line

CompletionItemKind := [
    Text,
    Method,
    Function,
    Constructor,
    Field,
    Variable,
    Class,
    Interface,
    Module,
] implements [
    Decoding { decoder: decodeThing },
    Encoding { toEncoder: encodeThing },
    Inspect,
]

decodeThing : Decoder CompletionItemKind fmt
decodeThing = Decode.custom \bytes, _ ->

    ok : _, List U8 -> DecodeResult CompletionItemKind
    ok = \tag, rest -> {result: Ok (@CompletionItemKind tag), rest}

    when bytes is
        ['1', .. as rest] -> ok Text rest
        ['2', .. as rest] -> ok Method rest
        ['3', .. as rest] -> ok Function rest
        ['4', .. as rest] -> ok Constructor rest
        ['5', .. as rest] -> ok Field rest
        ['6', .. as rest] -> ok Variable rest
        ['7', .. as rest] -> ok Class rest
        ['8', .. as rest] -> ok Interface rest
        ['9', .. as rest] -> ok Module rest
        _ -> {result: Err TooShort, rest: bytes}

encodeThing : CompletionItemKind -> Encoder fmt
encodeThing = \@CompletionItemKind tag -> Encode.custom \bytes, _ ->

    append : U8 -> List U8
    append = \u8 -> List.append bytes u8

    when tag is
        Text -> append 1
        Method -> append 2
        Function -> append 3
        Constructor -> append 4
        Field -> append 5
        Variable -> append 6
        Class -> append 7
        Interface -> append 8
        Module -> append 9

view this post on Zulip Luke Boswell (Apr 10 2024 at 02:00):

It's basically the same thing as what @Eli Dowling posted at the start, just another version I guess and I've cut some of the Tags out for brevity.

view this post on Zulip Brendan Hansknecht (Apr 10 2024 at 02:32):

Yeah, so we are talking about auto generating the encode/decode implementations that you specified manually in that example.

Then it expanded scope to say you might also want to auto generate a string mapping as well

Text -> "text"
Method -> "method"
...

view this post on Zulip Brendan Hansknecht (Apr 10 2024 at 02:33):

Specifically the question is can we add information to the type such that we can auto generate encode/decode and ensure we never miss a case.

view this post on Zulip Brendan Hansknecht (Apr 10 2024 at 02:34):

The last piece on top of that is can the implementation be flexible enough to also make for easy support of tag unions that contain data rather than just enums with no data.

view this post on Zulip Brendan Hansknecht (Apr 10 2024 at 02:34):

I think that is the full set of questions being looked at here

view this post on Zulip Brendan Hansknecht (Apr 10 2024 at 02:37):

In your example, if you add a new enum variant. You will get a type error that leads to updating encode, but no help to update decode.

view this post on Zulip Eli Dowling (Apr 10 2024 at 03:01):

Well, if we do go with this design I would consider it completely separate to tag unions with data inside. Those encoders and decoders can be decided by the format. We can look at how other languages with tag unions encode this. But for json you might have something like:

"myUnion":{
  "Tag1":{
    //..tag content...
}}
//Or:
"myUnion":{
  "tag":"Tag1",
  "value":{
     //..tag contents...
}}

Enum tags are just a completely different thing

view this post on Zulip Brendan Hansknecht (Apr 10 2024 at 03:11):

For sure, I guess you can't autoderive a tag union with data encoder in many formats, but we should be able to autoderive the two core pieces still. We should be able to autoderive the tag encoding and autoderive the contained data encoding. It would still be up to the user to decide how that wires into the final output.

view this post on Zulip Eli Dowling (Apr 10 2024 at 03:14):

I think it should be implemented in the format. We just provide the encoding for the name of the tag and the contents and the format decides how to encode and decode the two. Just like we do for record encoding and decoding. Then we just implement something like I suggested above.

view this post on Zulip Brendan Hansknecht (Apr 10 2024 at 03:15):

Which is what we do currently, right?
tag : Str, List (Encoder fmt) -> Encoder fmt where fmt implements EncoderFormatting

The Str is the encoding of the tag. The List (Encoder fmt) is how to encode each field of data.

view this post on Zulip Eli Dowling (Apr 10 2024 at 03:17):

Oh, yeah exactly :sweat_smile:. I'm on mobile, I would have checked otherwise, oops .

view this post on Zulip Brendan Hansknecht (Apr 10 2024 at 03:17):

So if that is the case, it just means whatever syntax we pick, we need to make sure that it works with tags with data.

CompletionItemKind := [
    Text = 1,
    Method = 2,
    ...etc
]

CompletionItemKindWithData := [
    Text Str = 1,
    Method U64 I32 = 2,
    ...etc
]

view this post on Zulip Eli Dowling (Apr 10 2024 at 03:19):

Ahh, I see, good point. Well that makes my bottom suggestion much more appealing

view this post on Zulip Eli Dowling (Apr 10 2024 at 03:20):

I was actually initially thinking we might just want to not allow these enum style tag annotations on tags with data at all. But it does seem like it'd work fine

view this post on Zulip Luke Boswell (Apr 10 2024 at 03:31):

Could something like comptime help us here? Or maybe some kind of code-gen?

view this post on Zulip Eli Dowling (Apr 10 2024 at 03:39):

Well that's basically what we're talking about. Autoderived decoding for records and tuples currently uses what are basically macros/comptime but using hand written roc expressions in the compiler. Think of it as a macro system with the worst syntax imaginable :sweat_smile:. That's what my null decoding PR is updating

view this post on Zulip Luke Boswell (Apr 10 2024 at 03:40):

I was thinking in roc userland

view this post on Zulip Eli Dowling (Apr 10 2024 at 03:44):

Well I agree, even just in compiler land it would be handy. It's a big addition but I think it would enable some really cool stuff. Like testing out syntax changes very easily. And doing super fun stuff like automatic generation of types for JSON. Which would make scripting in roc super fun. Basically you provide a slice of sample json and the types get generated and added to your program then you can use it to do transformations on the Json data. F# has this and it's amazingly useful for scripts

view this post on Zulip Eli Dowling (Apr 13 2024 at 07:19):

I had a cool idea in this area for automated encode and decode for tag unions.
Obviously the most general solution to all of this is macros/comptime, but I believe we can make some good progress with this
I often want my tag unions to just be the names but camel cased, or snake cased or pascal cased.
Sometimes I want tag unions to be unions but not tagged to interop with JS, they should just try decoding each tag until one matches and then also encode with no tag info.
We could use custom formatters for that that wrap an existing formatter like this:

interface UnionTags
    exposes [
        unionTags
    ]

    imports [
    ]

UnionTags fmt := {otherFormatter:fmt } where fmt implements EncoderFormatting
    implements [
        EncoderFormatting {
            u8: encodeU8,
            u16: encodeU16,
            u32: encodeU32,
            u64: encodeU64,
            u128: encodeU128,
            i8: encodeI8,
            i16: encodeI16,
            i32: encodeI32,
            i64: encodeI64,
            i128: encodeI128,
            f32: encodeF32,
            f64: encodeF64,
            dec: encodeDec,
            bool: encodeBool,
            string: encodeString,
            list: encodeList,
            record: encodeRecord,
            tuple: encodeTuple,
            tag: encodeTag,
        },
    ]

unionTags =\fmt-> @UnionTags { otherFormatter:fmt}

encodeTag:Str, List (Encoder _) -> Encoder _
encodeTag = \name, encoders ->
    Encode.custom \bytes, @UnionTags { otherFormatter } ->
        when encoders is
            [only] ->
                bytes |> Encode.appendWith only otherFormatter
            _-> panic "cannot encode multi arg tags as unions "

forward=\n->
    Encode.custom \bytes, @UnionTags {otherFormatter} ->
        bytes |>Encode.append n otherFormatter

# all the other functions just forward

Basically we ignore the tag part of the tag union in this encoder.
Currently this crashes the compiler, probably because it doesn't like encodeFormatting type to be generic. But with module params it won't have to be and will probably work quite well :)

You could just implement this encoder/decoder in your opaque type and then


Last updated: Jun 16 2026 at 16:19 UTC