Im evaluating using Roc in a specific system, where one of the needs is to encode/decode structures as unions discriminated on a particular key that I may or may not want reflected in the type being encoded or decoded. As a simple example, suppose our system already defines a "Status" message with the rough JSON schema of
status : { "kind": "success", "message": string } | { "kind": "failure", "code": number, "message": string }
perhaps we'd like to encode this message as the Roc type
Status := [Success { message: Str } , Failure { code : U64, message : Str }] has [Encoding, Decoding]
Unfortunately, the standard derived encoding and decoding for this opaque type, with the current JSON formatting implementation, would expect JSON messages that look like
{ "Success": [{ "message": string }] } | { "Failure": [{ "message": string, code : number }] }
One way to deal with this is to define a JSON formatting implementation that has some kind of configuration option to specify how certain discriminants should be encoded. For example, maybe the JSON implementation defines
interface Json exposes [JsonFormatting, TagEncoding, format]
JsonFormatting := { encodingsForTags: Dict Str TagEncoding, <other fields...> }
TagEncoding: [
Default, # wrap the tag payloads as an array keyed by the tag name, like `{ "Success": [{ "message": string }] }`
InlineAsSingleton { key: Str, value: Str }, # inline the discriminant in the payload, if it's unary, like `{ "type": "success", "message": string }`
]
format : Dict Str TagEncoding -> JsonFormatting
this is okay, but it has a few drawbacks. A few on the top of my head:
InlineAsSingleton is used for a tag that does not have a singleton payload, then the failure to inline the tag would have to be a runtime error, it can't be made into a compile-time error.Status on its own vs a larger structure that contains Status somewhere nested inside of it.Another option is to write a custom JSON encoder/decoder for this type, but that loses the advantages of the Encoding/Decoding abilities and usage of JSON encoding/decoding packages in the ecosystem.
Another option is to define the Roc type as something like
Status := { type : String, code : Result U64 {}, message : Str }
but this loses the desired type safety.
I wonder if there is something we can do at the language level to better support this kind of pattern, as I believe it occurs quite often.
For context, here's how I might define this in typescript, using literal types
type Status = { type : 'success', message: string } | { type : 'failure', code: number, message: string }
in Python:
import typing as t
class Success(t.TypedDict):
type: t.Literal["success"]
message: str
class Failure(t.TypedDict):
type: t.Literal["failure"]
code: int
message: str
Status = t.Union[Success, Failure]
in Rust with serde:
#[derive(Serialize, Deserialize)]
#[serde(tag = "type")]
enum Status {
#[serde(tag = "success")]
Success { message: String },
#[serde(tag = "failure")]
Failure { message: String, code: u64 },
}
One option is to extend the Encoding/Decoding ability API to take a parameter like the TagEncoding described above in the JSON module, for example
# [tag tagName payloads tagEncoding] encodes a tag variant and its payload types
tag : Str, List (Encoder fmt), TagEncoding -> Encoder fmt | fmt has EncoderFormatting
and then have some kind of new annotation syntax you can use for the purposes of deriving Encoding/Decoding for a type, a-la serde's use of Rust annotations or Go's tags on struct fields. For example
Status := [Success { message: Str } , Failure { code : U64, message : Str }] has [Encoding with {
tagEncoding: InlineDiscriminant,
renameTags : when tag is
Success -> "success",
Failure -> "failure",
}]
which are only applicable for builtin types. I haven't really thought this through, but on the surface it doesn't strike me as a good idea.
what about this?
TagNameTransformation : [
ToSnakeCase,
ToKebabCase,
ToCamelCase,
None,
]
TagNameStrategy : [
# e.g. [Success { message : Str }] ==> { "Success": [{ "message": String }] }
BecomeFieldName TagNameTransformation,
# e.g. [Success { message : Str }] ==> { "kind": "Success", "message": string }
StoredInField Str TagNameTransformation,
]
format : TagNameStrategy -> JsonFormatting
in my experience at least, it's most common to encounter a given JSON payload that has a consistent policy like this
e.g. all field names are kebab-case or snake_case or camelCase, discriminants are handled the same way throughout the payload, etc.
yeah, that works as long as you use different formattings when the behavior changes
right, but in my experience a given payload is going to be consistent within itself and not mix & match
and you already need to specify a formatting per payload that you want to encode or decode
In the specific case I'm considering, there are instances where there are unions discriminated by a key, and cases where they are not (the unions are just disjoint without a common key name)
within the same payload?
They are not the same type, but they compose like
MyType1 : ... # discriminate on "type"
MyType2 : ... # no discriminant
..
MyCompoundType : ... # deeper references to MyType1 and MyType2
So the payloads are disjoint, but they end up being used as part of a larger structure - if that answers your question
hm, do you have JSON of MyCompoundType? or just separate JSON payloads for MyType1 and MyType2, and then the outputs of those get put into MyCompoundType on the TypeScript side?
Yeah, I would like to encode/decode MyCompoundType as a whole
what if the discriminant was there but got ignored on the TypeScript side?
that is, Roc encodes the discriminant but TS ignores it
Roc would not be able to decode it if the discriminant was missing in that case, though - if you used the derived ability and the single tag encoding configuration
hm, but how would Roc decode it without a discriminant anyway? :thinking:
One option is backtracking - for example if you have [Foo {a : Str}, Bar {b : Str}], if you don't see an "a" key in the message, fallback on decoding the "Bar" variant by looking for the "b" key
Another practical option is to have your platform encode/decode the type appropriately if it's a case like this. Since in practice you'll probably be getting these messages over a network or other effectful operation.
well you can always handwrite a decoder at that point
If you hand write a decoder, you would either need the Decoding API to be extended to take a tag configuration option, or you wouldn't be able to parameterize over Decoding formatters arbitrarily right. Like, you would have to write a decoder specifically for the JSON case, unless Decode.tag took a tag configuration option
right, but I think having the platform encode/decode the type has the same drawback, yeah?
yeah
is there some way the Decoding API could be changed to make this possible without either of the following being true?
yeah im not sure. Naively I would say do what serde does and have some way to specify whether it's tagged/untagged in Decode.tag/Encode.tag, but Im not sure that generalizes to arbitrary formattings. Maybe formattings can ignore that if it's not applicable.
serde also relies on nominal types
one of the things I think about with encoding/decoding designs is "if you want to use this, do you have to convert all your structural types into nominal types just to get the encoding/decoding behavior you want?"
ideally not, of course :big_smile:
well, changing the decoding API doesn't need to relate to nominal/structural typing at all. For example, similar to the API I first described, maybe you can define Encode.tag as
Encode.tag : Str, List (Encoder fmt), TagEncodingStrategy, TagNaming -> Encoder fmt | fmt has EncoderFormatting
this API works equally for structural and nominal types, other than that derived implementation must choose a default encoding and naming strategy (but I don't see how they couldn't without language changes)
Could this be solved with a code generation tool? You give it a type, it generates Roc decoders/encoders for you. Then if you need to tweak it, it's easy since it's just user code at that point.
that's an interesting direction...I've been thinking about how roc glue could be expanded to work on arbitrary interface modules instead of just platform modules
and then you could give it a glue spec to generate client encoders and decoders (e.g. in Elm or TypeScript)
and those could be specific to your application and its types
downside being that you'd have to incorporate it into your build process
Ayaz Hafiz said:
well, changing the decoding API doesn't need to relate to nominal/structural typing at all.
what I meant is that the serde approach relies on nominal types
it's like "here's my type, and also here's how I want to encode its fields"
that only works for nominal types, since with structural types you'd be saying "here's the shape of my type, and any time in the entire program that any type happens to have this shape, encode it as follows" which (even if abilities worked that way) wouldn't be a great design :sweat_smile:
but then again, to be fair - because of how opaque types work in Roc, it's pretty easy to define an opaque wrapper just for serialization/deserialization and then say "ok now unwrap it and that's the type I'll actually use"
so maybe that's not actually a real concern
I guess it's annoying if you have a nested data structure which stores a structural type, and you want to serialize the whole data structure while getting some custom behavior for one of its nested structural types
but you want to specify that custom encoding/decoding behavior in terms of the nested type, not the nested data structure that contains it
i feel like that would only really come up during prototyping though. If these are messages over a network you’re likely to explicitly type their structure, at which point yeah there might be little overhead to use an opaque type there
in any case , this distinction for serde’s case only matters with regard to annotations that tell it how to encode/decode the type right? which would have to be a new language addition to Roc if it were to be reflected, which i think we would prefer not to do?
I'd definitely prefer not to do it
I'm open to the idea in some form, but it feels like a Pandora's Box to open
a thing I also don't love about it is that it feels like something to add to the language almost exclusively for JSON specifically, which is a category of language design decision that tends not to age well (e.g. Scala having baked-in syntax for XML, which was about as popular a serialization format at the time as JSON is now)
like binary formats don't care about this, and I assume XML would have explicit discriminants...feels like a JSON-specific thing
yeah I have the same feeling.
binary formats might care about it but yeah they’re usually more standardized
I assume a binary format would use an integer discriminant
Richard Feldman said:
downside being that you'd have to incorporate it into your build process
What I had in mind was an editor tool. You'd create a type representing the external data you want to decode, and then you press some hotkey to generate a JSON decoder for it. After that it's just user code so you don't regenerate it unless you decide later you want to throw away your old decoder and start again. In other words, no complicated build process.
Last updated: Jun 16 2026 at 16:19 UTC