I'd like to get some opinions on how to handle json's null type.
I've got an Option type that decodes to @Option None
when given a 0 byte decode input. That lets me handle missing fields, but what about fields that are null eg {"name":null,"age":20}
vs {"age":20}
Some ideas i had:
A) Add a special check in the json record decoder that sends the decoder []
instead of null
B) Add a check in the Option decoder that decodes null
to None
tying it to json
C) Make some new Opaque type that specifically decodes null
and []
D) some other wizardry
I like A the most because it's simple, but it does obscure the idea of "null" vs "undefined". This would put us inline with most other decoding libraries and we could make this optional if the user wants to implement their own extra opaque type for null vs undefined
I like the sound of A too. I originally had it decoding null into a valid Str "null" which wasn't a good call. @Richard Feldman had thoughts on that as he is using it at Vendr for some things and found that bug the hard way. So I definitely would be interested to know what he thinks too. I haven't used JSON much in anger, so for me it's more of a theoretical concept and staring at a spec to guess what we should do.
There is also a similar question around encoding. I have currently made it so when a record field's encoder returns []
the field is omitted and same with elements in a list and in a tuple.
I've then added an option to put in a null
instead.
Although it's unconventional I'm leaning towards having it put nulls in by default, or making it the default for lists and tuples, I want tuples and options to put in nulls because if you encode [Some 1 ,None ,Some 2 ]
to [1,2]
then decode it you get [Some 1, Some 2 ]
which feels like an annoying "gotcha" moment I'd rather [1,null,2]
so it correctly decodes to [Some 1,None,Some 2]
I was talking to a friend about this recently and he mentioned that for some APIs there is an important difference between null and undefined in json . Sending a message that has a single field null is and the rest undefined is a way to null-out that one field.
With my current implementation there is no way to encode both concepts at once. I was able to make a type that does handle both, but it is very coupled to json because it only decodes a null when it gets exactly "null" and encodes exactly "null" on encode.
interface OptionOrNull
exposes [none, some, null]
imports [
Encode,
Core,
]
OptionOrNull val := [Some val, None, Null]
implements [
Eq,
Decoding {
decoder,
},
Encoding {
toEncoder,
},
]
none = \{} -> @OptionOrNull None
some = \a -> @OptionOrNull (Some a)
null = \{} -> @OptionOrNull Null
nullChars = "null" |> Str.toUtf8
toEncoder = \@OptionOrNull val ->
Encode.custom \bytes, fmt ->
when val is
Some contents -> bytes |> Encode.append contents fmt
None -> bytes
Null -> bytes |> List.concat (nullChars)
decoder = Decode.custom \bytes, fmt ->
when bytes is
[] -> { result: Ok (@OptionOrNull None), rest: [] }
['n', 'u', 'l', 'l',.. as rest] -> { result: Ok (null {}), rest: rest }
_ ->
when bytes |> Decode.decodeWith (Decode.decoder) fmt is
{ result: Ok res, rest } -> { result: Ok (@OptionOrNull (Some res)), rest }
{ result: Err a, rest } -> { result: Err a, rest }
OptionTest2: { maybe : OptionOrNull U8, y : U8, maybe2 : OptionOrNull U8 }
expect
decoded : Result OptionTest2 _
decoded =
"""
{"y":1,"maybe":null}
"""
|> Str.toUtf8
|> Decode.fromBytes (Core.jsonWithOptions{emptyEncodeAsNull:Bool.false,nullAsUndefined:Bool.false})
expected = Ok ({ y: 1u8, maybe: null{},maybe2: none{}} )
isGood =
when (decoded, expected) is
(Ok a, Ok b) ->
a == b
_ -> Bool.false
isGood == Bool.true
# Encode Option None record with null
expect
encoded =
dat:OptionTest2
dat = { maybe2: none {}, maybe: null {}, y: 1 }
Encode.toBytes dat (Core.jsonWithOptions{emptyEncodeAsNull:Bool.false})
|> Str.fromUtf8
expected = Ok
"""
{"maybe":null,"y":1}
"""
expected == encoded
I would love to hear what some other folks with a lot of experience with web stuff and json apis think of this.
Should we maybe have our encoders and decoders have an inbuilt concept of Null vs undefined so that this implementation isn't tied to json? The builtin decoder could define opaque types for Null
and Undefined
and expect a DecodingFormat
to implement them somehow.
@Richard Feldman @Luke Boswell Thoughts?
I think the null/undefined distinction is reasonably specific to JSON. Some other serialization formats have it too, but not all of them. For instance, the rvn format I recently worked on doesn't have null, and neither does TOML or XML. I think it'd be a unintuitive for a TOML encoder/decoder to have to deal with null if neither TOML nor Roc support it. Based on that, null being a JSON-specific think makes a certian amount of sense to me.
Makes sense to me to have something like ValueOrNullOrUndefined
lying around in the json
library though. I don't like it when an API makes the difference between a null field an a missing field significant, a helper type like the one you have defined would make it super transparent in an implementation what shenanigans the API is up to.
Yeah I think I mostly agree with that, a type like that as part of the Json library would mostly solve that.
XML does kind of have null undefined distinction <Maytag/> could be thought of as null
This is one of those cases where I really want to say, Don't support it. Apis that need to distinguish both are really poorly designed. Just merge them into one concept.....
But real life use cases may dictate my above opinion as impractical.
It looks bad from our world, but as a JS Dev I think it's natural to think of null and undefined as different things that should have distinct usage.
Thinking back I realise I've actually used an API like this. I'm pretty sure Salesforce has this. Unfortunately I think not supporting it would be a real sore spot for a lot of people.
Yeah, I think it is required, but I have seen a number of bugs from it.
I might be misunderstanding something, but JSON doesn't have undefined, just null (although JavaScript has both)
are we using "undefined" here to mean "missing field" or something similar?
I think a js undefined will get encoded as the string "undefined"
Right?
Oh, nvm
Undefined -> field missing
Null -> explicit null value
@Richard Feldman I feel like there is alot of confusion here. I'll have a think and try to create some user story type explanations of why I think null undefined distinction is important for json (not for builtin into roc or anything, just within json encoding and decoding.
fair enough, sounds good! :+1:
Brendan Hansknecht said:
This is one of those cases where I really want to say, Don't support it. Apis that need to distinguish both are really poorly designed. Just merge them into one concept.....
I can provide a real life example, described initially here: https://github.com/thoth-org/Thoth.Json/issues/91
[...] imagine creating a JSON payload of MongoDB update. It's big difference between:
{ $set: { firstname: "…", lastname: null, somethingElse: null } }
and
{ $set: { firstname: "…", somethingElse: null } }
First one will changefirstname
and erase bothlastname
andsomethingElse
, the latter will not alterlastname
at all.
@witoldsz thankyou. I was going to write out something very similar for a Salesforce database, basically the same api:
The accompanying roc code might be:
PersonUpdate:{
age: ValOrNullOrUndefined U8,
firstName: ValOrNullOrUndefined Str,
lastName: ValOrNullOrUndefined Str
}
updatePerson \update-> encodeAndSendToMongo update
In essence if I am decoding or encoding a record type "person" {age, firstName, lastName}
Each field could have one of three states:
It could not exist(undefined),
It could be null,
It could have a value.
Whether or not we want to encourage roc users to distinguish between null and undefined, some pretty significant existing systems do, and so to interoperate with them fully we would need to be able to represent all possible states.
Therefore, whilst most roc code shouldn't need to distinguish between null and undefined we should think about supporting distinguishing between them when encoding and decoding.
I think this may just be a misalignment of semantics :sweat_smile:
I have a suggestion: what if instead of talking about "undefined" we talk about "missing" if those are the intended semantics?
I really cannot overemphasize how much of a mistake I think it would be for the word "undefined" to appear anywhere in any Roc JSON library :big_smile:
like if we mean "this is a value which indicates that the field will not be included" then we should choose a word that means that, which undefined
only does in one programming language in history
or like "omitted" maybe
there is definitely a distinction between a field being omitted and the field being present and set to null
in JSON, and plenty of APIs rely on that distinction, so I agree that it's important that we support a distinction there
I'm totally happy to use whatever term, 'missing' works fine for me :+1:. To be clear though, I used 'undefined' because JSON is JavaScriptObjectNotation and so it seemed obvious to use JavaScript's term for "a field that is missing".
I might be missing some context, but wouldn't default values be better than 'undefined' ? most of the time I use missing fields for this, and it's (probably) trivial to add your own "Undefined" type if needed
Undefined, or missing, field is very important - that means omitting of data. You shouldn't just slap any default value in the missing field's place unless maybe you explicitly define it in the Encode ability declaration of your type
Optionality and nullability should be composable, like
{
age: Optional (Nullable U8),
firstName: Optional Str,
lastName: Nullable Str
}
as they are in Javascript:
{
age?: number | null,
firstName?: string,
lastName: string | null
}
Yeah, in roc terms, you probably would want the type to be:
{
age: Result U8 [Null, Undefined],
firstName: Result Str [Undefined],
lastName: Result Str [Null]
}
I think collapsing it into a single error tag is the cleanest way to represent this in Roc.
And as mentioned above Undefined
could be Missing
or Omitted
. Just a name.
Would that work for Encode and Decode?
Not presently but that's why this discussion exists. To get ideas as to how we'd like it to work and then see how close we can get/if we need new language features etc
You probably could hack to to work by making the JSON encoder/decoder special case on the tags Null
and Undefined
. More likely, some form of opaque type for smarter handling is needed.
@Karakatiza I have a for k of roc and roc- Json package where This is implemented as an opaque type.
So, like Optional a [Null, Omitted]
?
@Eli Dowling just curious, with current roc, why couldn't it be implemented on an opaque type. What caused the need for the fork?
@Brendan Hansknecht There is no way to decode a record with a field that Is missing. I needed to change the internal decoding derive code so that it tries to decode any missing fields to see if they are okay with decoding an empty array []
.
Last updated: Jul 05 2025 at 12:14 UTC