Stream: contributing

Topic: How to handle json null decoding


view this post on Zulip Eli Dowling (Mar 20 2024 at 11:39):

I'd like to get some opinions on how to handle json's null type.

I've got an Option type that decodes to @Option None when given a 0 byte decode input. That lets me handle missing fields, but what about fields that are null eg {"name":null,"age":20}vs {"age":20}
Some ideas i had:
A) Add a special check in the json record decoder that sends the decoder [] instead of null
B) Add a check in the Option decoder that decodes null to None tying it to json
C) Make some new Opaque type that specifically decodes null and []
D) some other wizardry

I like A the most because it's simple, but it does obscure the idea of "null" vs "undefined". This would put us inline with most other decoding libraries and we could make this optional if the user wants to implement their own extra opaque type for null vs undefined

view this post on Zulip Luke Boswell (Mar 20 2024 at 22:43):

I like the sound of A too. I originally had it decoding null into a valid Str "null" which wasn't a good call. @Richard Feldman had thoughts on that as he is using it at Vendr for some things and found that bug the hard way. So I definitely would be interested to know what he thinks too. I haven't used JSON much in anger, so for me it's more of a theoretical concept and staring at a spec to guess what we should do.

view this post on Zulip Eli Dowling (Mar 20 2024 at 23:56):

There is also a similar question around encoding. I have currently made it so when a record field's encoder returns [] the field is omitted and same with elements in a list and in a tuple.
I've then added an option to put in a null instead.

Although it's unconventional I'm leaning towards having it put nulls in by default, or making it the default for lists and tuples, I want tuples and options to put in nulls because if you encode [Some 1 ,None ,Some 2 ] to [1,2] then decode it you get [Some 1, Some 2 ] which feels like an annoying "gotcha" moment I'd rather [1,null,2] so it correctly decodes to [Some 1,None,Some 2]

view this post on Zulip Eli Dowling (Apr 02 2024 at 04:00):

I was talking to a friend about this recently and he mentioned that for some APIs there is an important difference between null and undefined in json . Sending a message that has a single field null is and the rest undefined is a way to null-out that one field.
With my current implementation there is no way to encode both concepts at once. I was able to make a type that does handle both, but it is very coupled to json because it only decodes a null when it gets exactly "null" and encodes exactly "null" on encode.

interface OptionOrNull
    exposes [none, some, null]
    imports [
        Encode,
        Core,
    ]

OptionOrNull val := [Some val, None, Null]
    implements [
        Eq,
        Decoding {
            decoder,
        },
        Encoding {
            toEncoder,
        },
    ]
none = \{} -> @OptionOrNull None
some = \a -> @OptionOrNull (Some a)
null = \{} -> @OptionOrNull Null

nullChars = "null" |> Str.toUtf8

toEncoder = \@OptionOrNull val ->
    Encode.custom \bytes, fmt ->
        when val is
            Some contents -> bytes |> Encode.append contents fmt
            None -> bytes
            Null -> bytes |> List.concat (nullChars)

decoder = Decode.custom \bytes, fmt ->
    when bytes is
        [] -> { result: Ok (@OptionOrNull None), rest: [] }
        ['n', 'u', 'l', 'l',.. as rest] -> { result: Ok (null {}), rest: rest }
        _ ->
            when bytes |> Decode.decodeWith (Decode.decoder) fmt is
                { result: Ok res, rest } -> { result: Ok (@OptionOrNull (Some res)), rest }
                { result: Err a, rest } -> { result: Err a, rest }
OptionTest2: { maybe : OptionOrNull U8, y : U8, maybe2 : OptionOrNull U8 }

expect
    decoded : Result OptionTest2 _
    decoded =
        """
        {"y":1,"maybe":null}
        """
        |> Str.toUtf8
        |> Decode.fromBytes (Core.jsonWithOptions{emptyEncodeAsNull:Bool.false,nullAsUndefined:Bool.false})

    expected = Ok ({ y: 1u8, maybe: null{},maybe2: none{}} )
    isGood =
        when (decoded, expected) is
            (Ok a, Ok b) ->
                a == b

            _ -> Bool.false
    isGood == Bool.true

# Encode Option None record with null
expect
    encoded =
        dat:OptionTest2
        dat = { maybe2: none {}, maybe: null {}, y: 1 }
        Encode.toBytes dat (Core.jsonWithOptions{emptyEncodeAsNull:Bool.false})
        |> Str.fromUtf8

    expected = Ok
        """
        {"maybe":null,"y":1}
        """
    expected == encoded

I would love to hear what some other folks with a lot of experience with web stuff and json apis think of this.

Should we maybe have our encoders and decoders have an inbuilt concept of Null vs undefined so that this implementation isn't tied to json? The builtin decoder could define opaque types for Null and Undefined and expect a DecodingFormat to implement them somehow.

@Richard Feldman @Luke Boswell Thoughts?

view this post on Zulip Jasper Woudenberg (Apr 02 2024 at 05:55):

I think the null/undefined distinction is reasonably specific to JSON. Some other serialization formats have it too, but not all of them. For instance, the rvn format I recently worked on doesn't have null, and neither does TOML or XML. I think it'd be a unintuitive for a TOML encoder/decoder to have to deal with null if neither TOML nor Roc support it. Based on that, null being a JSON-specific think makes a certian amount of sense to me.

view this post on Zulip Jasper Woudenberg (Apr 02 2024 at 06:00):

Makes sense to me to have something like ValueOrNullOrUndefined lying around in the json library though. I don't like it when an API makes the difference between a null field an a missing field significant, a helper type like the one you have defined would make it super transparent in an implementation what shenanigans the API is up to.

view this post on Zulip Eli Dowling (Apr 02 2024 at 06:03):

Yeah I think I mostly agree with that, a type like that as part of the Json library would mostly solve that.

XML does kind of have null undefined distinction <Maytag/> could be thought of as null

view this post on Zulip Brendan Hansknecht (Apr 02 2024 at 14:44):

This is one of those cases where I really want to say, Don't support it. Apis that need to distinguish both are really poorly designed. Just merge them into one concept.....

But real life use cases may dictate my above opinion as impractical.

view this post on Zulip Eli Dowling (Apr 02 2024 at 14:52):

It looks bad from our world, but as a JS Dev I think it's natural to think of null and undefined as different things that should have distinct usage.
Thinking back I realise I've actually used an API like this. I'm pretty sure Salesforce has this. Unfortunately I think not supporting it would be a real sore spot for a lot of people.

view this post on Zulip Brendan Hansknecht (Apr 02 2024 at 15:22):

Yeah, I think it is required, but I have seen a number of bugs from it.

view this post on Zulip Richard Feldman (Apr 02 2024 at 15:37):

I might be misunderstanding something, but JSON doesn't have undefined, just null (although JavaScript has both)

view this post on Zulip Richard Feldman (Apr 02 2024 at 15:37):

are we using "undefined" here to mean "missing field" or something similar?

view this post on Zulip Brendan Hansknecht (Apr 02 2024 at 17:01):

I think a js undefined will get encoded as the string "undefined"

view this post on Zulip Brendan Hansknecht (Apr 02 2024 at 17:01):

Right?

view this post on Zulip Brendan Hansknecht (Apr 02 2024 at 17:02):

Oh, nvm

view this post on Zulip Brendan Hansknecht (Apr 02 2024 at 17:02):

Undefined -> field missing
Null -> explicit null value

view this post on Zulip Eli Dowling (Apr 03 2024 at 11:36):

@Richard Feldman I feel like there is alot of confusion here. I'll have a think and try to create some user story type explanations of why I think null undefined distinction is important for json (not for builtin into roc or anything, just within json encoding and decoding.

view this post on Zulip Richard Feldman (Apr 03 2024 at 11:37):

fair enough, sounds good! :+1:

view this post on Zulip witoldsz (Apr 03 2024 at 11:46):

Brendan Hansknecht said:

This is one of those cases where I really want to say, Don't support it. Apis that need to distinguish both are really poorly designed. Just merge them into one concept.....

I can provide a real life example, described initially here: https://github.com/thoth-org/Thoth.Json/issues/91

[...] imagine creating a JSON payload of MongoDB update. It's big difference between:
{ $set: { firstname: "…", lastname: null, somethingElse: null } }
and
{ $set: { firstname: "…", somethingElse: null } }
First one will change firstname and erase both lastname and somethingElse, the latter will not alter lastname at all.

view this post on Zulip Eli Dowling (Apr 03 2024 at 11:53):

@witoldsz thankyou. I was going to write out something very similar for a Salesforce database, basically the same api:
The accompanying roc code might be:

PersonUpdate:{
    age: ValOrNullOrUndefined U8,
    firstName: ValOrNullOrUndefined Str,
    lastName: ValOrNullOrUndefined Str
}
updatePerson \update->  encodeAndSendToMongo update

In essence if I am decoding or encoding a record type "person" {age, firstName, lastName}
Each field could have one of three states:
It could not exist(undefined),
It could be null,
It could have a value.

Whether or not we want to encourage roc users to distinguish between null and undefined, some pretty significant existing systems do, and so to interoperate with them fully we would need to be able to represent all possible states.
Therefore, whilst most roc code shouldn't need to distinguish between null and undefined we should think about supporting distinguishing between them when encoding and decoding.

view this post on Zulip Eli Dowling (Apr 03 2024 at 13:07):

I think this may just be a misalignment of semantics :sweat_smile:

view this post on Zulip Richard Feldman (Apr 03 2024 at 14:48):

I have a suggestion: what if instead of talking about "undefined" we talk about "missing" if those are the intended semantics?

view this post on Zulip Richard Feldman (Apr 03 2024 at 14:49):

I really cannot overemphasize how much of a mistake I think it would be for the word "undefined" to appear anywhere in any Roc JSON library :big_smile:

view this post on Zulip Richard Feldman (Apr 03 2024 at 14:49):

like if we mean "this is a value which indicates that the field will not be included" then we should choose a word that means that, which undefined only does in one programming language in history

view this post on Zulip Richard Feldman (Apr 03 2024 at 14:50):

or like "omitted" maybe

view this post on Zulip Richard Feldman (Apr 03 2024 at 14:50):

there is definitely a distinction between a field being omitted and the field being present and set to null in JSON, and plenty of APIs rely on that distinction, so I agree that it's important that we support a distinction there

view this post on Zulip Eli Dowling (Apr 03 2024 at 20:48):

I'm totally happy to use whatever term, 'missing' works fine for me :+1:. To be clear though, I used 'undefined' because JSON is JavaScriptObjectNotation and so it seemed obvious to use JavaScript's term for "a field that is missing".

view this post on Zulip Pei Yang Ching (Apr 07 2024 at 22:37):

I might be missing some context, but wouldn't default values be better than 'undefined' ? most of the time I use missing fields for this, and it's (probably) trivial to add your own "Undefined" type if needed

view this post on Zulip Karakatiza (Apr 08 2024 at 01:16):

Undefined, or missing, field is very important - that means omitting of data. You shouldn't just slap any default value in the missing field's place unless maybe you explicitly define it in the Encode ability declaration of your type

view this post on Zulip Karakatiza (Apr 08 2024 at 01:19):

Optionality and nullability should be composable, like

{
    age: Optional (Nullable U8),
    firstName: Optional Str,
    lastName: Nullable Str
}

as they are in Javascript:

{
    age?: number | null,
    firstName?: string,
    lastName: string | null
}

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 01:23):

Yeah, in roc terms, you probably would want the type to be:

{
    age: Result U8 [Null, Undefined],
    firstName: Result Str [Undefined],
    lastName: Result Str [Null]
}

I think collapsing it into a single error tag is the cleanest way to represent this in Roc.

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 01:24):

And as mentioned above Undefined could be Missing or Omitted. Just a name.

view this post on Zulip Karakatiza (Apr 08 2024 at 01:25):

Would that work for Encode and Decode?

view this post on Zulip Eli Dowling (Apr 08 2024 at 01:30):

Not presently but that's why this discussion exists. To get ideas as to how we'd like it to work and then see how close we can get/if we need new language features etc

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 01:30):

You probably could hack to to work by making the JSON encoder/decoder special case on the tags Null and Undefined. More likely, some form of opaque type for smarter handling is needed.

view this post on Zulip Eli Dowling (Apr 08 2024 at 01:34):

@Karakatiza I have a for k of roc and roc- Json package where This is implemented as an opaque type.

view this post on Zulip Karakatiza (Apr 08 2024 at 01:34):

So, like Optional a [Null, Omitted] ?

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 01:39):

@Eli Dowling just curious, with current roc, why couldn't it be implemented on an opaque type. What caused the need for the fork?

view this post on Zulip Eli Dowling (Apr 08 2024 at 01:51):

@Brendan Hansknecht There is no way to decode a record with a field that Is missing. I needed to change the internal decoding derive code so that it tries to decode any missing fields to see if they are okay with decoding an empty array [].


Last updated: Jul 05 2025 at 12:14 UTC