Builtin Option type for decoding · ideas

I think It'd be good to either
A) Provide some documentation on the approved way to handle decoding data that has optional fields without an Opitonal type
B) Consider adding an optional type to builtins.

With the discussion around json null decoding It became evident that roc-json should probably have an option style type for decoding json, which almost always has optional fields. I'm concerned that this will create many different optional types created by different libraries.
fsharp and ocaml both have this problem to some degree, a bunch of common things are not provided and end up being redefined in incompatible ways. ocaml's many incompatible standard libraries and fsharp's result types and fsharpPlus libraries being good example of this.

Brendan Hansknecht (Apr 02 2024 at 14:40):

Can we just make it Result SomeType [WasNull] and keep it a result like everything else?

Eli Dowling (Apr 02 2024 at 14:57):

I suppose it's probably possible, I'll have a play around with what it looks like as a user and get back to you with some thoughts. I think my main concern would be how to implement it into the decoder. You can't define decoders and encoders outside opaque types.

Also what would it look like to encode null?
Does that mean where normally you would have a value like myval:{Err{Null:{}} we magically change it to not be tags and instead encode to myval:null ?

Richard Feldman (Apr 02 2024 at 15:43):

Richard Feldman (Apr 02 2024 at 15:48):

it's simple and self-descriptive, and I'm not sure it'll be all that valuable in practice to have helper functions like withDefault and the like

Richard Feldman (Apr 02 2024 at 15:49):

it at least seems like the most straightforward thing to try first! If there's a problem in practice we can always try a different approach

Eli Dowling (Apr 02 2024 at 22:08):

@Richard Feldman Are you suggesting we have a special handling in the compiler's decoding implementation for these tags?
I would personally find options without , try, withDefault and map pretty annoying to use, I use those all the time in Rust, F#, and Ocaml. Those are very basic operations I would always use when validating optional data. Infact in the lsp library I'm working on that I need optionals for I immediately defined and used all the above :sweat_smile:

@Brendan Hansknecht I had a think about your Options as results idea, I actually quite like it. We could implement it like this:

ResultOption val := Result val [Null]
    implements [
        Eq,
        Decoding {
            decoder: decoderRes,
        },
        Encoding {
            toEncoder: toEncoderRes,
        },
    ]
resNull = \{} -> @ResultOption (Err Null)
resSome = \val -> @ResultOption (Ok val)
get = \@ResultOption val-> val
from= \val-> @ResultOption val

toEncoderRes = \@ResultOption val ->
    Encode.custom \bytes, fmt ->
        when val is
            Ok contents -> bytes |> Encode.append contents fmt
            Err Null -> bytes |> List.concat (nullChars)

decoderRes = Decode.custom \bytes, fmt ->
    when bytes is
        [] -> { result: Ok (resNull{}), rest: [] }
        _ ->
            when bytes |> Decode.decodeWith (Decode.decoder) fmt is
                { result: Ok res, rest } -> { result: Ok (resSome res), rest }
                { result: Err a, rest } -> { result: Err a, rest }

Luke Boswell (Apr 02 2024 at 22:37):

Eli Dowling (Apr 02 2024 at 22:40):

Yup, I had a version for that too, but I think it'd be better to have a standard one without undefined and a special Json one with. I do agree with what was mentioned in the other thread about the distinction being a niche use case, and somewhat unique to Json

Luke Boswell (Apr 02 2024 at 23:21):

If that is the case, then perhaps the best approach to document this would be to include an example using something like the OptionalResult you have above for others to see a good way to handle decoding and encoding.

Eli Dowling (Apr 02 2024 at 23:56):

@Luke Boswell
I was suggesting that we could include just the type above as a builtin to avoid duplicating any of the existing Result methods.
That way once the type is decoded you call option.get to un-opaque it into a Result myval Null at which point you can call Result.try etc to do other validation and such

Luke Boswell (Apr 03 2024 at 01:05):

It makes sense to include in json so it is easy for people to reach for, and if someone needs it they can implement locally specific to their decoding use case.

Luke Boswell (Apr 03 2024 at 01:07):

Luke Boswell (Apr 03 2024 at 01:12):

What I take from that is that the intent is to encourage users to write more descriptive models of their data using tags.

So in the context of this discussion, I feel like having a good example and encouraging people to tailor this to their data or decoding use case would be a good approach.

Luke Boswell (Apr 03 2024 at 01:29):

Is it because if we have it in builtins, then everywhere we can use the same opaque type which makes it more transferable across packages? you might have a library which expects a Nullable data field, and it would represent that using this type and so the encoding and decoding implementation is also available.

Luke Boswell (Apr 03 2024 at 01:30):

I don't have a strong opinion here. Just really interested to explore the idea, as the above looks really useful

Eli Dowling (Apr 03 2024 at 03:26):

@Luke Boswell
My argument for having it in builtin is as follows:
I am a user of roc who has made a web API that uses optional types and send Json.
I would like to switch to a binary format for my data to save bandwidth, so I switch to message pack or grpc or some other format.
My API has optional fields.

If an Optional type is builtin, I change the serialisation part of my server and everything works.

If Optional is part of the Json package, I have to change every optional field to use the optimal type from the messagepack package.

If I wanted to have a flag that changes the server from one to another.... Well that's either extremely painful or impossible depending on your tolerance for pain.

I think optional data is a very fundamental part of all serialisation and de-serialization.

As for the goal of not including optional in the language:
I don't disagree with it. That's why I would put the type in the decoding package, and we could also make it return a result when "un-opaqued" (as shown above). I think that would strongly indicate that optional is a tool for decoding and encoding. For modelling the outside world . Also we can make it clear in the docs and example code the intended use, convention is a powerful tool.

Brendan Hansknecht (Apr 03 2024 at 03:43):

I think this is more of a slippery slope than you think. Trying to force sharing between the Json serde type and the grpc serde type. These types are highly likely to diverge and have various uniquenesses. On top of that, a lot of versioning complexities often have to be encoded into binary formats, but are dropped in json and more flexible formats.

Brendan Hansknecht (Apr 03 2024 at 03:45):

Not against the idea, but I have essentially never seen a binary encoded piece of data that doesn't build up cruft over time and with versioning. As such, sharing may be lacking. On top of that, as you already mentioned, json is special with undefined as well, which removes its ability to share.

Brendan Hansknecht (Apr 03 2024 at 03:46):

I do think we should try to enable easy type conversion or some sort of sharing if possible, but I'm not sure how much this will help.

Brendan Hansknecht (Apr 03 2024 at 03:53):

I feel like my comment above my be more important for the focus on sharing of the exact same type than the sharing of part of the type in the form of ResultOption. I do think that ResultOption is a primitive that can probably be shared with most formats. Though it may need a special encoding/decoding ability function. I'm not sure it's actually encodable in a generic way that would map to all formats. So it may need a unique function per format or it may need to be defined per format to encode correctly.

Eli Dowling (Apr 03 2024 at 05:32):

I was going to say something similar. I'd say the Json grpc issue is more a design decision (maybe not a good one) up to the user.
I can't say for sure, but I haven't yet thought of a format that couldn't encode it using the method I've outlined in my optional record field PR (encoding None outputs a an empty byte array and decoding an empty byte array is interpreted as None)

The Json issue isn't that relevant I'd say. I think most users would prefer to treat null and undefined as the same thing and we can provide an option to turn that behaviour off for the few who want it. In that case you really are relying on fairly Json specific features and so I think it's fair to need to use a special JsonValueOrNullOrUndefined type.

Eli Dowling (Apr 03 2024 at 05:42):

Also my example was less about exposing both in prod, but because it's common when using binary protocols like messagepack to switch to Json for debugging purposes because it's a pain to deal with decoding stuff eg: in your Dev environment everything talks Json, and you use messagepack when testing and when deployed. Microsoft's implementation for asp.net does that by default I believe.

Richard Feldman (Apr 03 2024 at 10:39):

Richard Feldman (Apr 03 2024 at 10:41):

can someone help me out with this? Like if you write { "foo": undefined } - that is invalid JSON and doesn't parse

Richard Feldman (Apr 03 2024 at 10:42):

if you write JSON.parse({ foo: undefined }) in JavaScript, it evaluates to the JSON string "{}" but that has everything to do with the semantics of JavaScript's JSON.parse method and is irrelevant to how any other programming language implements JSON serialization

Richard Feldman (Apr 03 2024 at 10:44):

so I'm just very confused about why the word "undefined" is any part of a conversation about JSON (a serialization format which doesn't have undefined) in a language that also doesn't have undefined!

Richard Feldman (Apr 03 2024 at 10:52):

Eli Dowling (Apr 03 2024 at 10:56):

@Richard Feldman
Okay, firstly, we got a bit derailed, the conversation i wanted to have was about having a builtin Optional type.

However to address your point, I would say JSON does have undefined ,if you think of undefined as "a value that doesn't exist" which is how i think most would define it.
eg:
User = {"name":"eli","age":10} name is a string
User = {"name":null,"age":10} name is null
User = {"age":10} name is undefined
JS certainly thinks so
image.png

Eli Dowling (Apr 03 2024 at 10:57):

I'm happy to use another term if you think there is a more correct/intuitive way to talk about a field being "not present" vs "null" :). I guess because JSON is from the JS world I'm using JS terms

Richard Feldman (Apr 03 2024 at 11:18):

I think this is a really important distinction: the undefined in that screenshot is coming from JavaScript and not JSON

Richard Feldman (Apr 03 2024 at 11:20):

Richard Feldman (Apr 03 2024 at 11:21):

this is an important distinction because "a value that doesn't exist" is represented in Roc as "the record field doesn't exist"

Richard Feldman (Apr 03 2024 at 11:21):

so I don't think JSON serialization in Roc should have any special concept of "field missing"

Richard Feldman (Apr 03 2024 at 11:25):

(unless we want to get into optional/default values in the case that a field is missing, which is separate from undefined)

Eli Dowling (Apr 03 2024 at 11:28):

Firstly, could we maybe move this to another thread? I feel like we are a long way away from my original topic.

witoldsz (Apr 03 2024 at 11:28):

I have just found that the feature of optional fields is severely limited in practice, please take a look here: optional field problem

Eli Dowling (Apr 03 2024 at 11:29):

Richard Feldman (Apr 03 2024 at 11:42):

@witoldsz yeah by design optional record fields are not intended to be for use cases like this - they're designed to be limited in use to default parameters in functions, but I understand why that has been unclear!

Eli Dowling (Apr 03 2024 at 12:10):

Richard Feldman (Apr 03 2024 at 14:52):

so I appreciate the point about withDefault being convenient in general, but I'm not sure how important it is in this specific use case

Richard Feldman (Apr 03 2024 at 14:52):

maybe it's a big deal in practice, or maybe it's fine to just use a when on the field

Richard Feldman (Apr 03 2024 at 14:53):

so I don't want to assume that's a problem before we've confirmed it one way or the other

Richard Feldman (Apr 03 2024 at 14:53):

having it be Result would work, but I agree with the point that something seems off about it

Richard Feldman (Apr 03 2024 at 14:53):

Richard Feldman (Apr 03 2024 at 14:54):

one of the things I like about [Null, NotNull a] is that it hints that "we got this value from JSON and we should probably turn it into something more reasonable"

Richard Feldman (Apr 03 2024 at 14:55):

whereas if I see a Result in my data structures, my first thought is "wait, what is that Result doing there?"

Richard Feldman (Apr 03 2024 at 14:55):

Richard Feldman (Apr 03 2024 at 14:57):

I appreciate this point, but I think the cure might be worse than the symptoms in this case :big_smile:

Richard Feldman (Apr 03 2024 at 14:58):

as in, if Option or Optional is a builtin, there is a 100% chance it will get used for lots of things other than serialization

Richard Feldman (Apr 03 2024 at 14:59):

and there will end up needing to be guides written about when to use it and when not to use it, when to use Result instead, when to use tag unions, like this entire talk

Richard Feldman (Apr 03 2024 at 14:59):

so I don't think it's a given that the JsonOptional/YamlOptional/etc. world is actually worse than having a builtin!

Richard Feldman (Apr 03 2024 at 15:00):

(maybe it is, maybe it isn't - I just want to point out that neither is obviously better than the other to me)

Stream: ideas

Topic: Builtin Option type for decoding

Eli Dowling (Apr 02 2024 at 10:20):

Brendan Hansknecht (Apr 02 2024 at 14:40):

Eli Dowling (Apr 02 2024 at 14:57):

Richard Feldman (Apr 02 2024 at 15:43):

Richard Feldman (Apr 02 2024 at 15:48):

Richard Feldman (Apr 02 2024 at 15:49):

Eli Dowling (Apr 02 2024 at 22:08):

Luke Boswell (Apr 02 2024 at 22:37):

Eli Dowling (Apr 02 2024 at 22:40):

Luke Boswell (Apr 02 2024 at 23:21):

Eli Dowling (Apr 02 2024 at 23:56):

Luke Boswell (Apr 03 2024 at 01:05):

Luke Boswell (Apr 03 2024 at 01:07):

Luke Boswell (Apr 03 2024 at 01:12):

Luke Boswell (Apr 03 2024 at 01:29):

Luke Boswell (Apr 03 2024 at 01:30):

Eli Dowling (Apr 03 2024 at 03:26):

Brendan Hansknecht (Apr 03 2024 at 03:43):

Brendan Hansknecht (Apr 03 2024 at 03:45):

Brendan Hansknecht (Apr 03 2024 at 03:46):

Brendan Hansknecht (Apr 03 2024 at 03:53):

Eli Dowling (Apr 03 2024 at 05:32):

Eli Dowling (Apr 03 2024 at 05:42):

Richard Feldman (Apr 03 2024 at 10:39):

Richard Feldman (Apr 03 2024 at 10:39):

Richard Feldman (Apr 03 2024 at 10:39):

Richard Feldman (Apr 03 2024 at 10:41):

Richard Feldman (Apr 03 2024 at 10:42):

Richard Feldman (Apr 03 2024 at 10:44):

Richard Feldman (Apr 03 2024 at 10:52):

Eli Dowling (Apr 03 2024 at 10:56):

Eli Dowling (Apr 03 2024 at 10:57):

Richard Feldman (Apr 03 2024 at 11:18):

Richard Feldman (Apr 03 2024 at 11:18):

Richard Feldman (Apr 03 2024 at 11:20):

Richard Feldman (Apr 03 2024 at 11:21):

Richard Feldman (Apr 03 2024 at 11:21):

Richard Feldman (Apr 03 2024 at 11:21):

Richard Feldman (Apr 03 2024 at 11:25):

Eli Dowling (Apr 03 2024 at 11:28):

witoldsz (Apr 03 2024 at 11:28):

Eli Dowling (Apr 03 2024 at 11:29):

Richard Feldman (Apr 03 2024 at 11:42):

Eli Dowling (Apr 03 2024 at 12:10):

Richard Feldman (Apr 03 2024 at 14:52):

Richard Feldman (Apr 03 2024 at 14:52):

Richard Feldman (Apr 03 2024 at 14:53):

Richard Feldman (Apr 03 2024 at 14:53):

Richard Feldman (Apr 03 2024 at 14:53):

Richard Feldman (Apr 03 2024 at 14:53):

Richard Feldman (Apr 03 2024 at 14:54):

Richard Feldman (Apr 03 2024 at 14:55):

Richard Feldman (Apr 03 2024 at 14:55):

Richard Feldman (Apr 03 2024 at 14:57):

Richard Feldman (Apr 03 2024 at 14:58):

Richard Feldman (Apr 03 2024 at 14:59):

Richard Feldman (Apr 03 2024 at 14:59):

Richard Feldman (Apr 03 2024 at 15:00):