I just discovered the hard way that our JSON decoder decodes null
into the string "null"
:joy:
only if you expect to decode it to a string, right?
somebody somewhere would call it a feature
I think this is because we hadn't decided on a convention for what it should do with nulls
I think [Null, NotNull a]
seems like the frontrunner because it's very explicit, and also means you can do stuff like { email: Null }
and have it generate the right JSON
@Luke Boswell do you think you'd have bandwidth to add null handling to your json package?
it's the first Roc in production blocker we've run into at work (which is, in one sense, a milestone for the language! :tada::sweat_smile::tada:)
oh wait, @Ayaz Hafiz - Decoding
doesn't support tag unions yet, does it?
we would need that to be able to decode into [Null, NotNull Str]
or whatever
I should have some time today to look at it. Might be hlepful to make a summary of things not yet supported. We also dont support Dicts yet.
nice! Actually another one that should be unblocked is being able to decode objects with extra fields
From memory I think tags are good to go. Do you think it shouldn't be possible to decode a null or true or false into a Str? The alternative is these fail, and must use tags instead.
e.g. if I want to decode { a, b }
from a JSON object that has "a"
, "b"
, and also "c"
as fields, that fails currently
yeah I think they should fail
like if we get a null
and we expected a string, that should fail
if we think null could happen instead of string, we should decode into [Null, NotNull Str]
I thought extra fields were just ignored. I can look at that too.
sweet, thank you so much! :heart_eyes:
btw the PascalCase field mapping came up right away and was super useful, so thanks for that! :smiley:
Decoding doesn't support tag unions yet, does it?
Yeah it doesn't
I forget - were there unsolved design questions there?
or is it unblocked but just not implemented
Oh right, we currently encode tags, but can't decode them.
yeah that makes sense
So if we don't have tags, would you like me to update and have roc-json fail when trying to decode a Null into a string?
That is an easy change, just remove this when statement
decodeString = Decode.custom \bytes, @Json {} ->
when bytes is
['n', 'u', 'l', 'l', ..] ->
{ result: Ok "null", rest: List.dropFirst bytes 4 }
the specific scenario we ran into at work is needing to distinguish between null and non-null things
so Bool
errors out on null at the moment, which is actually a blocker :sweat_smile:
so what we need is to be able to tell the difference, not to have it error out (or change it to a default value)
in other words, we need tags
It's easy enough to default null to false when decoding a Bool
unfortunately in this use case we specifically need to be able to distinguish between null, true, and false :big_smile:
but I think accepting unused object fields, should be unblocked, yeah?
I think it should be working with jsonWithOptions = \{ fieldNameMapping ? Default, skipMissingProperties ? Bool.true }
Setting skipMissingProperties: Bool.true
should be working
ahh interesting
can we default that to true?
It is currently
## Returns a JSON `Encoder` and `Decoder`
json = @Json { fieldNameMapping: Default, skipMissingProperties: Bool.true }
## Returns a JSON `Encoder` and `Decoder` with configuration options
##
## `skipMissingProperties` - if `True` the decoder will skip additional properties
## in the json that are not present in the model. (Default: `True`)
jsonWithOptions = \{ fieldNameMapping ? Default, skipMissingProperties ? Bool.true } ->
@Json { fieldNameMapping, skipMissingProperties }
And here is the test for it which passes
# Test decode of partial record with multiple additional fields
expect
input = Str.toUtf8 "{\"extraField\":2, \"ownerName\": \"Farmer Joe\", \"extraField2\":2 }"
actual : DecodeResult { ownerName : Str }
actual = Decode.fromBytesPartial input json
expected = Ok { ownerName: "Farmer Joe" }
result = actual.result
result == expected
oh but what about Decode.decode
?
or is it unblocked but just not implemented
Yeah, it's this. We have a design for it here: https://github.com/roc-lang/roc/issues/3816. In principle ("principle") it should mostly be a copy-paste of what's needed for records
awesome! Does anyone want to take a shot at implementing that? Could be a fun self-contained way to make a change to that part of the compiler :smiley:
(with outside help and guidance of course!)
if nobody else is interested, I'll pick it up, but I figured it might be a good introductory project to compiler type system stuff
@Richard Feldman if you haven't started on it yet I'd be interested in doing it. I have a good chunk of time I could spend on it tomorrow. That said, if it's very time sensitive to get it done for your work then I'll hold off.
that's awesome, go for it! :smiley:
Okay sweet! Any guidance before I dig in?
re-reading the notion doc linked in the issue, I don't remember what the Nat
in discriminant
does, or how this would work when specifically decoding the bytes null
into a tag union :sweat_smile:
@Ayaz Hafiz do you remember how those would work?
it returns the index of the matched discriminant given the list of discriminants
the formatter could match null on the empty discriminant
hm, I still don't follow :sweat_smile: - how would that look in practice?
so when Json
encounters a null
it calls DecoderFormatting.discriminant [[]]
? But in that case, how would that get translated to a [Null, NotNull a]
on the decoded side?
No, the point is that the formatter implements discriminant right
So JSON implements that function
So given [[A],[B],[]] the formatter would return a decoder that when matched on null returns 2
And then that index is used to figure out what “sequence” function should be called
which for the index 2 case here, would return the variant named Null
[[A],[B],[]]
is this the type of a tag union?
No, the list of discriminants passed to the discriminant function
The relevant tag union would be “A …, B…, Null”, I suppose
ah gotcha - @Isaac Van Doren does that make sense? Happy to answer questions if it would help!
I haven't looked at the implementation yet but I think this will be enough for me to get started tomorrow :check:
What is the status of decoding tuples from json? I see that it's implemented but haven't gotten it to work
do you have an example you've tried?
As I was typing up a compact example, I got it to work. Looks like the reason it wasn't working before is because the file I was reading in had whitespace in it.
For example, this fails
expect
actual = Decode.fromBytes ("[ 123,456]" |> Str.toUtf8) Core.json
expected = Ok (123,456)
actual == expected
but it passes if I remove the space before 123. So I guess just a bug with whitespace then
ah nice! can you open an issue for that?
Will do!
https://github.com/lukewilliamboswell/roc-json/issues/5
Also, what's the status for removing the rest of TotallyNotJson
from the builtins? Is it blocked or ready to proceed? I see that there is a partial PR from June to do so
Another thought, it might be nice to link to to roc-json somewhere on the Decode page so that it's easier to find. I had to dig around in Zulip for a while before I could figure out where Json was moved to.
I think the tests need to be updated to use a shim implementation, but I don't think there's any blocker. I think @Luke Boswell has the most context
On another note, this approach to decoding is so cool :star_struck:. Very slick to be able to do it without any type annotations if desired
yeah it's crazy, right
still blows my mind
Ayaz Hafiz said:
still blows my mind
and you made it!!! :rolling_on_the_floor_laughing:
Re removing TotallyNotJson I couldn't figure out how to inline the module into the rust tests at the time. I described the issue in a zulip thread I think, or maybe the issue. On my phone rn, I can dig into it later. I don't think there are any issues, it was just beyond my rust skills and probably a bit ambitious for someone with less than a month rust exposure :sweat_smile:
This #5543 is the inactive PR I was working on. I haven't looked at this in a while, and don't plan to any time soon. I probably should close this PR as I imagine it is pretty stale now.
@Isaac Van Doren while you are looking at that part of the code base, there is another issue #5294 which would be super useful. This would mean we can decode json straight into dicts.
i.e. {'Mickey Mouse':'Disney', 'Donald Duck':'Disney', 'Daffy Duck':'Warner Brothers', 'Fred Flintstone':'Hanna Barbera'}
could be decoded into Dict Str Str
I don't know how hard that is to implement. But just thought I would mention in case you are interested.
Okay sounds good. Thanks for mentioning it, I’ll keep it in mind if I get that far! It would be very nice to have
Trying to wrap my head around how this stuff works, so I could be totally off here. Do we need to have a
sequence : state, (state -> [Keep (Decoder state fmt), Skip]), (state -> Result val DecodeError) -> Decoder val fmt | fmt has DecoderFormatting
when we already have a
tuple : state, (state, Nat -> [Next (Decoder state fmt), TooLong]), (state -> Result val DecodeError) -> Decoder val fmt where fmt implements DecoderFormatting
?
It seems like perhaps it could serve the same purpose.
They seem to have different goals. Sequence looks to collect or skip items from a list while tuple takes a list of a specific length. So it would never skip anything and must fail if there are too few or many items.
Though probably could be merged if really wanted. Would need a combined tag and more complex state machine.
Also, what final type does sequence generate? I assume a list instead of a tuple?
I think they both would take a list of a specific length. Because the appropriate call to sequence would be made based on the particular tag that is trying to be decoded into. So in that case the arity of the tag (i.e. the length of the list) is known.
I think the usage of Keep and Skip may have been borrowed from the record decoder. I don’t see why you would want to skip some values if you’re trying to decode into a tag
Also, what final type does sequence generate? I assume a list instead of a tuple?
The idea is to use sequence to decode into a Tag with payload so basically the same thing as a tuple
ah sequence is for tags. I'm not sure then cause a tag isn't really a tuple. A tag is a enum and then anything (record, list, tuple, etc)
Right I should have been more clear. What I mean is that T a b c holds basically the same payload as (a,b,c).
ah yeah, totally.
Yeah i think using Tuple is fine
I wrote that document before tuples existed :upside_down:
Okay cool
@Isaac Van Doren how are things going on this? I'm happy to help with anything! :smiley:
I’ve got most of the boilerplate for a module to derive tag unions, implemented discriminant, and mostly figured out how the decoding stuff fits together.
The next step is getting a sample implementation of what the derived decoder should be for an example tag Union so that I can test it and make sure I’m on the right track before writing all the code to actually generate it.
Originally I thought I would be able to use the tue decoder for the payload of a tag but I realized that won’t work because it needs to work for an empty tuples and single element tuple.
So I’ll need to implement sequence as Ayaz had originally thought might be necessary.
Once I have that I’ll have the example implementation done.
I’ve had some issues with building my branch which I don’t yet understand. Specifically, cargo check succeeded but cargo build hung on roc_load. Do you have an idea about what could cause this? Also have not gotten rust analyzer and cargo to play together nicely yet.
If you need this done more promptly for work I am happy to let you take over.
oh wow, awesome!
it's not urgent, and I'm excited to see your progress! :smiley:
if you git bisect
, can you find out which commit introduced the hang? I might have some ideas based on that
Also have not gotten rust analyzer and cargo to play together nicely yet.
The devtools flake should fix that
Does build hang around roc_load? If so I suspect it’s due to the builtins not compiling. you can try “cargo build -vvv” which should be much more verbose and show if that’s the issue. I suspect you may need to add the discriminant/sequence implementation to the Json module in the stdlib if you haven’t already
Okay good to know, I’ll try that. I have added the implementation to the Json module but there could certainly still be an error
do you mean TotallyNotJson
in the roc/
repo?
The devtools flake should fix that
Yes I need to get the setup! Right now I have the LSP building in a different directory which solved the main problem it seems
do you mean
TotallyNotJson
in theroc/
repo?
Yes
I can probably figure something out once I get a chance to look into it more with the verbose output
cool, let us know if you get stuck, and we can try some other things!
I have an example implementation of a derived decoder for a tag union:
example : Decoder [Zero, One a, Two b c] fmt where a implements Decoding, b implements Decoding, c implements Decoding, fmt implements DecoderFormatting
example =
nameDecoder =
["Zero", "One", "Two"]
|> List.map Str.toUtf8
|> discriminant
Decode.custom \bytes, fmt ->
{ result: nameResult, rest: bytesAfterName } = Decode.decodeWith bytes nameDecoder fmt
when nameResult is
Err _ -> { result: Err TooShort, rest: bytesAfterName }
Ok index ->
when index is
0 ->
state = {}
stepper = \_, _ -> TooLong
finalizer = \_ -> Ok Zero
Decode.decodeWith bytesAfterName (Decode.tuple state stepper finalizer) fmt
1 ->
state = { e0: Err TooShort }
stepper = \s, i ->
when i is
0 ->
Next
(
Decode.custom \b, f ->
when Decode.decodeWith b Decode.decoder f is
{ result, rest } ->
{ result: Result.map result \val -> { s & e0: Ok val }, rest }
)
_ -> TooLong
finalizer = \s ->
when s is
{ e0: Ok val } -> Ok (One val)
_ -> Err TooShort
Decode.decodeWith bytesAfterName (Decode.tuple state stepper finalizer) fmt
2 ->
state = { e0: Err TooShort, e1: Err TooShort }
stepper = \s, i ->
when i is
0 ->
Next
(
Decode.custom \b, f ->
when Decode.decodeWith b Decode.decoder f is
{ result, rest } ->
{ result: Result.map result \val -> { s & e0: Ok val }, rest }
)
1 ->
Next
(
Decode.custom \b, f ->
when Decode.decodeWith b Decode.decoder f is
{ result, rest } ->
{ result: Result.map result \val -> { s & e1: Ok val }, rest }
)
_ -> TooLong
finalizer = \s ->
when s is
{ e0: Ok v0, e1: Ok v1 } -> Ok (Two v0 v1)
_ -> Err TooShort
Decode.decodeWith bytesAfterName (Decode.tuple state stepper finalizer) fmt
_ -> { result: Err TooShort, rest: bytesAfterName }
It type checks (:tada:) but I have not been able to test it.
When I run this (using cargo)
expect
actual = Decode.decodeWith ("Two:[3,\"foo\"]" |> Str.toUtf8) example json
actual.result == Ok (Two 3 "foo")
I get
thread '<unnamed>' panicked at 'unspecialized lambda sets left over during resolution: LambdaSet([] + (<22014>FlexAble(a, [`Decode.Decoding`, `Bool.Eq`]):`Decode.decoder`:1), ^<22018>), UlsOfVar(VecMap { keys: [19831, 22014, 22367, 22462], values: [VecSet { elements: [19830] }, VecSet { elements: [22016] }, VecSet { elements: [22369] }, VecSet { elements: [22464] }] })', crates/compiler/mono/src/layout.rs:2074:17
Any suggestions about getting the testing to work or about the code in general? I figured it would be good to get feedback before I get too far in to implementing the code to generate it.
if it's a lambda set problem, my general approach is "cross my fingers and hope @Ayaz Hafiz knows" :sweat_smile:
wow, is there 22K type variables in the derived implementation? or is that placed into a roc program?
have you added tests in the derive crate? there are some checks there that make sure the type variables are well formed
otherwise, if you can run a derive test with ROC_PRINT_UNIFICATIONS and ROC_TRACE_COMPACTION the output might give some pointers to us
This is just an example I wrote myself to figure out what the compiler should derive. I had it in the TotallyNotJson module to test with the other functions there. Given that, are there still tests I could add in the derive crate that would be helpful?
oh interesting
I’m out right now but I can take a deeper look in a bit. you might be running into what i lovingly call the “higher region restriction”
there are some notes on the possible patterns that cause it and workarounds here: https://github.com/roc-lang/roc/issues/3724
Okay cool, I’ll check that out when I get a chance
Can you share your branch?
Yes here you go https://github.com/isaacvando/roc/tree/decode-tags-2
Alright, I don't know exactly why this is happening but it's not due to the HRR. But I suspect that it shouldn't be a blocker for the implementation of auto-derivation. I think this is a bug with type inference, but in the auto-deriver, the type variables will be explicitly linked together, and so things should "just compile"
Thanks for looking into it! I’ll go ahead with implementing the deriving for this
I would like to construct a FlatDecodableKey::TagUnion(Vec<(TagName, u16)>)
here to be used in deriving the decoder. The u16 is the arity of the tag in question. I'm not sure where I can get the arity, any suggestions?
here’s how it determined for Encode: https://github.com/roc-lang/roc/blob/531af182899835a9298efd758b927e5ba3e2f64a/crates/compiler/derive_key/src/encoding.rs#L80
Great, thanks!
I've started working on generating code for the decoder. Actually generating the code is pretty straightforward but I'm not sure what do with all of the type variables, when to unify them, and which ones to unify. I don't know how unification works in detail so perhaps I need to get more familiar with it.
That being said, I think I can probably get pretty far by using similar approaches to those used by the other decoder derivers.
Is there a good way to test the unification part incrementally? Right now I have a test in test_derive which is useful because I can see the code that gets generated, but I'm not sure about the unification piece.
This is my branch https://github.com/isaacvando/roc/tree/decode-tags-2
I'm also currently getting an error with the generated code about builtin functions not being found:
---- decoding::tag_union stdout ----
thread 'decoding::tag_union' panicked at 'Derived does not typecheck:
── UNRECOGNIZED NAME ──────────────────────────────────────────────── Test.roc ─
The List module does not expose anything by the name `map`.
── UNRECOGNIZED NAME ──────────────────────────────────────────────── Test.roc ─
The Str module does not expose anything by the name `toUtf8`.
Derived def:
#Derived.decoder_[A 2,B 1] = discriminant (List.map Str.toUtf8 ["A", "B"])
I'm not sure if if this is because I'm referencing them wrong or maybe because the derived code doesn't type check?
This is how I'm writing the calls:
let list_map_fn = Box::new((
env.subs.fresh_unnamed_flex_var(),
Loc::at_zero(Expr::Var(
Symbol::LIST_MAP,
env.subs.fresh_unnamed_flex_var(),
)),
env.subs.fresh_unnamed_flex_var(),
env.subs.fresh_unnamed_flex_var(),
));
let list_map_to_utf8_call = Expr::Call(
list_map_fn,
vec![
(
env.subs.fresh_unnamed_flex_var(),
Loc::at_zero(Expr::Var(
Symbol::STR_TO_UTF8,
env.subs.fresh_unnamed_flex_var(),
)),
),
(
env.subs.fresh_unnamed_flex_var(),
Loc::at_zero(list_of_variants),
),
],
CalledVia::Space,
);
all the fresh_unnamed_flex_vars are placeholders for now
@Ayaz Hafiz do you know if there’s any good way to test parts of a derived decoder prior to having a full implementation?
The uitest for deriving?
They show then intermediate specializations info that is chosen
Not really. You can try writing a test in the test_derive
crate (check out for example the tests for encoding/decoding in that crate)
They will check that the derived implementation is doesn't have type errors
It looks like the addition of those tests was missed for inspect.
Okay thanks
I have a test written there which is useful to see the code that is generated but doesn’t help with the variables when I’m just checking a small chunk
Another question, on Expr::Call there are three variables, one for the function, one for the closure, and one for function return type. What is the closure var and how does that differ from the function var?
The closure var is the so called "lambda set"
Each function type is of form arg1, ..., argn -[clos]-> ret
. The closure var is the "clos" part
Okay cool, thanks
Hi folks, working on this (adding support for decoding into tag unions) has become a drain recently so I am going to stop. If anyone would like to pick up on my progress, it is here. I implemented discriminant, the necessary boilerplate, an example derived decoder, and some of the code to generate decoders. Thanks for the help along the way! If anyone does want to pick it up I’m happy to answer questions about my approach.
fair enough, thanks for taking it this far - really appreciate it! :smiley:
Thanks for the opportunity! It was very cool to work on
Last updated: Jul 05 2025 at 12:14 UTC