Stream: ideas

Topic: Encoding that can fail


view this post on Zulip Fabian Schmalzried (Mar 13 2024 at 19:04):

I was considering BSON as a potential example for encoding, but I noticed that unlike JSON, BSON does not permit bare values.While 5 is valid JSON, it cannot be converted to valid BSON without being wrapped in a list. Therefore, attempting to convert 5 to BSON should fail. However, this limitation renders the current encoding ability unusable. I am uncertain how to handle this situation, but I think it is typical for an encoder to fail if the output format cannot support a particular value.

view this post on Zulip Brendan Hansknecht (Mar 13 2024 at 20:55):

Encode should be able to fail. We should update the implementation. At the same time, for cases like this specifically, you probably just want to deal with promotion to a 1 element list when encoding.

view this post on Zulip Brendan Hansknecht (Mar 13 2024 at 21:02):

To clarify, I assume we should support failure cause serde does. I assume they have a good reason for it (probably cause not all types need to be supported by all serialization formats.

view this post on Zulip Richard Feldman (Mar 13 2024 at 21:02):

yeah encoding not being able to fail was an oversight...it should be possible for it to fail!

view this post on Zulip Fabian Schmalzried (Mar 13 2024 at 21:55):

How difficult would this be to change? Also would it be possible for an EncoderFormatting to have custom errors, or can there be only one generic EncodingError?

view this post on Zulip Richard Feldman (Mar 13 2024 at 23:03):

just one, same as decoding

view this post on Zulip Richard Feldman (Mar 13 2024 at 23:04):

shouldn't be too hard to change, since decoding already uses Result, so there are already examples of how to do it :grinning:

view this post on Zulip Richard Feldman (Mar 13 2024 at 23:04):

I'd be happy to provide guidance if anyone wants to try implementing that change!

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 01:43):

Do we have any way to support custom errors here in general? For example, an encoding format could have many possible errors, integer too large for format, string not support in general, etc.

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 01:43):

Giving each specific encoder controller over the error type would be very useful

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 01:44):

Though I'm not sure if it is actually possible in roc

view this post on Zulip Richard Feldman (Mar 14 2024 at 02:14):

not without a type system change, e.g. associated types for abilities

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 04:06):

Kinda sad...but that's what I thought. Would it be work having an error type that allows a string in it? Or do we just want to enumerate common case? Or just a generic error?

view this post on Zulip Richard Feldman (Mar 14 2024 at 04:16):

yeah we could have a string, or maybe a tag union of a string or some raw bytes (if that's desirable)

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 04:25):

We should probably survey what are common errors for serde in rust

view this post on Zulip Jasper Woudenberg (Mar 14 2024 at 06:55):

If adding failing encoders, would there still be a possibility to write an encoder that the type system knows is guaranteed to succeed? I imagine that a lot of common encoders will always succeed (thinking JSON, YAML, TOML, XML, ...), and it'd be nice if users wouldn't need to handle errors when using them.

view this post on Zulip Fabian Schmalzried (Mar 14 2024 at 07:55):

That's a good point, the question is if it is worth it to have a second UnsafeEncoding trait? I would assume that requires a lot of duplication.

view this post on Zulip Fabian Schmalzried (Mar 14 2024 at 12:54):

Assuming that I want bare integer encoding to fail, but records should work. I think that would not work with the current API. When encoding the record, I have to use the same encoding function for integer fields that is also used for bare integers.

view this post on Zulip Asier Elorz (he/him) (Mar 14 2024 at 13:10):

This could be represented by returning a result where the error type is Never/Void. And there would also be a mapping from Result a Never -> a which I don't know if already exists but it probably should.

view this post on Zulip Fabian Schmalzried (Mar 14 2024 at 13:17):

If I understood correctly we cannot have custom error types, so this is not possible?

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 14:56):

This is definitely one argument for more powerful abilities with associated types. But that is a full other discussion.

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 15:04):

Assuming that I want bare integer encoding to fail, but records should work. I think that would not work with the current API.

I was looking at serde in rust cause that is what we modeled after and it supports bson. I think bson just happens to be a weird format that requires an extra level of indirection, but still supports serde and raw integer types.

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 15:09):

It works by:

  1. You can serialize any type into a bson enum. The bson enum is not a full bson document. Instead an integer is just a Bson::Int32(value) or Bson::Int64(value). This uses serde and should never fail.
  2. A user can take that Bson enum value and put it into a document
  3. The document can be serialized to the actual binary format

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 15:13):

This works with barely needing errors to encode to the bson enum. The errors they still have are:

view this post on Zulip Fabian Schmalzried (Mar 14 2024 at 15:23):

Interesting. If I understand correctly, then the same mechanism would only work in roc if it would be possible to encode to something else than just List U8, which would require associated types. Probably not worth it, just for this one weird format?

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 15:35):

Doesn't require associated types. Inspect can do this

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 15:36):

But it does point out a potentially large deficiency in our encode api

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 15:36):

Encode and decode may need to be more powerful like inspect.

view this post on Zulip Richard Feldman (Mar 14 2024 at 17:28):

hm interesting! What would that API look like? :thinking:

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 18:59):

Same as inspect.

Encode.encode someData |> Bson.asBytes

encode and decode generate a type that fulfills an ability. Turning concrete happens when using it with the Bson/Json/etc module.

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 19:00):

Would even allow us to get custom error types, but does make implementing the api more complex (and it is always 2 calls instead of just Encode.encode)

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 19:02):

Cause you could say:

BsonResult := Result Bson CustomError implements ...

As part of the implements you would define encoder on it. Then you would make a Bson function do do the unwrapping.

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 19:03):

Extra note, each individual module can of course define their own single method call version:

# in Bson module
encode = \data Encode.encode data |> Bson.asBytes

view this post on Zulip Richard Feldman (Mar 14 2024 at 19:20):

yooooo

view this post on Zulip Richard Feldman (Mar 14 2024 at 19:20):

that's a great idea!

view this post on Zulip Richard Feldman (Mar 14 2024 at 19:20):

that's actually even more convenient for the end user

view this post on Zulip Richard Feldman (Mar 14 2024 at 19:21):

so then instead of:

user = Decode.decode bytes Json.format

...it would become:

user = Json.decode bytes

...and it would have a decoding error that's custom to JSON

view this post on Zulip Richard Feldman (Mar 14 2024 at 19:22):

right?

view this post on Zulip Richard Feldman (Mar 14 2024 at 19:22):

(at the expense of the Decode.decode type becoming more complex, among other things, but in this world only encoder/decoder authors would be working with those types anyway)

view this post on Zulip Fabian Schmalzried (Mar 14 2024 at 19:28):

That's cool. Let me check if I get this right, it would be

Encode.encode : val -> encoded
    where val implements Encoding, encoded implements Encoded

And then I can have EncodedBson implements Encoded and Bson.toBytes: EncodedBson -> Result (List U8) CustomBsonErrors

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 19:32):

Yeah, I believe so. Would need to double check the exact inspect implementation, but that looks right to me.

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 19:34):

And the EncodedBson type would probably be a := Result Bson CustomBsonErrors. So it would try to build up the Bson and if anything fails, it would just propagate the error instead. Of course it can have more context if wanted.

For json, it might be a := Result (List U8) CustomJsonErrors still always going directly to bytes instead of an intermediate enum style type. Or if json encoding can't fail for roc, it would just be a := List U8.

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 19:35):

I do think this all should work, though I also expect it to be easier for it to break the compiler.

view this post on Zulip Richard Feldman (Mar 14 2024 at 19:45):

given that Inspect already uses the same technique, seems like it should work here too!

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 19:55):

Yeah, I just mean that I know that implementing an inspector "wrong" can easily lead to a compiler crash. So probably would be more common in encode with this change.

view this post on Zulip Richard Feldman (Mar 14 2024 at 20:00):

ah fair :big_smile:

view this post on Zulip Richard Feldman (Mar 14 2024 at 20:01):

but in terms of overall design, that sounds like a worthwhile improvement!

view this post on Zulip Richard Feldman (Mar 14 2024 at 20:44):

I bet this could be prototyped out today, like @Brendan Hansknecht did with the Inspect ability before that landed in builtins!

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 20:44):

100%

view this post on Zulip Brendan Hansknecht (Mar 14 2024 at 20:47):

I think the encode version would be almost identical to inspect. It wouldn't autoderive for everything and would have a slightly changed API, but would be 99% the same as inspect.

Of course the larger work is porting something like Json to this.

view this post on Zulip Luke Boswell (Mar 14 2024 at 21:30):

Might be a good excuse to have another go at "fast" json using the simdjson ideas

view this post on Zulip Luke Boswell (Mar 14 2024 at 21:30):

At least for the decoding side

view this post on Zulip Ayaz Hafiz (Mar 16 2024 at 01:24):

I'm probably missing something but how does this look in practice? Does it mean that the encoding ability must be defined as

# current version
Encoder fmt := List U8, fmt -> List U8 where fmt implements EncoderFormatting

Encoding implements
    toEncoder : val -> Encoder fmt where val implements Encoding, fmt implements EncoderFormatting

# new version
Encoder fmt := fmt -> fmt where fmt implements EncoderFormatting

Encoding implements
    toEncoder : val -> Encoder fmt where val implements Encoding, fmt implements EncoderFormatting

if so, are we sure that this composes in the way that auto-derived implementations must compose? It's also worth noting that this requires at least one call into a function exposed by the specific EncoderFormatting to actually unwrap the value

view this post on Zulip Brendan Hansknecht (Mar 16 2024 at 01:27):

If it works for inspect and inspect can generate any arbitrary type, I think it should work for encode. It really is just inspect with some minor API tweaks. Oh and some autoderive tweaks (e.g. no autoderive for opaques).

view this post on Zulip Brendan Hansknecht (Mar 16 2024 at 01:27):

And yeah, requires one call from the specific encoder type library

view this post on Zulip Ayaz Hafiz (Mar 16 2024 at 19:38):

i’m pretty sure you still want autoderivation for opaque types. That’s a useful feature.

view this post on Zulip Brendan Hansknecht (Mar 16 2024 at 19:40):

I mean you want opt in auto derive. A user need to write implements [ Encode ]. With inspect they don't need to write implements at all.


Last updated: Jun 16 2026 at 16:19 UTC