encoding/decoding Num * · ideas · Zulip Chat Archive

I'm thinking about what the behavior for deriving encoding/decoding for Num * should be. Unlike certain low-level operations like Num.toStr : Num * -> Str, implementors of the encoders/decoders don't have access to the compiler's code-generation procedure. That means they can't decide how to encode/decode a number after monomorphization.

Actually, for encoding, it might be okay, since you could maybe piggy back off numeric operations to figure out a number's width, or us Num.toStr. But for decoding it would not work - if your serialization format represents integers as fixed-width bytes, how can you decode an arbitrary-width integer?

decodeNum = \decodeFmt, bytes ->
    when Decode.fromBytes bytes decodeFmt is
        Ok num ->
            strNum = Num.toStr num
            "Your number was \(strNum)"
        Err _ -> "I couldn't decode a number"

However the implications of this are more serious for serialization formats where numbers are sent as fixed bytes rather than strings. For example if we're decoding a buffer that has a number serialized as 4 big-endian bytes into a Num *, the compiler deciding that we are really decoding into a I64 is no good. These are the kind of bugs that would be really difficult to debug, especially if you're not meticulously looking at the types in your program.

With that, I'm inclined to think we should make this a warning or error in the compiler. To get some advantages in development, we could issue a warning/error, but also infer the default number type in dev builds - that way if something goes wrong, the developer has a hint of where to look. They'll need to add an explicit type annotation before productionizing the code anyway, to get rid of the error. Thoughts?

Brian Carroll (Aug 17 2022 at 16:52):

Yeah I think it should be a warning or error. If you're running CI on your Roc app it should fail.

Brian Carroll (Aug 17 2022 at 16:53):

Brian Carroll (Aug 17 2022 at 17:04):

Even for JSON, where 'it might work on a good day', the app developer should be forced to resolve that ambiguity. I think it's one of the key features of languages like Roc/Elm/Haskell that they force you to deal with stuff like this.

Richard Feldman (Aug 17 2022 at 17:59):

I think warning but not error makes the most sense for decoding - e.g. it says "warning: since this number is only ever used with other Num * values, it will be decoded into an I64 by default - but this default behavior may not be what you want! You could be more specific about the number type you expect by adding a type annotation, or by passing it to a function which requires a more specific number type than Num *."

(And maybe a similar error for Int *, Frac *, etc.) - so it doesn't block you if you're just prototyping an implementation that decodes from JSON, but it does warn you about it so you know how to make it more explicit

Richard Feldman (Aug 17 2022 at 18:01):

I think the same case could be made for encoding, too - e.g. if I'm encoding to a binary format and I just give it Num *, and it defaults to encoding with I64, is I64 what the recipient of the encoded data expects? Maybe, maybe not - the only way to be sure is to be explicit

Ayaz Hafiz (Aug 17 2022 at 18:03):

Ayaz Hafiz (Aug 17 2022 at 18:05):

I don't think it's feasible, either, to add a anyNumber : Num * -> Encoder (Num *) fmt | fmt has EncoderFormatting member to the EncoderFormatting ability, for the reasons described above - the implementer of this ability member would not have a good way of determining how wide the passed number is, without perhaps some clever hacks

Richard Feldman (Aug 17 2022 at 18:12):

Ayaz Hafiz (Aug 17 2022 at 18:15):

actually, I guess in principle, there could be a new Num.toBigEndianBytes : Num * -> List U8 and Num.toLittleEndianBytes: Num * -> List U8 that could resolve the anyNumber problem

Richard Feldman (Aug 17 2022 at 18:15):

Ayaz Hafiz (Aug 17 2022 at 18:16):

might be fine after constant lists though, since the list can then be represented with a zero capacity and the elements are just the number. so no allocations

Ayaz Hafiz (Aug 17 2022 at 18:18):

although I still don't think it would be worth adding for just Encode even if that's possible. Better to be consistent between encode and decode where we can be

Stream: ideas

Topic: encoding/decoding Num *

Ayaz Hafiz (Aug 17 2022 at 16:20):