Stream: beginners

Topic: Number formats


view this post on Zulip Luke Boswell (Apr 23 2023 at 02:16):

Should Str.toU128 support e? Roc currently supports this for F32 and F64, but not Dec or Int *. This has come up when adding tests for Json decoding of integers, as large numbers can be validly represented using an exponent in Json, but then are not able to be decoded into any Roc type except F64.

» Num.toStr 340_000_000_000_000_000_000_000_000_000_000_000_000u128

"340000000000000000000000000000000000000" : Str
                         # val20

» Str.toU128 "340000000000000000000000000000000000000"

Ok 340000000000000000000000000000000000000 : Result U128 [InvalidNumStr]
                         # val21

» Str.toU128 "34e37"

Err InvalidNumStr : Result U128 [InvalidNumStr]
                         # val22

view this post on Zulip Luke Boswell (Apr 23 2023 at 02:16):

Also should we also support E and +, e.g. 12e+2,12E+2 in as these are currently not supported but commonly used across various programming languages and number formats (XML, Python etc).

moved to a separate idea thread

view this post on Zulip Luke Boswell (Apr 23 2023 at 02:16):

And a related question, json numbers technically should only ever be encoded as a double precision float and therefore should be a maximum 21 bytes. However, if we encode a U128 using Roc it could be "340282366920938463463374607431768211455" which is 39 bytes long. So should we support a max number of bytes in a json string for Decoding of the max for a double precision float (21), OR the max for a niaive Roc (Num.toStr) of the max U128 (39) bytes?

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 02:23):

Shouldn't json decode all types to f64 then convert to the correct type?

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 02:26):

Cause the only json native type is f64. Of course you need to check errors when converting from f64 to other types

view this post on Zulip Luke Boswell (Apr 23 2023 at 02:28):

I'm not sure. My first thought was to collect the bytes that make up a valid json number, and then try to convert to the desired Roc type. What you suggest may be easier and more reliable. Wouldn't that mean there are two conversions, Str -> F64 -> U128 etc

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 02:32):

That is fair. I guess it isn't needed, but is 7.000 a valid U128? In js it is the same thing as 7. Idk. Also, i think the cost of f64 to other type will be small compared to parsing bytes

view this post on Zulip Luke Boswell (Apr 23 2023 at 02:34):

Is this what you mean @Brendan Hansknecht

Current

decodeU16 = Decode.custom \bytes, @Json {} ->
    { taken, rest } = takeJsonNumber bytes

    result =
        taken
        |> Str.fromUtf8
        |> Result.try Str.toU16
        |> Result.mapErr \_ -> TooShort

    { result, rest }

Proposed

decodeU16 = Decode.custom \bytes, @Json {} ->
    { taken, rest } = takeJsonNumber bytes

    result =
        taken
        |> Str.fromUtf8
        |> Result.try Str.toF64
        |> Result.map Num.round
        |> Result.map Num.toU16
        |> Result.mapErr \_ -> TooShort

    { result, rest }

view this post on Zulip Luke Boswell (Apr 23 2023 at 02:35):

I guess one downside, is we can't support anything larger than a F64, but I guess that is a limitation with JSON. A workaround could be to use a Str or something and handle in roc. I think what you have suggested is probably the right way to go

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 02:35):

No. I think if you need to round, that would be an error.

view this post on Zulip Luke Boswell (Apr 23 2023 at 02:36):

Even if I am specifically trying to decode into a U16?

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 02:37):

Yeah, cause 7.3 is not a U16

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 02:37):

That should definitely be a decoding failure.

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 02:40):

I would think that if an end user specifically needs a big int, they will have to deal with conversions. Store in a string or multiple ints. I don't think the json decoder should automatically deal with it, but i may be wrong. I have not worked a ton with json decoding and specs.

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 02:41):

I guess for encoding, this is where it really gets problematic. You don't want encoding a U64 to fail or lose data because it can't fit losslessly in a F64.

view this post on Zulip Luke Boswell (Apr 23 2023 at 03:01):

This works, is it too hacky? we could make a builtin that checks for fraction part in a float

decodeU16 = Decode.custom \bytes, @Json {} ->
    { taken, rest } = takeJsonNumber bytes

    result =
        taken
        |> Str.fromUtf8
        |> Result.try Str.toF64
        |> Result.try hasNoFractionPart
        |> Result.map Num.round
        |> Result.map Num.toU16
        |> Result.mapErr \_ -> TooShort

    { result, rest }

hasNoFractionPart : F64 -> Result F64 [HasFractionPart]
hasNoFractionPart = \a ->
    fraction = Num.floor((a-Num.toFrac(Num.floor(a/1.0))*1.0)*1000)

    if fraction == 0 then
        Ok a
    else
        Err HasFractionPart

expect
    result = hasNoFractionPart 12.0

    Result.isOk result

expect
    result = hasNoFractionPart 12.1

    Result.isErr result

view this post on Zulip Luke Boswell (Apr 23 2023 at 03:05):

Actually, this has a problem ... we need to use Result.try Num.toU16Checked instead

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 03:05):

I don't think you need the round anymore.

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 03:06):

That said, as i am thinking about this more especially in the context of encode, i am a lot less sure which approach is better. I think we definitely should look at what other tools do. For example serde_json.

view this post on Zulip Luke Boswell (Apr 23 2023 at 03:10):

I might just leave this for now, and add some TODO comments for a later deep dive. I'm focussing right now on adding more test coverage and identifying issues like this. Don't want to get too off course here.

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 03:16):

Also, i would advise making an idea thread specifically for adding e to parsing with integer types. I feel like this one has been pretty derailed at this point.

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 03:25):

Maybe relevant comments from a serde_json issue on u128 and i128. Also comments from an issue on precision loss and representing as strings

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 03:28):

They look to lean into how JS defines a number and not explicitly support any large numbers. So you hit issues with anything larger than the max i54. They also do not support u128 and i128 by default.

view this post on Zulip Luke Boswell (Apr 23 2023 at 03:38):

For Decoding; it defers to the Str builtins for this, e.g. Str.toI128. It takes the bytes for a valid json string (double precision float-64) and then attempts to convert it to the desired Roc number type. If that fails, then it is a decoding failure.

The story for Encoding is less compliant with Json right now, we just use Num.toStr which works fine 90% of the time, but will include far too many bytes for valid json if a large number like a U128, Dec, or precise float is used. I'm not sure what our preferred behaviour in these situations should be, we don't have any errors and can't fail when encoding. Would we want to panic in this situation?

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 03:58):

Wait decode gets errors, but not encode? I'm sure we made this decision for a reason, but sounds strange. I'm sure with certain output formats, encoding will definitely have error cases that should get reported.

view this post on Zulip Ajai Nelson (Apr 23 2023 at 04:42):

At least according to Wikipedia, json doesn't really specify anything about number precision:

Numbers in JSON are agnostic with regard to their representation within programming languages. While this allows for numbers of arbitrary precision to be serialized, it may lead to portability issues. For example, since no differentiation is made between integer and floating-point values, some implementations may treat 42, 42.0, and 4.2E+1 as the same number, while others may not. The JSON standard makes no requirements regarding implementation details such as overflow, underflow, loss of precision, rounding, or signed zeros, but it does recommend expecting no more than IEEE 754 binary64 precision for "good interoperability". (https://en.wikipedia.org/wiki/JSON)

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 18:00):

but it does recommend expecting no more than IEEE 754 binary64 precision for "good interoperability".

This line is exceptionally important.

If you encode data in json as an arbitrary precision number, for example: {"myint": 9223372036854775807}, that precision will be lost in all browser. myint may claim to be 9223372036854775807, but in reality, when it is loaded on the frontend, it will be 9223372036854776000. This can be a nasty footgun.

So even though json does not specify precision, we should take it into account in order to build robust systems. Numbers that are too large should not be encoded as numbers in json. They should be strings or some sort of special large number format.

view this post on Zulip Richard Feldman (Apr 23 2023 at 19:21):

we should change encode to return a Result so it's allowed to fail based on the value

view this post on Zulip Richard Feldman (Apr 23 2023 at 19:21):

because I64 can also be too big to represent in F64 without precision loss

view this post on Zulip Brendan Hansknecht (Apr 23 2023 at 19:58):

With all of this, here is my current thought of something that could work nicely:

By default always take F64 restrictions into account.

When encoding, make sure the value can fit into an f64 without precision loss. If that is the case, encode it. Otherwise, return an error due to loss of precision.

When decoding, essentially decode to f64, then make sure the value can successfully convert to the correct number type without precision loss. so 7.0 is fine as a u16, but 7.3 is not.

Note: Dec may make this complicated. We probably should just always encode Dec as a Str. Given precision is very impotant to Dec, we need to be extra careful. We don't want 10.30 dollars to become 10.300000001 dollars or similar.

Have parameterization to allow ease of use.

Essentially, parameterize json encoding and decoding with a number of options.

One of those options would be to enable arbitrary precision mode. In that mode, all number types that could lose precision when converting to f64 (maybe all number types in general?) would just be encode as strings.

Another option could be ignoring precision loss and just converting all numbers to F64 without failure on precision loss.

Have a way to enable a specific field to be encoded as a string even though it is a number

This is probably not gonna work super nice in Roc, but still will likely be important form some application. The options to support this that I can think of are either to let the user do it manually by converting the type to a String, or adding some sort of opaque wrapper around a type that is the version that encodes as a string. Neither of these sound great, maybe someone else has a better way that we could support this. I think an opaque wrapper is the only way in roc to get a custom encode for a type.

This is nice in rust for example cause you can do it with an annotation to the field.


Last updated: Jul 05 2025 at 12:14 UTC