Should Str.toU128
support e
? Roc currently supports this for F32
and F64
, but not Dec
or Int *
. This has come up when adding tests for Json decoding of integers, as large numbers can be validly represented using an exponent in Json, but then are not able to be decoded into any Roc type except F64
.
» Num.toStr 340_000_000_000_000_000_000_000_000_000_000_000_000u128
"340000000000000000000000000000000000000" : Str
# val20
» Str.toU128 "340000000000000000000000000000000000000"
Ok 340000000000000000000000000000000000000 : Result U128 [InvalidNumStr]
# val21
» Str.toU128 "34e37"
Err InvalidNumStr : Result U128 [InvalidNumStr]
# val22
Also should we also support E
and +
, e.g. 12e+2
,12E+2
in as these are currently not supported but commonly used across various programming languages and number formats (XML, Python etc).
moved to a separate idea thread
And a related question, json numbers technically should only ever be encoded as a double precision float and therefore should be a maximum 21 bytes. However, if we encode a U128 using Roc it could be "340282366920938463463374607431768211455" which is 39 bytes long. So should we support a max number of bytes in a json string for Decoding of the max for a double precision float (21), OR the max for a niaive Roc (Num.toStr) of the max U128 (39) bytes?
Shouldn't json decode all types to f64 then convert to the correct type?
Cause the only json native type is f64. Of course you need to check errors when converting from f64 to other types
I'm not sure. My first thought was to collect the bytes that make up a valid json number, and then try to convert to the desired Roc type. What you suggest may be easier and more reliable. Wouldn't that mean there are two conversions, Str -> F64 -> U128 etc
That is fair. I guess it isn't needed, but is 7.000
a valid U128? In js it is the same thing as 7
. Idk. Also, i think the cost of f64 to other type will be small compared to parsing bytes
Is this what you mean @Brendan Hansknecht
decodeU16 = Decode.custom \bytes, @Json {} ->
{ taken, rest } = takeJsonNumber bytes
result =
taken
|> Str.fromUtf8
|> Result.try Str.toU16
|> Result.mapErr \_ -> TooShort
{ result, rest }
decodeU16 = Decode.custom \bytes, @Json {} ->
{ taken, rest } = takeJsonNumber bytes
result =
taken
|> Str.fromUtf8
|> Result.try Str.toF64
|> Result.map Num.round
|> Result.map Num.toU16
|> Result.mapErr \_ -> TooShort
{ result, rest }
I guess one downside, is we can't support anything larger than a F64, but I guess that is a limitation with JSON. A workaround could be to use a Str or something and handle in roc. I think what you have suggested is probably the right way to go
No. I think if you need to round, that would be an error.
Even if I am specifically trying to decode into a U16?
Yeah, cause 7.3
is not a U16
That should definitely be a decoding failure.
I would think that if an end user specifically needs a big int, they will have to deal with conversions. Store in a string or multiple ints. I don't think the json decoder should automatically deal with it, but i may be wrong. I have not worked a ton with json decoding and specs.
I guess for encoding, this is where it really gets problematic. You don't want encoding a U64 to fail or lose data because it can't fit losslessly in a F64.
This works, is it too hacky? we could make a builtin that checks for fraction part in a float
decodeU16 = Decode.custom \bytes, @Json {} ->
{ taken, rest } = takeJsonNumber bytes
result =
taken
|> Str.fromUtf8
|> Result.try Str.toF64
|> Result.try hasNoFractionPart
|> Result.map Num.round
|> Result.map Num.toU16
|> Result.mapErr \_ -> TooShort
{ result, rest }
hasNoFractionPart : F64 -> Result F64 [HasFractionPart]
hasNoFractionPart = \a ->
fraction = Num.floor((a-Num.toFrac(Num.floor(a/1.0))*1.0)*1000)
if fraction == 0 then
Ok a
else
Err HasFractionPart
expect
result = hasNoFractionPart 12.0
Result.isOk result
expect
result = hasNoFractionPart 12.1
Result.isErr result
Actually, this has a problem ... we need to use Result.try Num.toU16Checked
instead
I don't think you need the round anymore.
That said, as i am thinking about this more especially in the context of encode, i am a lot less sure which approach is better. I think we definitely should look at what other tools do. For example serde_json.
I might just leave this for now, and add some TODO comments for a later deep dive. I'm focussing right now on adding more test coverage and identifying issues like this. Don't want to get too off course here.
Also, i would advise making an idea thread specifically for adding e
to parsing with integer types. I feel like this one has been pretty derailed at this point.
Maybe relevant comments from a serde_json issue on u128 and i128. Also comments from an issue on precision loss and representing as strings
They look to lean into how JS defines a number and not explicitly support any large numbers. So you hit issues with anything larger than the max i54. They also do not support u128 and i128 by default.
For Decoding; it defers to the Str
builtins for this, e.g. Str.toI128
. It takes the bytes for a valid json string (double precision float-64) and then attempts to convert it to the desired Roc number type. If that fails, then it is a decoding failure.
The story for Encoding is less compliant with Json right now, we just use Num.toStr
which works fine 90% of the time, but will include far too many bytes for valid json if a large number like a U128, Dec, or precise float is used. I'm not sure what our preferred behaviour in these situations should be, we don't have any errors and can't fail when encoding. Would we want to panic in this situation?
Wait decode gets errors, but not encode? I'm sure we made this decision for a reason, but sounds strange. I'm sure with certain output formats, encoding will definitely have error cases that should get reported.
At least according to Wikipedia, json doesn't really specify anything about number precision:
Numbers in JSON are agnostic with regard to their representation within programming languages. While this allows for numbers of arbitrary precision to be serialized, it may lead to portability issues. For example, since no differentiation is made between integer and floating-point values, some implementations may treat 42, 42.0, and 4.2E+1 as the same number, while others may not. The JSON standard makes no requirements regarding implementation details such as overflow, underflow, loss of precision, rounding, or signed zeros, but it does recommend expecting no more than IEEE 754 binary64 precision for "good interoperability". (https://en.wikipedia.org/wiki/JSON)
but it does recommend expecting no more than IEEE 754 binary64 precision for "good interoperability".
This line is exceptionally important.
If you encode data in json as an arbitrary precision number, for example: {"myint": 9223372036854775807}
, that precision will be lost in all browser. myint
may claim to be 9223372036854775807
, but in reality, when it is loaded on the frontend, it will be 9223372036854776000
. This can be a nasty footgun.
So even though json does not specify precision, we should take it into account in order to build robust systems. Numbers that are too large should not be encoded as numbers in json. They should be strings or some sort of special large number format.
we should change encode
to return a Result
so it's allowed to fail based on the value
because I64
can also be too big to represent in F64
without precision loss
With all of this, here is my current thought of something that could work nicely:
F64
restrictions into account.When encoding, make sure the value can fit into an f64 without precision loss. If that is the case, encode it. Otherwise, return an error due to loss of precision.
When decoding, essentially decode to f64, then make sure the value can successfully convert to the correct number type without precision loss. so 7.0 is fine as a u16, but 7.3 is not.
Note: Dec may make this complicated. We probably should just always encode Dec as a Str. Given precision is very impotant to Dec, we need to be extra careful. We don't want 10.30 dollars to become 10.300000001 dollars or similar.
Essentially, parameterize json encoding and decoding with a number of options.
One of those options would be to enable arbitrary precision mode. In that mode, all number types that could lose precision when converting to f64 (maybe all number types in general?) would just be encode as strings.
Another option could be ignoring precision loss and just converting all numbers to F64 without failure on precision loss.
This is probably not gonna work super nice in Roc, but still will likely be important form some application. The options to support this that I can think of are either to let the user do it manually by converting the type to a String, or adding some sort of opaque wrapper around a type that is the version that encodes as a string. Neither of these sound great, maybe someone else has a better way that we could support this. I think an opaque wrapper is the only way in roc to get a custom encode for a type.
This is nice in rust for example cause you can do it with an annotation to the field.
Last updated: Jul 05 2025 at 12:14 UTC