Stream: ideas

Topic: Builtin Binary Encoding/Decoding


view this post on Zulip Luke Boswell (Dec 27 2023 at 06:43):

Follow on from this discussion

Builtin Binary Encoding/Decoding

Bool.toBytes : Bool -> (U8)
Bool.fromBytes : (U8) -> Bool

Num.u8ToBytes : U8 -> (U8)
Num.u8FromBytes : (U8) -> U8

Num.u16ToBytes : U16, [BE, LE] -> (U8, U8)
Num.u16FromBytes : (U8, U8), [BE, LE] -> U16

Num.u32ToBytes : U32, [BE, LE] -> (U8, U8, U8, U8)
Num.u32FromBytes : (U8, U8, U8, U8), [BE, LE] -> U32

Num.u64ToBytes : U64, [BE, LE] -> (U8, U8, U8, U8, U8, U8, U8, U8)
Num.u64FromBytes : (U8, U8, U8, U8, U8, U8, U8, U8), [BE, LE] -> U64

Num.u128ToBytes : U128, [BE, LE] -> (U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8)
Num.u128FromBytes : (U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8), [BE, LE] -> U128

Num.i8ToBytes : I8 -> (U8)
Num.i8FromBytes : (U8) -> I8

Num.i16ToBytes : I16, [BE, LE] -> (U8, U8)
Num.i16FromBytes : (U8, U8), [BE, LE] -> I16

Num.i32ToBytes : I32, [BE, LE] -> (U8, U8, U8, U8)
Num.i32FromBytes : (U8, U8, U8, U8), [BE, LE] -> I32

Num.i64ToBytes : I64, [BE, LE] -> (U8, U8, U8, U8, U8, U8, U8, U8)
Num.i64FromBytes : (U8, U8, U8, U8, U8, U8, U8, U8), [BE, LE] -> I64

Num.i128ToBytes : I128, [BE, LE] -> (U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8)
Num.i128FromBytes : (U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8), [BE, LE] -> I128

IEEE-754-2008 binary32
Num.f32ToBytes : F32, [BE, LE] -> (U8, U8, U8, U8)
Num.f32FromBytes : (U8, U8, U8, U8), [BE, LE] -> F32

IEEE-754-2008 binary64
Num.f64ToBytes : F64, [BE, LE] -> (U8, U8, U8, U8, U8, U8, U8, U8)
Num.f64FromBytes : (U8, U8, U8, U8, U8, U8, U8, U8), [BE, LE] -> F64

Num.decToBytes : Dec, [BE, LE] -> (U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8)
Num.decFromBytes : (U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8, U8), [BE, LE] -> Dec

Not included

I think these can be managed just using Encode and Decode abilities and so not required.

view this post on Zulip Fabian Schmalzried (Dec 27 2023 at 18:37):

I can see basically two use cases for this:

  1. I want to get / manipulate a given byte. Then tuple would be the right choice
  2. Encoding/Decoding to specific binary format to send to disk or over the wire (protobuf, BSON, etc. ). Since the Encode/Decode Abilities work with List U8, getting a List U8 directly would be more ergonomic.

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 19:03):

Yeah, it is definitely a tradeoff between ergonomics and avoiding allocations

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 19:06):

@Richard Feldman do we have any ideas or plans around going from tuples to lists and back?

I think the biggest issue is that two different sizes tuples are not the same type...so how would we generate that. I would guess it would have to be syntax sugar if we support it at all....which isn't great.

Just cause plans around that could shape this API.

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 19:07):

The API above is the performant allocation avoiding api

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 19:10):

Another option is to use lists and simplify the API at the cost of always allocating:

Num.toBytes : Num a, [BE, LE] -> List U8
Num.fromBytes : List U8, [BE, LE] -> Result (Num a) [WrongNumberOfBytes]

view this post on Zulip Fabian Schmalzried (Dec 27 2023 at 19:10):

Could we have both? Num.toBytesTuple and Num.toBytesList

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 19:12):

I mean, we could...though I really wouldn't be a fan of the API as a whole.

view this post on Zulip Richard Feldman (Dec 27 2023 at 19:12):

suppose this API is just tuples and not lists - what specific use cases are painful if there's no way to go from tuples to lists?

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 19:13):

Also, in the Encode case, you don't actually want toBytes to create a list. Cause you probably want to take the bytes and append them to an existing list.

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 19:13):

Cause generally encode is building up a buffer as a whole. No reason to waste an allocation on the extra list

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 19:14):

For decode, you definitely want to be able to take a seamless slice as input. To go from a list U8 buffer to a number. Of course you can always use when is to extract the tuple of values from the list or return an error, but that is quite inconvenient.

view this post on Zulip Fabian Schmalzried (Dec 27 2023 at 19:17):

(b0, b1, b2, b3, b4, b5, b6, b7) = Num.f64ToBytes myFloat LE
List.concat buffer [b0, b1, b2, b3, b4, b5, b6, b7]

I assume that's how it would be used?

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 19:21):

Sadly, in practice.....yeah....which means the temporary list would still be generated. Cause really it should be:

(b0, b1, ..., b7) = ...
buffer
|> list.reserve 8
|> List.append b0
|> List.append b1
|> ...
|> List.append b7

view this post on Zulip Fabian Schmalzried (Dec 27 2023 at 19:25):

Maybe a helper function for exactly that? Num.appendBytestToList : Num a, [LE, BE] , List U8 -> List U8

view this post on Zulip Richard Feldman (Dec 27 2023 at 20:24):

that seems reasonable to me :thumbs_up:

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 20:32):

Will people see/know to use that?

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 20:32):

My guess is most people would just use toBytes and most encoders would be slower than they need to be.

view this post on Zulip Fabian Schmalzried (Dec 27 2023 at 20:35):

When toBytes gives me a tuple, and I need to append to a list, I would look for a way to get a list. Maybe that's just me, but I probably would have found it.

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 20:40):

Oh, if it is just the tuple version and the append to list version, that probably would cause people to use the right version.

view this post on Zulip Brendan Hansknecht (Dec 27 2023 at 20:40):

Yeah. I like that api. Just no direct to list version.

view this post on Zulip Fabian Schmalzried (Dec 27 2023 at 20:42):

And if you actually need a list, you can always use the append to an empty list.

view this post on Zulip Agus Zubiaga (Dec 27 2023 at 20:42):

What's a use case where the tuple is useful? Manipulating a given byte in a number?

view this post on Zulip Brendan Hansknecht (Dec 28 2023 at 15:02):

hashing is one example.

view this post on Zulip Brendan Hansknecht (Dec 28 2023 at 15:04):

That said, I would guess that most people would never touch any of these functions except when writting binary format encode and decode functions

view this post on Zulip Agus Zubiaga (Dec 28 2023 at 15:05):

For binary format encoding you’re probably fine with the appendBytesToList function unless I’m missing something

view this post on Zulip Brendan Hansknecht (Dec 28 2023 at 15:31):

Some binary formats remove leading zero bytes from integers for example

view this post on Zulip Brendan Hansknecht (Dec 28 2023 at 15:32):

So append bytes wouldn't work correctly

view this post on Zulip Luke Boswell (Jan 02 2024 at 10:38):

So to clarify; what I understand from the above is that we would like to have a tuple to/from bytes for each number, and a generic append to list.

add
Num.appendBytesToList : List U8, Num a, [LE, BE] -> List U8

... others as above
Num.u8ToBytes : U8 -> (U8)
Num.u8FromBytes : (U8) -> U8

view this post on Zulip Isaac Van Doren (Jan 02 2024 at 13:48):

Is there a reason to have the conversions for U8? Also, is that intended to be a real one element tuple, or are the parenthesis there to stylistically match the other functions?

view this post on Zulip Brendan Hansknecht (Jan 02 2024 at 16:01):

Yeah probably should skip U8

view this post on Zulip Brendan Hansknecht (Jan 02 2024 at 16:01):

Otherwise, yeah, that all sounds correct

view this post on Zulip Brian Carroll (Jan 09 2024 at 06:07):

@Isaac Van Doren there's no such thing as a single element tuple in Roc so that's either a stylistic choice or a mistake, not sure which. But I don't think we need those functions anyway.


Last updated: Jun 16 2026 at 16:19 UTC