Determine Endianness for Implementing SHA-1 · beginners

I was trying to implement SHA-1 using Wikipedia's pseudocode and realized that I have no way to know whether the machine is running big or little endian mode. I need it for the UUID library I'm writing (specifically UUID v5). If I'm trying to convert a U128 to a series of big-endian U32 "words", how would I do that in Roc?

Sam Mohr (Jun 09 2024 at 03:38):

Worst-case, I take an { endianness: [Big, Little] } module param on the Sha-1 module, not ideal but it's better than assuming little endian

Notification Bot (Jun 09 2024 at 03:38):

Luke Boswell (Jun 09 2024 at 03:56):

I think this is by design. Im not sure. But I guess it is so that Roc code executes the same everwhere.

Luke Boswell (Jun 09 2024 at 03:57):

Would it help if the platform provided a primitive that told you what the underlying system was?

Sam Mohr (Jun 09 2024 at 03:58):

I'm not a pro on endianness and it's implementation in different languages, but that's actually the worry here: I'm not confident that Roc will execute the same way everywhere. If it can be either big or little endian, then byte-level operations may have different results on different machines.

Sam Mohr (Jun 09 2024 at 03:59):

Yes, I think that is necessary unless Roc makes a guarantee of which endianness it runs, which I'd be suprised by, since the architecture of the machine you're running on is optimized for one or the other.

Sam Mohr (Jun 09 2024 at 04:02):

Another reason why I'd see that being an issue is if Roc code can be distributed in a way that the platform could compile the Roc code to either big or little endian without setting such a flag that you're proposing, since the Roc is already compiled to a .so. Not sure if that's possible, but it could be a problem

Brendan Hansknecht (Jun 09 2024 at 05:31):

I think you should be able to implement this without worrying about endianess in roc.

Brendan Hansknecht (Jun 09 2024 at 05:31):

Sure you could get the info from a platform, but I don't think it is needed here

Brendan Hansknecht (Jun 09 2024 at 05:32):

Bit shifting is the magical way to convert from your sha big endian numbers to a native endian number and back

Sam Mohr (Jun 09 2024 at 05:34):

Okay, so yeah, if I bit shift (which I'm already doing), that'll normalize the result. If that's true, which I think it is, then no need to worry about this anymore.

Brendan Hansknecht (Jun 09 2024 at 05:34):

If you have the bytes [ 0x00, 0x00, 0x00, 0x07 ] as your 32 bit big endian number input, you do:

when bytes is
    [b3, b2, b1, b0, ..] ->
        num = (b3 << 24) |  (b3 << 16) |  (b3 << 8) |  b3

Brendan Hansknecht (Jun 09 2024 at 05:34):

Sam Mohr (Jun 09 2024 at 05:40):

Well, just to make sure we're on the same page, I need to go the other way. SHA1 requires big endian, and I "don't know" what I'm running on. So how do I get from native to big, not the other way?

Brendan Hansknecht (Jun 09 2024 at 05:42):

num = 7
b0 = Num.toU8 (num)
b1 = Num.toU8 (num >> 8)
b2 = Num.toU8 (num >> 16)
b3 = Num.toU8 (num >> 24)

# Big endian out
[b3, b2, b1, b0]

# little endian out
[b0, b1, b2, b3]

Sam Mohr (Jun 09 2024 at 05:44):

Richard Feldman (Jun 09 2024 at 14:11):

Richard Feldman (Jun 09 2024 at 14:13):

as Luke mentioned earlier, it’s a design goal that Roc code should give the same answers regardless of what target it’s running on, and the fact that bit shifts give different answers depending on native endianness breaks that

Richard Feldman (Jun 09 2024 at 14:14):

so I think it would be best not to depend on that behavior, because I’d like to try to figure out a way to change it!

Brendan Hansknecht (Jun 09 2024 at 15:45):

Brendan Hansknecht (Jun 09 2024 at 15:47):

It always gives the same answer, which is why you can use it to extract bytes from native endian. Then you can order them as big or little endian

Brendan Hansknecht (Jun 09 2024 at 15:48):

So the roc users doesn't know if the number is actually stored in big or little endian. They can just extract the bytes then order it as they please.

Stream: beginners

Topic: Determine Endianness for Implementing SHA-1

Sam Mohr (Jun 09 2024 at 03:25):

Sam Mohr (Jun 09 2024 at 03:38):

Notification Bot (Jun 09 2024 at 03:38):

Luke Boswell (Jun 09 2024 at 03:56):

Luke Boswell (Jun 09 2024 at 03:57):

Sam Mohr (Jun 09 2024 at 03:58):

Sam Mohr (Jun 09 2024 at 03:59):

Sam Mohr (Jun 09 2024 at 04:02):

Brendan Hansknecht (Jun 09 2024 at 05:31):

Brendan Hansknecht (Jun 09 2024 at 05:31):

Brendan Hansknecht (Jun 09 2024 at 05:32):

Sam Mohr (Jun 09 2024 at 05:34):

Brendan Hansknecht (Jun 09 2024 at 05:34):

Brendan Hansknecht (Jun 09 2024 at 05:34):

Sam Mohr (Jun 09 2024 at 05:40):

Brendan Hansknecht (Jun 09 2024 at 05:42):

Sam Mohr (Jun 09 2024 at 05:44):

Richard Feldman (Jun 09 2024 at 14:11):

Richard Feldman (Jun 09 2024 at 14:13):

Richard Feldman (Jun 09 2024 at 14:14):

Brendan Hansknecht (Jun 09 2024 at 15:45):

Brendan Hansknecht (Jun 09 2024 at 15:47):

Brendan Hansknecht (Jun 09 2024 at 15:48):

Richard Feldman (Jun 09 2024 at 19:54):