Bug with shiftRightZfBy? · beginners

Stream: beginners

Topic: Bug with shiftRightZfBy?

Matthieu Pizenberg (Jun 06 2024 at 14:15):

So I was chasing a problem which boiled down to what I think is a bug in the implementation of Num.shiftRightZfBy. When I do shift right by 128 on a U128 number, instead of getting 0, it seems I’m getting the original number. Weirder, this behavior is inconsistent between roc repl and roc test. Here is a screenshot of the repl and the test.

Matthieu Pizenberg (Jun 06 2024 at 14:16):

image.png

Matthieu Pizenberg (Jun 06 2024 at 14:19):

I’m using macos with an M3 chip

Brendan Hansknecht (Jun 06 2024 at 14:35):

Difference between llvm and the dev backend

Brendan Hansknecht (Jun 06 2024 at 14:35):

I guess we have to pick the functionality we actually want

Brendan Hansknecht (Jun 06 2024 at 14:35):

I think we want the repl behaviour

Matthieu Pizenberg (Jun 06 2024 at 14:36):

the repl behavior (0) I think is the only correct one no?

Matthieu Pizenberg (Jun 06 2024 at 14:36):

I mean it’s not a wrapping shift

Matthieu Pizenberg (Jun 06 2024 at 14:40):

hum ... actually, for 64 bit integers, the repl behaves like roc test for 128, and not like the repl for 128

» Num.shiftRightZfBy 7_u64 63

0 : U64
» Num.shiftRightZfBy 7_u64 64

7 : U64

Matthieu Pizenberg (Jun 06 2024 at 14:41):

And for 32, and lower it’s back to behaving "normally" (returning 0)

» Num.shiftRightZfBy 7_u32 31

0 : U32
» Num.shiftRightZfBy 7_u32 32

0 : U32
» Num.shiftRightZfBy 7_u16 15

0 : U16
» Num.shiftRightZfBy 7_u16 16

0 : U16

Matthieu Pizenberg (Jun 06 2024 at 14:43):

So basically, in the repl, the behavior is consistently returning 0 for a shift the size of the integer, except for 64bit, where it’s returning the number itself.

Matthieu Pizenberg (Jun 06 2024 at 14:45):

And in tests (llvm backend?) its more inconsistent:

[U256.roc:36] Num.shiftRightZfBy 7_u128 128 = 7
[U256.roc:38] Num.shiftRightZfBy 7_u64 64 = 7
[U256.roc:40] Num.shiftRightZfBy 7_u32 32 = 7
[U256.roc:42] Num.shiftRightZfBy 7_u16 16 = 0
[U256.roc:44] Num.shiftRightZfBy 7_u8 8 = 0

Matthieu Pizenberg (Jun 06 2024 at 14:48):

I can open an issue on the repo if you want.

Brendan Hansknecht (Jun 06 2024 at 14:55):

llvm considers it undefined behavior to shift equal to more than the bit width.

Brendan Hansknecht (Jun 06 2024 at 14:56):

They do whatever is fastest and move on with life

Brendan Hansknecht (Jun 06 2024 at 14:56):

Technically the dev backend does the same (bit with different instructions)

Brendan Hansknecht (Jun 06 2024 at 14:57):

Changing target or optimizing the build can change results

Matthieu Pizenberg (Jun 06 2024 at 14:59):

Damn ok. I got bitten by that because I was implementing shifts for a custom U256.

shiftLeftBy : U256, U8 -> U256
shiftLeftBy = \{ high, low }, shift ->
    if shift >= 128 then
        { high: Num.shiftLeftBy low (shift - 128), low: 0 }
    else
        {
            high: Num.bitwiseOr (Num.shiftLeftBy high shift) (Num.shiftRightZfBy low (128 - shift)),
            low: Num.shiftLeftBy low shift,
        }

I guess I got to special case 0 then because otherwise the right part of my high is shifting by 128 - 0 == 128.

Brendan Hansknecht (Jun 06 2024 at 14:59):

For roc, we probably want it to always return zero, but kinda sucks to add a conditional and cmov around every bitshift.

This is why llvm makes in undefined behaviour. They will do whatever is fastest on the specific platform and ignore the "obvious bug" of bitshifting by the bit width or greater.

Matthieu Pizenberg (Jun 06 2024 at 15:03):

Okay, so what’s your take, should I open an issue with my problem and our discussion? or keep it in zulip.

Brendan Hansknecht (Jun 06 2024 at 15:25):

Haha....llvm: https://godbolt.org/z/qsMncer9W
Just doesn't emit any instructions for a shift right that is too large. This is with optimizations.

gcc explicitly zeros: https://godbolt.org/z/h5bGdsoMo

Of coures, for both, if they can't figure out the shift value at compile time will always emit the instructions, like shr rdx, cl.

The semantics of that is to mask to the right number of bits (throwing away all the higher bits) and then shift. So U64 >> 64 is equivalent to U64 >> 0, U64 >> 70 is equivalent to U64 >> 6.

Brendan Hansknecht (Jun 06 2024 at 15:25):

Just kinda noting the existing semantics.....not really sure what roc should do.

Brendan Hansknecht (Jun 06 2024 at 15:26):

@Richard Feldman any thoughts on perf vs correctness and undefined behaviour for bit shifting?

Brendan Hansknecht (Jun 06 2024 at 15:31):

I guess I got to special case 0 then because otherwise the right part of my high is shifting by 128 - 0 == 128.

Just to note, this is actually the assembly that llvm or clang have to generate for this. They need condition and cmov if the shift amount is equal to the max. That said, they also don't handle if you overshift a U128 or U256. U128 >> 128 is still undefined behaviour for both and will likely return "wrong" results.

Richard Feldman (Jun 06 2024 at 16:31):

yeah we should not have UB in Roc

Richard Feldman (Jun 06 2024 at 16:31):

so I think we need the conditional, but I'd expect it to often (usually?) get optimized away anyway, right?

Brendan Hansknecht (Jun 06 2024 at 16:55):

As long as the shift is by a constant it should get optimized away

Brendan Hansknecht (Jun 06 2024 at 16:57):

Aside: this is a case where I really like what zig does. They allow for arbitrarily sized int and then require the correct sized int when shifting. U32 >> U5 is the signiture for 32bit. You literally can't pass in a shift that is too large. And requiring to convert to U5 forces the user to explicitly think about the behaviour they want.

Richard Feldman (Jun 06 2024 at 17:45):

yeah that's super cool

Kiryl Dziamura (Jun 06 2024 at 17:48):

speaking of that. does it make sense for roc to support arbitrary sized ints?

Richard Feldman (Jun 06 2024 at 20:31):

I wrote up some thoughts on that! :big_smile:

https://www.roc-lang.org/faq#arbitrary-numbers

Matthieu Pizenberg (Jun 06 2024 at 21:10):

I opened an issue: https://github.com/roc-lang/roc/issues/6789

Kiryl Dziamura (Jun 07 2024 at 06:21):

I meant like in zig, not bigints specifically. e.g. U3, U5

Last updated: Aug 17 2025 at 12:14 UTC