avoiding UB in bit shifts · compiler development

Stream: compiler development

Topic: avoiding UB in bit shifts

Folkert de Vries (Jun 28 2024 at 09:20):

why do we have UB in our test suite?

    assert_evals_to!("Num.shiftRightBy 0b0100_0000u8 12", 0b0000_0000u8, u8);

Folkert de Vries (Jun 28 2024 at 09:21):

ah this one's even better

    // LLVM in release mode returns 0 instead of -1 for some reason
    if !is_llvm_release_mode {
        assert_evals_to!("Num.shiftRightBy 0b1000_0000u8 12", 0b1111_1111u8, u8);
    }

the anwer is UB: shifting by more bits than the integer has is UB in LLVM

Richard Feldman (Jun 28 2024 at 12:38):

so I guess we need to either wrap those in a conditional that crashes if you give it too many, or do a silent modulo or something?

Folkert de Vries (Jun 28 2024 at 13:42):

well to me shifting a u8 by 12 bits has an intuitive interpretation, which is that that should just return 0. So I'd be in favor of making the behavior defined on our end and do some if shift > 8 then 0 else x >> shift

Folkert de Vries (Jun 28 2024 at 13:42):

that may need to be >= 8 I always forget

Brendan Hansknecht (Jun 28 2024 at 15:04):

Yeah, I think a conditional to generate zero is what we should do here

Brendan Hansknecht (Jun 28 2024 at 15:04):

Oh wait, have to be careful though

Brendan Hansknecht (Jun 28 2024 at 15:04):

Cause this is the version of shift right that doesn't zero fill

Brendan Hansknecht (Jun 28 2024 at 15:04):

It sign extends

Brendan Hansknecht (Jun 28 2024 at 15:05):

Really, we probably should redo our bitshift to be based on the number type instead of based on calling the zero fill version of the method or not.

Brendan Hansknecht (Jun 28 2024 at 15:09):

But yeah, should 100% make it deterministic.....

Real question is, should a sign extending bit shift ever return zero for a negative number. Cause -1 is 0b1111_1111....which if you imagine shifting right by one with the sign extending, is still 0b1111_1111

Brendan Hansknecht (Jun 28 2024 at 15:10):

I think that's what happening in the ub example above at least for non-release.....

Brendan Hansknecht (Jun 28 2024 at 15:18):

Yeah, reading into this more online, I think the reasonably sane, but also strange answer is that signed shifting with too many bits on a negative number should become -1 and on a positive number should become 0.

Zero fill/logical/unsigned shifting should become zero if over shifted.

Brendan Hansknecht (Jun 28 2024 at 15:20):

Basically, in both cases, you should be able to just do a max with the bitwidth - 1 and shift. That should give the correct right shift answer.

Brendan Hansknecht (Jun 28 2024 at 15:21):

Wikipedia also has a good quote on this:

Shifting right by n bits on a two's complement signed binary number has the effect of dividing it by 2n, but it always rounds down (towards negative infinity). This is different from the way rounding is usually done in signed integer division (which rounds towards 0). This discrepancy has led to bugs in a number of compilers.

Brendan Hansknecht (Jun 28 2024 at 15:25):

Personally, this is where I love zigs solution the most. Don't allow too large of a number by forcing the user to convert to the correct integer bit width. U3 for this shift

Folkert de Vries (Jun 28 2024 at 15:30):

Brendan Hansknecht said:

Basically, in both cases, you should be able to just do a max with the bitwidth - 1 and shift. That should give the correct right shift answer.

for shifting right that works I think. for shifting left you need to explicitly return 0 when shifting too far I think

Folkert de Vries (Jun 28 2024 at 15:30):

because 1 << 7 is 128, not 0 which I'd expect for 1 << 42 for instance

Brendan Hansknecht (Jun 28 2024 at 15:31):

Ah yeah, for right shifts

Last updated: Jul 26 2025 at 12:14 UTC