Stream: beginners

Topic: Branchless helper bug?


view this post on Zulip Luke Boswell (Aug 14 2023 at 11:29):

I've been experimenting with different ways to count the number of backslashes in a large file. I wrote a function which I hope will be branchless (but haven't checked the dissasembly yet) however it is giving me an incorrect answer and I am not sure if I am doing something wrong, or maybe we have a bug somewhere.

The below seems to work ok, the below tests pass etc... however when I run it on a large file of UTF-8 bytes that I know has 1230 backslashes it gives me an incorrect answer of 24_642_180.

Can anyone see anything obviously wrong with this helper?

countSlashesBranchlessHelp : Nat, U8 -> Nat
countSlashesBranchlessHelp = \count, byte ->
    byte
    |> Num.bitwiseXor '\\'          # XOR will be 0 if values are equal
    |> Num.subWrap 0b0000_0001      # MSB 1 if values are equal, 0 otherwise
    |> Num.shiftRightBy 7u8         # Shift MSB to LSB
    |> Num.toNat                    # Convert to Nat
    |> Num.addWrap count            # Add to running count

expect List.walk ['1','2','3'] 0 countSlashesBranchlessHelp == 0
expect List.walk ['1','2','\\','3','\\'] 0 countSlashesBranchlessHelp == 2

view this post on Zulip Anton (Aug 14 2023 at 11:56):

It could be valuable to progressively half the input file string with a function to see at which point the count starts going wrong.

view this post on Zulip Ayaz Hafiz (Aug 14 2023 at 13:59):

I think the xor/subtraction logic is incorrect. For example, for <, which is ascii code 60 0b0111100, the xor against \=92 will be 0b1011000 xor 0b0111100 = 0b1100100. The subtraction and right shift will then end up with 1 in the LSB, if I did my math right.

view this post on Zulip Luke Boswell (Aug 15 2023 at 07:56):

Thank you @Ayaz Hafiz :thank_you:

view this post on Zulip Luke Boswell (Aug 15 2023 at 08:36):

Update to my previous. I changed it to the below which now gives the expected answer!

countSlashesBranchlessHelp : Nat, U8 -> Nat
countSlashesBranchlessHelp = \count, byte ->

    # XOR will be 0 if values are equal
    a =  Num.bitwiseXor byte '\\'

    # Use shifts to propogate 1's
    b = Num.bitwiseOr a (Num.shiftRightBy a 4)
    c = Num.bitwiseOr b (Num.shiftRightBy b 2)
    d = Num.bitwiseOr c (Num.shiftRightBy c 1)

    # Invert the bits
    e = Num.bitwiseNot d

    # Mask for the LSB
    f = Num.bitwiseAnd e 1

    # Add to running count
    f |> Num.toNat |> Num.addWrap count

view this post on Zulip Luke Boswell (Aug 15 2023 at 08:43):

Unfortunately, though I think it looks nicer, if I write it out using a pipeline it is roughly 25% slower. I assume this is because it is because the lambdas are not being inlined in the function?

countSlashesBranchlessHelp : Nat, U8 -> Nat
countSlashesBranchlessHelp = \count, byte ->
    byte
    |> Num.bitwiseXor  '\\'                         # XOR will be 0 if values are equal
    |> \c -> Num.bitwiseOr c (Num.shiftRightBy c 4) # Use shifts to propogate 1's
    |> \c -> Num.bitwiseOr c (Num.shiftRightBy c 2)
    |> \c -> Num.bitwiseOr c (Num.shiftRightBy c 1)
    |> \c -> Num.bitwiseNot c                       # Invert the bits
    |> \c -> Num.bitwiseAnd c 1                     # Mask for the LSB
    |> Num.toNat                                    # Convert to Nat
    |> Num.addWrap count                            # Add to running count

view this post on Zulip Luke Boswell (Aug 15 2023 at 08:44):

I guess my question here is should I expect the pipeline version to run at the same speed? or is that unreasonable?

view this post on Zulip Folkert de Vries (Aug 15 2023 at 08:51):

we'd have to do our own inlining to guarantee it

view this post on Zulip Folkert de Vries (Aug 15 2023 at 08:51):

though LLVM should get pretty close here

view this post on Zulip Richard Feldman (Aug 15 2023 at 10:35):

we do want to do our own inlining at some point, but it's not a priority in the near term

view this post on Zulip Brendan Hansknecht (Aug 15 2023 at 14:50):

Llvm not inlining this surprised me. I wonder if something else is wrong


Last updated: Jul 05 2025 at 12:14 UTC