Stream: contributing

Topic: setjmp/longjmp on aarch64


view this post on Zulip Folkert de Vries (Jul 29 2023 at 14:18):

for the dev backends (and the repl), I need to see the assembly of the setjmp and longjmp functions for all architectures that we support. I have this working on x86_64 linux, and a basic implementation for windows (untested). But I don't have an arm machine, so it would be helpful if someone on one of the newer macs could follow the instructions below and provide the output of the disassemble --frame calls.

take this c file, save it as setlongjmp.c and run zig build-exe setlongjmp.c -lc

#include <stdlib.h>
#include <stdio.h>
#include <setjmp.h>

jmp_buf env;

int foo() {
  int i;

  i = setjmp(env);
  printf("i = %d\n", i);

  if (i != 0) return 1;

  longjmp(env, 2);

  return 0;
}

int main() {
    foo();

    return 0;
}

This should give you a setlongjmp executable.

next, run lldb setlongjmp and execute these commands

b foo
r
disassemble --frame

so we set a breakpoint on the "foo" function, run (till we hit that breakpoint) and then disassemble the frame. Here is what that looks like on my machine.

> lldb setlongjmp
(lldb) target create "setlongjmp"
Current executable set to '/home/folkertdev/c/setlongjmp' (x86_64).
(lldb) b foo
Breakpoint 1: where = setlongjmp`foo + 6 at setlongjmp.c:10:7, address = 0x0000000000201e26
(lldb) r
Process 45115 launched: '/home/folkertdev/c/setlongjmp' (x86_64)
Process 45115 stopped
* thread #1, name = 'setlongjmp', stop reason = breakpoint 1.1
    frame #0: 0x0000000000201e26 setlongjmp`foo at setlongjmp.c:10:7
   7    int foo() {
   8      int i;
   9
-> 10     i = setjmp(env);
   11     printf("i = %d\n", i);
   12
   13     if (i != 0) return 1;
(lldb) disassemble --frame
setlongjmp`foo:
    0x201e20 <+0>:  push   rbp
    0x201e21 <+1>:  mov    rbp, rsp
    0x201e24 <+4>:  push   rbx
    0x201e25 <+5>:  push   rax
->  0x201e26 <+6>:  lea    rdi, [rip + 0x5073]       ; env
    0x201e2d <+13>: call   0x20238c                  ; setjmp
    0x201e32 <+18>: mov    ebx, eax
    0x201e34 <+20>: lea    rdi, [rip - 0x1c6b]
    0x201e3b <+27>: xor    eax, eax
    0x201e3d <+29>: mov    esi, ebx
    0x201e3f <+31>: call   0x2023b7                  ; printf
    0x201e44 <+36>: test   ebx, ebx
    0x201e46 <+38>: je     0x201e54                  ; <+52> at setlongjmp.c:15:3
    0x201e48 <+40>: mov    eax, 0x1
    0x201e4d <+45>: add    rsp, 0x8
    0x201e51 <+49>: pop    rbx
    0x201e52 <+50>: pop    rbp
    0x201e53 <+51>: ret
    0x201e54 <+52>: lea    rdi, [rip + 0x5045]       ; env
    0x201e5b <+59>: mov    esi, 0x2
    0x201e60 <+64>: call   0x202364                  ; longjmp
    0x201e65 <+69>: ud2b

next run

ni
si
disassemble --frame

next instruction to bring us to the call to setjmp, then step instruction to step into that call.
you may need more/fewer ni calls, because the assembly format is different. here we need one to go from

->  0x201e26 <+6>:  lea    rdi, [rip + 0x5073]       ; env
    0x201e2d <+13>: call   0x20238c                  ; setjmp

to

    0x201e26 <+6>:  lea    rdi, [rip + 0x5073]       ; env
->  0x201e2d <+13>: call   0x20238c                  ; setjmp

so just keep using ni and disassemble until the arrow points at the call to setjmp

now, the disassemble --frame should show the implementation of setjmp.

setlongjmp`setjmp:
->  0x20238c <+0>:  mov    qword ptr [rdi], rbx
    0x20238f <+3>:  mov    qword ptr [rdi + 0x8], rbp
    0x202393 <+7>:  mov    qword ptr [rdi + 0x10], r12
    0x202397 <+11>: mov    qword ptr [rdi + 0x18], r13
    0x20239b <+15>: mov    qword ptr [rdi + 0x20], r14
    0x20239f <+19>: mov    qword ptr [rdi + 0x28], r15
    0x2023a3 <+23>: lea    rdx, [rsp + 0x8]
    0x2023a8 <+28>: mov    qword ptr [rdi + 0x30], rdx
    0x2023ac <+32>: mov    rdx, qword ptr [rsp]
    0x2023b0 <+36>: mov    qword ptr [rdi + 0x38], rdx
    0x2023b4 <+40>: xor    eax, eax
    0x2023b6 <+42>: ret

now run

finish

to exit the setjmp function. Now we need to ni again until we get to the call to longjmp.

    0x201e4d <+45>: add    rsp, 0x8
    0x201e51 <+49>: pop    rbx
    0x201e52 <+50>: pop    rbp
    0x201e53 <+51>: ret
    0x201e54 <+52>: lea    rdi, [rip + 0x5045]       ; env
    0x201e5b <+59>: mov    esi, 0x2
->  0x201e60 <+64>: call   0x202364                  ; longjmp
    0x201e65 <+69>: ud2b

then again

si
disassemble --frame

which gives the code for longjmp

(lldb) di --frame
setlongjmp`longjmp:
->  0x202364 <+0>:  xor    eax, eax
    0x202366 <+2>:  cmp    esi, 0x1
    0x202369 <+5>:  adc    eax, esi
    0x20236b <+7>:  mov    rbx, qword ptr [rdi]
    0x20236e <+10>: mov    rbp, qword ptr [rdi + 0x8]
    0x202372 <+14>: mov    r12, qword ptr [rdi + 0x10]
    0x202376 <+18>: mov    r13, qword ptr [rdi + 0x18]
    0x20237a <+22>: mov    r14, qword ptr [rdi + 0x20]
    0x20237e <+26>: mov    r15, qword ptr [rdi + 0x28]
    0x202382 <+30>: mov    rsp, qword ptr [rdi + 0x30]
    0x202386 <+34>: jmp    qword ptr [rdi + 0x38]
    0x202389 <+37>: int3
    0x20238a <+38>: int3

view this post on Zulip Luke Boswell (Jul 29 2023 at 22:13):

Cool, I'll have a look. I'm on an M2 mac.

view this post on Zulip Luke Boswell (Jul 30 2023 at 02:21):

First snag, just investigating now

% zig build-exe setlongjmp.c -lc


thread 1025751 panic: Darwin is handled separately via std.zig.system.darwin module
Unable to dump stack trace: debug info stripped
zsh: abort      zig build-exe setlongjmp.c -lc

view this post on Zulip Luke Boswell (Jul 30 2023 at 02:22):

Looks related to #5590 which @Anton added

view this post on Zulip Luke Boswell (Jul 30 2023 at 03:02):

Got it working with zig 0.10.1.

I get the following; which I don't think is what you want,

setlongjmp`setjmp:
->  0x100000670 <+0>: adrp   x16, 8
    0x100000674 <+4>: ldr    x16, [x16, #0x10]
    0x100000678 <+8>: br     x16

ChatGPT tells me the following

This code is an example of an indirect function call. Instead of calling setjmp directly, it computes the address of the function at runtime and then calls it. This is commonly used for function pointers and virtual methods, but in this case it's probably being used because the actual setjmp code is part of the system libraries, not the setlongjmp binary itself. The address of setjmp needs to be looked up at runtime, so the binary includes code to do that lookup and then branch to the correct location.

I feel like I need zig to compile so that the assembly doesn't lookup or use a system library. I'm not sure but will continue investigating.

view this post on Zulip Ayaz Hafiz (Jul 30 2023 at 03:08):

what if you follow the call to the indirect branch (i.e. where it goes after 0x100000678)? you can do this by typing step a few times.

on macos setjmp/longjmp is linked in at runtime so it might be visible until then.

view this post on Zulip Luke Boswell (Jul 30 2023 at 03:10):

(lldb) si
Process 98397 stopped
* thread #1, stop reason = instruction step into
    frame #0: 0x0000000100000678 setlongjmp`setjmp + 8
setlongjmp`setjmp:
->  0x100000678 <+8>: br     x16
    0x10000067c:      adrp   x17, 8
    0x100000680:      add    x17, x17, #0x18           ; =0x18
    0x100000684:      stp    x16, x17, [sp, #-0x10]!
(lldb) si
Process 98397 stopped
* thread #1, stop reason = instruction step into
    frame #0: 0x00000001a2244b24
->  0x1a2244b24: pacibsp
    0x1a2244b28: stp    x21, x30, [x0]
    0x1a2244b2c: mov    x21, x0
    0x1a2244b30: orr    w0, wzr, #0x1
(lldb) disassemble --frame
->  0x1a2244b24: pacibsp
    0x1a2244b28: stp    x21, x30, [x0]
    0x1a2244b2c: mov    x21, x0
    0x1a2244b30: orr    w0, wzr, #0x1
    0x1a2244b34: mov    x1, #0x0
    0x1a2244b38: add    x2, x21, #0xb0            ; =0xb0
    0x1a2244b3c: bl     0x1a22495ec
    0x1a2244b40: mov    x0, x21

view this post on Zulip Anton (Jul 30 2023 at 11:59):

I'll try it on my pi

view this post on Zulip Anton (Jul 30 2023 at 15:11):

Is this the code for setjmp (at the end)?
output.txt

view this post on Zulip Folkert de Vries (Jul 30 2023 at 15:17):

not obviously, to me

view this post on Zulip Folkert de Vries (Jul 30 2023 at 15:17):

maybe? but I don't really recognize the structure

view this post on Zulip Folkert de Vries (Jul 30 2023 at 15:18):

the the pi, you should actually just have objdump right?

view this post on Zulip Anton (Jul 30 2023 at 15:36):

Yeah, I tried that, but I believe it just shows a reference to the dynamically linked code:
disassem.s

view this post on Zulip Folkert de Vries (Jul 30 2023 at 15:36):

eh, does musl work on that platform?

view this post on Zulip Folkert de Vries (Jul 30 2023 at 15:37):

zig has some options

  "aarch64_be-linux-musl",
  "aarch64-linux-musl",
  "armeb-linux-musleabi",
  "armeb-linux-musleabihf",
  "arm-linux-musleabi",
  "arm-linux-musleabihf",

view this post on Zulip Folkert de Vries (Jul 30 2023 at 15:37):

not quite sure which one but zig targets | grep "musl" gives a bunch

view this post on Zulip Anton (Jul 30 2023 at 15:38):

I'll give those a try

view this post on Zulip Anton (Jul 30 2023 at 15:53):

aarch64-linux-musl worked, is this it?

0000000000211570 <__setjmp>:
  211570:   a9005013    stp x19, x20, [x0]
  211574:   a9015815    stp x21, x22, [x0, #16]
  211578:   a9026017    stp x23, x24, [x0, #32]
  21157c:   a9036819    stp x25, x26, [x0, #48]
  211580:   a904701b    stp x27, x28, [x0, #64]
  211584:   a905781d    stp x29, x30, [x0, #80]
  211588:   910003e2    mov x2, sp
  21158c:   f9003402    str x2, [x0, #104]
  211590:   6d072408    stp d8, d9, [x0, #112]
  211594:   6d082c0a    stp d10, d11, [x0, #128]
  211598:   6d09340c    stp d12, d13, [x0, #144]
  21159c:   6d0a3c0e    stp d14, d15, [x0, #160]
  2115a0:   d2800000    mov x0, #0x0                    // #0
  2115a4:   d65f03c0    ret

view this post on Zulip Folkert de Vries (Jul 30 2023 at 15:55):

that looks plausible

view this post on Zulip Anton (Jul 30 2023 at 15:55):

And longjmp:

0000000000211534 <_longjmp>:
  211534:   a9405013    ldp x19, x20, [x0]
  211538:   a9415815    ldp x21, x22, [x0, #16]
  21153c:   a9426017    ldp x23, x24, [x0, #32]
  211540:   a9436819    ldp x25, x26, [x0, #48]
  211544:   a944701b    ldp x27, x28, [x0, #64]
  211548:   a945781d    ldp x29, x30, [x0, #80]
  21154c:   f9403402    ldr x2, [x0, #104]
  211550:   9100005f    mov sp, x2
  211554:   6d472408    ldp d8, d9, [x0, #112]
  211558:   6d482c0a    ldp d10, d11, [x0, #128]
  21155c:   6d49340c    ldp d12, d13, [x0, #144]
  211560:   6d4a3c0e    ldp d14, d15, [x0, #160]
  211564:   7100003f    cmp w1, #0x0
  211568:   1a9f1420    csinc   w0, w1, wzr, ne  // ne = any
  21156c:   d61f03c0    br  x30

view this post on Zulip Brendan Hansknecht (Jul 30 2023 at 15:56):

Arm with it's fancy instructions for storing/loading 2 registers at the same time

view this post on Zulip Folkert de Vries (Jul 30 2023 at 15:56):

well that is where I'm out of my depth

view this post on Zulip Brendan Hansknecht (Jul 30 2023 at 15:57):

But yeah, those look right to me

view this post on Zulip Folkert de Vries (Jul 30 2023 at 15:57):

guess I'll have to get a 64-bit pi somehow (also useful for PTP, at work so I guess I can justify it)

view this post on Zulip Folkert de Vries (Jul 30 2023 at 15:58):

@Brendan Hansknecht how likely is it that this would also work on mac. Is the calling convention (much) different?

view this post on Zulip Brendan Hansknecht (Jul 30 2023 at 15:59):

Calling convention is mostly the same. May just work, but I think there is one register treated differently....would need to double check


Last updated: Jul 06 2025 at 12:14 UTC