Stream: compiler development

Topic: Understanding IR Layout


view this post on Zulip Anthony Bullard (Jan 22 2025 at 15:58):

I'm trying to understand this Debug output for a Layout:

Function(
    [],
    LambdaSet {
        set: [
            ( Test.2, [InLayout(STR)]),
        ],
        args: [],
        ret: InLayout(STR),
        representation: InLayout(STR),
        full_layout: InLayout(
            22,
        ),
    },
    InLayout(STR),
)

I took this as a function layout with no args, returning a string. But the lambda set has a function that has a string argument. Am I understanding this wrong (without reading all 10k+ lines of mono/src/ir.rs)?

view this post on Zulip Anthony Bullard (Jan 22 2025 at 15:59):

This is a RawFunctionLayout to be clear

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:00):

The source of the entire program is:

            app "test" provides [main] to "./platform"

            main : Str
            main =
                g : Str
                g = "hello world"

                get_g : () -> Str
                get_g = || g

                get_g()

And this layout is for get_g, a zero-arg closure defined inside main

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:02):

My assumption is that the layout should make sure to include the argument for the string which is captured - assuming that since there is a single capture we don't wrap it in a struct/tuple/record.

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:06):

It feels like for zero arg closure, we have to ensure that the arg layout of the function _always_ includes the capture value (here a string, but often a struct)

view this post on Zulip Ayaz Hafiz (Jan 22 2025 at 16:07):

The lambda set lists all function values that could be referenced by a value of a function type. In your case, get_g is a function with a lambda set that is a singleton that contains just get_g. The lambda set also encodes any captures of the function, and get_g captures g so you can think of it as [get_g Str] if it were a tag.

Function(
    [],
    LambdaSet {
        set: [
            ( Test.2, [InLayout(STR)]), # get_g, with capture Str
        ],
        args: [],
        ret: InLayout(STR), # return type
        representation: InLayout(STR), # representation of the lambda set at runtime - the get_g function and captures (unwrapped to just Str because there is only one function and one capture)
        full_layout: InLayout(
            22,
        ),
    },
    InLayout(STR), # return type
)

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:07):

I know that there is like 3 maybe 4 people on the core team that deeply understand this code, but I'd like to get to the point where I could be a +1 to that tally in six months - year

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:08):

Interesting

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:08):

What you are saying makes sense, but with this mono layout, you get this in gen_llvm:

Error in alias analysis: error in module ModName("UserApp"), function definition FuncName("\x10\x00\x00\x00\x00\x00\x00\x00l\x08\x12\xd4g1\x9bl"), definition of value binding ValueId(4): expected type '((heap_cell,),)', found type '()'

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:09):

And here is the morphic program:

program {
  mod "UserApp" {
    const "\x06\x00\x00\x00\x0e\x00\x00\x00": type_1 = {
      let val_0 = new_heap_cell ();
      let val_1 = make_tuple (val_0);
      val_1
    } where {
      type type_0 = heap_cell;
      type type_1 = (type_0);
    }

    const "THIS IS A STATIC LIST": type_4 = {
      let val_0 = new_heap_cell ();
      let val_1 = empty_bag<type_0> ();
      let val_2 = make_tuple (val_0, val_1);
      val_2
    } where {
      type type_0 = ();
      type type_1 = ();
      type type_2 = heap_cell;
      type type_3 = bag<type_1>;
      type type_4 = (type_2, type_3);
    }

    fn "" (val_0: type_3) -> type_3 {
      let val_6 = choice {
        case {
          let val_1 = make_tuple ();
          let val_2 = make_union<type_1, type_0> 0 (val_1);
          let val_3 = unwrap_union 1 (val_2);
          let val_4 = call["\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"] "UserApp"::"\x10\x00\x00\x00\x00\x00\x00\x00l\x08\x12\xd4g1\x9bl" (val_3);
          let val_5 = unknown_with<type_2> (val_4);
          val_5
        },
      } ();
      val_6
    } where {
      type type_0 = ();
      type type_1 = ();
      type type_2 = ();
      type type_3 = ();
    }

    fn "\x10\x00\x00\x00\x00\x00\x00\x00l\x08\x12\xd4g1\x9bl" (val_0: type_0) -> type_2 {
      let val_1 = const_ref "UserApp"::"\x06\x00\x00\x00\x0e\x00\x00\x00" ();
      let val_2 = recursive_touch (val_1);
      let val_3 = make_tuple ();
      let val_4 = call["\x01\x00\x00\x00"] "UserApp"::"\x10\x00\x00\x00\x02\x00\x00\x00\xac]\xc2\'o)\xaf]" (val_3);
      val_4
    } where {
      type type_0 = ();
      type type_1 = heap_cell;
      type type_2 = (type_1);
    }

    fn "\x10\x00\x00\x00\x02\x00\x00\x00\xac]\xc2\'o)\xaf]" (val_0: type_2) -> type_4 {
      let val_1 = get_tuple_field 0 (val_0);
      val_1
    } where {
      type type_0 = heap_cell;
      type type_1 = (type_0);
      type type_2 = (type_1);
      type type_3 = heap_cell;
      type type_4 = (type_3);
    }
  }

  entry_point "" = "UserApp"::"";
}

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:10):

And If I'm reading things even close to right, \x10\x00\x00\x00\x02\x00\x00\x00\xac]\xc2\'o)\xaf] is the function that corresponds to get_g in the source

view this post on Zulip Ayaz Hafiz (Jan 22 2025 at 16:10):

right so that's a bug in the morphic construction

view this post on Zulip Ayaz Hafiz (Jan 22 2025 at 16:10):

      let val_3 = make_tuple ();
      let val_4 = call["\x01\x00\x00\x00"] "UserApp"::"\x10\x00\x00\x00\x02\x00\x00\x00\xac]\xc2\'o)\xaf]" (val_3);

this isn't passing the capture

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:11):

(val_3) isn't the capture?

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:11):

Oh, it's a plain tuple!

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:12):

Should it be make_tuple (val_2)?

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:13):

I see that it's expecting val_0 to be a (heap_cell), and then it unwraps the first element.

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:13):

I need to read up on what a "recursive_touch" is. I'm assuming it's a ref increment recursively?

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:14):

I really have to go to work, but I'll check back in at lunch

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:15):

Thanks for your help Ayaz, it makes me feel better know that it is on the morphic generation side, and not all the way back in mono still

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:50):

@Sam Mohr fyi

view this post on Zulip Ayaz Hafiz (Jan 22 2025 at 16:58):

i would check that the mono IR looks right first

view this post on Zulip Ayaz Hafiz (Jan 22 2025 at 16:58):

you can also run with ROC_CHECK_MONO_IR=1 which runs a type checker over the IR and spits out if there is a bug

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:58):

Sweet. Thanks

view this post on Zulip Anthony Bullard (Jan 22 2025 at 16:59):

It’d be cool if we had an omnibus flag for debugging all things in a phase

view this post on Zulip Anthony Bullard (Jan 22 2025 at 21:30):

Here's the mono::Proc:

Proc {
                name: LambdaName {
                    name: `#UserApp.get_g`,
                    niche: Niche(
                        Captures(
                            [
                                InLayout(STR),
                            ],
                        ),
                    ),
                },
                args: [
                    (
                        InLayout(
                            22,
                        ),
                        `#UserApp.g`,
                    ),
                ],
                body: Ret(
                    `#UserApp.g`,
                ),
                closure_data_layout: Some(
                    InLayout(
                        22,
                    ),
                ),
                ret_layout: InLayout(STR),
                is_self_recursive: NotSelfRecursive,
                is_erased: false,
            }

view this post on Zulip Anthony Bullard (Jan 22 2025 at 21:31):

Running with ROC_CHECK_MONO_IR=1 didn't output anything different

view this post on Zulip Anthony Bullard (Jan 22 2025 at 21:32):

This looks like get_g should take an arg. The closure_data_layout looks curious to me though

view this post on Zulip Anthony Bullard (Jan 22 2025 at 21:32):

But it does line up with what I see in the morphic program...

view this post on Zulip Anthony Bullard (Jan 22 2025 at 21:33):

So it looks like part of building the IR thinks it will take a string arg for the captures, but something else is creating a new layout for it

view this post on Zulip Sam Mohr (Jan 22 2025 at 21:37):

get_g should take the lambdaset as its only arg

view this post on Zulip Sam Mohr (Jan 22 2025 at 21:37):

Which it looks like is the case

view this post on Zulip Sam Mohr (Jan 22 2025 at 21:38):

I presume that InLayout is a single-variant union with the payload of a STR

view this post on Zulip Sam Mohr (Jan 22 2025 at 21:38):

Which is why it would not say InLayout(STR) but instead InLayout(22)

view this post on Zulip Anthony Bullard (Jan 22 2025 at 21:42):

I wish it were easy to figure out what that layout is

view this post on Zulip Anthony Bullard (Jan 22 2025 at 21:47):

After changing visibility of a method and wrapping a call it in unsafe:

Layout 22: Layout {
    repr: Direct(
        LambdaSet(
            LambdaSet {
                set: [
                    ( Test.2, [InLayout(STR)]),
                ],
                args: [],
                ret: InLayout(STR),
                representation: InLayout(STR),
                full_layout: InLayout(
                    22,
                ),
            },
        ),
    ),
    semantic: None,
}

view this post on Zulip Anthony Bullard (Jan 22 2025 at 22:26):

Hmmm...

                let call = self::Call {
                    call_type: CallType::ByName {
                        name: proc_name,
                        ret_layout: function_layout.result,
                        arg_layouts: function_layout.arguments,
                        specialization_id: env.next_call_specialization_id(),
                    },
                    arguments: field_symbols,
                };

view this post on Zulip Anthony Bullard (Jan 22 2025 at 22:27):

Do we really need to explicitly add the closure struct to field_symbols?

view this post on Zulip Anthony Bullard (Jan 22 2025 at 22:32):

But call_specialized_proc thinks we should do this because it thinks we are getting get_g here because the field_symbols is empty

view this post on Zulip Anthony Bullard (Jan 22 2025 at 23:30):

We figure this one out

view this post on Zulip Anthony Bullard (Jan 22 2025 at 23:30):

Now....

view this post on Zulip Anthony Bullard (Jan 22 2025 at 23:30):

Can someone ELI5 borrow signatures to me?

view this post on Zulip Anthony Bullard (Jan 22 2025 at 23:31):

This panics on not having a borrow signature:

            app "test" provides [main] to "./platform"

            Effect a := () -> a

            succeed : a -> Effect a
            succeed = |x| @Effect(|| x)

            run_effect : Effect a -> a
            run_effect = |@Effect(thunk)| thunk()

            foo : Effect F64 # <----- This has no borrow signature
            foo =
                succeed(1.23)

            main : F64
            main =
                run_effect(foo)

view this post on Zulip Anthony Bullard (Jan 22 2025 at 23:33):

I have the high level understanding that this is how we determine something about Refcounting around a closure?

view this post on Zulip Brendan Hansknecht (Jan 22 2025 at 23:44):

Basically it is away to avoid tons of extra refcount increments and decrements especially in hot loops.

view this post on Zulip Brendan Hansknecht (Jan 22 2025 at 23:45):

All it does is track whether a list is passed into a function that requests ownership (anything that wants to update the list in place). If so, it is owned. If not, it is borrowed. If it is borrowed, we can pass it all the way down the call stack without ever incrementing or decrementing the refcount. Just keeping the single refcount at the top of the stack

view this post on Zulip Brendan Hansknecht (Jan 22 2025 at 23:45):

I think all non-list types are always owned/ignored currently

view this post on Zulip Brendan Hansknecht (Jan 22 2025 at 23:47):

So basically a dirty bit per arg to note if it is used in a way that requires ownership

view this post on Zulip Brendan Hansknecht (Jan 22 2025 at 23:47):

I think we would eventually want to expand this to be tracked for things that wrap lists rather than just raw lists (for example dict). Not sure the state of borrow signatures and tags.

view this post on Zulip Anthony Bullard (Jan 22 2025 at 23:48):

I’m getting the same panic in an example that is not wrapped in a opaque type

view this post on Zulip Brendan Hansknecht (Jan 22 2025 at 23:49):

Interesting. Would have guess the opaque type was the cause

view this post on Zulip Brendan Hansknecht (Jan 22 2025 at 23:50):

Might just be related to storing a closure. That might disconnect the borrow signature from the original function?

view this post on Zulip Anthony Bullard (Jan 22 2025 at 23:53):

I have trouble finding where these signatures are inserted in the first place

view this post on Zulip Anthony Bullard (Jan 22 2025 at 23:53):

I see only one place

view this post on Zulip Anthony Bullard (Jan 22 2025 at 23:53):

Which is infer_borrow_signatures

view this post on Zulip Anthony Bullard (Jan 22 2025 at 23:53):

But I haven’t gone deep on it

view this post on Zulip Sam Mohr (Jan 22 2025 at 23:54):

We do it per proc name and layout

view this post on Zulip Sam Mohr (Jan 22 2025 at 23:55):

Each name+function layout pair is assumed to have been specialized at this point

view this post on Zulip Anthony Bullard (Jan 22 2025 at 23:58):

Yeah I think it is, but for some reason its borrow signature just isn’t there

view this post on Zulip Anthony Bullard (Jan 22 2025 at 23:58):

I’ll figure it out tonight or in the morning. Thanks for the context

view this post on Zulip Sam Mohr (Jan 22 2025 at 23:58):

That's why I think the function didn't get multiple specializations

view this post on Zulip Sam Mohr (Jan 22 2025 at 23:59):

We assume there's one for each specialized return type on line 352 of borrow.rs


Last updated: Jul 06 2025 at 12:14 UTC