Understanding IR Layout · compiler development

I took this as a function layout with no args, returning a string. But the lambda set has a function that has a string argument. Am I understanding this wrong (without reading all 10k+ lines of mono/src/ir.rs)?

Anthony Bullard (Jan 22 2025 at 15:59):

Anthony Bullard (Jan 22 2025 at 16:00):

            app "test" provides [main] to "./platform"

            main : Str
            main =
                g : Str
                g = "hello world"

                get_g : () -> Str
                get_g = || g

                get_g()

Anthony Bullard (Jan 22 2025 at 16:02):

My assumption is that the layout should make sure to include the argument for the string which is captured - assuming that since there is a single capture we don't wrap it in a struct/tuple/record.

Anthony Bullard (Jan 22 2025 at 16:06):

It feels like for zero arg closure, we have to ensure that the arg layout of the function _always_ includes the capture value (here a string, but often a struct)

Ayaz Hafiz (Jan 22 2025 at 16:07):

The lambda set lists all function values that could be referenced by a value of a function type. In your case, get_g is a function with a lambda set that is a singleton that contains just get_g. The lambda set also encodes any captures of the function, and get_g captures g so you can think of it as [get_g Str] if it were a tag.

Function(
    [],
    LambdaSet {
        set: [
            ( Test.2, [InLayout(STR)]), # get_g, with capture Str
        ],
        args: [],
        ret: InLayout(STR), # return type
        representation: InLayout(STR), # representation of the lambda set at runtime - the get_g function and captures (unwrapped to just Str because there is only one function and one capture)
        full_layout: InLayout(
            22,
        ),
    },
    InLayout(STR), # return type
)

Anthony Bullard (Jan 22 2025 at 16:07):

I know that there is like 3 maybe 4 people on the core team that deeply understand this code, but I'd like to get to the point where I could be a +1 to that tally in six months - year

Anthony Bullard (Jan 22 2025 at 16:08):

What you are saying makes sense, but with this mono layout, you get this in gen_llvm:

Error in alias analysis: error in module ModName("UserApp"), function definition FuncName("\x10\x00\x00\x00\x00\x00\x00\x00l\x08\x12\xd4g1\x9bl"), definition of value binding ValueId(4): expected type '((heap_cell,),)', found type '()'

Anthony Bullard (Jan 22 2025 at 16:09):

program {
  mod "UserApp" {
    const "\x06\x00\x00\x00\x0e\x00\x00\x00": type_1 = {
      let val_0 = new_heap_cell ();
      let val_1 = make_tuple (val_0);
      val_1
    } where {
      type type_0 = heap_cell;
      type type_1 = (type_0);
    }

    const "THIS IS A STATIC LIST": type_4 = {
      let val_0 = new_heap_cell ();
      let val_1 = empty_bag<type_0> ();
      let val_2 = make_tuple (val_0, val_1);
      val_2
    } where {
      type type_0 = ();
      type type_1 = ();
      type type_2 = heap_cell;
      type type_3 = bag<type_1>;
      type type_4 = (type_2, type_3);
    }

    fn "" (val_0: type_3) -> type_3 {
      let val_6 = choice {
        case {
          let val_1 = make_tuple ();
          let val_2 = make_union<type_1, type_0> 0 (val_1);
          let val_3 = unwrap_union 1 (val_2);
          let val_4 = call["\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"] "UserApp"::"\x10\x00\x00\x00\x00\x00\x00\x00l\x08\x12\xd4g1\x9bl" (val_3);
          let val_5 = unknown_with<type_2> (val_4);
          val_5
        },
      } ();
      val_6
    } where {
      type type_0 = ();
      type type_1 = ();
      type type_2 = ();
      type type_3 = ();
    }

    fn "\x10\x00\x00\x00\x00\x00\x00\x00l\x08\x12\xd4g1\x9bl" (val_0: type_0) -> type_2 {
      let val_1 = const_ref "UserApp"::"\x06\x00\x00\x00\x0e\x00\x00\x00" ();
      let val_2 = recursive_touch (val_1);
      let val_3 = make_tuple ();
      let val_4 = call["\x01\x00\x00\x00"] "UserApp"::"\x10\x00\x00\x00\x02\x00\x00\x00\xac]\xc2\'o)\xaf]" (val_3);
      val_4
    } where {
      type type_0 = ();
      type type_1 = heap_cell;
      type type_2 = (type_1);
    }

    fn "\x10\x00\x00\x00\x02\x00\x00\x00\xac]\xc2\'o)\xaf]" (val_0: type_2) -> type_4 {
      let val_1 = get_tuple_field 0 (val_0);
      val_1
    } where {
      type type_0 = heap_cell;
      type type_1 = (type_0);
      type type_2 = (type_1);
      type type_3 = heap_cell;
      type type_4 = (type_3);
    }
  }

  entry_point "" = "UserApp"::"";
}

Anthony Bullard (Jan 22 2025 at 16:10):

And If I'm reading things even close to right, \x10\x00\x00\x00\x02\x00\x00\x00\xac]\xc2\'o)\xaf] is the function that corresponds to get_g in the source

Ayaz Hafiz (Jan 22 2025 at 16:10):

      let val_3 = make_tuple ();
      let val_4 = call["\x01\x00\x00\x00"] "UserApp"::"\x10\x00\x00\x00\x02\x00\x00\x00\xac]\xc2\'o)\xaf]" (val_3);

Anthony Bullard (Jan 22 2025 at 16:11):

Anthony Bullard (Jan 22 2025 at 16:12):

Anthony Bullard (Jan 22 2025 at 16:13):

I see that it's expecting val_0 to be a (heap_cell), and then it unwraps the first element.

Anthony Bullard (Jan 22 2025 at 16:13):

I need to read up on what a "recursive_touch" is. I'm assuming it's a ref increment recursively?

Anthony Bullard (Jan 22 2025 at 16:14):

Anthony Bullard (Jan 22 2025 at 16:15):

Thanks for your help Ayaz, it makes me feel better know that it is on the morphic generation side, and not all the way back in mono still

Anthony Bullard (Jan 22 2025 at 16:50):

Ayaz Hafiz (Jan 22 2025 at 16:58):

you can also run with ROC_CHECK_MONO_IR=1 which runs a type checker over the IR and spits out if there is a bug

Anthony Bullard (Jan 22 2025 at 16:58):

Anthony Bullard (Jan 22 2025 at 16:59):

Anthony Bullard (Jan 22 2025 at 21:30):

Proc {
                name: LambdaName {
                    name: `#UserApp.get_g`,
                    niche: Niche(
                        Captures(
                            [
                                InLayout(STR),
                            ],
                        ),
                    ),
                },
                args: [
                    (
                        InLayout(
                            22,
                        ),
                        `#UserApp.g`,
                    ),
                ],
                body: Ret(
                    `#UserApp.g`,
                ),
                closure_data_layout: Some(
                    InLayout(
                        22,
                    ),
                ),
                ret_layout: InLayout(STR),
                is_self_recursive: NotSelfRecursive,
                is_erased: false,
            }

Anthony Bullard (Jan 22 2025 at 21:31):

Anthony Bullard (Jan 22 2025 at 21:32):

This looks like get_g should take an arg. The closure_data_layout looks curious to me though

Anthony Bullard (Jan 22 2025 at 21:32):

Anthony Bullard (Jan 22 2025 at 21:33):

So it looks like part of building the IR thinks it will take a string arg for the captures, but something else is creating a new layout for it

Sam Mohr (Jan 22 2025 at 21:37):

Sam Mohr (Jan 22 2025 at 21:38):

Anthony Bullard (Jan 22 2025 at 21:42):

Anthony Bullard (Jan 22 2025 at 21:47):

Layout 22: Layout {
    repr: Direct(
        LambdaSet(
            LambdaSet {
                set: [
                    ( Test.2, [InLayout(STR)]),
                ],
                args: [],
                ret: InLayout(STR),
                representation: InLayout(STR),
                full_layout: InLayout(
                    22,
                ),
            },
        ),
    ),
    semantic: None,
}

Anthony Bullard (Jan 22 2025 at 22:26):

                let call = self::Call {
                    call_type: CallType::ByName {
                        name: proc_name,
                        ret_layout: function_layout.result,
                        arg_layouts: function_layout.arguments,
                        specialization_id: env.next_call_specialization_id(),
                    },
                    arguments: field_symbols,
                };

Anthony Bullard (Jan 22 2025 at 22:27):

Anthony Bullard (Jan 22 2025 at 22:32):

But call_specialized_proc thinks we should do this because it thinks we are getting get_g here because the field_symbols is empty

Anthony Bullard (Jan 22 2025 at 23:30):

Anthony Bullard (Jan 22 2025 at 23:31):

            app "test" provides [main] to "./platform"

            Effect a := () -> a

            succeed : a -> Effect a
            succeed = |x| @Effect(|| x)

            run_effect : Effect a -> a
            run_effect = |@Effect(thunk)| thunk()

            foo : Effect F64 # <----- This has no borrow signature
            foo =
                succeed(1.23)

            main : F64
            main =
                run_effect(foo)

Anthony Bullard (Jan 22 2025 at 23:33):

I have the high level understanding that this is how we determine something about Refcounting around a closure?

Brendan Hansknecht (Jan 22 2025 at 23:44):

Basically it is away to avoid tons of extra refcount increments and decrements especially in hot loops.

Brendan Hansknecht (Jan 22 2025 at 23:45):

All it does is track whether a list is passed into a function that requests ownership (anything that wants to update the list in place). If so, it is owned. If not, it is borrowed. If it is borrowed, we can pass it all the way down the call stack without ever incrementing or decrementing the refcount. Just keeping the single refcount at the top of the stack

Brendan Hansknecht (Jan 22 2025 at 23:45):

Brendan Hansknecht (Jan 22 2025 at 23:47):

So basically a dirty bit per arg to note if it is used in a way that requires ownership

Brendan Hansknecht (Jan 22 2025 at 23:47):

I think we would eventually want to expand this to be tracked for things that wrap lists rather than just raw lists (for example dict). Not sure the state of borrow signatures and tags.

Anthony Bullard (Jan 22 2025 at 23:48):

Brendan Hansknecht (Jan 22 2025 at 23:49):

Brendan Hansknecht (Jan 22 2025 at 23:50):

Might just be related to storing a closure. That might disconnect the borrow signature from the original function?

Anthony Bullard (Jan 22 2025 at 23:53):

Sam Mohr (Jan 22 2025 at 23:54):

Sam Mohr (Jan 22 2025 at 23:55):

Each name+function layout pair is assumed to have been specialized at this point

Stream: compiler development

Topic: Understanding IR Layout

Anthony Bullard (Jan 22 2025 at 15:58):

Anthony Bullard (Jan 22 2025 at 15:59):

Anthony Bullard (Jan 22 2025 at 16:00):

Anthony Bullard (Jan 22 2025 at 16:02):

Anthony Bullard (Jan 22 2025 at 16:06):

Ayaz Hafiz (Jan 22 2025 at 16:07):

Anthony Bullard (Jan 22 2025 at 16:07):

Anthony Bullard (Jan 22 2025 at 16:08):

Anthony Bullard (Jan 22 2025 at 16:08):

Anthony Bullard (Jan 22 2025 at 16:09):

Anthony Bullard (Jan 22 2025 at 16:10):

Ayaz Hafiz (Jan 22 2025 at 16:10):

Ayaz Hafiz (Jan 22 2025 at 16:10):

Anthony Bullard (Jan 22 2025 at 16:11):

Anthony Bullard (Jan 22 2025 at 16:11):

Anthony Bullard (Jan 22 2025 at 16:12):

Anthony Bullard (Jan 22 2025 at 16:13):

Anthony Bullard (Jan 22 2025 at 16:13):

Anthony Bullard (Jan 22 2025 at 16:14):

Anthony Bullard (Jan 22 2025 at 16:15):

Anthony Bullard (Jan 22 2025 at 16:50):

Ayaz Hafiz (Jan 22 2025 at 16:58):

Ayaz Hafiz (Jan 22 2025 at 16:58):

Anthony Bullard (Jan 22 2025 at 16:58):

Anthony Bullard (Jan 22 2025 at 16:59):

Anthony Bullard (Jan 22 2025 at 21:30):

Anthony Bullard (Jan 22 2025 at 21:31):

Anthony Bullard (Jan 22 2025 at 21:32):

Anthony Bullard (Jan 22 2025 at 21:32):

Anthony Bullard (Jan 22 2025 at 21:33):

Sam Mohr (Jan 22 2025 at 21:37):

Sam Mohr (Jan 22 2025 at 21:37):

Sam Mohr (Jan 22 2025 at 21:38):

Sam Mohr (Jan 22 2025 at 21:38):

Anthony Bullard (Jan 22 2025 at 21:42):

Anthony Bullard (Jan 22 2025 at 21:47):

Anthony Bullard (Jan 22 2025 at 22:26):

Anthony Bullard (Jan 22 2025 at 22:27):

Anthony Bullard (Jan 22 2025 at 22:32):

Anthony Bullard (Jan 22 2025 at 23:30):

Anthony Bullard (Jan 22 2025 at 23:30):

Anthony Bullard (Jan 22 2025 at 23:30):

Anthony Bullard (Jan 22 2025 at 23:31):

Anthony Bullard (Jan 22 2025 at 23:33):

Brendan Hansknecht (Jan 22 2025 at 23:44):

Brendan Hansknecht (Jan 22 2025 at 23:45):

Brendan Hansknecht (Jan 22 2025 at 23:45):

Brendan Hansknecht (Jan 22 2025 at 23:47):

Brendan Hansknecht (Jan 22 2025 at 23:47):

Anthony Bullard (Jan 22 2025 at 23:48):

Brendan Hansknecht (Jan 22 2025 at 23:49):

Brendan Hansknecht (Jan 22 2025 at 23:50):

Anthony Bullard (Jan 22 2025 at 23:53):

Anthony Bullard (Jan 22 2025 at 23:53):

Anthony Bullard (Jan 22 2025 at 23:53):

Anthony Bullard (Jan 22 2025 at 23:53):

Sam Mohr (Jan 22 2025 at 23:54):

Sam Mohr (Jan 22 2025 at 23:55):

Anthony Bullard (Jan 22 2025 at 23:58):

Anthony Bullard (Jan 22 2025 at 23:58):

Sam Mohr (Jan 22 2025 at 23:58):

Sam Mohr (Jan 22 2025 at 23:59):