Roc Codegen Ignoring `Task` · compiler development

Turns out that roc is simply generating the wrong type for the main function and totally ignoring that their is a Task in the api.

Header : { name : Str, value: List U8 }
Method : [GET, POST, HEAD, DELETE, PUT, OPTIONS]
Request : { method : Method, url : Str, headers : List Header, body: List U8 }
Response : { status : U16, headers : List Header, body: List U8 }

mainForHost : Request -> Task Response []

void @roc__mainForHost_1_exposed(ptr nocapture writeonly sret({ %list.RocList, %list.RocList, i16 }) %0, ptr nocapture readonly %1)

The return type is specifically { %list.RocList, %list.RocList, i16 }. That is the return type of RocResponse, but the function is not just returning a RocResponse, it is returning a Task RocResponse []. Which would become a closure. And a closure just returns closure capture data as a List U8.

So specifically the issue is that when we directly return a task with Task.ok and never await anything, it doesn't actually generate a closure. Instead the raw Task is return. This is a different API from when we call Task.await even once.

I think we need to make sure to always wrap returned Tasks in a closure to keep the API consistent.

I was able to confirm that this isssue is the missing closure by changing the code from:

Task.ok { status: 200, headers: [], body: "The Answer" |> Str.toUtf8 }

x <- Task.ok { status: 200, headers: [], body: "The Answer" |> Str.toUtf8 } |> Task.await
Task.ok x

Richard Feldman (Nov 06 2023 at 04:05):

Richard Feldman (Nov 06 2023 at 04:06):

Brendan Hansknecht (Nov 06 2023 at 04:06):

Roc still returns as a RocResponse. So it never returns any sort of closure or capture.

Brendan Hansknecht (Nov 06 2023 at 04:07):

Richard Feldman (Nov 06 2023 at 04:07):

Brendan Hansknecht (Nov 06 2023 at 04:12):

Brendan Hansknecht (Nov 06 2023 at 04:17):

    res <- main req |> Task.await
    Task.ok res

works in optimized builds but fails in normal builds....that is just super strange to me. Optimizations fixing instead of breaking something

Brendan Hansknecht (Nov 06 2023 at 04:18):

The webserver generates correctly and works in general in optimized builds. It also consistently fails in normal builds

Brendan Hansknecht (Nov 06 2023 at 04:31):

Ok. Looking at the differences between the optimized and non-optimized code. This is my current understanding.

Either way, main directly returns the RocResponse which has the same layout as a Task RocResponse [].
We store that in a structure that thinks it is a closure_data: roc_std::RocList<u8>. This is incorrect, but doesn't end up matter.

Either way, we pass the pointer to the closure data pointer back to roc. Roc knows the pointer is a RocResponse, not a RocList<u8>, interprets it correctly and returns the data.

Since in rust, we are only store enough space for a RocList<u8>, the entire RocResponse does not fit. In the debug build, this overflowing data ends up getting overwritten. This now totally unrelated data leads to the crash. In optimized, we just happen to not overwite the data. As such, even though we are accessing memory that shouldn't be accessed, everything still works fine.

Brendan Hansknecht (Nov 06 2023 at 04:33):

So fundamentally, I think this is still a roc codegen bug. A Task ... should always return a closure data. closure data should always be stored as a RocList<u8>. So we are missing the wrapping and store of the RocResponse into a RocList<u8> of closure data.

Brendan Hansknecht (Nov 06 2023 at 04:34):

So we are missing a layer of indirection to make closure data always consistent layout as a RocList<u8>

Ayaz Hafiz (Nov 06 2023 at 04:38):

Do you have the IR for the program? (you can generate it with ROC_PRINT_IR_AFTER_RESET_REUSE=1 on debug)

Ayaz Hafiz (Nov 06 2023 at 04:38):

Task.ok { status: 200, headers: [], body: "The Answer" |> Str.toUtf8 }

Ayaz Hafiz (Nov 06 2023 at 04:39):

x <- Task.ok { status: 200, headers: [], body: "The Answer" |> Str.toUtf8 } |> Task.await
Task.ok x

Ayaz Hafiz (Nov 06 2023 at 04:39):

So the first case becomes a lambda that is just a function pointer, but the latter is the function pointer + closure data

Brendan Hansknecht (Nov 06 2023 at 04:40):

The await was a red herring. Either way it is still just returning the RocResponse directly.

Brendan Hansknecht (Nov 06 2023 at 04:46):

procedure : `pf..mainForHost` {{List U8, List {Str, List U8}, U16}, {}}
procedure = `pf..mainForHost` (`pf..req`: {List U8, List {Str, List U8}, Str, U8}):
    let `pf..5` : {List U8, List {Str, List U8}, U16} = CallByName `#UserApp.main` `pf..req`;
    let `pf..6` : {} = Struct {};
    let `pf..4` : {{List U8, List {Str, List U8}, U16}, {}} = CallByName `pf.Task.await` `pf..5` `pf..6`;
    ret `pf..4`;

It is eagerly running and directly returning the Task Response [] as a {{List U8, List {Str, List U8}, U16}, {}}
It should be returning closure_data and then actually running the execution in the roc_caller function, at least that is how I understand the model.

That said, we could just make it eagerly run, but then we need to change glue to expect the result type rather than a closure capture and needing to run a caller function.

Brendan Hansknecht (Nov 06 2023 at 04:53):

@Richard Feldman Did we change to main returning a List<U8> if it is a closure? I think that may be the real issue. Main just returns the closure data. So we need to change the glue code to properly store that captured data in a List<U8> cause roc doesn't do that automatically for us.

Brendan Hansknecht (Nov 06 2023 at 04:55):

Yeah....I think this actually is just a glue bug. Roc currently expects glue to generate a spot to store the closure capture bytes and pass a pointer to the location into the main task. The rust glue is not doing that. It is instead expecting roc to return a List<U8>.

Brendan Hansknecht (Nov 06 2023 at 05:06):

Yeah, probably a glue issue (at least with how current roc generates closure handling code).

pub fn mainForHost(arg0: RocRequest) -> RocFunction_88 {
    extern "C" {
        fn roc__mainForHost_1_exposed_generic(
            _: *mut u8,
            _: &mut core::mem::ManuallyDrop<RocRequest>,
        );
        fn roc__mainForHost_1_exposed_size() -> i64;
    }

    unsafe {
        let size = roc__mainForHost_1_exposed_size();
        let mut ret = RocFunction_88 {
            closure_data: RocList::with_capacity(size as usize),
        };
        roc__mainForHost_1_exposed_generic(
            ret.closure_data.as_mut_ptr(),
            &mut core::mem::ManuallyDrop::new(arg0),
        );

        ret
    }
}

Though really, there is no need for RocFunction_88 to contain a RocList at all. It could use a rust vector or even better a small vector type. In a perfect world, it might even use a stack allocated array of bytes.

Brendan Hansknecht (Nov 06 2023 at 05:06):

Maybe this is related to changes that were planned for how we implemented closures but that have not happened yet?

Brendan Hansknecht (Nov 07 2023 at 17:15):

@Richard Feldman looping back to this. Does all of the final conclusion make senses? Is there a specific reason that RocFunction_88 contains a RocList<u8> instead of just using a Vec<u8> (or not existing at all and just being inlined?)

I think RocFunction_88 probably shouldn't exist in general (at least with current roc) cause calling fn88.force_thunk() twice is not guaranteed to work. A roc closure is only safe to call once (without incrementing the refcount of everything in the closure capture, which the host knows none of the details of).

Stream: compiler development

Topic: Roc Codegen Ignoring `Task`

Brendan Hansknecht (Nov 06 2023 at 03:58):

Richard Feldman (Nov 06 2023 at 04:05):

Richard Feldman (Nov 06 2023 at 04:06):

Brendan Hansknecht (Nov 06 2023 at 04:06):

Brendan Hansknecht (Nov 06 2023 at 04:06):

Brendan Hansknecht (Nov 06 2023 at 04:07):

Richard Feldman (Nov 06 2023 at 04:07):

Brendan Hansknecht (Nov 06 2023 at 04:12):

Brendan Hansknecht (Nov 06 2023 at 04:17):

Brendan Hansknecht (Nov 06 2023 at 04:18):

Brendan Hansknecht (Nov 06 2023 at 04:31):

Brendan Hansknecht (Nov 06 2023 at 04:33):

Brendan Hansknecht (Nov 06 2023 at 04:34):

Ayaz Hafiz (Nov 06 2023 at 04:38):

Ayaz Hafiz (Nov 06 2023 at 04:38):

Ayaz Hafiz (Nov 06 2023 at 04:39):

Ayaz Hafiz (Nov 06 2023 at 04:39):

Brendan Hansknecht (Nov 06 2023 at 04:40):

Brendan Hansknecht (Nov 06 2023 at 04:46):

Brendan Hansknecht (Nov 06 2023 at 04:53):

Brendan Hansknecht (Nov 06 2023 at 04:55):

Brendan Hansknecht (Nov 06 2023 at 05:06):

Brendan Hansknecht (Nov 06 2023 at 05:06):

Brendan Hansknecht (Nov 07 2023 at 17:15):