Stream: contributing

Topic: Implement RocRefcounted for RocResult


view this post on Zulip Luke Boswell (Jul 14 2024 at 11:17):

Having a crack at this and I'm a little stuck. Just wondering if there is something obvious I'm missing here.

no method named `is_refcounted` found for reference `&ManuallyDrop<T>` in the current scope
found the following associated functions; to be used as methods, functions must have a `self` parameter
impl<T, E> RocRefcounted for RocResult<T, E>
where
    T: RocRefcounted,
    E: RocRefcounted,
{
    fn inc(&mut self) {
        match self.as_result_of_refs() {
            Ok(payload) if payload.is_refcounted() => {
                payload.inc();
            }
            Err(payload) if payload.is_refcounted() => {
                payload.inc();
            }
            _ => {}
        }
    }
    fn dec(&mut self) {
        todo!()
    }
    fn is_refcounted() -> bool {
        todo!()
    }
}

view this post on Zulip Luke Boswell (Jul 14 2024 at 11:33):

I think I'm closer...

impl<T, E> RocRefcounted for RocResult<T, E>
where
    T: RocRefcounted,
    E: RocRefcounted,
{
    fn inc(&mut self) {
        unsafe {
            match self.tag {
                RocResultTag::RocOk => (*self.payload.ok).inc(),
                RocResultTag::RocErr => (*self.payload.err).inc(),
            }
        }
    }
    fn dec(&mut self) {
        unsafe {
            match self.tag {
                RocResultTag::RocOk => (*self.payload.ok).dec(),
                RocResultTag::RocErr => (*self.payload.err).dec(),
            }
        }
    }
    fn is_refcounted() -> bool {
        unsafe {
            match self.tag {
                RocResultTag::RocOk => (self.payload.ok).is_refcounted(),
                RocResultTag::RocErr => (self.payload.err).is_refcounted(),
            }
        }
    }
}

Screenshot-2024-07-14-at-21.32.50.png

pub trait RocRefcounted {
    /// Increments the refcount.
    fn inc(&mut self);

    /// Decrements the refcount potentially freeing the underlying allocation.
    fn dec(&mut self);

    /// Returns true if the type is actually refcounted by roc.
    fn is_refcounted() -> bool;
}

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 16:18):

is_refcointed should have a branch

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 16:18):

Should be true if either the ok or err is refcounted

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 16:54):

fn is_refcounted() -> bool {
        T::is_refcounted() || E::is_refcounted()
}

view this post on Zulip Luke Boswell (Jul 14 2024 at 23:30):

Also need to implement this

impl roc_std::RocRefcounted for SQLiteValue {
    fn inc(&mut self) {
        unimplemented!();
    }
    fn dec(&mut self) {
        unimplemented!();
    }
    fn is_refcounted() -> bool {
        true
    }
}

view this post on Zulip Luke Boswell (Jul 14 2024 at 23:31):

This was generated by roc glue. I might try hand implementing first.

view this post on Zulip Luke Boswell (Jul 14 2024 at 23:45):

I think I know where the problem might be for basic-cli...

We have a lot of hand-rolled interface types like the following.

#[repr(C)]
pub struct InternalResponse {
    body: RocList<u8>,
    metadata: Metadata,
    variant: RocStr,
}

#[repr(C)]
pub struct Metadata {
    headers: RocList<Header>,
    url: RocStr,
    statusCode: u16,
    statusText: RocStr,
}

Do we need to do/change anything to manage the refcounting in these correctly?

view this post on Zulip Luke Boswell (Jul 14 2024 at 23:48):

Like if roc gives the host one of these, or the host gives roc one, do we need to use the RocRefcounted fn's at all?

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 00:21):

Do we need to do/change anything to manage the refcounting in these correctly?

If they aren't put in lists, nothing should change

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 00:22):

So only if you give a List InternalResponse To/from roc.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 00:22):

SQLiteValue

Must have missed that, I thought it would fall under a struct style tag, but I guess it is a non-recursive tag which needs smarter rules

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 00:23):

impl should just be a match statement and then calling inc/dec on the values within it.

view this post on Zulip Luke Boswell (Jul 15 2024 at 00:23):

OK.

Well, I'm currently porting some of the changes we made in basic-cli across to basic-webserver so there is less glue there and trying to implement these RefCounted things.

view this post on Zulip Luke Boswell (Jul 15 2024 at 00:23):

I haven't found the issue in basic-cli

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 00:24):

I haven't found the issue in basic-cli

I think likely the bug is on the roc internal side and not platform related.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 00:25):

Also, is args.roc the only one with issues?

view this post on Zulip Luke Boswell (Jul 15 2024 at 00:27):

I'm not sure, it's the first one on the list and fails in the expect script

view this post on Zulip Luke Boswell (Jul 15 2024 at 00:28):

Ok, figured you can run the expect scripts manually using

$ EXAMPLES_DIR=examples/ expect ci/expect_scripts/args.exp
spawn examples//args div -n 5 -d 20
args(82068,0x201c58c00) malloc: Heap corruption detected, free list is damaged at 0x600000f48080
*** Incorrect guard value: 145982044012671
args(82068,0x201c58c00) malloc: *** set a breakpoint in malloc_error_break to debug

Error: output was different from expected value.

view this post on Zulip Luke Boswell (Jul 15 2024 at 00:30):

Just ran them all manually, only args.roc is failing

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 00:31):

Args with valgrind, core issue:

# Probably reading heap size on a list allocated without a heap size?
=56308== Invalid read of size 8
==56308==    at 0x1CDF43: ??? (roc_app:0)
==56308==    by 0x1D1E03: ??? (roc_app:0)
==56308==    by 0x1AAAB7: List_walkTryHelp_147f9a29f721bc6a47c431ffbf897368d53af6834eae9416f984da7fc2dd89 (roc_app:0)


# Double free
==56308==  Address 0x4b6d4b0 is 0 bytes inside a block of size 9 free'd
==56308==    at 0x48469E4: free (in /nix/store/78zxvxfy6klvwfc2s95y71y0b284fd6v-valgrind-3.22.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==56308==    by 0x1D4882: roc_dealloc (lib.rs:81)
==56308==    by 0x1CE2CA: decrement_refcounted_ptr_8 (roc_app:0)
==56308==    by 0x1CE27D: ??? (roc_app:0)
==56308==    by 0x18DB36: #Attr_#generic_rc_by_ref_2_dec (roc_app:0)
==56308==    by 0x18DB36: Arg.Parser_constructSetOfOptions_3655121d7ba98da90331d38b92032d451b515e695221ee2a9d34aa692cd2c (roc_app:0)

# Use after free
==56308== Invalid write of size 8
==56308==    at 0x1CDF64: ??? (roc_app:0)
==56308==    by 0x1D1E03: ??? (roc_app:0)
==56308==    by 0x1AAAB7: List_walkTryHelp_147f9a29f721bc6a47c431ffbf897368d53af6834eae9416f984da7fc2dd89 (roc_app:0)
==56308==    by 0x194BAE: List_walkTry_1d346aa605d9beef3d3c8b59d397858b899a3c7efdcede8d0efd4953334f936 (roc_app:0)

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 00:31):

So almost certainly a refcounting bug of some sort.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 00:31):

Just need to figure out how to debug it.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 00:31):

Add some sort of minimal repro to gen_refcount

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 00:32):

All the cli parser combinators are pretty complex, so would prefer something more minimal....

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 01:28):

I think I've found a more minimal repro, but some reason I can't get it to run as a gen_refcount test. So still need to mess with debugging more.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 01:57):

Much more minimal repro of at least one refcounting issue:

app [main] {
    pf: platform "../platform/main.roc",
}

import pf.Stdout
import pf.Task exposing [Task]

Arg : [
    Short Str,
    ShortGroup { names : List Str, complete : [Complete, Partial] },
    Long { name : Str, value : Result Str [NoValue] },
    Parameter Str,
]

constructSetOfOptions : Str -> Arg
constructSetOfOptions = \combined ->
    options =
        combined
        |> Str.toUtf8
        |> List.keepOks \c -> Str.fromUtf8 [c]

    when options is
        [alone] -> Short alone
        other -> ShortGroup { names: other, complete: Complete }

main =
    ["-d"]
    |> List.map \x ->
        when Str.splitFirst x "-" is
            Ok {after} ->
                constructSetOfOptions after
            Err _ ->
                Parameter "err"
    |> List.len
    |> Num.toStr
    |> Stdout.line

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 02:04):

Seems like the bug is somehow with pattern matching extracting a value...no idea how though. Maybe I am missing something:

constructSetOfOptions : Str -> Arg
constructSetOfOptions = \combined ->
    options =
        combined
        |> Str.toUtf8
        |> List.keepOks \c -> Str.fromUtf8 [c]

    dbg options
    when options is
        [alone] ->
            dbg alone
            Short alone
        other -> ShortGroup { names: other, complete: Complete }

main =
    ["-d"]
    |> List.map \x ->
        when Str.splitFirst x "-" is
            Ok {after} ->
                out = constructSetOfOptions after
                dbg out
                out
            Err _ ->
                Parameter "err"
    |> List.len
    |> Num.toStr
    |> Stdout.line
[examples/args.roc:22] options = ["d"]
[examples/args.roc:25] alone = "d"
[examples/args.roc:35] out = (Short "๏ฟฝ")

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 02:05):

Or maybe treating something as a large string/slice incorrectly.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 02:08):

Oh, interesting, I think the issue might actually be pattern matching directly... (I guess I don't know how pattern matching does refcounting).
This is fine:

    if List.len options == 1 then
        when List.get options 0 is
            Ok alone -> Short alone
            _ -> crash "impossible"
    else
        ShortGroup { names: options, complete: Complete }

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 02:16):

Ok even more minimal:

main =
    in = [Str.fromUtf8 ['d']]
    out =
        when in is
            [alone] -> [alone]
            other -> other
    out
    |> List.len
    |> Num.toStr
    |> Stdout.line

view this post on Zulip Luke Boswell (Jul 15 2024 at 04:13):

Ok, so I've done a fair bit of refactoring for basic-webserver for these changes. It looks like everything is happy, the platform and all the examples check and build ok, but whenever I run the server and actually call into roc I get the following error from inside Roc.

thread 'tokio-runtime-worker' panicked at crates/roc_host/src/roc.rs:42:5:
The Roc app crashed with: NoImplementationNamed { def_symbol: `38.IdentId(2)` }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'tokio-runtime-worker' panicked at crates/roc_host/src/http_server.rs:87:26:
called `Result::unwrap()` on an `Err` value: JoinError::Panic(Id(10), ...)

view this post on Zulip Luke Boswell (Jul 15 2024 at 04:16):

Never mind, I found the thing that causes that is when you have a type annotation but have provided the implementation. Oops on my part.

view this post on Zulip Luke Boswell (Jul 15 2024 at 04:25):

Ok, so I think I've completed the refactor for basic-webserver. I added all the changes to the refactor-host PR. :tada:

I have some failing expect scripts, but I can verify it's behaving correctly manually... so I think it's just a macos versus linux CI thing.

Before this will pass in CI it also needs a new nightly with the Refcount changes, and a release of basic-cli. But otherwise looks like everything is running ok on my macos.

I can also test it on my linux machine later.

view this post on Zulip Luke Boswell (Jul 15 2024 at 04:58):

I've been testing the updated/refactored basic-webserver with the demo web app I've been working on, so am pretty confident it's working well.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 16:13):

The RC bug looks to be llvm specific (which makes sense, cause llvm rolls its own rc while other backends generate refcount procs).

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 16:44):

Yep, a use after free for sure in llvm. The list ['d'] was converted to a Str, but still got freed anyway.

+ allocated u8@60000166c040 (1 bytes with alignment 8)
+ allocated u8@600001a600c0 (32 bytes with alignment 8)
decref list: (8,32,true,fn(?[*]u8) callconv(.C) void@1049a9420)
freeing underlying elements from: (u8@600001a600d0, 1)
| decrement isize@60000166c040: 1 - 1 = 0!
๐Ÿ’€ freed u8@60000166c040
| decrement isize@600001a600c8: 1 - 1 = 0!
๐Ÿ’€ freed u8@600001a600c0
+ allocated u8@600001a600c0 (32 bytes with alignment 8)
decref list: (8,32,true,fn(?[*]u8) callconv(.C) void@1049a9420)
freeing underlying elements from: (u8@600001a600d0, 1)
| decrement isize@60000166c040: 9223523571393151041 - 1 = 9223523571393151040!
| decrement isize@600001a600c8: 1 - 1 = 0!
๐Ÿ’€ freed u8@600001a600c0

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 16:44):

60000166c040 specifically

view this post on Zulip Anton (Jul 15 2024 at 17:15):

Is this output from a tool?

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 17:18):

Debug prints in our zig builtins

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 17:18):

Some behind flags that already exist. A few new.

view this post on Zulip Luke Boswell (Jul 15 2024 at 22:47):

Testing the changes on my x64 linux machine -- in short basic-webserver refactor host branch is good to go.

The above free issue prevents us from using basic-cli.

08:44:37 ~/Documents/Github/basic-cli refactor-host$ roc --prebuilt-platform examples/args.roc -- div 5 2
free(): invalid next size (fast)

This prevents us from using the build script for basic-webserver as it uses argument parsing.

08:44:11 ~/Documents/Github/basic-webserver refactor-host$ roc --prebuilt-platform build.roc
free(): invalid pointer

Running through the build steps manually I can verify that basic-webserver is working well.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 23:11):

Yeah, hopefully I can narrow down the bug and then everything will be green.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 23:51):

Ok. I think it is a bug in drop specialization or something related to the dec to decref conversion.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 23:55):

After refcounting ir is correct:

procedure : `#UserApp.main` Str
procedure = `#UserApp.main` ():
    let `#UserApp.16` : U8 = 100i64;
    let `#UserApp.15` : List U8 = Array [`#UserApp.16`];
    let `#UserApp.inStr` : [C {U64, U8}, C Str] = CallByName `Str.fromUtf8` `#UserApp.15`;
    let `#UserApp.in` : List [C {U64, U8}, C Str] = Array [`#UserApp.inStr`];
    joinpoint `#UserApp.8` `#UserApp.out`:
        let `#UserApp.7` : U64 = CallByName `List.len` `#UserApp.out`;
        dec `#UserApp.out`;
        let `#UserApp.6` : Str = CallByName `Num.toStr` `#UserApp.7`;
        ret `#UserApp.6`;
    in
    let `#UserApp.12` : U64 = lowlevel ListLenUsize `#UserApp.in`;
    let `#UserApp.13` : U64 = 1i64;
    let `#UserApp.14` : Int1 = lowlevel Eq `#UserApp.12` `#UserApp.13`;
    if `#UserApp.14` then
        let `#UserApp.11` : U64 = 0i64;
        let `#UserApp.alone` : [C {U64, U8}, C Str] = lowlevel ListGetUnsafe `#UserApp.in` `#UserApp.11`;
        inc `#UserApp.alone`;
        dec `#UserApp.in`;
        let `#UserApp.9` : List [C {U64, U8}, C Str] = Array [`#UserApp.alone`];
        jump `#UserApp.8` `#UserApp.9`;
    else
        jump `#UserApp.8` `#UserApp.in`;

After drop specialization ir is incorrect cause it is missing the increment of #UserApp.alone. For lists, there should not longer be a difference between dec and decref or at least, that is my understanding of how my PR changes things, but drop specialization (some other pass after refcounting) doesn't know that....Though maybe there should still be a difference... Cause we could save on the increment here if there was a difference.

procedure : `#UserApp.main` Str
procedure = `#UserApp.main` ():
    let `#UserApp.16` : U8 = 100i64;
    let `#UserApp.15` : List U8 = Array [`#UserApp.16`];
    let `#UserApp.inStr` : [C {U64, U8}, C Str] = CallByName `Str.fromUtf8` `#UserApp.15`;
    let `#UserApp.in` : List [C {U64, U8}, C Str] = Array [`#UserApp.inStr`];
    joinpoint `#UserApp.8` `#UserApp.out`:
        let `#UserApp.7` : U64 = CallByName `List.len` `#UserApp.out`;
        dec `#UserApp.out`;
        let `#UserApp.6` : Str = CallByName `Num.toStr` `#UserApp.7`;
        ret `#UserApp.6`;
    in
    let `#UserApp.12` : U64 = lowlevel ListLenUsize `#UserApp.in`;
    let `#UserApp.13` : U64 = 1i64;
    let `#UserApp.14` : Int1 = lowlevel Eq `#UserApp.12` `#UserApp.13`;
    if `#UserApp.14` then
        let `#UserApp.11` : U64 = 0i64;
        let `#UserApp.alone` : [C {U64, U8}, C Str] = lowlevel ListGetUnsafe `#UserApp.in` `#UserApp.11`;
        decref `#UserApp.in`;
        let `#UserApp.9` : List [C {U64, U8}, C Str] = Array [`#UserApp.alone`];
        jump `#UserApp.8` `#UserApp.9`;
    else
        jump `#UserApp.8` `#UserApp.in`;

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 23:56):

Thinking about this more, I don't think there can be this optimization anymore. Cause decrementing the list may be freeing tail elements that are in the capacity of the ilst but not in the specific slice being freed. So while it may be correct for the slice elements, it won't be correct for the underlying elements.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 23:56):

So I think I just need to block this optimization.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 23:57):

I guess it could still be allowed if there was some sort of check on slice vs list though.

view this post on Zulip Brendan Hansknecht (Jul 15 2024 at 23:57):

cc: @Folkert de Vries in case he has any comments around this optimization and how it maps to the new list refcounting

view this post on Zulip Brendan Hansknecht (Jul 16 2024 at 00:02):

Oh, I think this optimization may only apply to lists with known lengths. So maybe we can make it safe.

view this post on Zulip Brendan Hansknecht (Jul 16 2024 at 00:11):

Ok. I think I have a fix

view this post on Zulip Brendan Hansknecht (Jul 16 2024 at 00:16):

If anyone has thoughts on drop specialization for lists, left a big todo with info on the refcounting change. Currently just disabled it, but not sure if it should be fully removed or not: https://github.com/roc-lang/roc/pull/6907/files#r1678554981 (extra context in todo right above comment)

view this post on Zulip Brendan Hansknecht (Jul 16 2024 at 00:43):

cc @J.Teeuwissen who worked on drop specialization

view this post on Zulip Luke Boswell (Jul 16 2024 at 00:44):

Test results for https://github.com/roc-lang/roc/pull/6907 using

macos aarch64

linux x64

view this post on Zulip Luke Boswell (Jul 16 2024 at 01:00):

Ok, I think we are ready for the next steps @Anton

  1. a new test nightly for roc
  2. pre-release of basic-cli
  3. release of roc nightly & update examples
  4. latest release of basic-cli & basic-webserver

The following platforms have been updated and compatible with these changes;

The following packages/platforms should be updated at some point soon, not blocking

view this post on Zulip J.Teeuwissen (Jul 16 2024 at 06:21):

So, if my understanding of the new lists is correct, a dec of a list does something like this:

if (unique list)
  dec items
  free list
else
  decref list

If this is the case, inc item; dec list (with list being a singleton), it could be optimized as follows

if (unique list)
  free list
else
  inc item
  decref list

view this post on Zulip J.Teeuwissen (Jul 16 2024 at 06:27):

Iirc the previous list dec simply decremented all items. I think all drop specializations for lists (with a known length) would have to be updated with a uniqueness check (and free/decref).
I'm not sure how "dead" references factor in, I would assume they do not matter as their reference count should not be touched either way.

view this post on Zulip Brendan Hansknecht (Jul 16 2024 at 06:27):

That is correct with the caveat that seamless slices can make it more complex

str = "some super long string that will definitely be allocated"
list = [str, str]
subList = List.dropFirst 1

# keep sublist alive and use it later

In this case, we are left with 2 allocations. 1 for the Str and 1 for the List.
The Str has a refcount of 2.
The slice has a size of 1, but knows how to get back to the initial allocation so it can free the entire original list.

When subList goes out of scope, it will end up decrementing the refcount to str twice and freeing both it and the underlying list allocation.

view this post on Zulip Brendan Hansknecht (Jul 16 2024 at 06:31):

So more accurate refcount note to add to the code would be:

if (unique list)
  dec items (this includes all items in the entire allocation, they don't have to be in the current sublist)
  free list
else
  decref list

view this post on Zulip J.Teeuwissen (Jul 16 2024 at 06:57):

Ah okay. Using the suggested slice/list check would be a simple fix. It would only make sense to drop specialise a slice if the size of the underlying list size is known at compile time. But even then, you would have to inline both the dec for the slice and the dec for the list. (Assuming the slice one calls the list one).

view this post on Zulip Brendan Hansknecht (Jul 16 2024 at 07:03):

Makes sense. Thanks for the info.

view this post on Zulip Anton (Jul 16 2024 at 16:21):

linux x64 basic-cli refactor-host :check:

@Luke Boswell did you run this inside nix?
I'm now hitting:

โฏ ./examples/args div -n 5 -d 20
free(): invalid next size (fast)
Aborted (core dumped)

view this post on Zulip Brendan Hansknecht (Jul 16 2024 at 16:25):

Did you delete and rebuild the host for surgical linking? I think we hit that as well if the surgical linker pulled in the old host (should pass with --linker=legacy` or if the surgical linker host is properly rebuilt

view this post on Zulip Anton (Jul 16 2024 at 16:33):

Thanks @Brendan Hansknecht :heart:
that fixed it :)


Last updated: Jul 06 2025 at 12:14 UTC