Stream: compiler development

Topic: Host Refcounting of List with Size on Heap


view this post on Zulip Brendan Hansknecht (Apr 05 2024 at 14:38):

So one piece that is gonna be more annoying to design around is the host language specific libraries. One important piece of this change is that knowledge around refcounting is even more ingrained into the handling of a list. My thought is to essentially add a trait (rust)/comptime function (zig) (will need to be generated by glue) on each roc type to specify if it is refcounted or not. It will also contain an inc and dec function. This is definitely a case where it will suck to duplicate this logic into each host language. It would be nicer if roc could expose the inc and dec functions for every single type exposed via the host api (just to makes sure the function is always correct), but that is probably a lot more work to enable.

Note: lists in host specific libraries should really know about this anyway. It is required if you ever want to properly interact with a list of refcounted roc things instead of simply deallocate it.

Any general thoughts on this?

Relatedly, once we have explicit allocators, a drop function in rust won't be possible anymore. All drops will need the allocator passed in, which is incompatible with rusts drop api.

Part of me thinks that it may be worth analyzing the host api and exposing all of the primitives from roc to the host. That way the logic is all centralized in roc. Don't need to expose all functions, but the complex underlying pieces around refcounting and getting access to the data.

view this post on Zulip Brendan Hansknecht (Apr 05 2024 at 14:38):

I hit this cause freeing a list now requires knowing whether or not the elements in the list are refcounted.

view this post on Zulip Richard Feldman (Apr 05 2024 at 18:01):

Brendan Hansknecht said:

Relatedly, once we have explicit allocators, a drop function in rust won't be possible anymore. All drops will need the allocator passed in, which is incompatible with rusts drop api.

that's a great point - they should probably all be ManuallyDrop anyway

view this post on Zulip Richard Feldman (Apr 05 2024 at 18:01):

but exposing dealloc methods for convenience, which accept an allocator

view this post on Zulip Richard Feldman (Apr 05 2024 at 18:02):

Brendan Hansknecht said:

My thought is to essentially add a trait (rust)/comptime function (zig) (will need to be generated by glue) on each roc type to specify if it is refcounted or not. It will also contain an inc and dec function.

couldn't the inc/dec functions always be there, but sometimes be no-ops?

view this post on Zulip Brendan Hansknecht (Apr 05 2024 at 18:05):

It's fine if the functions are always there, but not particularly useful. We need to know if the elements are refcounted. So it would then be inc, dec, and is_refcounted on every type.

view this post on Zulip Brendan Hansknecht (Apr 05 2024 at 18:06):

Is the element type is refcounted, we have to store the full list length on the heap so that a seamless slice can free the elements in the underlying list.

view this post on Zulip Brendan Hansknecht (Apr 05 2024 at 18:06):

Cause the slice may not reference all elements.

view this post on Zulip Brendan Hansknecht (Apr 05 2024 at 18:07):

Instead of changing all lists to have space to store the full list length on the heap, only lists with refcounted elements are changed.

view this post on Zulip Brendan Hansknecht (Apr 05 2024 at 18:24):

Huh, I don't think rust can do what I wanted without an unstable feature. I was thinking of essentially:

impl<T> Drop for RocList<T>
where
    T: RocRefcounted,
{
   // impl with refcounting.
}

impl<T> Drop for RocList<T>
{
   // impl without refcounting.
}

view this post on Zulip Brendan Hansknecht (Apr 05 2024 at 18:47):

There is a way to make this work in stable rust, but it is definitely odd... https://github.com/dtolnay/case-studies/blob/master/autoref-specialization/README.md

view this post on Zulip Brendan Hansknecht (Apr 05 2024 at 19:26):

If I understand correctly, which I totally might not, the autoref thing doesn't work generic types that implement other traits. So it wouldn't work with RocList<T>.

So I guess my current proposal here to simply get something in is:
1) Add a RocRefcounted trait

pub trait RocRefcounted {
    fn inc(&mut self, n: usize);
    fn dec(&mut self);
    const fn is_refcounted() -> bool;
}

2) Force all list element types to implement the trait

pub struct RocList<T>
where
    T: RocRefcounted
{
    // ...
}

3) Use a macro to implement RocRefcounted as a no-op for all of the stardard rust types (integers mostly, but anything that might be in a RocList that is part of the rust std).

4) Write implementations of RocRefcounted for the types in roc_std (RocStr, RocList, etc)

5) Update RustGlue.roc to generate an implementation for all generated types.

6) With RocRefcounted implemented everywhere, types will be usable with RocList and RocList will be able to proper deal with all refcounting and deallocation stuff.

view this post on Zulip Brendan Hansknecht (Apr 06 2024 at 00:52):

This is what requiring RocRefcounted looks like: https://github.com/roc-lang/roc/commit/05fb172348b9520a8500ed5ca618d9423068dc41

I am not exactly a fan, but it seems to be the most practical solution currently....

Ideas very welcome.

view this post on Zulip Brendan Hansknecht (Apr 06 2024 at 00:53):

Oh, also with this change, only 19 failing tests in the full test_gen for llvm:

1277 tests run: 1258 passed, 19 failed, 19 skipped

Still probably many memory leaks/mistakes/other bugs...and two other backends, but almost to parity for llvm backend.

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 18:42):

Progress... :hundred:% passing test-gen-llvm

Summary [ 167.284s] 1277 tests run: 1277 passed, 19 skipped

Still need to:

  1. Update RustGlue.roc to generate RocRefcounted for all types. Still kinda want a different solution than RocRefcounted, but it works.
  2. Make sure all the examples work without memory leaks.
  3. Update the dev backend
  4. Update the wasm backend

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 18:43):

Will be interesting once I have this wired up E2E and can actually measure the perf difference for something like Dict Str U64. Should be huge!

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 15:06):

So I just manually did some wiring up to get the examples working and make it so that I can test the perf gains.

Wrote a really simple app. Just counts word occurrences in text. The text is 43MB. I basically wrote an implementation of this blog: https://benhoyt.com/writings/count-words/ in roc. Though I skipped the lowercasing and I didn't try to optimize it at all.

Anyway....the perf diff:

Current main: 59min
This branch: 1.6s

So ~2,200 times faster.

For more reasonable perf rough benchmark, I ran an equivalent go and python implementation. For this test though, I removed the sorting (we have a bug that hurts sorting perf). Just counting and printing. Loading the file in bulk in all languages. So pretty apples to apples...we do eh, ok-ish.

Benchmark 1: ./count-go
  Time (mean ± σ):     348.6 ms ±   3.4 ms    [User: 366.8 ms, System: 47.4 ms]
  Range (min … max):   344.4 ms … 353.6 ms    10 runs

Benchmark 2: ./count-roc
  Time (mean ± σ):     769.8 ms ±   7.0 ms    [User: 746.0 ms, System: 19.5 ms]
  Range (min … max):   760.3 ms … 782.5 ms    10 runs

Benchmark 3: python /tmp/count.py
  Time (mean ± σ):      1.026 s ±  0.030 s    [User: 0.930 s, System: 0.085 s]
  Range (min … max):    0.991 s …  1.086 s    10 runs

Summary
  './count-go' ran
    2.21 ± 0.03 times faster than './count-roc'
    2.94 ± 0.09 times faster than 'python /tmp/count.py'

Haven't done any analysis as to why we are 2.2x slower than Go. Might be a bad implementation of splitWhiteSpace on my part (probably this I think I am doing extra unicode verification that is costing, why a builtin may be needed here). Might be needing borrow inference to get some refcounting out of hot loops. Might be other dictionary perf issues.

We do at least use way less memory:
Roc: 85MB
Go: 213MB
Python: 588MB

view this post on Zulip Richard Feldman (Apr 08 2024 at 15:07):

Screenshot-2024-04-08-at-11.07.27AM.png
:rolling_on_the_floor_laughing:

view this post on Zulip Richard Feldman (Apr 08 2024 at 15:08):

amazing work @Brendan Hansknecht!!!!

view this post on Zulip Richard Feldman (Apr 08 2024 at 15:09):

@Folkert de Vries do you have borrow inference far enough on a branch that we could try it with this?

view this post on Zulip Folkert de Vries (Apr 08 2024 at 15:12):

maybe? there are bugs but maybe it works for some examples

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 15:17):

I don't think it would help yet cause it wouldn't work on Dict. Cause Dict is not a List, it is a container that holds a List. So far it is just implemented for raw Lists and Strs.

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 18:22):

So got a bit of a breakdown of the perf (all precentages are of total execution time for the program):

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 18:24):

In case I am missing something and there is a better way to do this. Here is the current impl to count words

countWords impl

view this post on Zulip Brendan Hansknecht (Apr 10 2024 at 02:53):

Just an update on status of implementation:

llvm -> looks all good e2e (need to double check potential mem leaks)
dev -> not implemented
wasm -> not implemented

RustGlue.roc -> needs update to generate RocRefcounted for all types (probably the biggest part of this work if I implement it for every single type, could leave some of it as todo and still merge the PR)

view this post on Zulip Brendan Hansknecht (Apr 12 2024 at 00:23):

Down to 47 failing dev wasm tests. Most (if I'm lucky, all) because I haven't updated the auto generated refcounting.

Dev...may be a rougher lift. I have been updating it along with wasm, but it has many more failures, 204. All sig aborts. My gut feeling is that some of the auto generated reference counting function pointers are broken/done wrong. So anytime zig tries to increment a refcount of an element in a list it is crashing.

view this post on Zulip Brendan Hansknecht (Apr 13 2024 at 03:19):

In an interesting turn of events, I update the autogenerated refcounting functions and the test numbers essentially switched between wasm and dev

wasm: 229 failing
dev: 45 failing

IIUC, most of the wasm tests are failing in roc's wasm interpreter. Not sure if I am revealing an interpreter bug or if this is a bug in my code gen. Theoretically the new functions are just calling into zig. So feels strange that they would be wrong. But I easily could have mis-wired something

Most wasm errors look like this now:

panicked at 'index out of bounds: the len is 1254 but the index is 1254', crates/wasm_module/src/lib.rs:460:34

view this post on Zulip Brian Carroll (Apr 23 2024 at 19:30):

That particular line of code is in trace_live_functions which is used for dead code elimination. It's possible to turn off DCE using the DEBUG_SETTINGS struct here.

view this post on Zulip Brian Carroll (Apr 23 2024 at 19:34):

That particular error means that it's trying to trace a function call but the function index is out of bounds, so the function doesn't exist, or the index is wrong somehow.

view this post on Zulip Brendan Hansknecht (Apr 23 2024 at 19:44):

Thanks for the info. Will take a look

view this post on Zulip Brendan Hansknecht (Jul 07 2024 at 18:03):

I am looking at this PR a bit again. One thing I am unsure of that maybe @Folkert de Vries or @Brian Carroll know. For the refcounting function code gen. I can generate a symbol for a proc. (In this case a proc to decrement the elements in a list). Do we have a way to load that symbol into a pointer for passing to zig?

context: https://github.com/roc-lang/roc/blob/46be989f5b42a59a331f1f63b150088735c86723/crates/compiler/mono/src/code_gen_help/refcount.rs#L1019-L1032

I know both dev backends have ways to do this, but not sure if anything is actually exposed such that I could do that here. Maybe I just need to somehow give the function symbol a pointer layout?

view this post on Zulip Folkert de Vries (Jul 07 2024 at 18:21):

I think we only do that for lowlevels so it's baked into the backends, not something we can currently do in mono

view this post on Zulip Brendan Hansknecht (Jul 07 2024 at 18:23):

Ok

view this post on Zulip Brendan Hansknecht (Jul 07 2024 at 18:23):

I should be fine to add it as a low level?

view this post on Zulip Brendan Hansknecht (Jul 07 2024 at 18:23):

Just function symbol arg to pointer?

view this post on Zulip Brendan Hansknecht (Jul 07 2024 at 18:24):

Or I guess I could make this whole chunk of list recounting code call a low-level that the backends deal with

view this post on Zulip Folkert de Vries (Jul 07 2024 at 18:24):

yea some ToFunctionPointer should be fine

view this post on Zulip Brian Carroll (Jul 07 2024 at 18:50):

If this is for decrementing elements of lists then we should know the type of those elements at compile time in mono, so we should know what function to call. So why is it a function pointer at runtime and not just a call to a known function?

view this post on Zulip Brendan Hansknecht (Jul 07 2024 at 18:51):

Only cause the logic is complex enough now that I didn't want to port it all to mono. Instead just passing off to the zig builtin

view this post on Zulip Brian Carroll (Jul 07 2024 at 18:53):

OK

view this post on Zulip Brian Carroll (Jul 07 2024 at 18:53):

Where's the logic now?

view this post on Zulip Brendan Hansknecht (Jul 07 2024 at 18:55):

On main, it is written in mono (and I guess duplicated in zig). With this branch, it become more complex so I want to have one source of truth in zig if possible.

view this post on Zulip Brian Carroll (Jul 07 2024 at 18:55):

OK. Well we do already have lowlevels for casting integer to pointer and back. But I don't think we have a lowlevel for function pointers because they were always known at compile time.

view this post on Zulip Brian Carroll (Jul 07 2024 at 19:01):

But as you mentioned, the wasm dev backend does pass function indices to Zig already.
This seems like a good place to refer to.

view this post on Zulip Brendan Hansknecht (Jul 07 2024 at 19:01):

Thanks

view this post on Zulip Brendan Hansknecht (Jul 07 2024 at 23:41):

hey @Folkert de Vries, do you recall symbols from debug_symbol in the dev backend need to be freed? My understanding was that it generates a unique symbol just with whatever name as like a prefix, so freeing shouldn't be needed. But maybe I am incorrect about how that is implemented.

view this post on Zulip Brendan Hansknecht (Jul 07 2024 at 23:52):

Also, one thing that really confuses me with list-size-on-heap is that a lot of the newly failing gen_dev tests are not even touching lists and the refcount path. I don't know what I broke, but really feels like I may have accidentally modify something unrelated to lists for that backend (same for gen_wasm) I guess. Hopefully, that means the fix is unrelated and pretty minor. The fact that llvm is 100% working only gen_wasm/gen_dev have issues definitely a tad annoying. Want to get this merged for people to be able to use it.

view this post on Zulip Brendan Hansknecht (Jul 10 2024 at 20:59):

I decided to open a Draft PR for the list size on heap work(#6894). If anyone has time to look over the code changes in gen_dev, gen_wasm or mono/code_gen_help, it would be much appreciated. Given the PR is passing for the llvm backend, it suggests that the issues I am still hitting are coming from those folders. Extra eyes to leave comments, attempt to debug, or give general suggestions of code I should check over again would be quite helpful.

Hopefully I can figure out the last chunk of issues and finally get this submitted. It is a huge perf gain for any lists of refcounted types.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 00:26):

Oh! Just had a big breakthrough on wasm. I was generating a bunch of direct refcounting functions when I needed indirect ones!. Changing over fixed a number of tests. That said, also revealed some incorrect args I need to update.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 00:34):

Eh, less big than I thought, but still a chunk better. That said, there is definitely a DCE bug. approximately 120 tests fail with DCE enabled. I assume it is related to passing function pointers to zig and those being DCE'd when they shouldn't be.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 01:22):

Hmm, for wasm it seems that some indirect refcounting function my not be getting added to the function table in general. (No idea why currently). As such, we end up passing an incorrect function pointer around. Maybe it is related to the nested nature (like outer function getting added but not inner or something similar)

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 01:51):

Ok. Can confirm issue is caused by indirect helper procs calling direct helper procs. Wasm backend is only emitting one of these procs and not the other. I attempted to naively recursively generate helper procs, but that clearly isn't correct. Some functions are generating forever.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 03:10):

Hmm, we also seem to have a chicken and egg problem. Even if I explicitly generate the element helper procs when generating the list helper procs, they won't be known when generating the helper procs. This is because the helper procs are taken from the code generator. So they don't exist when building the helper procs themselves. Only when generating the main procs.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 03:49):

I think I fixed one of the main recursion cases leaded to inner procs not actually being generated. Down from 200ish dev wasm failures to 50ish.

Most failures still look to be due to missing helper procs leading to index out of bounds when trying to load a proc...not sure why at this point though.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 04:08):

Oh haha, actually those issues were due to forgetting to pass an arg to the function. Not sure why it manifested that way. Down to 18 gen wasm tests failing.

view this post on Zulip Brian Carroll (Jul 11 2024 at 04:20):

Sounds like good progress!
I will have a look in a couple of hours.
What do you mean by "indirect ref counting function"?

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 04:31):

Indirect: take a pointer to the thing to refcount and call the direct refcounting function

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 04:31):

Essentially all the refcount functions we pass to zig are indirect.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 04:35):

So at this point, the failures are down to two types:

  1. cases where the refcounting proc I am trying to load a function pointer to isn't found and we panic on an unwrap.
  2. gen_refcount functions that are no longer correct in how they load the rc. Cause lists with refcounted elements are now |size|rc|elements ...|. So gen_refcount is loading the size on heap instead of the rc.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 04:44):

So for the gen_refcount tests. All of the the sizes on heap never actually get initialized. So, by zeroing out the the allocation with calloc instead of malloc, all of the lists end up having their refcount read as Constant cause it is reading the 0 size on heap. I guess I can special case that. Definitely a bit jank, but should be consistent and a good test.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 05:05):

Ok, fixed all the gen_refcount tests to recognize when the load the size on heap (it's always positive) and if so load the value after it as the refcount.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 05:34):

Oh wow, down to 2 in gen wasm. Both due to not having a matching refcount proc for a layout. Though for 1, I think the layouts are the same, it's just that they have to be check recursively to be known equal. I'm not sure why this ends up happening. Like they have different internal InLayouts, but the runtime reprs of those InLayouts are the same. So the types really are the same.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 06:07):

So yeah, we have 1 test where we request 1 refcount layout, but the function takes a different layout (that I'm pretty sure is equavalent:

[crates/compiler/gen_wasm/src/backend.rs:2123:25] self.layout_interner.runtime_representation(inner) = Union(
    NonRecursive(
        [
            [
                InLayout(STR),
                InLayout(
                    27,
                ),
            ],
            [
                InLayout(
                    30,
                ),
                InLayout(UNIT),
            ],
        ],
    ),
)
[crates/compiler/gen_wasm/src/backend.rs:2123:25] layout_repr = Union(
    NonRecursive(
        [
            [
                InLayout(STR),
                InLayout(
                    29,
                ),
            ],
            [
                InLayout(
                    32,
                ),
                InLayout(UNIT),
            ],
        ],
    ),
)

Then we have a test where we are generating the correct refcount function, but it seems to be missing when we try to loading it. Maybe it just isn't being registered in the wasm backend for some reason... There is a function of the same name with a different layout for this second test. Given this is for a recursive tag, I think that function is for refcounting the inner elements. So I think that is why it has the same name.

view this post on Zulip Luke Boswell (Jul 11 2024 at 22:51):

@Brendan Hansknecht it's awesome seeing your progress with this. Hype for the extra :racecar: boost.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 22:52):

fingers crossed that I can actually close it out. Really hoping that ironing out wasm (which is more debuggable) will pave the path for fixing the dev backend as well.

view this post on Zulip Brendan Hansknecht (Jul 11 2024 at 23:02):

Should List.map* take ownership of the input lists?

I guess the theory is that if:

  1. one of the lists has the same element size as the output element size
  2. That list has a refcount of 1

The zig builtin could reuse that allocation instead of allocating an output list.

Currently we don't do any of these sort of optimizations. On top of that, except for maybe List.map (the single element version) I would assume they would be exceptionally rare to trigger.

Just realizing that we have to pass in way more args if List.map* are also responsible for free every single list that is passed in.

view this post on Zulip Brendan Hansknecht (Jul 12 2024 at 00:10):

Hmm...actually with I'm a bit confused. Setting it to Borrowed or Owned doesn't seem to chaing our recfcount codegen...something feels off here. Either way, it seems to be Borrowed The lists are passed in and then decrefed after the call....

view this post on Zulip Brendan Hansknecht (Jul 12 2024 at 01:15):

@Folkert de Vries do you know where we ensure that closure passed to List.map* have the correct borrow signiture? As in, I noticed that in one List.map2 case, the closure was dealing with incrementing the element refcount, but in another List.map case, the closure was not dealing with incrementing the refcount.

I definitely prefer for the closure to assume it is borrowing all args and be required to increment the refcount if the arg is used, but I'm not sure where this logic lives.

To get concrete, these are the examples: The first test here the List.map closure does not increment the refcount. In the second case, the closure is incrementing the refcount.

view this post on Zulip Brendan Hansknecht (Jul 12 2024 at 01:28):

Lol, forgot to post the link: https://github.com/roc-lang/roc/blob/d4880cc438decdbe1bb5588706e682ff33daa24f/crates/compiler/test_gen/src/gen_refcount.rs#L127-L167

view this post on Zulip Richard Feldman (Jul 12 2024 at 02:59):

Brendan Hansknecht said:

Should List.map* take ownership of the input lists?

I guess the theory is that if:

  1. one of the lists has the same element size as the output element size
  2. That list has a refcount of 1

The zig builtin could reuse that allocation instead of allocating an output list.

Currently we don't do any of these sort of optimizations. On top of that, except for maybe List.map (the single element version) I would assume they would be exceptionally rare to trigger.

I'd like to do it for single-element List.map for that reason, but I agree that for the others it's not likely to come up

view this post on Zulip Richard Feldman (Jul 12 2024 at 02:59):

but as an example, a common thing is to List.map to transform a list of strings into other strings

view this post on Zulip Brendan Hansknecht (Jul 12 2024 at 03:33):

Also, don't know if it was @Brian Carroll, @Folkert de Vries, or someone else. But many thanks to whoever made gen_refcount for the wasm backend. Very useful for adding extra test cases and debugging my changes.

view this post on Zulip Brian Carroll (Jul 12 2024 at 05:31):

That was me. Glad you like it!

view this post on Zulip Brian Carroll (Jul 12 2024 at 05:49):

Also, periodic reminder that we have a HTML page in test_gen/ that can load the Wasm file for a test into the browser so we can use their debug tools. Whenever I had a really hard Wasm bug, it always ended up being helpful.

view this post on Zulip Folkert de Vries (Jul 12 2024 at 10:10):

do you know where we ensure that closure passed to List.map* have the correct borrow signiture

hmm, should we maybe just move that over to roc itself finally? then lambda sets (or whatever else we use) will just handle that automatically

view this post on Zulip Folkert de Vries (Jul 12 2024 at 10:10):

because the actual answer is that I may have missed that in the recent borrow inference changes (and none of our test cases catch that?!)

view this post on Zulip Richard Feldman (Jul 12 2024 at 11:07):

I'd still rather not add special syntax for that :sweat_smile:

view this post on Zulip Richard Feldman (Jul 12 2024 at 11:07):

I know it would make this sort of thing nicer, but I think if it's exposed in syntax then there will be requests for moving it to userspace

view this post on Zulip Folkert de Vries (Jul 12 2024 at 13:45):

syntax for what? I'm not proposing syntax?

view this post on Zulip Folkert de Vries (Jul 12 2024 at 13:46):

we can write List.map in pure roc in the standard library. There is a minor performance hit (or there was, years ago now) but it really simplifies the backends

view this post on Zulip Richard Feldman (Jul 12 2024 at 14:31):

oh nm I misunderstood

view this post on Zulip Brendan Hansknecht (Jul 12 2024 at 15:20):

Ah, so move List.map into roc and let roc generate recounting for calling and what not automatically. That said, any other zig builtin that take lambdas may still hit this. Like sort with

Also, I'm gonna guess that this was broken before your recent change, but I can double check.

view this post on Zulip Richard Feldman (Jul 12 2024 at 15:22):

Folkert de Vries said:

we can write List.map in pure roc in the standard library. There is a minor performance hit (or there was, years ago now) but it really simplifies the backends

I could see doing this for now, and then in the future finding a way to improve the perf (e.g. optimizations that apply to that roc code, maybe introducing a new builtins-only primitive that can speed it up, etc.)

view this post on Zulip Brendan Hansknecht (Jul 12 2024 at 21:33):

Yeah, I guess we want the inplace optimization, so moving to roc maybe doesn't make sense. We will just end up moving it back out eventually.

view this post on Zulip Brendan Hansknecht (Jul 12 2024 at 21:34):

I guess it could be represented in pure roc if we add a built-in to check the uniqueness. If unique, can read and write directly to the single list

view this post on Zulip Brendan Hansknecht (Jul 12 2024 at 21:34):

So actually that sounds fine

view this post on Zulip Brendan Hansknecht (Jul 12 2024 at 21:34):

Yeah, probably gonna move it to full roc.

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 00:37):

Moved List.map* to roc and fixed a refcounting bug in dev_wasm.
I also added new tests (so that has increased the number of failing tests), but currently at:

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 00:53):

dev wasm at 100% passing!

view this post on Zulip Richard Feldman (Jul 13 2024 at 00:57):

yooooooo

view this post on Zulip Richard Feldman (Jul 13 2024 at 00:57):

this is so exciting! :heart_eyes:

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 01:28):

dev native 100% passing!

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 01:30):

So now all that is left is updating RustGlue.roc. Should be easy but tedious.

view this post on Zulip Luke Boswell (Jul 13 2024 at 01:41):

Wow, moving quickly

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 02:14):

Looks like there is also some sort of memory issue in parser-movies-csv.roc on linux

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 02:39):

looks like we have some invalid writes during decrement_refcounted_ptr_8. Probably means I am either treating something like a list when I shouldn't or the reverse.

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 02:41):

Ah wait, might actually be use after free. So bad refcounting.

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 02:41):

Anyway, guess that also needs to be fixed

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 03:33):

Cool, just a minor mistake in List.sublist leading to a use after free. All good now.

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 04:26):

CI is green!!!

So that means all that is left is:

  1. update glue and re-enable the 2 failing glue tests
  2. Maybe do something about -ld_classic. Cause I think for many people with updated macs, they will need that flag. (EDIT: might need to do something like this where we query xcode version: https://github.com/godotengine/godot/pull/83088/files)

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 04:26):

@Anton just curious if you have any thoughts about -ld_classic and macos versioning.

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 06:30):

Glue generation is not very robust for refcounting (mostly unimplemented) and I haven't dealt with -ld_classic, but I think #6894 is technically mergable now (minus whichever test I just broke...).

view this post on Zulip Anton (Jul 13 2024 at 15:12):

might need to do something like this where we query xcode version

Yeah, I would use the same approach, add -ld_classic for xcode 15 and up

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 16:59):

So with this change, it mostly doesn't affect the ecosystem. That said, it will subtly break a few effects in various platforms. Anything with refcounted things in lists will now be incorrect. So dirList in basic-cli and basic-webserver. Also sqliteExecute in basic-webserver. Both take/return nested lists.

So they will need glue regenerated. Or, at a minimum some copy over the new roc_std/ and potentially do a tiny bit of manual tweaking.

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 17:02):

My thought is that we should merge this and follow quickly with updates to basic-cli and basic-webserver.

Given the small amount of effects that it will affect, I am not too worried about just getting this in. We can post an announcement that warns users of the breaking change and asks them to stay on an older version of nightlies (if they run into it) until these platforms are updates.

view this post on Zulip Anton (Jul 13 2024 at 17:40):

My thought is that we should merge this and follow quickly with updates to basic-cli and basic-webserver.

Yeah, we can hold off on uploading new nightlies until the new releases are ready. I can make the releases a top priority on monday.

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 17:41):

Hopefully updating plays smoothly. If not, let me know. Most likely issues are with things missing the RocRefcounted trait.

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 18:45):

Ok, I'm happy with #6894.

Glue could be updated more, but I am fine with leaving complex uncommon types with some unimplement RocRefcounted methods.

view this post on Zulip Luke Boswell (Jul 13 2024 at 19:57):

Can we lump the breaking change in with the refactor host and task as builtin PRs? Then all the breaking changes are in one release

view this post on Zulip Luke Boswell (Jul 13 2024 at 19:58):

As far as I know those changes are gtg right?

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 19:59):

We could delay nightlies for that grouping together. Sure.

I just want to get this in before anything else moves too much. Don't want it to live too long and get stale.

view this post on Zulip Luke Boswell (Jul 13 2024 at 19:59):

Totally.

view this post on Zulip Luke Boswell (Jul 13 2024 at 20:00):

Also, the glue code gen package is kind of the source for a bunch of platforms which have build scripts. So we can update there and it should be straightforward to update the others.

view this post on Zulip Luke Boswell (Jul 13 2024 at 20:03):

I can update the code gen package in a few hours time, and I'll update the refactor host PRs with the latest glue and align and test with your branch.

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 20:03):

Yeah, I updated glue bar some more complex recounting that I don't think any platform uses.

view this post on Zulip Luke Boswell (Jul 13 2024 at 20:04):

The glue code gen package literally just delivers a vendored roc_std and zig builtins for now.

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 20:05):

Ah, I see. Makes sense

view this post on Zulip Richard Feldman (Jul 13 2024 at 20:05):

ooo, this means we can pass allocation size to roc_dealloc, right?

view this post on Zulip Richard Feldman (Jul 13 2024 at 20:05):

because now even slices know it

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 20:05):

Yeah, I think we could.

view this post on Zulip Richard Feldman (Jul 13 2024 at 20:05):

niiiiice

view this post on Zulip Richard Feldman (Jul 13 2024 at 20:06):

that means Rust hosts will be able to use the global allocator for dealloc!

view this post on Zulip Richard Feldman (Jul 13 2024 at 20:06):

which requires being passed that size

view this post on Zulip Richard Feldman (Jul 13 2024 at 20:06):

that will be valuable for rust hosts that aren't just using malloc/free

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 20:08):

Actually, no, this still doesn't fix it.

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 20:09):

Cause it is number of refcounted elements rather than total allocation size.

view this post on Zulip Richard Feldman (Jul 13 2024 at 20:12):

ah ok

view this post on Zulip Luke Boswell (Jul 13 2024 at 20:12):

So will older basic-cli work with these changes? I think not

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 20:14):

Thought:
Pass in an optional to alloc/dealloc size. If it is none, the allocator can allocate a bit of extra space to store the size.

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 20:14):

Old basic CLI will work with size on heap except for dirList. I think that is the only effect that will break

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 20:15):

This change only affects lists of refcounted things (list list, list str, dict str, set str).

view this post on Zulip Luke Boswell (Jul 13 2024 at 20:19):

I'll see if it breaks the roc build scripts, but it sounds like it wont which makes things easier.

I definitely would prefer we keep the current 0.12.0 basic-cli release (even if it stays in pre-release), and jump to 0.13 with this update, builtin task, and refactor host.

view this post on Zulip Luke Boswell (Jul 13 2024 at 20:21):

Is there any chance we could make another branch on roc that includes both this and task as builtin changes?

view this post on Zulip Brendan Hansknecht (Jul 13 2024 at 20:42):

Personally, I would just merge them into main, main is unstable and that is fine. If we need, we can delay nightlies or explicitly tell users to pin to the nightly from yesterday until everything else is ready.

view this post on Zulip Richard Feldman (Jul 13 2024 at 21:06):

yeah I think it's fine to merge now and disable nightlies temporarily

view this post on Zulip Richard Feldman (Jul 13 2024 at 21:09):

that way nobody who happens to try out roc for the first time has a bad experience

view this post on Zulip Richard Feldman (Jul 13 2024 at 21:10):

(and based on multiple people joining Zulip daily, people trying out Roc every day seems to be a thing that now happens!)

view this post on Zulip Luke Boswell (Jul 14 2024 at 07:57):

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:01):

For roc-wasm4 -- I had to downgrade back to zig 0.11.0 from 0.12.0. I started trying to update the builtins to 0.13.0 and thought I might contribute that back but got stuck on some non-trivial things I wasn't sure how to change. I could get through the many

package/zig-builtins/str.zig:1946:9: note: consider using 'const'
package/zig-builtins/str.zig:1962:9: error: local variable is never mutated

But things like below seemed too important and so I stopped.

package/zig-builtins/dec.zig:317:62: error: expected error union type, found 'u128'
        const self_u128 = @as(u128, @intCast(@abs(self_i128) catch {

Screenshot-2024-07-14-at-18.00.15.png

I would really like it if we could upgrade roc to 0.12.0 or 0.13.0 soon. :pray:

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 08:04):

Probably depends on updating inkwell and llvm

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 08:04):

Should be quite doable though

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:06):

Also, how do I fix this?

error[E0277]: the trait bound `Header: RocRefcounted` is not satisfied
   --> crates/roc_host/src/lib.rs:760:14
    |
760 |     headers: RocList<Header>,
    |              ^^^^^^^^^^^^^^^ the trait `RocRefcounted` is not implemented for `Header`
    |
    = help: the following other types implement trait `RocRefcounted`:
              ()
              (A, B)
              (A, B, C)
              (A, B, C, D)
              (A, B, C, D, E)
              (A, B, C, D, E, F)
              (A, B, C, D, E, F, G)
              (A, B, C, D, E, F, G, H)
            and 29 others
note: required by a bound in `RocList`

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:06):

Header is

#[repr(C)]
pub struct Header {
    key: RocStr,
    value: RocStr,
}

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:07):

Is there a way to auto derive like we would in Roc?

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 08:09):

Glue should generate correctly for that type

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 08:09):

Anyway, it should be super simple for that type

Recounted true
Inc increments both strings
Dec decrements both strings

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:10):

There is no glue in basic-cli anymore... so I had a crack

impl roc_std::RocRefcounted for Header {
    fn inc(&mut self) {
        self.key.inc();
        self.value.inc();
    }
    fn dec(&mut self) {
        self.key.dec();
        self.value.dec();
    }
    fn is_refcounted() -> bool {
        true
    }
}

I think I figured it out

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 08:10):

Theoretically it could auto derive, but I don't know how to write an auto derive macro in rust

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:10):

Thank you rust-analyser

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 08:10):

Yeah, perfect impl

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:10):

And we've killed RocDict?

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:11):

I think you may have mentioned it before, the plan is to do them later?

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:11):

It was being used in roc_fx_envDict

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 08:11):

The roc_std type was totally wrong and dict changed a lot (plus is mildly complex)

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 08:11):

My current thought is pass in a list of tuples and have roc turn it into a dict

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:12):

Ok, I'll switch it to a RocList<(RocStr, RocStr)> I guess

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:15):

Ok, I've got basic-cli working again

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:17):

I'll see if I can make a pre-release by cross-compiling from my mac

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:18):

Because it's broken, we have to manually step through the build steps by hand

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:19):

Though I'm not sure if we need the prebuilt surgical host's for CI to pass on main roc. I think we do which could be a challenge

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 08:20):

I don't think the main roc repo tests depended on anything in basic cli that broke.

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:25):

Curious... I can run basic scripts, but the build.roc script in basic cli isn't happy using the current 0.12.0

$ roc build.roc
roc_app_binary(53940,0x201c58c00) malloc: *** error for object 0x60000348f698: pointer being freed was not allocated
roc_app_binary(53940,0x201c58c00) malloc: *** set a breakpoint in malloc_error_break to debug

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:29):

NVM -- figured that out. PEBKAC

view this post on Zulip Luke Boswell (Jul 14 2024 at 08:52):

CI is failing because it needs a pre-release of basic-cli to pass.

@Anton I added a new file in this PR called jump-start.sh which enables us to build basic-cli locally from a breaking change in roc.

view this post on Zulip Luke Boswell (Jul 14 2024 at 09:02):

:sad:

$ roc --prebuilt-platform examples/args.roc div -n 5 -d 20
roc_app_binary(61803,0x201c58c00) malloc: Heap corruption detected, free list is damaged at 0x60000229c110
*** Incorrect guard value: 185616897523983
roc_app_binary(61803,0x201c58c00) malloc: *** set a breakpoint in malloc_error_break to debug

view this post on Zulip Luke Boswell (Jul 14 2024 at 09:02):

For some reason, the args example is broken

view this post on Zulip Luke Boswell (Jul 14 2024 at 09:06):

@Brendan Hansknecht or if anyone else would like to reproduce this, you just need to checkout the refactor-host branch on basic-cli, then run bash jump-start.sh, and then roc --prebuilt-platform examples/args.roc div -n 5 -d 20.

view this post on Zulip Luke Boswell (Jul 14 2024 at 09:28):

I've staged the update for basic-ssg and tested locally with everything looking good. However, if the above arg issue is serious and we need to regenerate/change roc_std then I will have to do another release so I'll just wait to see how we go with basic-cli.

view this post on Zulip Luke Boswell (Jul 14 2024 at 11:03):

I think this may be something we missed

error[E0599]: no method named `inc` found for struct `RocResult` in the current scope

view this post on Zulip Luke Boswell (Jul 14 2024 at 11:36):

error[E0277]: the trait bound `SQLiteBindings: RocRefcounted` is not satisfied

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 17:03):

Oh, looks like there may be some missed updates on initializing lists in RocRefcounted

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 17:05):

Ah, specifically with elems_with_capacity

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 17:06):

hmm, nvm, that looks correct

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 17:35):

Probably need to run things with valgrind and debug. Trying to fix my linux machine. Broke it yesterday with an update

view this post on Zulip Brendan Hansknecht (Jul 14 2024 at 22:43):

So will debug more later, but looks like we have a refcounting bug that is leading to use after free. Maybe in List.walkTry? Maybe in somt other builtin running before the call to List.walkTry. That or I missed that heapsize offset somewhere.

view this post on Zulip Anton (Jul 15 2024 at 10:47):

and jump to 0.13 with this update, builtin task, and refactor host.

Would bundling all of them not make debugging much harder because there are many more possible causes for a bug?


Last updated: Jul 06 2025 at 12:14 UTC