Use of different allocators in the same platform · platform development

Stream: platform development

Topic: Use of different allocators in the same platform

Oskar Hahn (Dec 20 2024 at 07:07):

I have a question about the planed change, that allocators get passed.

With this change, it will be possible, that different calls to Roc could use different allocators. For example different buffers as pre allocated memory. This could introduce new kinds of memory bugs, when the memory of different values are in different buffers.

For example, if the platform uses a Model, like the wasm4-platform or roc-ray.

The Model could be a List Str (or a List of another refcounted type) and the memory of each Str could be in a different memory buffer. What happens, if Roc tries to deallocate the list? Currently, it calls roc_dealloc for the list itself and for each element on the list. But since the elements could have been created with another allocator, roc_dealloc could not deallocate them.

How should we handle situations like this?

Oskar Hahn (Dec 20 2024 at 07:13):

Should it be, that when you call Roc, all arguments have to be created with the same allocator?

So for example the following code would be illegal, since addStr gets called with allocator1 but it has an argument, that was created with an allocator2.

var allocator1 = RocAllocator();
var allocator2 = RocAllocator();
var myModel : Model = undefinded;
roc__initModel_1_exposed_generic(allocator1, &myModel);

var myStr = RocStr(allocator2, "hello world");
roc__addStr_1_exposed_generic(allocator1, &myModel, &myModel, &myStr);

Oskar Hahn (Dec 20 2024 at 07:19):

Or would it be possible and enough, that allocators can not only be passed around, but also be attached to a type.

For example in zig, you have the ArrayList. Most of the time, it gets used like this:

var list = ArrayList(u8).init(allocator);
defer list.deinit();
try list.append('H');
try list.append('e');
try list.append('l');
try list.append('l');
try list.append('o');
try list.appendSlice(" World!");

So the allocator is saved with the array list, so each call to append and also to deinit use the same allocator. If we want to do this, all std-container-types (like List, Dict, Set etc) could use the allocator, they used, when they where created. But it would also be necessary, that when you add an element to a container, that was allocated with a different allocater, it gets copied to the allocator of the container.

Oskar Hahn (Dec 20 2024 at 07:21):

So the call to addStr in the example above would copy myStr to allocator1.

Oskar Hahn (Dec 20 2024 at 07:28):

What I would like to do is to update the kingfisher platform, to use an arena allocators for each request. So each request gets handled by its own allocator. But the Kingfisher-platform uses a Model, so parts of a request (a request-header or the request-body) could be saved in the Model. Since the allocator of each request gets deallocated after the request finishes, Roc would need to make sure, that the memory used in the Model is copied to an allocator, that stays around for the hole runtime of the program.

Do you think, this could be possible in a future version of Roc?

Richard Feldman (Dec 20 2024 at 12:14):

I think we could facilitate this with Brendan's idea of exposing functions from the application to the host which provide operations on its types

Richard Feldman (Dec 20 2024 at 12:15):

we were talking about it in the context of builtins, but we could also do it for application-specific types like Model

Richard Feldman (Dec 20 2024 at 12:16):

and one of the operations we could expose is "clone" - where you pass it a roc_alloc (and it's potentially different from the one that was used to allocate the original value) and then it uses that roc_alloc for the new (cloned) value and everything inside it

Brendan Hansknecht (Dec 20 2024 at 16:32):

Yeah, I have thought about this a bit and I'm not sure if it is better to have the roc types hold onto the allocator or have the platform guarantee the allocator is used correctly. My gut feeling is that the platform needs to be responsible and in control. If a roc list holds onto an allocator and the platform frees the underlying allocator, that roc list is now broken. So I don't think that really solves the problem

Brendan Hansknecht (Dec 20 2024 at 16:40):

So each request gets handled by its own allocator. But the Kingfisher-platform uses a Model, so parts of a request (a request-header or the request-body) could be saved in the Model.

For kingfisher, I think you just need to run decodeModel and handleWriteRequest with a special allocator. Everything that only reads the model (which hopefully would be most requests) can just use an arena allocator.

Brendan Hansknecht (Dec 20 2024 at 16:41):

How does kingfisher deal with race conditions and data sharing across threads to update the model?

Brendan Hansknecht (Dec 20 2024 at 16:42):

Does it have a way to work around the lack of atomic refcounts?

Oskar Hahn (Dec 21 2024 at 16:57):

The idea to handle race conditions is, that the refcound of Model gets set to infinity for read-requests (Get-Requests) and that there can only be one write-request (POST) at the same time (One write requests OR many read requests). I had problems with the lack of atomics in read requests.

Oskar Hahn (Dec 21 2024 at 16:58):

Brendan Hansknecht said:

So each request gets handled by its own allocator. But the Kingfisher-platform uses a Model, so parts of a request (a request-header or the request-body) could be saved in the Model.

For kingfisher, I think you just need to run decodeModel and handleWriteRequest with a special allocator. Everything that only reads the model (which hopefully would be most requests) can just use an arena allocator.

This could work. But of cause I would be a fan, if the allocator was attached to the data-types, so write requests can also use an area allocator.

Oskar Hahn (Dec 21 2024 at 16:59):

If the allocator is not attached to the data-types, it will be a source of possible errors. I think attaching the allocator will be a nicer experience for platform developers.

Oskar Hahn (Dec 21 2024 at 17:04):

Richard Feldman said:

and one of the operations we could expose is "clone" - where you pass it a roc_alloc (and it's potentially different from the one that was used to allocate the original value) and then it uses that roc_alloc for the new (cloned) value and everything inside it

I don't think this will work. If your Model is something like Model: {list: List Str} and the application code does newModel = {model & list: List.append model.list "not a small string"} then it is not a clone, but it has to be ensured, that the string and (if there is not enough capacity) the clone of the List gets allocated in the allocator of the model. This is not a clone of the Model

Brendan Hansknecht (Dec 21 2024 at 17:31):

I think the clone suggestion was that, newModel = {model & list: List.append model.list "not a small string"} would be allocated into the request local arena allocator. On return from roc, the host would call model.clone() which would recursively clone the model into a different location to keep it alive longer.

Brendan Hansknecht (Dec 21 2024 at 17:35):

so write requests can also use an area allocator.

I don't think attaching the allocator to the data type solves this. You have no guarantees if the user continues to use the same container or replaces the container with a totally new container. Also, any new data added to the container would potentially use a different allocator.

Brendan Hansknecht (Dec 21 2024 at 17:36):

I had problems with the lack of atomics in read requests.

Yeah, in current roc, you either need to fully understand the model data type to recursively set the refcount to constant or you need atomics.

Brendan Hansknecht (Dec 21 2024 at 17:41):

If we gave every container an allocator, that would mean that every container holds an extra pointer, which would not be great. A list of str would now need to be 64bit longer per string in the container.

Oskar Hahn (Dec 21 2024 at 17:56):

Brendan Hansknecht said:

so write requests can also use an area allocator.

I don't think attaching the allocator to the data type solves this. You have no guarantees if the user continues to use the same container or replaces the container with a totally new container. Also, any new data added to the container would potentially use a different allocator.

I think you are right. This would make it necessary to call all functions, that can manipulate the Model with the same allocator. The idea from @Richard Feldman could work, but would create potentially unnecessary copies.

Richard Feldman (Dec 21 2024 at 17:57):

how does updating the model work?

Richard Feldman (Dec 21 2024 at 17:57):

like what's the API - is there an effect any request handler can run to change it?

Oskar Hahn (Dec 21 2024 at 17:58):

Brendan Hansknecht said:

If we gave every container an allocator, that would mean that every container holds an extra pointer, which would not be great. A list of str would now need to be 64bit longer per string in the container.

Are there tricks, that pointers can use only 32bits on a 64bit system? If so, then the refcounter could be used. Its a usize value that will never store more then 32bit. For lists, that have heap allocated elements, there is even another usize value to point to the original list.

Oskar Hahn (Dec 21 2024 at 18:03):

Richard Feldman said:

like what's the API - is there an effect any request handler can run to change it?

In the current version, Roc exposes the function handleWriteRequest : Request, Model -> (Response, Model),.

But I am playing with the API. One Idea is to have only one handleRequest function, but the http-method-type looks like this:

RequestMethod : [
    Options,
    Get,
    Post SaveEvent,
    Put SaveEvent,
    Delete SaveEvent,
    Head,
    Trace,
    Connect,
    Patch SaveEvent,
]

And SaveEvent is an effect that updates the Model. So if you want to update the Model, you have to unpack the method with

 when request.method is
        Get ->
           ...
        Post saveEvent ->
            ...

Oskar Hahn (Dec 21 2024 at 18:07):

This uses Rocs type system, that you can only update the Model on write requests, but you don't have to update the model, if it did not change. For example, if there is a unauthenticated POST request. At the current version, the host can not know, if that Model changed and has to save it it any case. With the new API, it only needs to save it, when SaveEvent was called.

Brendan Hansknecht (Dec 21 2024 at 18:24):

Oskar Hahn said:

Are there tricks, that pointers can use only 32bits on a 64bit system? If so, then the refcounter could be used. Its a usize value that will never store more then 32bit. For lists, that have heap allocated elements, there is even another usize value to point to the original list.

A small string or eventual small list u8 would need to store the allocator, so I dont think there is any way around the full 64 bits being on the stack.

Brendan Hansknecht (Dec 21 2024 at 18:28):

Anyway, how I see this playing out in practice:

We get explicit allocators per requests not per container
We attempt to use them and figure out smart patterns that avoid any memory issues (probably add recursive clone for easy testing)
We evaluate if it is enough or if the allocator needs to be stored on each container type.

Richard Feldman (Dec 21 2024 at 20:07):

Oskar Hahn said:

Richard Feldman said:

like what's the API - is there an effect any request handler can run to change it?

In the current version, Roc exposes the function handleWriteRequest : Request, Model -> (Response, Model),.

yeah this innately has a race condition in it, unfortunately. :big_smile:

what can happen is that two different requests start getting handled in parallel, one increments a "total requests handled" counter in the model from 5 to 6, finishes, and saves the new model, then the other finishes and also increments "total requests handled" from 5 to 6 because it was 5 when both handler functions started running

Richard Feldman (Dec 21 2024 at 20:08):

Elm doesn't have this problem bc only update can change the model, and it only ever runs on one thread

Richard Feldman (Dec 21 2024 at 20:10):

the SaveEvent design would have the same race condition

Brendan Hansknecht (Dec 21 2024 at 20:21):

SaveEvent could have a RWLock. So each request grabs a reader lock over the model. Only grab a writer lock with that function. Then mutate the model in place. Still have to be really careful. Cause either you need to lock the entire request or you need to ensure that no thread has a reference to any data within the model before freeing the lock.

Brendan Hansknecht (Dec 21 2024 at 20:21):

Probably would do it per request in this case

Brendan Hansknecht (Dec 21 2024 at 20:24):

Also, just realized that without locking for the whole request you definitely will hit the increment bug you mentioned above.

Brendan Hansknecht (Dec 21 2024 at 20:24):

Still could model it with a save effect, but the save effect would need to start from loading the existing model after grabbing the writer lock

Richard Feldman (Dec 21 2024 at 20:47):

yeah, which would be pretty different from the current API :sweat_smile:

Oskar Hahn (Dec 21 2024 at 21:20):

There is a RWLock, but this lock is in the host. The host also checks the http method and only lets one write request through.

So there can not be a race condition on write requests, since there is no race.

I know that this has other problems. For example, you can send many unauthorized POST requests. But this limitation is fine for me

Richard Feldman (Dec 22 2024 at 00:47):

if we had https://roc.zulipchat.com/#narrow/stream/304641-ideas/topic/STM.20.28software.20transactional.20memory.29.20builtin I think that could address the race conditions by providing access to Model as a Store Model

Oskar Hahn (Dec 22 2024 at 08:56):

I read the topic, but I don't get it.

Would the Store/Actor be something, the platform provides which each call like handleRequest : Request, Store a -> (Response, Store a), or is it something that could life in some sort of global space in each Roc app and is therefore platform independent?

If it's the second case, then I would not need the kingfisher platform anymore. I could just use basic-webserver. All I want is a way to store state between requests. If this is something Roc can always do, it would be perfect.

But I think this global thing is not possible in a pure language, so you probably mean, that Store/Actor is something, the platform has to provide on each call.

Last updated: Aug 17 2025 at 12:14 UTC