horrible performance problem · beginners

Stream: beginners

Topic: horrible performance problem

Artur Swiderski (Dec 18 2023 at 17:48):

hi I see there is problem with performance, I am working on app https://github.com/salarii/peek/ and there is Regex.roc file, which is interface. Inside this file there is availableRegex field which should be created once, because it acts like static const field.
What I see is that this field is created every time I call parseStr. It is horrible from performance point of view because constructing this field is expensive, how it is even possible that application can't figure this out, to make it only once?

Brendan Hansknecht (Dec 18 2023 at 18:06):

Roc doesn't have any sort of compile time execution to generate that as a constant currently.

Roc also makes all top levels into closures essentially. The issue is that roc doesn't have control to initialize constants due to how hosts load roc apps. We probably need to find a way to avoid this. @Richard Feldman do we have any plan# around this? Top levels that should be constants instead being closures is really unintuitive and easy to be a huge perf issue like noted above.

Brendan Hansknecht (Dec 18 2023 at 18:07):

Currently it 100% depends on the magic of llvm and inline to make these low cost. They should always be free.

Richard Feldman (Dec 18 2023 at 18:09):

this is exactly why I think we need compile-time evaluation of top-level constants :sweat_smile:

see this thread for more:

Richard Feldman said:

as an aside, I didn't say it explicitly earlier, but one of the things I like about the dev backend for top-level constant evaluation idea is that it essentially adds nothing to the build+run cost:

if we build and run them at compile time, then there's zero cost at runtime

if we build them at compile time and then run them as soon as the program starts (which isn't really a thing in a Roc application, since the host is in charge of when the compiled Roc code gets called, but let's pretend it was a thing) then we pay the same cost immediately on program start that we saved from not doing it during the build

so in terms of dev builds, where you basically always build and then immediately run, I think the cost of evaluating them at compile-time ends up being essentially zero. Really it's the difference between however we end up storing them (after evaluating them) at build time versus at runtime, but if anything that probably makes the build-time version faster because it always gets to bump allocate the entire evaluation, whereas the runtime version only can in some situations (depending on the platform)

this hasn't been implemented yet though

Brendan Hansknecht (Dec 18 2023 at 18:12):

Cool, we do have a plan at least

Brendan Hansknecht (Dec 18 2023 at 18:36):

This would also be good to document in https://www.roc-lang.org/plans

Artur Swiderski (Dec 18 2023 at 18:48):

but what I am complaining about it is not that it is not evaluate during compile time which is merely nice to have feature. What bothers me is that it is evaluated every time. "closures" means that all variables like that are evaluated into functions and executed in place ? This should be fixed first , evaluation at compile time is nice but complete different story

Brendan Hansknecht (Dec 18 2023 at 19:02):

Yeah, while compile time execution fixes this issue, running once at runtime and caching would also fix the issue.

Brendan Hansknecht (Dec 18 2023 at 19:03):

That said, it has more complexity due to not knowing the context roc is run in. It has to be safe even if called many times from various threads all at once. Which would at a minimum require some form of locking which isn't great.

Brendan Hansknecht (Dec 18 2023 at 19:08):

As I see it, there are 3 options:

Exposed a roc_init function that must be called before other roc functions and initializes all globals.
Add mutex around all globals that allows the first calling thread to take the writer lock. All other threads would notice the lock is taken and wait. The first thread would initialize the value and set a flag that it is initialized. Future threads would see it is initialized and just use the value.
All top level constants are compile time evaluated. This they are statically stored in the binary and safe to use from the get go.

Richard Feldman (Dec 18 2023 at 19:45):

maybe it's time to start working on compile time evaluation!

Brendan Hansknecht (Dec 18 2023 at 20:30):

Interpreter or trying to use the dev backend?

Richard Feldman (Dec 18 2023 at 20:54):

we should probably discuss that on the other thread - I don't think we conclusively resolved that :thinking:

Shaiden Spreitzer (Dec 20 2023 at 14:23):

Brendan Hansknecht said:

As I see it, there are 3 options:

Exposed a roc_init function that must be called before other roc functions and initializes all globals.

Add mutex around all globals that allows the first calling thread to take the writer lock. All other threads would notice the lock is taken and wait. The first thread would initialize the value and set a flag that it is initialized. Future threads would see it is initialized and just use the value.

All top level constants are compile time evaluated. This they are statically stored in the binary and safe to use from the get go.

I thought concurrency was a platform issue?

Shaiden Spreitzer (Dec 20 2023 at 14:48):

Artur Swiderski said:

but what I am complaining about it is not that it is not evaluate during compile time which is merely nice to have feature. What bothers me is that it is evaluated every time. "closures" means that all variables like that are evaluated into functions and executed in place ? This should be fixed first , evaluation at compile time is nice but complete different story

How is the compiler supposed to know which function to cache? Should there be a database containing all functions with all input params and their return values? What if a function is called with aaaaa 10000x times and then with "bbbbb" once? Should "aaaaa" be thrown out? What if a function is called with a new param every time??

The 'correct' 'solution' to the above problem (imo) is to use an Elm-style model-view approach:
In Elm:

type alias Model = { availableRegex : String }

parseStr : String -> String -> Model -> (Model, Result ParsingResultType String)
parseStr str pattern model =
   ...

In Roc (I'm guessing use at own risk!)

Model : { availableRegex : Str }

parseStr : Str, Str, Model -> (Moddel, Result ParsingResultType Str)
parseStr = \str, pattern, model ->

Now in parseStr you check if availableRegex has already been created and if not: run the full function...

Brendan Hansknecht (Dec 20 2023 at 15:20):

I thought concurrency was a platform issue?

Kinda. Roc needs to not break when called concurrently.

Brendan Hansknecht (Dec 20 2023 at 15:23):

How is the compiler supposed to know which function to cache?

You misunderstand, this is not about arbitrary functions. This is about top level constants.

For example, if I write:

x = 127

y = someFn 8

someFn = \num -> ...

x and y are both top level constants. In most programming languages they would be initialized before main is called.

Brendan Hansknecht (Dec 20 2023 at 15:24):

Roc doesn't control when it is called (and can not guarantee when code will run), which is why there are more complexities here.

Last updated: Jul 26 2025 at 12:14 UTC