Is there a limit on the size of an expression? If I write a function which includes an expression with thousands of boolean and arithnetic operations?
I'm looking to write a function List U32 -> List [A,B,C,D]
that is efficient and I figured branchless would give a fixed number of calculations per U32. The use case is mapping unicode code points into graphmeme cluster break classes.
The other idea was to create a top level Dict U32 [A,B,C,...]
but not sure if that would be more efficient?
I can write a benchmark, but thought I might ask here first and save myself in case there is an obvous solution to reach for.
Like a when expression?
Just need something more concrete to understand?
Or a doc with some technical info on what you want to do would be good
Yeah, I could explain that better. I mean like a huge sequence of comparisons, like isControlClass = (u32 > 0x0 && u32 <= 0x9) || (u32 >= 0xB && u32 <= 0xC) || (u32 == 0x61C) || ...
I might be mixing operators there... but I mean boolean.
I have wondered if it would be efficient to use some kind of lookup table, but that seems like a lot of memory. I've done some research on that and maybe it can be reduced by using multple lookups in sequence. The goal is U32 -> [A,B,C,...N]
.
I can imagine creating the lookup tables by sampling every possible value from 0x0 to 0x10FFFF, and then generating the equivalent roc List [A,B,C,...]
.
https://www.unicode.org/Public/UCD/latest/ucd/auxiliary/GraphemeBreakProperty.txt
This doc maps the unicode code points to a graphmeme cluster break property. The rules for when to split up into extended graphmeme clusters are determined using these property values.
Ah yeah, I would do the giant Boolean expression and not waste the memory on something that big
That said, if it can be represented as a chain of very small lookups or a few medium lookups, that could be faster.
Depends mostly on the size of the lookup and what it optimized to
The Boolean expression should optimize just fine
If it is faster than the lookup mostly depends on if the lookup will be in cache and just how many instructions it becomes
Last updated: Jul 05 2025 at 12:14 UTC