Hey all, I picked up an issue to work on a built-in but I am not yet much of an expert in the functional programming world. Is there anyone that would care to review some ideas I had to fix the issue? I'd like to follow conventions as much as possible and to learn along the way :smile:
Just pick something and post about it here and someone will chat with you about it!
As a note, if you want to do a video call or pair with someone on an issue, a number of us are generally open if you request it. That said, not sure how holidays affect availability. I know that I don't have a setup/the current availability for the holidays, but maybe others will.
Ok! I am trying to implement List.split
in Roc and here is my first attempt:
split = \items, splitter ->
if List.len items > 1 then
splitHelper = \itemList, splitChar, acc, all ->
when itemList is
[] -> List.append all acc
[item] ->
if item != splitChar then
splitHelper [] splitChar (List.append acc item) all
else
splitHelper [] splitChar acc all
[head, .. as tail] ->
if head != splitChar then
splitHelper tail splitChar (List.append acc head) all
else
splitHelper tail splitChar [] (List.append all acc)
splitHelper items splitter [] []
else
[items]
I've done only a tiny bit of functional programming so go easy!
What are the expected semantics of the function? Just not sure what exact API we decided on.
I wasn't entirely positive so I tried to match the behavior of Str.split
as much as possible as I believe that was the root cause of the issue itself
I don't think the outer if statement is correct, but otherwise the function looks fine. Cause if you call List.split [5] 5
, I would not expect [[5]]
. I would expect [[], []]
.
Overall the function is probably fine.
One improvement would be to use guard clauses instead of nested ifs. Also, I would switch the ifs to be the positive case, I think it is clearer. Something like:
[head, .. as tail] if head == split har ->
...
#then the else case
[head, .. as tail] ->
...
Also, may be worth looking at writing the version of the algorithm that uses List.walk
instead of recursion, may be cleaner....but either is probably fine.
Thanks for taking a look! I'll give the guards a try. In addition, are there any performance difference between recursion vs using List.walk
typically?
I got way too curious about this. You are just fine with eiter. But if you want to know...
TLDR: Use the one that makes the code the most readable. They are about the same, I think.
I am probably just as clueless as you, but the quest is great! I have little experience in this field, but have touched the compiler before, so I was able to dump some Intermediate Representation of two functions that copy each element of a list into another list one-by-one. One with a walk and one with a recursive implementation. I don't know if LLVM optimises some of the differences away, but I highly doubt.
helper = \l, l2 ->
List.walk l l2 \state, elem ->
List.append state elem
helperRec= \l, l2 -> when l is
[] -> l2
[a] -> List.append l2 a
[a, .. as b] -> helperRec b (List.append l2 a)
Walking should be faster than head-tail based recursion. Not by much though. Looking at the implementation of walk, it is just a recursive fn as well. Here it is
Getting into the nitty-gritty: Walk uses an unchanging list and an index to iterate the elements. Num.addWrap does not panic on owerflow so that check is skipped after the add operation, so it is cheaper than a regular add (and I'd imagine faster than a regular subtract with an underflow check...? we are at instruction level, so idk). You basically use the .. as tail
as an index to keep track of the next list element. Instead of accepting a list and an index, you alway provide a list of which you get the first element. There must be an operation to create that new list. It is just a slice (no copying of the underlying list happens). A slice needs a pointer to the original list's bytes (incremented by an offset to start at the desired index, so there is a little bit of computation for you) and a length. The length will be the original list's length -1 (that is a checked subtraction (meaning it will panic instead of underflow) that the pattern match introduces). Pretty sure that's slower than the unchecked / wrapping addition. Not a big deal, but still something.
@Brendan Hansknecht Out of curiosity, couldn't this subtraction be Num.subWrap, since we know at compile time that for that branch, the list's length is always sufficently large?
There is also the question of what happens to ref counting. When you create a sublist you increment the ref count of the input list. In the example I tried, it almost immediately decremented the refcount , before the next recursive call, but those are still operations. I have no idea if those instructions will get to the final binary or not, or how ref counting of slices work.
Notice that these are related to the .. as tail
approac, not to the recursion inherently. You can create a recursive fn that takes a list and an index, which you increment. That being said, at that point you should just use walk. It is a well know function, which makes it easier to understand than a custom recursive function with an index in it.
This is definitely a micro-optimization, which you don't need. I just hope you found something interesting in it
I'll keep the Roc IR around in case you are interested, but I think this was already too much information. Cheers!
Very good summary
And sounds correct for the wrapping subtraction
Wow well done on digging up those details!
Yes generally the built-in functions should perform similarly to recursion because both just compile to loops.
In some cases there might be some trick we managed to do in the built-in that the compiler couldn't do with pure Roc.
From a style point of view if there's a built-in that fits your use case then you should generally use it. If you have something complicated where they don't fit, then you need the full generality of recursion.
yeah great stuff! :smiley:
nice catch on the place where we could use Num.subWrap
- happy to accept a PR for that!
Thanks a bunch everyone! @Norbert Hajagos that explanation was awesome, much appreciated. Admittedly a bit over my head when getting down in the IR, but would love to know how you managed to do that!
Thanks for the great feedback! I created an issue for the Num.subWrap
It was a journey for me as well @mdznta , and was over my head too. First of all, you need to be able to build the compiler from source. After that, what I did:
cargo test --package test_gen -- <name of your test case> --nocapture
to run a test. With the --nocapture
flag, cargo will let the tests print to stdout. @Norbert Hajagos Wow that is a great tip! I'll certainly keep that in mind when I need to compare implementations. Thanks a bunch for the info and explaining the process!
mdznta said:
Norbert Hajagos Wow that is a great tip! I'll certainly keep that in mind when I need to compare implementations. Thanks a bunch for the info and explaining the process!
It was my pleasure! Have a good one playing with the concept! And in case you celebrate it, happy Christmas!
Last updated: Jul 05 2025 at 12:14 UTC