Functionally starting out · beginners

Hey all, I picked up an issue to work on a built-in but I am not yet much of an expert in the functional programming world. Is there anyone that would care to review some ideas I had to fix the issue? I'd like to follow conventions as much as possible and to learn along the way :smile:

Brian Carroll (Dec 19 2023 at 23:08):

Just pick something and post about it here and someone will chat with you about it!

Brendan Hansknecht (Dec 19 2023 at 23:19):

As a note, if you want to do a video call or pair with someone on an issue, a number of us are generally open if you request it. That said, not sure how holidays affect availability. I know that I don't have a setup/the current availability for the holidays, but maybe others will.

mdznta (Dec 20 2023 at 03:14):

split = \items, splitter ->
    if List.len items > 1 then
        splitHelper = \itemList, splitChar, acc, all ->
            when itemList is
                [] -> List.append all acc
                [item] ->
                    if item != splitChar then
                        splitHelper [] splitChar (List.append acc item) all
                    else
                        splitHelper [] splitChar acc all
                [head, .. as tail] ->
                    if head != splitChar then
                        splitHelper tail splitChar (List.append acc head) all
                    else
                        splitHelper tail splitChar [] (List.append all acc)
        splitHelper items splitter [] []
    else
        [items]

Brendan Hansknecht (Dec 20 2023 at 04:07):

What are the expected semantics of the function? Just not sure what exact API we decided on.

mdznta (Dec 21 2023 at 02:14):

I wasn't entirely positive so I tried to match the behavior of Str.split as much as possible as I believe that was the root cause of the issue itself

Brendan Hansknecht (Dec 21 2023 at 04:25):

I don't think the outer if statement is correct, but otherwise the function looks fine. Cause if you call List.split [5] 5, I would not expect [[5]]. I would expect [[], []].

Brendan Hansknecht (Dec 21 2023 at 04:28):

One improvement would be to use guard clauses instead of nested ifs. Also, I would switch the ifs to be the positive case, I think it is clearer. Something like:

[head, .. as tail] if head == split har ->
    ...
#then the else case
[head, .. as tail] ->
    ...

Brendan Hansknecht (Dec 21 2023 at 04:29):

Also, may be worth looking at writing the version of the algorithm that uses List.walk instead of recursion, may be cleaner....but either is probably fine.

mdznta (Dec 21 2023 at 19:59):

Thanks for taking a look! I'll give the guards a try. In addition, are there any performance difference between recursion vs using List.walk typically?

Norbert Hajagos (Dec 21 2023 at 23:00):

I got way too curious about this. You are just fine with eiter. But if you want to know...

TLDR: Use the one that makes the code the most readable. They are about the same, I think.
I am probably just as clueless as you, but the quest is great! I have little experience in this field, but have touched the compiler before, so I was able to dump some Intermediate Representation of two functions that copy each element of a list into another list one-by-one. One with a walk and one with a recursive implementation. I don't know if LLVM optimises some of the differences away, but I highly doubt.

helper = \l, l2 ->
    List.walk l l2 \state, elem ->
        List.append state elem

helperRec= \l, l2 -> when l is
    [] -> l2
    [a] -> List.append l2 a
    [a, .. as b] -> helperRec b (List.append l2 a)

Walking should be faster than head-tail based recursion. Not by much though. Looking at the implementation of walk, it is just a recursive fn as well. Here it is
Getting into the nitty-gritty: Walk uses an unchanging list and an index to iterate the elements. Num.addWrap does not panic on owerflow so that check is skipped after the add operation, so it is cheaper than a regular add (and I'd imagine faster than a regular subtract with an underflow check...? we are at instruction level, so idk). You basically use the .. as tail as an index to keep track of the next list element. Instead of accepting a list and an index, you alway provide a list of which you get the first element. There must be an operation to create that new list. It is just a slice (no copying of the underlying list happens). A slice needs a pointer to the original list's bytes (incremented by an offset to start at the desired index, so there is a little bit of computation for you) and a length. The length will be the original list's length -1 (that is a checked subtraction (meaning it will panic instead of underflow) that the pattern match introduces). Pretty sure that's slower than the unchecked / wrapping addition. Not a big deal, but still something.

@Brendan Hansknecht Out of curiosity, couldn't this subtraction be Num.subWrap, since we know at compile time that for that branch, the list's length is always sufficently large?

There is also the question of what happens to ref counting. When you create a sublist you increment the ref count of the input list. In the example I tried, it almost immediately decremented the refcount , before the next recursive call, but those are still operations. I have no idea if those instructions will get to the final binary or not, or how ref counting of slices work.

Notice that these are related to the .. as tail approac, not to the recursion inherently. You can create a recursive fn that takes a list and an index, which you increment. That being said, at that point you should just use walk. It is a well know function, which makes it easier to understand than a custom recursive function with an index in it.

This is definitely a micro-optimization, which you don't need. I just hope you found something interesting in it
I'll keep the Roc IR around in case you are interested, but I think this was already too much information. Cheers!

Brendan Hansknecht (Dec 21 2023 at 23:13):

Brendan Hansknecht (Dec 21 2023 at 23:14):

Brian Carroll (Dec 21 2023 at 23:14):

Yes generally the built-in functions should perform similarly to recursion because both just compile to loops.

In some cases there might be some trick we managed to do in the built-in that the compiler couldn't do with pure Roc.

From a style point of view if there's a built-in that fits your use case then you should generally use it. If you have something complicated where they don't fit, then you need the full generality of recursion.

Richard Feldman (Dec 21 2023 at 23:39):

nice catch on the place where we could use Num.subWrap - happy to accept a PR for that!

mdznta (Dec 22 2023 at 01:22):

Thanks a bunch everyone! @Norbert Hajagos that explanation was awesome, much appreciated. Admittedly a bit over my head when getting down in the IR, but would love to know how you managed to do that!

Norbert Hajagos (Dec 22 2023 at 13:46):

It was a journey for me as well @mdznta , and was over my head too. First of all, you need to be able to build the compiler from source. After that, what I did:

mdznta (Dec 23 2023 at 01:13):

@Norbert Hajagos Wow that is a great tip! I'll certainly keep that in mind when I need to compare implementations. Thanks a bunch for the info and explaining the process!

Norbert Hajagos (Dec 23 2023 at 08:44):

It was my pleasure! Have a good one playing with the concept! And in case you celebrate it, happy Christmas!

Stream: beginners

Topic: Functionally starting out

mdznta (Dec 19 2023 at 22:21):

Brian Carroll (Dec 19 2023 at 23:08):

Brendan Hansknecht (Dec 19 2023 at 23:19):

mdznta (Dec 20 2023 at 03:14):

Brendan Hansknecht (Dec 20 2023 at 04:07):

mdznta (Dec 21 2023 at 02:14):

Brendan Hansknecht (Dec 21 2023 at 04:25):

Brendan Hansknecht (Dec 21 2023 at 04:28):

Brendan Hansknecht (Dec 21 2023 at 04:29):

mdznta (Dec 21 2023 at 19:59):

Norbert Hajagos (Dec 21 2023 at 23:00):

Brendan Hansknecht (Dec 21 2023 at 23:13):

Brendan Hansknecht (Dec 21 2023 at 23:14):

Brian Carroll (Dec 21 2023 at 23:14):

Richard Feldman (Dec 21 2023 at 23:39):

Richard Feldman (Dec 21 2023 at 23:39):

mdznta (Dec 22 2023 at 01:22):

Norbert Hajagos (Dec 22 2023 at 13:46):

mdznta (Dec 23 2023 at 01:13):

Norbert Hajagos (Dec 23 2023 at 08:44):