Stream: beginners

Topic: ✔ Splitting a string into graphemes/chars (or number into...


view this post on Zulip Lance Wicks (Mar 22 2024 at 21:06):

Hi,
I am trying to implement a function that takes a number (eg 12345), and creates a list of [1,2,3,4,5]
I am not finding the tool for the job and hoping this is the right place to ask.

In Perl for example I could do the_list = split "", "12345" (assuming I treat the number as a string)
Reading the roc docs, split "" "12345" would not work.

If someone could point me in the direction of the right idea I'd appreciate it.

view this post on Zulip Hristo (Mar 22 2024 at 21:25):

If you want to split it into characters, you could do something like the following:

12345 |> Num.toStr |> Str.toUtf8
# [49, 50, 51, 52, 53] : List (List U8)

view this post on Zulip Jonas Schell (Mar 22 2024 at 21:29):

Surprisingly difficult. Perhaps there is a better way?

separate = \str ->
    Str.walkUtf8
        str
        []
        (\state, element ->
            [element]
            |> Str.fromUtf8
            |> Result.withDefault ""
            |> Str.toU64
            |> (\maybeNum -> List.appendIfOk state maybeNum)
        )

expect
    separate "123456" == [1, 2, 3, 4, 5, 6]

view this post on Zulip Lance Wicks (Mar 22 2024 at 21:34):

Thanks... trying both now.

view this post on Zulip Hristo (Mar 22 2024 at 21:35):

Yes, just as @Jonas Schell has pointed out, if you want for your output to be unsigned integers, you'd have to go even a bit further, like so:

12345
|> Num.toStr
|> Str.toUtf8
|> List.chunks 1
|> List.map (\c -> Str.fromUtf8 c |> Result.withDefault "" |> Str.toU64)

Btw, my understanding has been that the input is supposed to be an integer itself.

view this post on Zulip Richard Feldman (Mar 22 2024 at 21:50):

https://github.com/roc-lang/unicode will make this sort of thing easier, but it's still WIP at the moment :sweat_smile:

view this post on Zulip Luke Boswell (Mar 22 2024 at 22:00):

The Grapheme.split works well for this use case. Though we dont have a release you can use easily from URL

view this post on Zulip Brendan Hansknecht (Mar 22 2024 at 23:08):

For a number specfically, I would do one of these two:

Input already a string:

"12345" |> Str.toUtf8 |> List.map \c -> c - '0'

If you need validation, use wrapping subtraction and check all outputs are less than 10.

Input is an actual int:

digits = \num ->
    helper = \list, n ->
        if n == 0 then
            list
        else
            list
            |> List.append (Num.rem n 10)
            |> helper (n // 10)
    helper [] num
    |> List.reverse

view this post on Zulip Hristo (Mar 23 2024 at 19:37):

Or you could go with an adventurous solution like this one :joy:

toDigits = \n ->
    div = Num.divTrunc n 10
    mod = Num.rem n 10
    when Num.compare div 1 is
        LT -> [mod]
        _ -> List.append (toDigits div) mod

expect
    toDigits 1234567890 == [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]

view this post on Zulip Brendan Hansknecht (Mar 23 2024 at 20:20):

Personally I wouldn't use less than due to negative numbers, but none of these methods deal with the sign if it has one

view this post on Zulip Hristo (Mar 23 2024 at 20:34):

Yeah, that's a good point!

My implicit assumption has been that the input would only be from the set of natural numbers, as the sign cannot be meaningfully interpreted as a digit.

Btw, I've just realised that my solution looks a lot like yours, Brendan :man_facepalming: I thought I was being original, as it was something I came up with whilst listening to a podcast earlier today, as they were discussing something remotely related (I did read yours yesterday but I completely forgot about it :man_facepalming: I should've properly re-read the thread before posting :pray:).

view this post on Zulip Brendan Hansknecht (Mar 23 2024 at 21:21):

I mean yours trades off stack space to avoid the reverse call. Both have merits

view this post on Zulip Hristo (Mar 23 2024 at 21:28):

I think in your solution, append could be replaced with prepend to avoid the reverse call. I've just learned about prepend by checking the standard library (I was looking for insert/insertAt which aren't there; I think I was getting confused, in terms of available function names, with Dict.insert).

view this post on Zulip Norbert Hajagos (Mar 23 2024 at 21:36):

The one with reverse seems better, because Roc uses array-backed lists, not linked-lists. Appending to the end is 1 operation, but prepending requires shifting the rest of the array 1 place to the right. It's probably implemented as a single operation, moving a block of memory in 1 swoop and not 1 by 1 coppying the elements, but it is still probably better performance wise. Then again, your input size is so small, it doesn't matter. But a good thing to remember when dealing with bigger datasets.

view this post on Zulip Brendan Hansknecht (Mar 23 2024 at 21:37):

Was about to say the same thing. Append and reverse should be faster

view this post on Zulip Hristo (Mar 23 2024 at 21:40):

Right, thank you both - understood and much appreciated! I admit I wasn't familiar with the internals to an extent that would allow me to reason properly about efficiency.
I confirm that the array vs linked-list trade-offs do make perfect sense, yes!

view this post on Zulip Norbert Hajagos (Mar 23 2024 at 21:43):

I've learnt these things thanks to Roc being performance-aware as well. Pretty cool that a high lvl language (and the community) can teach you these things in my opinion.

view this post on Zulip Notification Bot (Mar 28 2024 at 23:57):

Lance Wicks has marked this topic as resolved.


Last updated: Jul 06 2025 at 12:14 UTC