So I'm currently working on putting together an ISO date/time parsing Roc package, and just hit a major milestone: full date support (no time or date/time yet). Big ask, but just wondering if anyone would be interested in giving me a code review for the work so far?
Awesome!
In parseCalendarDateBasic I would recommend pattern matching on the strs. Then you can express that it should be 3 elements and bring all of those elements into scope at the same time. That way you can avoid having to get each element and unwrapping
For example:
parseCalendarDateBasic : Str -> Result Utc [InvalidDateFormat]
parseCalendarDateBasic = \str ->
when splitStrAtIndices str [4, 6] is
[year, month, day] -> doStuff year month day
_ -> Err InvalidDateFormat
If a function takes a lambda as the last argument you don't need to wrap it in parens. I'd say that's even the idiomatic way to write.
So you can do things like this and they look visually cleaner
List.map myList \elem ->
elem * 2
for the tests, you can avoid all the unwrap
s by comparing to Ok
, e.g.
expect unwrap (parseDate "2024-01-23") "Could not parse!" == @Utc (Num.toU128 ((19_723 + 22) * secondsPerDay * nanosPerSecond))
...becomes
expect parseDate "2024-01-23" == Ok (@Utc (Num.toU128 ((19_723 + 22) * secondsPerDay * nanosPerSecond)))
or, alternatively:
expect parseDate "2024-01-23" == ((19_723 + 22) * secondsPerDay * nanosPerSecond) |> Num.toU127 |> @Utc |> Ok
also, pattern matching on tuples can be used instead of if allOk [...] then
followed by unwrap
, e.g.
when (year, month, day) is
(Ok y, Ok m, Ok d) ->
numDaysSinceEpoch { year: y, month: m, day: d }
|> daysToNanos |> @Utc |> Ok
(_, _, _) -> Err InvalidDateFormat
that's a performance win because List
requires a heap allocation, but tuples don't! :smiley:
btw thanks for making this! At work we already know of a future use case where we're going to want to be able to parse ISO-8601 datetime strings in Roc, so this is very nice to see :heart_eyes_cat:
I started to work on one at some point but did not get as far as you have already!
oh! Also, I think Str.split
can be used in place of splitStrAtDelimiter
if I'm not mistaken
A few minor suggestions:
unwrap
from your library overall. For a library, there is almost always a better choice for how to handle/propagate errors.List.all results Result.isOk
hereList U8
, split with List.sublist/takeFirst/dropFirst
then convert back to a string. It will lead to simpler code.&&
) instead of if conditions and explicitly returning Bool.true/false
List.range
will materialize the entire list, probably fine, but may be slow/wasteful for large number of years.List.get
functions. It is not good style. Instead, use list pattern matching with when ... is
.Nat
. It is actually slotted for removal in the language. I believe for this package, U16
should be good, but U32
or U64
would be fine if wanted.Wow, thank you all for the code reviews. Can't say enough how much I appreciate everyone's time to look through my code and provide feedback. This is all awesome feedback. :star_struck:
Isaac and Richard, the tips on pattern matching multiple elements here are a huge help. Both of these will lead to much cleaner code and better error handling than my solution, and if we can get more performant with a tuple, awesome!!
Brendan, thanks for the in depth list of improvements. All of these are great tips. I'll be diving into making all these improvements later today! :working_on_it:
Btw Richard, great to hear this is something that could be useful at your work! I noticed there didn't seem to be any ISO date/time packages out there, and thought it could be a useful addition to ecosystem. Awesome to hear this could be filling a needed gap!
Also... Not sure how I missed Str.split
in the API... :sweat_smile: I think I started out looking for a splitStrAtIndex
equivalent, and when I didn't find that, it just went into my brain that there wan't a split function for Str. Derp!
Okay, all recommendations have been implemented! Thanks again for the review everyone!
While removing unwrap, I also removed all crash keyword usages in general. There is one instance in splitStrAtIndicesRecur where I think a crash may actually be the best failure mode, but I'm curious what others think. For now it will simply ignore the failure (which should never occur) and try to continue.
Latest code is in the repo :octopus:
If it truly should never occur. Thus, it would be a bug against the library if it occurs, crash is probably the right choice.
That said, in this case, you do have a bug:
expect
res = splitStrAtIndex "🔥🔥" 1
res == ["🔥", "🔥"]
If the case was a crash, it would be hit here.
And this is actually exactly why we don't have Str.splitAtIndex
in the standard library.
Not sure if that can be hit through the public api of your package, but that is the concern.
if you're up for it, I think an approach worth trying is to start by immediately converting to List U8
as the very first thing, and then doing all parsing in terms of that
that has the benefit of also working on raw byte streams, which is what Decode
uses
so you could support parsing them from both Str
as well as from List U8
to do that you'd convert things like "-"
to '-'
because single quote literals just compile directly to numbers (specifically a unicode code point)
I think we should probably start recommending parsing List U8
as a best practice, even if the input is a string! :big_smile:
Brendan Hansknecht said:
Not sure if that can be hit through the public api of your package, but that is the concern.
Good call here - It was possible previously to hit this by putting a multibyte character in one of the character positions where a digit would have been expected in the date string. I have now added validation to ensure the date string contains no multi-byte characters.
@Richard Feldman Yeah, I'm totally up for it. I'm beginning work on making those adaptations now. Looks like it might be in my best interest to figure out Decoders. Any pointers on where I can go for a code sample of custom decoders in use?
that's a good question! I forget if we have any material on that - maybe @Luke Boswell or @Brendan Hansknecht might know?
Doesn’t even need to be learning material, if there is any “production” code that uses decoders, I can start parsing through that…
I don't think a decoder would work here? They are more meant for decoding to/from a generic format like json or protobif.
Also, look up TotallyNotJson in the repo.
you could do something like make an opaque Timestamp
type which implements Decoding
for strings
that would give someone a way to decode a record from JSON while also decoding some of the fields as ISO-8601 timestamps
I'm not sure if that's actually the best design, but it's a thought!
Yeah, definitely can do that. Then a timestamp can be decoded from a string in any generic data format. Useful for end users, not useful for making the parsing nicer in this library.
This has come up a few times with Decode... I wonder if It is named poorly. I wonder if the serde folks regularly get the same question.
Thanks Brendon. I’ll take a look at it. From a brief scan at the Decode docs page, I thought there might be a more generic application with the custom decoders.
Richard, that is an interesting idea. Sounds like it could make for a pretty streamlined API for a common use case of ISO strings. Just to clarify, would timestamp be an opaque type which replaces the Utc opaque I’m currently using?
it's a good question, I'm not sure what would be more ergonomic for end users :thinking:
I think so, yeah
like Timestamp := Utc implements [Decoding { ...etc }]
Is there any reason I could not (or should not) simply define Utc := U128 implements [Decoding { …etc }]
?
you can, but the trick is that you have to pick 1 way to decode it
so if you do it that way, you're saying Utc
only decodes from an ISO-8601 timestamp as opposed to an integer
which maybe is what you want, but personally a thing I like about UTC is that I can treat it as an ordinary integer :big_smile:
Gotcha. Yeah, that doesn’t make a lot of sense to do… :rolling_eyes:
Just out of curiosity, why was Utc implemented as unsigned? Seems like a date pre-epoch could be useful to store. But maybe in terms of typical comparable dates, recent dates are almost exclusively to be expected, and older dates likely only need a textual representation…
yeah actually I can see an argument for signed! :+1:
Hmm well I just plan to stick with U128 for compatibility. Keeps things simpler for me too.
Feel free to push a PR to the roc projects to change UTC to signed
Brendan Hansknecht Anywhere I should make PRs besides roc-lang/basic-cli and roc-lang/basic-webserver?
I think just those two
So while working to migrate the basic-webserver to use I128
instead of U128
for Utc, I discovered a bug in InternalDateTime.epochMillisToDateTimeHelp
which causes it to fail to correctly convert millis to DateTime for the last day of a month or year.
I opened an issue on the repo, and submitted a corresponding PR.
Brendan Hansknecht said:
Feel free to push a PR to the roc projects to change UTC to signed
I've opened a PR for the issue on the basic-cli repo. Changes to the webserver are more involved, but should be ready for a PR soon(TM).
Okay, a matching PR for the webserver is up.
We're looking for some additional input on a change in this PR where the type of Utc
was changed from U128
to I128
.
deltaAsNanos
calculates the absolute difference between two Utc timestamps. The signature of deltaAsNanos
was changed from Utc, Utc -> U128
to Utc, Utc -> I128
. This is because Num.absDiff
returns the same type as its arguments (Num a, Num a -> Num a
).
absDiff
will never actually return a negative number. I'd like to make this behavior clear in the signature of deltaAsNanos
and cast the result of absDiff
with Num.toU128
. I believe this prevents confusion cause someone may otherwise see the I128 and wonder how this delta can be negative. It may also lead to wrong assumptions about how this function works. In general I also think it's good practice to avoid overly broad types.
On the other hand, the cast is not strictly necessary and it is kind of nice to use U128 everywhere and never need to cast for calculations.
Thoughts?
Yeah, I wish the roc type system had a bit more power so you could specify a same bit width number but signed or unsigned.
Personally, I would probably force them both into U128 even if they are negative and then run abs diff.
Should get the same abs diff, avoid the chance of abs overflowing, and get a more accurate type overall.
If only one is negative the abs diff would be different though
...ah yeah, you have to do some extra handling cause the negative number are put above the positive ones. (Would have to double check, but I think it may just be converting then flipping the first bit)
Cause imagining in I8/U8...
I8 inputs
-7 -> 11111001
12 -> 00001100
As U8 with same bit pattern
249
12
Flip first bit to put positives above negatives
121 -> 01111001
140 -> 10001100
Get abs diff:
19
Yeah, so I would do that
Is it really worth the increase in complexity to reduce the probability of overflow, given that we're dealing with timestamps? With an I128 we're still very comfortable to represent the age of the universe in nanoseconds.
Guess I didn't realize the scale of an I128.
That said, I really don't think it is that complex:
aCast = Num.bitwiseXor (Num.toU128 a) (Num.shiftLeftBy 1 127)
bCast = Num.bitwiseXor (Num.toU128 b) (Num.shiftLeftBy 1 127)
Num.absDiff aCast bCast
Personally though, I would really love if this could be done in Num.absDiff
internally and it could automatically return the unsized type to ensure that it will never panic.
Thanks for the input here guys. Brendan, I'll take your solution for handling oposing signs in the delta functions, and update both my PRs accordingly.
Changes are up, just waiting on CI.
Anton said:
We're looking for some additional input on a change in this PR where the type of
Utc
was changed fromU128
toI128
.
deltaAsNanos
calculates the absolute difference between two Utc timestamps. The signature ofdeltaAsNanos
was changed fromUtc, Utc -> U128
toUtc, Utc -> I128
. This is becauseNum.absDiff
returns the same type as its arguments (Num a, Num a -> Num a
).
absDiff
will never actually return a negative number. I'd like to make this behavior clear in the signature ofdeltaAsNanos
and cast the result ofabsDiff
withNum.toU128
. I believe this prevents confusion cause someone may otherwise see the I128 and wonder how this delta can be negative. It may also lead to wrong assumptions about how this function works. In general I also think it's good practice to avoid overly broad types.On the other hand, the cast is not strictly necessary and it is kind of nice to use U128 everywhere and never need to cast for calculations.
Thoughts?
Maybe I have an unusual perspective on time, having been a pulsar astronomer and having worked on Astropy's Time objects (which use a pair of double-precision numbers to provide sufficient precision), but are times before the UNIX epoch not of interest? Wouldn't a signed number of nanoseconds actually be useful in it's own right?
I realise that date formatting gets complicated for the distant past, but honestly no more so than the present era with our leap seconds. And how long before someone tries to parse the date of the Moon landing or the "shot heard round the world"?
Yeah, we just changed it to signed.
Last updated: Jul 26 2025 at 12:14 UTC