Num.mod is a trap? · ideas · Zulip Chat Archive

Stream: ideas

Topic: Num.mod is a trap?

Kevin Gillette (Apr 14 2022 at 04:13):

Working on the non-div portions of https://github.com/rtfeldman/roc/issues/2826, I have, for the first time in my career, been confronted by the difference between rem and mod. Roc's rem ("truncating remainder") appears to be the vastly more common default operation in programming languages, and indeed, in instruction sets (at least x86 has truncating remainder, as does wasm, yet neither have an instruction for a flooring remainder), according to https://en.wikipedia.org/wiki/Modulo_operation#In_programming_languages.

Nonetheless, the operation Roc calls rem is commonly labeled "mod" in imperative languages in the C family (and in that language family, the term "rem(ainder)" is rarely or never used.

Certainly both have the same behavior when both operands are non-negative, but my concern is that newcomers to Roc coming from languages calling this "mod" will reach for mod even though their desired semantics match Roc's rem--or they otherwise don't care which because they don't expect to pass negative values, but by choosing mod, they'll be choosing slower implementations on at least x86 and wasm.

What if we rename Num.mod to be Num.remFloor, and document rem as "being what you probably are looking for if you're coming from a C-style language" while documenting remFloor as "being equivalent to to mod in most functional languages, but that rem is more performant"? I suspect this will have the effect of encouraging most people to use rem unless they specifically know they want remFloor.

Nikita Mounier (Apr 14 2022 at 12:24):

Fully agree. And to go even further than that, I think we should outright remove the %% operator which currently desugars to Num.mod. In other languages like R, though, %% refers to the remainder and not modulus. Remainder is what people actually want most of the time, so we should remove %% for less ambiguity.

Richard Feldman (Apr 14 2022 at 12:29):

yeah I like the idea of removing %% :thumbs_up:

Nikita Mounier (Apr 14 2022 at 12:32):

yeah I like the idea of removing %% :thumbs_up:

Shall I make that into an issue and tag it as a good first issue?

Nikita Mounier (Apr 14 2022 at 12:33):

I'll probably end up doing it but just in case anyone else wants to do it

Richard Feldman (Apr 14 2022 at 12:33):

Roc's rem ("truncating remainder") appears to be the vastly more common default operation in programming languages

unfortunately lots of popular languages do this differently - e.g. in JavaScript and C, % means remainder, but in Python and Ruby it means modulo

Richard Feldman (Apr 14 2022 at 12:34):

for this reason Elm actually doesn't have a % operator, it just has named functions for mod and remainder so you have to spell out which one you want

Richard Feldman (Apr 14 2022 at 12:34):

so there's a real risk with % that if you port some Python code to Roc and the way it handles negatives matters, the ported code will have a bug

Richard Feldman (Apr 14 2022 at 12:35):

but I think if we have % then that's what people will use rather than calling Num.mod directly

Richard Feldman (Apr 14 2022 at 12:35):

because that definitely seems to be what people do in languages that have %

Richard Feldman (Apr 14 2022 at 12:36):

so I'm not worried about people reaching for Num.mod by mistake - I think they'll reach for %

Richard Feldman (Apr 14 2022 at 12:37):

and if they know they actually want modulo, mod is definitely the name that's used (even in mathematics!) for that operation, so I'd be reluctant to give it a different name

Nikita Mounier (Apr 14 2022 at 12:37):

so there's a real risk with % that if you port some Python code to Roc and the way it handles negatives matters, the ported code will have a bug

What if we limit Num.rem and % to Nat? Aka use the type system to enforce correctness

Richard Feldman (Apr 14 2022 at 12:37):

@Nikita Mounier feel free to make the issue and also feel free to go for it!

Richard Feldman (Apr 14 2022 at 12:40):

Nikita Mounier said:

so there's a real risk with % that if you port some Python code to Roc and the way it handles negatives matters, the ported code will have a bug

What if we limit Num.rem and % to Nat? Aka use the type system to enforce correctness

this isn't quite the same - the remainder operation is actually well-defined to work a certain way on negative numbers, so I don't think disallowing negative numbers is what we want

Nikita Mounier (Apr 14 2022 at 12:41):

Right, sorry, I got a little mixed up there.

Nikita Mounier (Apr 14 2022 at 12:43):

And we should definitely add it in the Num.mod documentation that Num.rem is probably what they're looking for, and detail the downsides of using Num.mod on certain platforms.

Kevin Gillette (Apr 14 2022 at 13:41):

I'm touching Num.mod in a PR already anyways, and can make a first attempt at such clarification (or if someone has particular wording in mind, I can wrap up my PR to make way for another person to do that doc clarification work).

Richard Feldman (Apr 14 2022 at 13:56):

I'm gonna revise the docs in the future to make them clearer anyway; I'll take care of it then!

Kevin Gillette (Apr 14 2022 at 15:25):

Richard Feldman said:

and if they know they actually want modulo, mod is definitely the name that's used (even in mathematics!) for that operation, so I'd be reluctant to give it a different name

Yes, it certainly appears to be called mod(ulo), but I'm not sure Roc's mod operation matches the mathematical convention. From the Congruence section of that same Wikipedia link, it suggests that the Euclidean interpretation is being used, and we can see that's indeed the case: https://gist.github.com/extemporalgenome/33cecccc7a5d998ac6c21a22838fc8fb

Dart uses Euclidean modulo, and it's the only one out of Euclidean, Flooring, and Truncating, that replicates the congruences listed in the article. https://en.wikipedia.org/wiki/Modular_arithmetic makes no mention of anything but the Euclidean variant, and flooring modulo appears to be attributed to Donald Knuth only as late as 1972 (relatively new compared to Euclid, ~300 BC).

In summary (based on my non-exhaustive research):

In mathematics, it appears that the Euclidean interpretation is customary when discussing modular arithmetic. Roc does not provide a function matching this interpretation, though few languages do.
In programming, "mod" has been ambiguously used to describe any of truncating, flooring, Euclidean, and other interpretations.
Euclidean, flooring, and truncating all agree when both operands are non-negative.
ML-family languages tend to use "mod" to mean flooring, but other languages (the vast majority of languages, and thus programmers) generally use "mod" to mean truncating.
There are a few family exceptions: Python deliberately broke from its family and chose flooring, while Dart broke from its family and chose Euclidean.
Richard: regarding your claim, albeit my research wasn't exhaustive, I haven't found any evidence that "mod" refers to flooring division in terms of customary mathematics, nor popularly in programming (since ML languages aren't [yet] particularly popular),

I do agree that people will reach for % instead of mod, but I believe many will reach for List.map mod thinking that they're getting the same behavior as List.map (\x, y -> x % y). Many Roc programmers might never even bother to learn/confirm/rememer what % desugars to, if they ever even learn that operators are desugared, and just assume it matches whatever term their earlier language encouraged as the default. I can't imagine that debugging this (at least for anyone with more imperative terminology background than ML functional terminology background) will be quick or fun.

I'll reassert that if we want to avoid confusion, we might distance ourselves from the "mod" terminology altogether, and that remFloor may be less likely to result in false friends bugs. Alternatively modTrunc and modFloor could disambiguate, but as discussed elsewhere, floor can easily be misinterpreted as "round towards zero."

Brendan Hansknecht (Apr 14 2022 at 18:08):

My anchor point for mod has always been powers of 2. Mod of a power of 2 should be equivalent anding with the power of two brought one closer to zero and keeping the original sign. This gives a clear performance story and maps to hardware well.

Brendan Hansknecht (Apr 14 2022 at 18:09):

I think if we have a % it is very important that this property holds true. Which would suggest it should always err towards 0.

Brendan Hansknecht (Apr 14 2022 at 18:10):

This is both expected by most low level bit twiddling algorithms

Brendan Hansknecht (Apr 14 2022 at 18:14):

In more concrete terms. x % 8 is equivalent to Num.bitwiseAnd x 0b0000_0111. assuming they type was an U8

Brendan Hansknecht (Apr 14 2022 at 18:29):

If the type is I8 it would is more complicated because we need to preserve the sign but it still is just anding, some shifting, and a subtraction. which is way faster than any sort of multiplication or real modulo operation.

Last updated: Jul 23 2026 at 13:15 UTC