replace Nat · ideas · Zulip Chat Archive

so I think there's a strong case to be made that we should replace Nat with U64

Richard Feldman (May 05 2023 at 01:04):

not to change any data structures, mind you - just any builtin function that currently accepts or returns Nat instead uses U64

Richard Feldman (May 05 2023 at 01:05):

so 64-bit targets would have no change, and on 32-bit targets there would be a cast behind the scenes from 32-bit to 64-bit integers (which after LLVM optimization would usually be no difference in practice I suspect)

Richard Feldman (May 05 2023 at 01:07):

so an obvious motivation (but not the main one, at least to me) is that it's one less concept to learn in the language, and also it means that when teaching the language to beginners, they don't need to encounter the concept of compilation targets (which isn't a thing you have to learn at all in lots of languages, e.g. the 3 most popular languages - Python, Java, JavaScript)

Richard Feldman (May 05 2023 at 01:09):

Richard Feldman (May 05 2023 at 01:11):

the reason for both of these is that you can use Nat to write Roc code that runs differently on different targets, because it overflows at different points - so you can do Num.addChecked 1nat 2^32 and on 32-bit targets it will return Err but on 64-bit targets it will return Ok

Richard Feldman (May 05 2023 at 01:11):

Richard Feldman (May 05 2023 at 01:12):

also, code that casts Nat to U64 today is lossless, but in a hypothetical future where 128-bit targets are a thing, that code will all silently become lossy and can cause bugs

Richard Feldman (May 05 2023 at 01:13):

similarly, the code isTarget32bit = Result.isErr (Num.addChecked 1nat 2^32) is accurate today, but inaccurate if we introduce a 16-bit target

Richard Feldman (May 05 2023 at 01:14):

(likewise for isTarget64bit = Result.isOk (Num.addChecked 1nat 2^32) and 128-bit targets)

Richard Feldman (May 05 2023 at 01:17):

granted, it's not a given that 128-bit targets will ever be a thing (since hardware today only permits using 48 of the 64 bits in practice, and we'd presumably want to start using all 64 bits before there would be demand for upgrading to 128) but in a future where 128-bit targets never happen (or perhaps by the time we want addresses that big, it's so far in the future that something more fundamental has changed about operating systems, and we need to rethink various representations of things anyway), and we only ever have 32-bit and 64-bit targets like today...then what was Nat getting us?

We could have just had U64 and actual target-independence and it would have been fine

Richard Feldman (May 05 2023 at 01:25):

supposing 128-bit targets do happen, at least having hardcoded List.len and friends to return U64 gives us options; we can, for example, do any one of these:

Richard Feldman (May 05 2023 at 01:27):

anyway, all things considered, it seems to me that replacing Nat with U64 is the right choice for the language

Richard Feldman (May 05 2023 at 01:27):

Brendan Hansknecht (May 05 2023 at 03:05):

Sky Rose (May 05 2023 at 05:26):

What about the size cost of using larger values everywhere? If roc starts supporting 16bit platforms and I want to write a program for a low-resource machine (pico-8?) would forcing all Nats to be 4 times as big create problems with using too much memory?

(Don't weight my comment heavily. I don't have experience with that sort of programming and don't know if that'd be an issue, and it's all just hypothetical anyway.)

Anton (May 05 2023 at 09:10):

Folkert de Vries (May 05 2023 at 09:21):

well part of the idea here is that in data structures, we still use the pointer size. I think that mitigates most of the downsides

Folkert de Vries (May 05 2023 at 09:22):

yes you do have to do some stuff with 64-bit values and could strictly speaking be done with a smaller value, but I suspect that the cost is marginal.

Richard Feldman (May 05 2023 at 13:24):

yep! Although just to be clear, the builtin functions would use U64 but under the hood the data structures would still be usize internally - so for example on 32-bit targets we wouldn't be wasting memory storing 64-bit lengths in memory

Brendan Hansknecht (May 05 2023 at 13:25):

I was assuming builtins would all use usize everywhere and just add casting at the boundaries?

Richard Feldman (May 05 2023 at 13:26):

Brendan Hansknecht (May 05 2023 at 13:29):

I think this will hurt if we ever have 16bit. I don't think it will hurt too bad on 32bit.

Richard Feldman (May 05 2023 at 13:30):

I haven't verified on godbolt, but I expect that the fact that we convert builtins to llvm bitcode will make a lot of the casts go away

Richard Feldman (May 05 2023 at 13:30):

Richard Feldman (May 05 2023 at 13:31):

Brendan Hansknecht (May 05 2023 at 13:32):

Richard Feldman (May 05 2023 at 13:32):

Richard Feldman (May 05 2023 at 13:33):

so I'm pretty sure LLVM will see all that and realize that the casts from 16-bit to 64-bit and then back to 16-bit are not necessary, and will drop them

Brendan Hansknecht (May 05 2023 at 13:33):

Richard Feldman (May 05 2023 at 13:33):

sure, but I think that tends to describe the way these functions are used in practice

Richard Feldman (May 05 2023 at 13:34):

for example, I think if I were writing some sort of SoA thing in pure Roc, I wouldn't use Nat anyway

Richard Feldman (May 05 2023 at 13:34):

Brendan Hansknecht (May 05 2023 at 13:34):

I don't think so. Many builtins don't get inlined or at least not fully. Even something like List.set is large enough and used in enough places that it doesn't get inlined (though maybe this one got fixed when I increased the inline threshold). List.set is not that large of a function.

Brendan Hansknecht (May 05 2023 at 13:35):

Richard Feldman (May 05 2023 at 13:35):

Brendan Hansknecht (May 05 2023 at 13:37):

If you catch everything and a user is careful to define types (including intermediate values inside of a function), i guess it could be ok. I still think this is likely to cause weird perf edges.

Brendan Hansknecht (May 05 2023 at 13:38):

That said, I really only think it will hurt when dropping to 16 bit. Where a 64 bit value is giant and often you don't have a lot of memory so inlining everything and bloating the executable is bad.

Richard Feldman (May 05 2023 at 13:40):

Richard Feldman (May 05 2023 at 13:41):

it would be cool in theory that you could just write 16-bit Roc code in the same style as 32-bit or 64-bit, I just have a hard time imagining that would work in practice :big_smile:

Brendan Hansknecht (May 05 2023 at 13:43):

I think we have already pointed out a number of reasons that roc isn't necessarily gonna be great for embedded. If we accept that embedded long term will remain second class for roc (which i think is totally fine, still could be fun to use on embedded and people manage to hack python to work on embedded), i think this is a totally fine change.

The biggest other concern being accidental register pressure leading to more spilling to cache and potential perf hits. Just maybe you have to add a lot of casts to random variables that you normally wouldn't have typed in roc.

Brendan Hansknecht (May 05 2023 at 13:44):

I do agree that the overall concern is pretty minor, but it can just randomly be wasteful and is easy to miss when many variables within functions in roc never get a type added to them.

Brendan Hansknecht (May 05 2023 at 13:46):

Aside, overall, i think i am for this change and saying that roc is focused on desktop and server systems. As such, though 32 bit and 16bit may be supported (even very well), they are not the core focus. In all cases, high perf should be possible, but smaller bit systems are slightly lower class.

Richard Feldman (May 05 2023 at 13:47):

Brendan Hansknecht (May 05 2023 at 13:47):

Hmm...what if i want to use a 32 bit hash on 32 bit systems and for that to be faster? Will we expose a built-in to check system info?

Brendan Hansknecht (May 05 2023 at 13:47):

Brendan Hansknecht (May 05 2023 at 13:48):

I guess we kinda have to expose similar info anyway if we want to expose a good simd chunk size to the user

Richard Feldman (May 05 2023 at 13:52):

hm, is that true? I thought we could do simd by offering primitives that are sort of agnostic to the chunk size

Richard Feldman (May 05 2023 at 13:52):

either that or we offer primitives that say "this will operate on an exact chunk size of N, but if that's not available in hardware, we'll automatically emulate it in software"

Richard Feldman (May 05 2023 at 13:53):

basically so the "simd" logic always works exactly the same way no matter the target - it just might be more or less efficient

Richard Feldman (May 05 2023 at 13:54):

I think this only specifically comes up when hashing collection lengths, which seems like it should be a small enough percentage of all hashes being performed that I'm not worried about it

Richard Feldman (May 05 2023 at 13:54):

it doesn't come up when hashing pointers because we never hash the pointer itself, but rather dereference its contents and hash that instead

Brendan Hansknecht (May 05 2023 at 13:57):

No, you misunderstand. I want to change a huge chunk of my code to use u32 and different algorithms instead of u64. So full replacement for faster code. I am not talking about just hashing a length.

Brendan Hansknecht (May 05 2023 at 13:58):

For example, maybe I want to use sha256 on 32bit systems and sha512 on 64bit systems.

Richard Feldman (May 05 2023 at 13:59):

Brendan Hansknecht (May 05 2023 at 14:00):

This is a contrived example, but imagine that we would prefer sha512 everywhere, but it is too slow on 32bit systems.

Richard Feldman (May 05 2023 at 14:00):

Brendan Hansknecht (May 05 2023 at 14:01):

Lets just keep of the contrived example, assume sha512 is way faster on 64bit systems.

Brendan Hansknecht (May 05 2023 at 14:01):

Brendan Hansknecht (May 05 2023 at 14:03):

I want my code to be fast on both systems, so I need a way to distinguish the systems and pick different code.

Richard Feldman (May 05 2023 at 14:07):

do such algorithms exist though? Like is there any algorithm where the same algorithm runs faster on 64-bit systems but slower on 32-bit systems, and there's another algorithm which runs faster on 32-bit systems but slower on 64-bit systems?

Richard Feldman (May 05 2023 at 14:07):

Brendan Hansknecht (May 05 2023 at 14:08):

Brendan Hansknecht (May 05 2023 at 14:09):

generally 64bit also has more memory. So it can use more cache and may have different levels of dependencies that makes sense.

Brendan Hansknecht (May 05 2023 at 14:10):

Also, since hashing is the base for sets and dictionaries and some algorithms, it indirectly means it effects all of those uses as well.

Brendan Hansknecht (May 05 2023 at 14:10):

Also, in some cases you may opt for more overflow safety on 64 bit systems, but give that up intentionally on 32 bit systems to save memory.

Richard Feldman (May 05 2023 at 14:13):

hm, ok so I wonder what's the specific application scenario where someone wants this

Richard Feldman (May 05 2023 at 14:14):

it has to be something where I'm writing an application that gets deployed both to 32-bit and to 64-bit targets, and hashing is a significant part of performance

Brendan Hansknecht (May 05 2023 at 14:14):

Also, all integer operations will be slower if you use a U64 instead of U32 on a 32bit system. So if you want to not pessimize on 32bit systems, some code may want to use U32 instead of U64 on 32bit systems. Though I guess you could argue that you should use the use U32 on both cause it is faster on both due to less memory pressure.

Brendan Hansknecht (May 05 2023 at 14:14):

Richard Feldman (May 05 2023 at 14:14):

yeah I think that's just a specific case of "use the smallest integer type you can get away with"

Richard Feldman (May 05 2023 at 14:14):

Richard Feldman (May 05 2023 at 14:15):

so I guess the "I want to have a library that does something different on 32-bit vs 64-bit targets" is kind of a separate discussion from Nat itself

Richard Feldman (May 05 2023 at 14:16):

and the main question on my mind there is whether it's worth it to enable that at the cost of giving up "Roc code gives the same outputs for the same inputs on all targets, so you never need to run roc test on different targets"

Richard Feldman (May 05 2023 at 14:17):

and my feeling right now is that it's not worth it, and target-aware Roc code shouldn't be a thing, even if that means we give up some perf in the specific scenario where you have one hashing algorithm that runs faster on 32-bit target and another that would run faster on 64-bit targets

Brendan Hansknecht (May 05 2023 at 14:19):

Also, if your goal is to fix "so you never need to run roc test on different targets". Enabling doing different things based on 32 vs 64 would add this problem right back in. Hopefully library code is written well and only they need to test on multiple systems, but it could definitely affect end users. So I think saying we will remove Nat, but maybe add doing different things based on 32 vs 64, it feels like you haven't gained much.

Brendan Hansknecht (May 05 2023 at 14:21):

Though I guess most places will probably only ever run code on 32 bit or 64 bit computation systems (note, wasm is a 64bit computation system), not both, so maybe it is just fine.

Richard Feldman (May 05 2023 at 14:22):

oh that's what I'm saying - I don't think we should enable doing different things based on 32 bit vs 64 bit :big_smile:

Richard Feldman (May 05 2023 at 14:22):

Brendan Hansknecht (May 05 2023 at 14:24):

I guess what I am trying to say is that Nat, doesn't have much value, but having the best perf when using hashing, dictionaries, and sets matters a lot. So I think those should be the focus of the discussion around roc test being consistent on all platforms.

Richard Feldman (May 05 2023 at 14:27):

Richard Feldman (May 05 2023 at 14:28):

we can have them do whatever we want under the hood, and it won't be observable in userspace because we've already made their hashing functions unobservable so we can upgrade them as a nonbreaking change :grinning:

Richard Feldman (May 05 2023 at 14:28):

Brendan Hansknecht (May 05 2023 at 14:29):

Currently it wouldn't but personally I don't think that solves the issue. What if I want to hash a file.

Brendan Hansknecht (May 05 2023 at 14:29):

Richard Feldman (May 05 2023 at 14:30):

Brendan Hansknecht (May 05 2023 at 14:30):

Also, we may want to expose changing the hash for dict in the long term. That or we want to implement at least 4 different hashing algorithms in the standard library dictionary.

Richard Feldman (May 05 2023 at 14:31):

interesting! what would be the use cases there? some hashing algorithms being faster for some key types than others?

Brendan Hansknecht (May 05 2023 at 14:31):

2 per target. One for short and one for long data. That would at least be a rough approximation.

Richard Feldman (May 05 2023 at 14:31):

Brendan Hansknecht (May 05 2023 at 14:32):

I agree that we can make an extremely fast standard library dictionary for most types. I still think that having fast and flexible userland hashing is important.

Brendan Hansknecht (May 05 2023 at 14:32):

Brendan Hansknecht (May 05 2023 at 14:33):

Only the standard library dict is fast. No other userland datastructures can get close because they can't use the magic hash function in the standard library that is impossible to write in roc.

Richard Feldman (May 05 2023 at 14:35):

hm, ok I think it would help to walk through a specific example - what are some specific pairs of hashing algorithms where:

Brendan Hansknecht (May 05 2023 at 14:37):

wyhash operates on larger blocks and use 64bit math. wyhash32 use smaller blocks and 32bit math

Brendan Hansknecht (May 05 2023 at 14:38):

The larger blocks and low cost of 64bit math make wyhash faster on 64bit system. The cheaper math make wyhash32 faster on 32bit systems

Brendan Hansknecht (May 05 2023 at 14:38):

Richard Feldman (May 05 2023 at 14:38):

Brendan Hansknecht (May 05 2023 at 14:39):

Also, wyhash has a few more variants if you want to tailor more to arm cpus that are missing/occasional have extra features.

Richard Feldman (May 05 2023 at 14:40):

so in that situation, always using wyhash64 would be optimal as long as it's being run on 64-bit CPUs, even if you're running on a 32-bit target like wasm32 (so long as it's actually running on a 64-bit CPU)

Richard Feldman (May 05 2023 at 14:41):

so the issue would be the specific scenario where I have an application that wants to be used on a machine with a 64-bit CPU and also on a machine with a 32-bit CPU, like a raspberry pi

Richard Feldman (May 05 2023 at 14:46):

libraries aside (which could presumably be vendored if absolutely necessary), if I really wanted wyhash32 on one and wyhash64 on the other, it's still possible to do this by:

obviously that's much less ergonomic as having userspace selection between the two targets, but it would work. If I really really needed that performance, I could get it.

Richard Feldman (May 05 2023 at 14:47):

but in that specific scenario I also wonder about: if my application runs well on a rpi, it can't be too CPU-intensive in general...so is it really going to be a noticeable problem on 64-bit targets if I just choose wyhash32 always?

Brendan Hansknecht (May 05 2023 at 14:49):

If someone told you that all hashing would be half as fast, would you accept that?

Brendan Hansknecht (May 05 2023 at 14:49):

Richard Feldman (May 05 2023 at 14:51):

Brendan Hansknecht (May 05 2023 at 14:51):

Also, I do specifically expect this to come up the most with libraries. We totally could standardize on maintaining two versions of all functions were this perf matters. Then just manually change? Seems like terrible ergonomics for an easy to solve problem.

when Target.registerWidth is
    32 ->
    64 ->

Richard Feldman (May 05 2023 at 14:52):

like if you have someone run two versions of the program, one using wyhash64 and one using wyhash32, and they can't tell a difference, I don't think there's a problem

Brendan Hansknecht (May 05 2023 at 14:53):

Do you ever expect a roc desktop or cli app to hash files that are large (or many of them so it adds up to a lot of data). You will feel the 2x there.

Richard Feldman (May 05 2023 at 14:53):

Brendan Hansknecht (May 05 2023 at 14:53):

Richard Feldman (May 05 2023 at 14:54):

Brendan Hansknecht (May 05 2023 at 14:56):

Richard Feldman (May 05 2023 at 14:56):

oh I mean like a specific CLI app that you'd want to run on both a desktop and on a rpi

Richard Feldman (May 05 2023 at 14:56):

Brendan Hansknecht (May 05 2023 at 14:58):

Brendan Hansknecht (May 05 2023 at 15:00):

Essentially anything that interacts with files and wants to be able to short circuit by using a hash.

Richard Feldman (May 05 2023 at 15:01):

:thinking: for files specifically, platforms could offer a function for hashing them

Richard Feldman (May 05 2023 at 15:01):

that could be target-specific, since platforms can already run target-specific host code

Brendan Hansknecht (May 05 2023 at 15:02):

Richard Feldman (May 05 2023 at 15:03):

Brendan Hansknecht (May 05 2023 at 15:05):

I think you should think about sockets, the postgres library recently built in roc, and general platform fragmentation.

all of that postgres library could be in a platform, but we want it to be possible in roc.
I think this general class should fall in the same category.

Richard Feldman (May 05 2023 at 15:05):

Richard Feldman (May 05 2023 at 15:06):

but another good question to ask is: suppose everyone in the Roc ecosystem does 64-bit hashing, what are the specific bad things that happen?

Brendan Hansknecht (May 05 2023 at 15:09):

Many 32bit systems are multiple times slower any time they hash. Any app that depends decently on hashing has a significant unnecessary slowdown. All applications that use hashing take more battery life due to increased computation. We can never do hashing on 16 bit systems (well we can, but the perf would make it non-viable)

Richard Feldman (May 05 2023 at 15:09):

so specifically, if someone build a roc app that compiles to 32-bit raspberry pi, and does hashing of large files, then it will run slower on raspberry pi unless they're using a platform that exposes a primitive for reading and hashing files (which uses 32-bit or 64-bit under the hood depending on target), or they go out of their way to accept worse ergonomics specifically for their hashing function (and possibly vendoring libraries if need be) in that they need to swap out the file that does the hashing when building for rpi

Richard Feldman (May 05 2023 at 15:11):

I think it's important to note that "it has to run massively slower and everyone just has a bad experience" is only true if the application author is willing to go to the trouble of building and distributing 32-bit binaries for rpi but not willing to do a custom build to swap out the hashing function

Richard Feldman (May 05 2023 at 15:13):

that's an important distinction to me, because there is a big difference in my mind between:

Brendan Hansknecht (May 05 2023 at 15:13):

Why add the friction though? Also, many apps are death by 1000 papercuts. Repeat a small slowdown in a bunch of places (or just bad architecture from the beginning) and you end up with the piles of slow apps that exist today. Everyone spent waiting longer or with a more jank experience.

Richard Feldman (May 05 2023 at 15:14):

well the reason to add the friction is to have the language-wide guarantee that Roc code gives the same answers regardless of target

Brendan Hansknecht (May 05 2023 at 15:14):

Also, I don't see the value in roc test being the same on all targets. You are still building on top of a platform that needs to be tested. So you fundamentally need to test on all the targets you deploy to.

Richard Feldman (May 05 2023 at 15:15):

to flip that around, "why sacrifice that language-wide guarantee for the entire ecosystem for the sake of making a build step a bit more convenient for the 0.0001% of Roc programmers who are building for both 32-bit raspberry pi and desktop applications and doing enough hashing that target-specific hashing makes a noticeable performance difference"

Richard Feldman (May 05 2023 at 15:16):

Richard Feldman (May 05 2023 at 15:17):

Richard Feldman (May 05 2023 at 15:18):

like concretely I started thinking about this because of wanting to use Roc at work by calling Roc functions from NodeJS via WebAssembly

Richard Feldman (May 05 2023 at 15:19):

which means all of our top-level expects, everything we try out in the repl...it might actually do something different in production

Richard Feldman (May 05 2023 at 15:20):

like there might be bugs because somewhere in some Roc code path there's a conditional based on the target, and it's just doing something different - and there's a bug in that much-less-tested code path

Richard Feldman (May 05 2023 at 15:20):

and we don't notice it until we've done a production deploy and got bitten by it

Richard Feldman (May 05 2023 at 15:21):

Richard Feldman (May 05 2023 at 15:22):

and more to the point, this is not a concern in JavaScript, Java, Python, Elm, ...

Richard Feldman (May 05 2023 at 15:23):

it's a case where Roc has a category of potential production errors that similar languages don't have

Richard Feldman (May 05 2023 at 15:23):

Richard Feldman (May 05 2023 at 15:24):

and generally speaking we're trying to remove entire categories of errors compared to other high-level languages, rather than introducing them :sweat_smile:

Richard Feldman (May 05 2023 at 15:24):

but at the same time, we also want to run faster than them, so there is definitely tension here!

Richard Feldman (May 05 2023 at 15:25):

especially considering we also want to be useful on a variety of targets - e.g. Golang can say "this is designed for servers" but our scope isn't that limited

Brendan Hansknecht (May 05 2023 at 15:29):

I think this is a case of boundaries. When something like Target.registerWidth is used correctly, it should never visibly do something different in a way that affects tests. That is the same with all of the standard library. Even though small string size is different or the hashing algorithm, it shouldn't cause userspace effects based on target. That is also generally the case with platforms.

What I am trying to say is that I do think the cases where something like Target.registerWidth should be used are extremely limited, but they tend to be the kind of stuff where performance really matters, and a whole chain of things depend on it. Hopefully because of the huge chain of dependencies, you can trust the code and not worry about cross platform issues.

This happens all the time: v8, cpython, numpy, any python library that calls c code, any java library that calls c code, the jvm itself. I bet multiple of these have had bugs that affect specific targets. None the less, people still use them, test them, and ignore the platform constraints (while getting platform specific speed ups).

Roc aims to be to have higher peak performance than all of these languages. As such, I think there will be some libraries where this matters. I agree with you that the rest of Roc should not need to care about this. They should just use the library, get amazing speeds, and never think about the fact it is optimized based on the target. I think removing Nat is completely reasonable, but if we do so, I think that we should add in some sort of Target.registerWidth.

Other languages and tools aren't immune to the issues you list. Also, roc still wouldn't be immune due to being on top of a platform.

Richard Feldman (May 05 2023 at 15:32):

Richard Feldman (May 05 2023 at 15:33):

I wonder if there's some way to sort of "limit the blast radius" here, or maybe make it more visible what subset of the program is target-specific, kinda like Rust unsafe

Richard Feldman (May 05 2023 at 15:33):

for example, maybe instead of having it be an expression-level thing, it's that packages can use one of two different modules depending on target

Richard Feldman (May 05 2023 at 15:34):

so you could tell right in the package's module header whether it was doing this

Richard Feldman (May 05 2023 at 15:34):

Brendan Hansknecht (May 05 2023 at 15:35):

interesting idea. Yeah something of that nature could potentially be reasonable.

Brendan Hansknecht (May 05 2023 at 15:36):

Maybe a bit more bug prone though. I would assume that would make it more likely that they don't get updated together

Brendan Hansknecht (May 05 2023 at 15:36):

Brendan Hansknecht (May 05 2023 at 15:38):

Also, other thought on testing, if we only expose 32 vs 64 register width, it totally can be fully tested on one system. So instead of testing for wasm, you are testing for 32bit and just using only 32bit assembly. So running a 32bit application on a 64bit system

Richard Feldman (May 05 2023 at 15:40):

Richard Feldman (May 05 2023 at 15:41):

theoretically we could even consider doing it automatically for all the relevant targets, if we could cheaply enough (e.g. using the module dependency graph) detect which code paths could possibly give different answers per target :thinking:

Richard Feldman (May 05 2023 at 15:46):

it would be pretty cool to get an error like "hey this test passes on 64-bit targets but not on 32-bit targets" after running roc test normally, not even thinking about it

Brendan Hansknecht (May 05 2023 at 15:49):

Richard Feldman (May 05 2023 at 15:49):

it's good that this is decoupled from Nat, since it means figuring out what to do (e.g. per-target modules vs builtin constant vs etc.) doesn't block changing the Nat APIs

Brendan Hansknecht (May 05 2023 at 15:50):

Yeah. I totally agree. I think all of these Nat changes are completely great assuming we recognize what restrictions it's adding and work on a plan to alleviate that.

Bryce Miller (May 05 2023 at 18:40):

Sky Rose (May 05 2023 at 22:00):

I like this idea of being able to choose which size you use, but in an obvious explicit way instead of the compiler silently choosing for you. And especially the idea of being able to run the 32 bit version to test it on a 64bit machine (or visa versa). Most people will never think about it, so get all the benefits of removing Nat.

Sky Rose (May 05 2023 at 22:00):

Another idea for if we get rid of Nat: What if by default List functions always return u64, but if you know you're on a platform where the performance/size matters, and you can guarantee that the list will never be too big, you could use a List8 that returned u8, or a Dict32 that used a 32bit hash.

It's explicit and opt in, so most people get the benefits of not using Nat. It would have consistent behavior no matter which target you compile for. It allows optimizing which implementation to use independent of the target (what if you want the performance of a List8 in one place, and capacity of a ~~List64~~ List in another), and is pretty future proof for adding 16bit or 128bit targets. And it combines well with the other idea of explicitly checking the target, if you want to use that to decide which implementation to use.

The downside is that it's a step towards the land of Java offering 13 different implementations of Dictionaries.

Brendan Hansknecht (May 05 2023 at 22:02):

Just a clarify, in the current proposal, on a 32bit system, you would get what you have named a List32 by default. It is just when we return a size or index to the user, we up cast it to a U64

Brendan Hansknecht (May 05 2023 at 22:02):

Sky Rose (May 05 2023 at 22:09):

Georges Boris (May 05 2023 at 22:33):

So, if I'm building for a 32bit system and I use, say List.length the type will be U64? wouldn't that confuse the end user? Is that why we have Nat currently? for typing system-bound values?

Georges Boris (May 05 2023 at 22:38):

as a complete noob in system development, I would prefer to have something explicitly dynamic, knowing that it could break at different points in different systems - and if I wanted to deploy in a system different of my own I would probably need to test my application in that system as well.

wouldn't it still make it for a simple experience for 99.9% of the system that are being developed and deployed on similar hardware? (and the example given of CI using 64bit and then production using 32bit -- would you really catch these types of errors on unit tests? and wouldn't it be really advised to test your application on a similar machine?) - seems like solving it in Roc is what is at hand, but ideally this would be a infrastructure problem, am I tripping?

Brendan Hansknecht (May 05 2023 at 22:59):

Why is that confusing? List.length always returns a U64. That seems like a very clear contract. If you don't know systems dev, you will not know that returning U64 is a little strange. If you do know systems dev, you may question it, but fundamentally a U32 fits in a U64.

Brendan Hansknecht (May 05 2023 at 23:00):

Brendan Hansknecht (May 05 2023 at 23:03):

Probably, though roc being a sandboxed language does have some ability to make this unnecessary. Think of python or java or js. You don't feel the need to test on an rpi and on your windows desktop and on your mac laptop. You just run the tests on your linux ci server and move on.

Georges Boris (May 06 2023 at 11:25):

Georges Boris (May 06 2023 at 13:27):

how does python, js and java deal with this particular problem? or they don't because their scope is narrower? (embedded systems for instance)

I totally hear what Richard said about maybe our use case falls out of embedded systems most of the time but would things like console game development also face these issues?

what is the best ergonomic strategy currently used for the code-once-run-anywhere when there are multiple system architectures involved?

is the code-once approach not the best ergonomic in these scenarios since you would need to optimize for the opaque system-optimizer algorithm instead of just accessing the different strategies for each system directly for your own, very specific, use case?

Brendan Hansknecht (May 06 2023 at 14:24):

Looking at Nat specifically, none of them have it. Python has an infinitely growing integer type. Java use an u32 for indexing (so you can't have an array with more than 4 billion-ish elements). JS has their weird number type (and is limited to 32 bits worth of elements in an array)

For the general optimization based on platform. I think essentially no code written in these languages does that. They may call C code that is optimized for the platform, but generally that is in large, well tested libraries that are assumed to be correct on all platforms. Of course in the case of Java and JS, the jit can optimize for the target machine (though that will be limited by data types), and python only gets any achitecture specific optimizations added to the interpretter.

Brendan Hansknecht (May 06 2023 at 14:27):

Console shouldn't hit issue with this, but will probably hit issues with reference counting and memory related costs. They are basically modern desktops with slightly limited resources where you want to use every drop of performance. Roc should have no issue replacing parts of games where lua would be used, but a pure roc console game might be very allocation heavy depending on how it is written (so would a bad c++ program though)

As an aside on embedded. Embedded systems are honestly really powerful. Micropython can run on a lot of embedded devices. It has a cost, but it is usable. I believe that they mostly just lock down python types and restrict them to more basic c like types (e.g. no growing ints, just i32)

Brendan Hansknecht (May 06 2023 at 14:30):

Write zig/c with good data oriented design principles and architecture generic simd/swar. Compile it for every target you care about.

Yeah, I don't think there is a real answer here. Nothing I know of truly targets both 16bit machines and 64 bit machines. Many target 32 and 64, but even though systems have restrictions because of it. Like java having restricted array size.

Brendan Hansknecht (May 06 2023 at 14:35):

I think Mike Acton in his DOD talk went over this some. If you want to make something with good performance, you need to know all of the details of the concrete set of systems that you are optimizing for. Of course, it would be better to target individual systems and design for each of those specifically, but a small or similar enough group of systems is also fine. Having a vague or large set of systems is simply not possible to optimize for.

In the case of libraries/core language design, generally scope is small enough that you can relatively well design each feature to run well on most systems. This does require picking specific code to run based on the systems though.

Kevin Gillette (May 24 2023 at 13:17):

Does every reasonably common 64-bit arch include compatibility support for a 32-bit ISA? I know that's the case for x86_64, but iirc, that's not the case for all others... Or do you mean compiling to use 32-bit registers and ops despite being on a 64-bit arch?

Aren't these properties we can just prove or constrain in the compiler for Roc code (assuming the platform code is doing the right thing)?

Stream: ideas

Topic: replace Nat

Richard Feldman (May 05 2023 at 01:04):

Richard Feldman (May 05 2023 at 01:04):

Richard Feldman (May 05 2023 at 01:05):

Richard Feldman (May 05 2023 at 01:07):

Richard Feldman (May 05 2023 at 01:09):

Richard Feldman (May 05 2023 at 01:11):

Richard Feldman (May 05 2023 at 01:11):

Richard Feldman (May 05 2023 at 01:12):

Richard Feldman (May 05 2023 at 01:13):

Richard Feldman (May 05 2023 at 01:14):

Richard Feldman (May 05 2023 at 01:17):

Richard Feldman (May 05 2023 at 01:25):

Richard Feldman (May 05 2023 at 01:27):

Richard Feldman (May 05 2023 at 01:27):

Brendan Hansknecht (May 05 2023 at 03:05):

Sky Rose (May 05 2023 at 05:26):

Anton (May 05 2023 at 09:10):

Folkert de Vries (May 05 2023 at 09:21):

Folkert de Vries (May 05 2023 at 09:22):

Richard Feldman (May 05 2023 at 13:24):

Brendan Hansknecht (May 05 2023 at 13:25):

Richard Feldman (May 05 2023 at 13:26):

Brendan Hansknecht (May 05 2023 at 13:29):

Brendan Hansknecht (May 05 2023 at 13:29):

Richard Feldman (May 05 2023 at 13:30):

Richard Feldman (May 05 2023 at 13:30):

Richard Feldman (May 05 2023 at 13:31):

Brendan Hansknecht (May 05 2023 at 13:32):

Richard Feldman (May 05 2023 at 13:32):

Richard Feldman (May 05 2023 at 13:33):

Brendan Hansknecht (May 05 2023 at 13:33):

Richard Feldman (May 05 2023 at 13:33):

Richard Feldman (May 05 2023 at 13:34):

Richard Feldman (May 05 2023 at 13:34):

Brendan Hansknecht (May 05 2023 at 13:34):

Brendan Hansknecht (May 05 2023 at 13:35):

Richard Feldman (May 05 2023 at 13:35):

Richard Feldman (May 05 2023 at 13:35):

Brendan Hansknecht (May 05 2023 at 13:37):

Brendan Hansknecht (May 05 2023 at 13:38):

Richard Feldman (May 05 2023 at 13:40):

Richard Feldman (May 05 2023 at 13:41):

Brendan Hansknecht (May 05 2023 at 13:43):

Brendan Hansknecht (May 05 2023 at 13:44):

Brendan Hansknecht (May 05 2023 at 13:46):

Richard Feldman (May 05 2023 at 13:47):

Brendan Hansknecht (May 05 2023 at 13:47):

Brendan Hansknecht (May 05 2023 at 13:47):

Brendan Hansknecht (May 05 2023 at 13:48):

Richard Feldman (May 05 2023 at 13:52):

Richard Feldman (May 05 2023 at 13:52):

Richard Feldman (May 05 2023 at 13:53):

Richard Feldman (May 05 2023 at 13:54):

Richard Feldman (May 05 2023 at 13:54):

Brendan Hansknecht (May 05 2023 at 13:57):

Brendan Hansknecht (May 05 2023 at 13:58):

Richard Feldman (May 05 2023 at 13:59):

Brendan Hansknecht (May 05 2023 at 14:00):

Richard Feldman (May 05 2023 at 14:00):

Brendan Hansknecht (May 05 2023 at 14:01):

Brendan Hansknecht (May 05 2023 at 14:01):

Brendan Hansknecht (May 05 2023 at 14:03):

Richard Feldman (May 05 2023 at 14:07):

Richard Feldman (May 05 2023 at 14:07):

Brendan Hansknecht (May 05 2023 at 14:08):

Brendan Hansknecht (May 05 2023 at 14:09):

Brendan Hansknecht (May 05 2023 at 14:10):

Brendan Hansknecht (May 05 2023 at 14:10):

Richard Feldman (May 05 2023 at 14:13):

Richard Feldman (May 05 2023 at 14:14):

Brendan Hansknecht (May 05 2023 at 14:14):

Brendan Hansknecht (May 05 2023 at 14:14):

Richard Feldman (May 05 2023 at 14:14):

Richard Feldman (May 05 2023 at 14:14):

Richard Feldman (May 05 2023 at 14:15):

Richard Feldman (May 05 2023 at 14:16):

Richard Feldman (May 05 2023 at 14:17):

Brendan Hansknecht (May 05 2023 at 14:19):