One thing I wish Elm had was some way of not needing to pattern match all the outputs of a function if the inputs are all known at compile time.
Here's an example of what I mean:
Here I have an opaque Email type that can only be created via Email.fromString to ensure the Email is always valid.
myEmail =
case Email.fromString "myEmail@email.com" of
Just email -> email
Nothing -> --This should never happen
Because this function only takes values that are known at compile time (a string literal in this case) I shouldn't need to handle the Nothing pattern.
In Elm, because I'm forced to handle the Nothing case anyway, I end up handling Nothings, or Errs, or some other patterns in every place I use values like this. If instead the compiler could detect that only one pattern can be reached then I could remove the Nothing pattern and safely unwrap my Email from the Maybe type.
As a bonus, if I were to mess up my email (Email.fromString "myEmail@email,com" perhaps) then this becomes a compiler error since now it returns Nothing and my pattern match doesn't handle that. As a result I can't accidentally start my program with a value that's going to return Nothing everywhere it's used.
I mention this for Roc because it seems especially important if there are editor plugins. It seems likely that people will make editor plugins for creating emails, phone numbers, or some other data that needs parsing and then want to place that parsed data in their code. It seems silly that after the plugin has guaranteed that the data is valid, we still need to have it wrapped in a Maybe or Result.
That does sound useful, it also seems complicated to implement :)
One relatively simple approach I tried implementing for Elm was to have an "unreachable" function defined like this
unreachable : () -> a
unreachable () = unreachable ()
that would be used like this in the above example
myEmail =
case Email.fromString "myEmail@email.com" of
Just email -> email
Nothing -> unreachable ()
and then I'd have a tool that would find the unreachable call, walk up the AST recursively finding all the functions relevant to determining if it's reachable. So for example, the unreachable is inside a case...of, so Email.fromString is relevant. And that takes "myEmail@email.com" as a parameter. And since that's a literal, there's nothing more to check. Since all the leaf nodes are literals, we then compute "myEmail@email.com" (which is just itself), followed by Email.fromString "myEmail@email.com" followed by seeing if the output reaches the pattern unreachable is in. If there was a leaf node in the AST that isn't a literal and is needed to run the computation, then the tool can't guarantee that the unreachable is unreachable and reports an error.
The advantage with this approach is, it requires no special syntax, or changes to the type system. In fact, if the programmer never makes mistakes then you don't even need the tool! Of course if the programmer does make a mistake then the program hangs so it's nice to have the tool double checking that unreachable really is unreachable and even nicer if this tool is integrated into the rest of the compiler (but not necessary).
It seems like this same approach can work for Roc?
This sounds a lot like Zig's comptime feature almost. Not sure if Roc would support it but it is cool in Zig
There are ideas around something similar where top level values are evaluated at comptime
and yes very similar to zig comptime from what I remember
Wolfgang Schuster said:
This sounds a lot like Zig's
comptimefeature almost. Not sure if Roc would support it but it is cool in Zig
Cool, I didn't know about comptime before. If I understand correctly, it looks like it does what I want and a lot of other stuff too. I just want to be able to safely unwrap values known at compile time and I'm not interested in precomputing in order to improve runtime performance.
This topic was moved here from #beginners > Compile time computation by Richard Feldman
so for this to work, it would require full compile-time evaluation (e.g. consider Regex.fromStr "..." - it would need to actually parse the regex to know if it was valid or not), and it would also need to do this before exhaustiveness checking could run
so this would mean that, for example, whenever the editor is determining if there are any errors to display, it would need to run arbitrary interpreted code (which could potentially crash with a stack overflow etc) in a bunch of places in the program
not saying it's impossible, but there is definitely tension between wanting that feature and wanting a fast feedback loop on compile errors! :big_smile:
(macros share this downside, incidentally, and I've definitely experienced feedback loop pain around them in Rust, which otherwise is usually fast at type checking itself)
Maybe performance issues could be avoided through a combination of caching and restricting how much computation can be done before the compiler gives up? Another idea is that the compiler only checks if a unreachable function is unreachable when the user triggers it, when a production build is made, or if the user reaches one while they're running their program locally.
those could potentially improve things, but the amount of compiler complexity we're talking about goes up and up the more performance fixes we have to apply to avoid having a slow feedback loop :sweat_smile:
I also don't want to have classes of compile time errors that only happen on optimized builds!
True. I guess the other alternative is to do what I (tried) doing with Elm. Just making it a completely stand alone tool and that user is responsible for remembering to run or write a script so that it runs during builds.
yeah I think that's probably the best default way to go!
Richard Feldman said:
so this would mean that, for example, whenever the editor is determining if there are any errors to display, it would need to run arbitrary interpreted code (which could potentially crash with a stack overflow etc) in a bunch of places in the program
You don't need to run code at compile time to determine the type of something, so the editor can type check. However it won't check unreachable clauses that can't be determined by simple static analysis.
Roc then would provide an intrinsic function like comptime : a -> a to force the compiler to execute the code during compilation and abort when reaching unreachable.
the problem would be the compile-time execution of code as a prerequisite for displaying compile-time errors in the editor; wrapping them in a hypothetical comptime function would limit the scope of the compile-time execution, but no more than macros do - and I've already seen those cause significant compile-time slowdowns! :big_smile:
Maybe show type errors first and then other compile errors based on comptime? I think it's best to cache the comptime function result alongside other compiled code (handled by the compiler), rather than using another tool to cache the result.
I think it's important that all compile-time errors show up at the same time - I really wouldn't want to introduce a tier system where you think you have all the errors fixed in the editor, but actually there are other secret errors that you have to know how to access :sweat_smile:
and if you don't need to do something separate to access them, then the slowness downside applies!
that said, I do think there are two interesting points to note on the topics of exhaustiveness checking and compile-time evaluation, which don't have these issues
one is this: https://github.com/rtfeldman/roc/issues/576
basically letting you not specify a branch if there's a [] in its type (Roc's equivalent of Never in Elm) - this only requires type information, not compile-time evaluation!
separately, we can do compile-time evaluation of all top-level values and store the result directly in the binary
e.g. at the top level of a module:
foo = if blah then 1 + baz else 2 + baz
if that compiles, then it means blah and baz must also be top-level values, so we have all the info we need to evaluate this at compile time
so we can just do it!
separately, any value in the middle of a function which doesn't include any of the function's arguments can be "hoisted" to the top-level and also evaluated at compile time
as long as no compiler feedback (e.g. errors) depends on this, it can be purely a feature for code generation, which only applies when you're actually about to run the program
in Elm, values like that would be run on startup anyway, so evaluating them at compile time right before running the program doesn't necessarily have to increase the total time you spend waiting for the program to do its thing - it would just be that some of the evaluation time would move from the beginning of runtime to (near) the end of compile time
and potentially faster since we can cache those values at compile time to avoid recomputing them on subsequent runs if their ASTs and dependencies haven't changed
The semantic of comptime is "the following code is total"
Maybe comptime isn't the right keyword here
Richard Feldman said:
separately, we can do compile-time evaluation of all top-level values and store the result directly in the binary
If the compiler is fast enough, then sure.
Again, comptime should only ensure that the following code path has no panic/exit(1) or whatever side effect that isn't reflected in the type system
In Roc it should be something like comptime : Result a b -> a
trading compile time and run time isn't the purpose of comptime; it's optimization
ensure that the following code path has no panic/exit(1) or whatever side effect that isn't reflected in the type system
unfortunately, this is lots of things :sweat_smile:
for example:
so even if Roc did have a hypothetical feature for checking whether an expression was total, it would be incredibly frustrating to use because so few expressions would actually turn out to be total! :big_smile:
That's the point of compile time execution
oh you mean it would execute it at compile time in order to ensure that it doesn't crash?
yes
got it - so once we know that it didn't crash, what do we do with that information?
you just let the program compile
again, comptime isn't probably the right keyword
maybe ensuretotal is better
oh - but now we're back to "we have to do compile-time evaluation in order to display compile-time errors" which I don't think we should do :big_smile:
I think it's really important to have a fast feedback loop on compile-time errors
So what if someone wants compile-time regex
this gets into another topic, but the idea for that ties into testing
So you need a test for every compile-time thing you use
and it happens before you compile the real program
which is the same except more work for the developer
because you need to write the test and include the computed result to the source code in some way
Richard Feldman said:
I think it's really important to have a fast feedback loop on compile-time errors
Then show errors incrementally, like all existing compilers do
so the design for tests is an expect keyword (which is currently in the parser but not implemented beyond that), which could be used something like this:
myRegex =
result = Regex.fromStr "regex-goes-here"
expect Result.isOk result
Result.withDefault Regex.none regex
so when I run tests, this expect will get evaluated and run as a test
so if the regex is invalid:
so what's the different between this and compile time execution?
it's not delaying all compile-time errors, just tests
it's the same thing
it checks the regex is valid before the program runs
and now you have a potentially slower feedback loop
In LSP, you can show errors one at a time
All compilers do that too when invoked from the terminal
even if they don't have compile time execution
Like, when using gcc, it prints the first error/warning, then the second, then the third, and so on
I see your point
here's an interesting idea:
myRegex =
when Regex.fromStr "regex-goes-here" is
expected Ok regex -> regex
_ -> Regex.none
so this hypothetical syntax gets you the test in basically the same amount of code as the idea at the start of the discussion via syntax sugar alone
(here, expected would be a new keyword that's short for "I expect this pattern to be the one that gets matched")
another idea: an unexpected keyword which crashes if it ever gets run, but automatically generates a test which fails if it gets run - so to modify the original example:
myEmail =
when Email.fromString "myEmail@email.com" is
Ok email -> email
Err _ -> unexpected
the characteristics of this would be:
unexpected keyword inside any expression containing a function argument (which would mean it can't be evaluated without actually calling the surrounding function, which would require knowing what arguments to call it with, etc.)Email.fromString returns Err, I'll get a test failureso to get a production crash from this design, you'd have to skip running tests, and also not even try running the program once locally (where you'd immediately see the crash)
this would also let you shorten it to:
myEmail = Email.fromString "myEmail@email.com" |> Result.withDefault unexpected
I can't use the unexpected keyword inside any expression containing a function argument (which would mean it can't be evaluated without actually calling the surrounding function, which would require knowing what arguments to call it with, etc.)
I don't understand this part
Richard Feldman said:
this would also let you shorten it to:
myEmail = Email.fromString "myEmail@email.com" |> Result.withDefault unexpected
Seems about right, although I think it should be a separate phase from other tests, and show a warning even when the result is not used.
Also the compiler need to know to optimize it away when it did not fail. That means it has to store the result from generated test case somewhere for compilation.
How about Result.withDefault []? Is that valid in Roc?
Locria Cyber said:
I can't use the unexpected keyword inside any expression containing a function argument (which would mean it can't be evaluated without actually calling the surrounding function, which would require knowing what arguments to call it with, etc.)
I don't understand this part
ah, a simpler rule would be "it can't be used inside a function definition" - so, only at the top level. That's probably a better rule anyway because it's easier to understand, even though technically it could be a bit less restrictive. :big_smile:
Locria Cyber said:
How about
Result.withDefault []? Is that valid in Roc?
yeah [] is already an empty list literal, like in Elm, so that's already taken ([] at the type level means "empty tag union" in Roc, which is equivalent to Never in Elm)
Richard Feldman said:
Locria Cyber said:
I can't use the unexpected keyword inside any expression containing a function argument (which would mean it can't be evaluated without actually calling the surrounding function, which would require knowing what arguments to call it with, etc.)
I don't understand this part
ah, a simpler rule would be "it can't be used inside a function definition" - so, only at the top level. That's probably a better rule anyway because it's easier to understand, even though technically it could be a bit less restrictive. :big_smile:
I think it is a reasonable choice. And it's probably better to make a new keyword and force anyone to put it like variable = <keyword> (Regex.fromStr "..."). It's more readable
Richard Feldman said:
I definitely think this is a promising idea, but I think the way to do it is with something more like a linter rule. Any Roc expression that has only top-level dependencies (e.g. it doesn't depend in any way on user input) can be safely evaluated at compile time, and then we can have a linter rule that validates the output and tells us if it's valid.
as an application author, I benefit from being informed about the problem (ideally at build time rather than runtime); it doesn't necessarily matter to me whether I find out from a type-checker or a linter!
Would keeping these checks out of the compiler be a way of furthering the goal of partial program compilation/evaluation, or is it for some other reason (compilation speed)?
Rust certainly follows the C++ tradeoffs between compilation speed and runtime performance. I saw in some of the Roc talks that Roc is targeting quick compile times at the expense of arbitrary optimization (a tradeoff I certainly agree with), but I believe it's a bit more subtle. If a main function's first operation during initialization is an attempt to parse hardcoded markdown, exiting the program if the parse fails, then in a really practical sense, a compiler with the capability to parse that markdown might as well do so, since the development cycle isn't really shortened in that case by deferring the parse until runtime (and indeed, any production deploy will be faster as a bonus, because the result of the parse is now pre-computed static data).
There are various techniques for amortizing the compile-time cost of pre-computation (though I personally have no practical experience with any of them):
Aside: I've seen "linter" in some languages as being used to describe a tool with a false-positive rate, i.e. if you can prove it's wrong, it's not linting. I've seen other terminology ("static analysis," "verification," "vetting," "compiler") used for cases that can be proven without false positives. What is the terminology prominently used in Roc today for these concepts?
@Kevin Gillette this thread has an existing discussion on a very similar idea, so moved your comment to here!
Kevin Gillette said:
I've seen "linter" in some languages as being used to describe a tool with a false-positive rate, i.e. if you can prove it's wrong, it's not linting. I've seen other terminology ("static analysis," "verification," "vetting," "compiler") used for cases that can be proven without false positives. What is the terminology prominently used in Roc today for these concepts?
we don't really have a term yet. I've been casually using the term "linter" to mean something like elm-review - that is, a tool that's part of the roc CLI but separate from compilation, where Roc programmers can specify custom rules based on traversing the AST (with type information) of their compiled program in order to report things they want disallow
I think it sort of goes hand-in-hand with editor plugins; you should be able to do the same sorts of things in an editor plugin (write Roc code to traverse your project's AST, report about things), but some of those things you'll want to run on CI in order to transform them from a one-time thing into something that will fail the build if the rule is ever violated in the future
that's a whole topic unto itself...but there aren't really any design specifics yet, just something it's safe to assume we should do.
(basically every language ends up with either a first-party linter, or, in the absence of that, a third-party one comes up; I'd rather we design Roc for it so code can be shared between the editor and the linter - or whatever we end up calling it, if not "linter")
Richard Feldman said:
we don't really have a term yet. I've been casually using the term "linter" to mean something like
elm-review- that is, a tool that's part of therocCLI but separate from compilation, where Roc programmers can specify custom rules based on traversing the AST (with type information) of their compiled program in order to report things they want disallow
I like that vision.
I wrote up a proposal for this!
https://docs.google.com/document/d/1hp12UnSdS0cIIaUjLhp2NzmzUewwBxp2Mtwk3CNtnI4/edit?usp=sharing
in addition to reporting the crash (still at build time) ... every entrypoing which references that constant should compile to a function which immediately crashes.
Could the runtime crash be delayed until the point that the constant is referenced, like how other compilation errors won't prevent running the code as far as possible?
I am +1 to this idea in general, but I'm really worried about the complexity and arch-specific behavior of evaluating toplevels and running them on the local machine, and then reading them back to the compiled binary.
That said, there's the concern that an interpreter might get different answers than a dev build. Arguably this could be desirable (e.g. it might avoid floating-point precision issues by calculating things differently), but also it could be undesirable in that you could take code which gives a certain expected answer in dev builds and tests, run a build with --optimize, and now it silently gives a different (unexpected) answer.
I think the same thing can be said for the readback approach. A specific machine might give an answer for its arch that's not reflected on another. Floating point operations are a good example of this. We could attempt to say that all inconsistent floating point operations will be emulated in software, but this feels like something that will need to be chased down potentially indefinitely. A similar thing could happen for mis-compilations on one target platform compared, that are then propagated to all target platforms. We could suppose this never happens, but it likely will. In these scenarios, I think having an abstract machine is extremely useful, because it tells you what the semantics are precisely, without having to chase down how dependencies were compiled.
it's still possible that an interpreter could be valuable for optimized builds, in order to do partial evaluations which simplify certain expressions and reduce the amount of runtime work that would need to be done
If this is going to be an option in the future, I think it's important to try to reduce the implementation scope. I'll note the performance considerations later on, but I don't think that having both an interpreter and readback approach is preferable to having one or the other, suppose all other tradeoffs are roughly equal. And I think the readback approach is way more complicated than the interpreter, especially if both are to eventually be supported. The empirical evidence for this is the current REPL - although it's relatively stable now, it took a very long time to get there, and there are still bugs. That implementation is 1500 lines today (excluding the arch-specific memory readback code, and the "expect" style readback that this would have to do). The IR-checker is 700 LoC (excluding type definitions). In my experience, evaluators tend to be simpler than checkers (at least naive ones).
Finally, regarding performance - I agree that a compile-and-eval-on-machine + readback approach is more performant than any interpreter we could write. But I'm not sure that tradeoff is the most important one here.
During code gen, in the common case where this is a dev build and the target is the same as the current machine, then we can keep almost all of the (pre-linking) machine code we generated for the constants.
If this is the common case, don't you get the same experience from running the binary as-is, without constant eval, and evaluating all constants (memoized) at runtime?
Of course, one wrinkle there is that in a world where constant eval is cached, you don't get the caching behavior. But I think there are other ways to achieve that behavior, that might even be more preferable than constant eval from the compiler. One is to write a script that generates the data that you want to output, dump it to a file, and then load that file as bytes into your Roc program that you deserialize as a top-level constant, or at run time. This enables you to build your binary out-of-band of the constant, if it is in fact expensive enough to compute that you cannot compute it reasonably in-band of the application compilation. And, if you choose an arch-independent serializer, you do not need to worry about arch-specific differences causing a hinderance to your users.
Getting the value out of a comptime list and I to the binary is definitely more complex than a memcpy especially considering pointers and cross compilation. In practice, I bet we would want to turn these back into constants in rust in the compiler and then regenerate code to load them as if they were a constant original specified in the mono ir.
Personally, I would highly suggest starting with an IR interpreter that directly interpreters mono ir nodes and generates new mono ir nodes.
I think it will be many times less buggy and easier to implement/maintain. Will also be useful for generic constant expression folding (though we technically could leave this to llvm in many cases).
On top of that, an IR interpreter will likely only be slower if there is significant recursion (which we can't understand in a way to turn into a rust for loop). Anything without recursion should always run faster in an IR interpreter. One scan of the mono ir will generate the final result. To generate dev backend assembly requires at least two scans of mono ir, shuffling of bytes to build and object file, linking a shared library, dumping to disk, and loading it before execution.
In the common case the IR interpreter may actually be faster.
Of course, in the bad case, it will be many times slower (or at least I hope it will be, dev backend is super unoptimized currently, so :shrug:).
One little detail of the design is that it suggests using a bump allocator for compile time evaluation.
But that might not be suitable for all cases. It runs out of memory more easily than any other kind of allocator, so it's a worst-case in that sense. If we use a normal allocator that frees things as they go out of scope, we reduce the number of programs that run out of memory.
Would this make it possible that there is a program that does not compile on 32bit machines (because a huge string is getting constructed in calculating the constant) but can cross compile from 64bit to 32bit (because actual constant value is small)?
Yes it would. I don't think that's a concern though.
@Ayaz Hafiz I agree that an interpreter is simpler. Does it help with floating point though? whatever (software) implementation you pick, it could still be different from what the target machine would do at runtime right?
Brendan Hansknecht said:
On top of that, an IR interpreter will likely only be slower if there is significant recursion (which we can't understand in a way to turn into a rust for loop). Anything without recursion should always run faster in an IR interpreter. One scan of the mono ir will generate the final result. To generate dev backend assembly requires at least two scans of mono ir, shuffling of bytes to build and object file, linking a shared library, dumping to disk, and loading it before execution.
In the common case the IR interpreter may actually be faster.
I hadn't considered this! It's a very good point, especially considering we can only pick one or the other; either we make the (expected to be) common case faster and the uncommon case slower, or we optimize for the worst case at the expense of the common case.
Given that, a good argument for choosing the interpreter path is that there's a workaround to minimize the impact of the worst case (whereas the reverse is not true):
Ayaz Hafiz said:
But I think there are other ways to achieve that behavior, that might even be more preferable than constant eval from the compiler. One is to write a script that generates the data that you want to output, dump it to a file, and then load that file as bytes into your Roc program that you deserialize as a top-level constant, or at run time. This enables you to build your binary out-of-band of the constant, if it is in fact expensive enough to compute that you cannot compute it reasonably in-band of the application compilation.
based on this, I'm sold on interpreter being most likely a better choice for performance overall, which makes me think it's the way to go here after all :thumbs_up:
Given we allow for cross compilation, do we have any fix for floats?
do we have specific documentation of different modern CPUs actually doing different things with floats?
At a minimum, the rounding mode can be changed at runtime
I know it used to be a common issue, but I am not confident it still is based on conversations I've had - other than in the specific case of trig functions where the best practice is to implement them in software anyway
sure, but I don't think rounding mode is fixable
or to say it another way, "if your host sets a different rounding mode, you're going to get different answers"
But yeah, I have only personally seen issues related to ordering. I have never seen an issue specific to CPU, but I don't generally compare results across many machines
ordering? :thinking:
Associativity giving different results
a + b != b + a
Question, what does llvm use when folding constant floats. Like do we even have float guarantees currently?
probably not
I mean in general the guiding principle of floats is "reach for them over Dec only if you're okay with imprecision"
but even with that in mind, it's still desirable to have the property that no matter which machine you build on, and which machine you're targeting, you get the exact same program
Yeah, so reading this post, I don't think we have the guarantee today due to depending on llvm:
https://discourse.llvm.org/t/fp-constant-folding-of-floating-point-operations/73138
I mean we can defensively set rounding mode and what not in the compiler and the interpreter should generate the same program (though that still isn't guaranteed due to things like op fusion according the the link above)
Some cpus can more a floating point multiplication and add into a single instructure
So depending on the target the compiler is built for, might use merged float ops that lead to slightly different results.
but even with that in mind, it's still desirable to have the property that no matter which machine you build on, and which machine you're targeting, you get the exact same program
You mean no matter which machine you build on, if you target the same machine, you get the exact same program? Cause obviously different target means different assembly means different program.
Either way, I think depending on llvm in practice means that we can't have this behaviour.
If llvm sees sin(2.4583) it is not guaranteed to generate the same result as the version of sin that zig generates for our bitcode (even with the same rounding mode). They could be generated with totally different algorithms.
Brendan Hansknecht said:
but even with that in mind, it's still desirable to have the property that no matter which machine you build on, and which machine you're targeting, you get the exact same program
You mean no matter which machine you build on, if you target the same machine, you get the exact same program? Cause obviously different target means different assembly means different program.
yeah, sorry - what I mean is that ideally if I call the roc CLI with the same arguments and the same input source files, it should produce the same program no matter what computer I'm running it on
otherwise cross-compilation becomes some amount of unreliable
LLVM giving different answers is obviously not ideal because it means --optimize can produce a different program from debug builds
that said, there's a path to fixing that --optimize behavior long-term, which is doing not only our own inlining but also our own constant folding, at which point we can make sure to use the same algorithms we'd use at runtime
that may or may not be worth it, but it's an option that's available at least
Yep. And we can then control the rounding mode and such. Though not 100% sure this would fix everything. Cause the compiler even if it is calling our zig builtins, it would be calling float intrinsics optimized for whatever target the compile was built on. So it would still be running compiler target with llvm optimizations of the algorithm instead of the final target machine optimized version.
But overall should be much much closer and probably generally identical with only a few exceptions.
We should ask the zig folks if they deal with this somehow in comptime.
Brendan Hansknecht said:
Though not 100% sure this would fix everything. Cause the compiler even if it is calling our zig builtins, it would be calling float intrinsics optimized for whatever target the compile was built on.
yeah, although we can also control that too if we want to (also not sure it will be worth it, but it's an option that's available)
Yeah, I think accepting floats as potentially noisy with optimizations is probably the right call. And luckily, it can be re-evaluated in the future.
I don’t have the example off the top of my head but at least as of 2022 different hardware FP instructions is definitely an issue in practice. At my previous job I had a bug where a quant ran a model on two different boxes and got different result because of some trig function
My point regarding this though is mostly to say it’s not a reason we should prefer the read back approach. The read back approach is not any more portable or reliable than an IR based one, and in general is less so. That doesn’t mean the IR based approach is reliable.
An interpreter seems like it'd be generally valuable to have, and evaluation via compilation, and evaluation via interpretation aren't mutually exclusive. Some miscellaneous thoughts:
repl could benefit from a builtin interpreter, especially in contexts where real compilation is unavailable, file write access is blocked (even for temporary files), or subprocess exec is prohibited. For many common repl expressions, interpretation will probably feel faster.math.h like PI2 (to mean 2*PI) are wholly unnecessary, because constant arithmetic has way more than 64-bits at compile time, and thus precision loss on irrationals is not a concern.I’m pretty sure the IR interpreter would be slower than the readback approach, at least in terms of wall time (what you experience compiling +running your program, even if you only partially run it). I think it’s pretty unlikely we bridge the gap without a stack/register based VM or a JIT for anything that goes beyond a couple function calls, but that is complicated and super close to AOT anyway. But I think the cost concern just isn’t a big deal. I think most of the time constants are trivial to evaluate (suspicion: <5 calls in depth). Looking at how I use const/constexpr in Rust in C++, this would be my guess. If it turns out to be a problem in practice, I still would have rather tried the IR interpreter first (because it has other benefits and use cases) and fall back to implementing a specialized AOT approach if it’s needed
Doesn't linking take a fair bit of time, or is that just for non-dev builds? In terms of interpretation speed often being faster, I had in mind simple expressions (i.e. constant time math expressions, etc).
It is really fast with the surgical linker (currently limited to linux, but eventually we want to use it everywhere)
And for repl or constant eval long term we technically could skip linking all together and do a solution similar to what a jit does (just call bytes in memory).
Ayaz Hafiz said:
I don’t have the example off the top of my head but at least as of 2022 different hardware FP instructions is definitely an issue in practice. At my previous job I had a bug where a quant ran a model on two different boxes and got different result because of some trig function
yeah sorry, I should have drawn a distinction between trig and non-trig; my understanding is that both of the following are true:
Super naive question, is it possible to compile const functions to WebAssembly, like dtolnay/watt is doing with Rust macros ?
As in use webassembly as our comptime executor?
Sure. Though what would be the gain.
:thinking: if we go with the IR evaluation approach to compile-time evaluation, wouldn't that take care of the let-generalization challenges that @Ayaz Hafiz wrote up?
certainly top-level constants would get evaluated all the way into literals, at which point we could inline those literals into each specialization
At the top level it would....well kinda....still would be 2x the compile time if we have to run it twice due to let generalization
but wouldn't we be doing the same amount of work?
(for top-level constants; other bindings are a separate question)
hm I guess actually we wouldn't want to inline it, normally we'd just generate a reference to it
so then we're generating static data in the binary twice and that's it?
Like if let generalization fails today. It is a type mismatch and the user decides how to deal with it.
They can turn it into a function. If so, it is N times the runtime. Once for each different type.
If we just run everything at compile time and let it specialize multiple times, the user never gets a type mismatch. They just get N times the compile time for each different type.
I'm assuming the top level constants are being generated from compile time execution
yeah true
I'd argue that compile time compilation should be fully explicit. As a user, I wouldn't want a global variable that's a list with a million elements to be generated at compile time if I don't want it to, because that bloats the binary and slows down compilation pretty dramatically.
I would propose an operator similar to dbg, try, or expect. It could be called "eval" but that's just off the top of my head; I'm sure someone else could come up with a better name for it.
The basic idea is that it would run an expression at compile time when all the values involved (functions, arguments, numbers etc) are known at compile time. This would be purely for optimizations and speedups like in the regex case mentioned earlier in this channel.
Example:
list = [1, 2, 3] |> List.map \x -> x + 1
evaledList = eval list
# would be the same as writing
evaledList = eval [1, 2, 3] |> List.map \x -> x + 1
# which would be the same as writing
evaledList = [2, 3, 4]
firstNum = eval "1, 2, 3" |> Str.split "," |> List.first
# would be the same as writing
firstNum = Ok "1"
Now one question in the example above is whether the type of firstNum should be [Ok Str] or Result Str. I would lean towards having Result Str since I don't think eval should change the type of a variable since that would greatly complicate the speed guarantees of the type checker. But I think you could fix this by making a Result.unwrap function similar to Rust which just crashes if there's an error.
firstNum = eval "1, 2, 3" |> Str.split "," |> List.first |> Result.unwrap
# is the same as writing
firstNum = "1"
If a crash is ever called at compile time, that should be, well, a compile time error. I assume there'd be some special handling needed to do this.
Another advantage of having a keyword to force compile time evaluation (other than the issues caused by the halting problem being unsolvable) is that this wouldn't only be limited to top-level definitions. You could run the firstNum stuff (or the Email.fromString example that started this channel) at the top level or deep inside a function and still have the return type Email with the guarantee that at runtime it'd be the same as if you just explicitly made an Email instance.
The ideas on caching, floating point precision, debugging, and loop iteration limits would still apply to this proposal.
Some more examples of things that would and wouldn't be allowed:
# not allowed - x isn't known here
addOne = \x -> eval x + 1
# allowed
addOne = \x -> x + 1
eightPlusOne = eval addOne 8
# allowed
addMagicNumber = \x ->
magicKey = 7
x + eval magicNumberGen magicKey
# even this would be fine
addMagicNumber = \x ->
Stdout.line! "Running the magic number function!"
magicKey = List.sum [2, 3, 2]
x + eval magicNumberGen (eval magicKey)
# would be turned into x plus a constant at compile time
# should you need a second eval here? i'm not sure
# but not this, because it depends on runtime input
addMagicNumber = \x ->
input = Stdin.line! "Running the magic number function!"
magicKeyInput = Str.toU32 input |> Task.fromResult!
x + eval magicNumberGen magicKey
I'm not sure whether evals should be allowed to return functions or structs that contain them. And I feel like they definitely shouldn't be able to return Tasks. (Someone smarter than me should probably think more about this.) From what I can tell, there's already a lot of inlining and reduction done on functions, and it's not clear how their speed and size guarantees work either, so I don't think it would be a benefit to the programmer. But all tags, strings, lists, dicts, numbers, etc and most user-defined structs should be returnable.
There's still a lot to think about here, but I think something like this could be useful in a few cases and is a lot better than evaluating every top-level declaration. And I think it's important that we try to keep this as digestible as possible for beginners.
Also, I'm not sure how to add syntax coloring to code snippets on this site - if someone lets me know, I'll go back and update my message to have that
Also, I'm not sure how to add syntax coloring to code snippets on this site - if someone lets me know, I'll go back and update my message to have that
We usually use elm, haskell or coffeeescript after the triple backticks, whatever looks best
Derin Eryilmaz said:
I'd argue that compile time compilation should be fully explicit. As a user, I wouldn't want a global variable that's a list with a million elements to be generated at compile time if I don't want it to, because that bloats the binary and slows down compilation pretty dramatically.
it's easy to opt out of though; just wrap it in a function, done! :big_smile:
Derin Eryilmaz said:
If a crash is ever called at compile time, that should be, well, a compile time error. I assume there'd be some special handling needed to do this.
Another advantage of having a keyword to force compile time evaluation (other than the issues caused by the halting problem being unsolvable) is that this wouldn't only be limited to top-level definitions. You could run the firstNum stuff (or the
Email.fromStringexample that started this channel) at the top level or deep inside a function and still have the return type Email with the guarantee that at runtime it'd be the same as if you just explicitly made an Email instance.
I think both of these apply to just doing it for you automatically - that is, crash becomes a compile-time error, and any expressions which are "effectively top-level" (in that they don't depend on any function parameters, even indirectly) can and should be compile-time evaluated too
I appreciate the idea, but it seems like the main consequence of making the optimization opt-in is that it would only get applied a small fraction of the time, instead of all the time
it seems both simpler and more effective to me to make it opt-out in the (very rare, I would imagine) situations where it actually causes a noticeable problem in practice :big_smile:
oh yeah, another important distinction: the current behavior is nonobvious, I think, and it's hard to fix that in any way other than evaluating top-level constants by default
that is, if I write a top-level foo = 5 then foo is a constant that compiles to 5
but if I write foo = bar + 1 at the top level, then that gets compiled to a function which gets automatically called (and therefore reevaluated) at runtime, every time you reference foo
usually that's not noticeable, but it's definitely surprising that if you put a dbg in there (for example), it doesn't run "on startup" like it would in many languages where you can declare things at the top-level (e.g. JavaScript, Python, ...) but rather it runs lazily, and multiple times, wherever you reference it
Richard Feldman said:
that is, if I write a top-level
foo = 5thenfoois a constant that compiles to 5
Fair enough, but is a foo = 5 inside a function not the same thing? Should there not be some kind of mechanism to do calculations with that at compile time in functions too instead of having to push everything to the top level which generally scrambles the code around a little?
making compile-time evaluation opt-in via eval necessarily preserves this surprising behavior because, unlike those languages, Roc doesn't have a "startup time" (by design!)
Derin Eryilmaz said:
Richard Feldman said:
that is, if I write a top-level
foo = 5thenfoois a constant that compiles to 5Fair enough, but is a
foo = 5inside a function not the same thing? Should there not be some kind of mechanism to do calculations with that at compile time in functions too instead of having to push everything to the top level which generally scrambles the code around a little?
that's a fair point, although the surprise happens right away and consistently
e.g. if we silently "hoist" foo = bar + 1 from inside a function to the top-level, because it doesn't depend on any function parameters, then if I put a dbg in there, I will always see it at compile time
whereas today, I wouldn't see anything when the program starts up, and only might potentially see anything once a code path gets taken that actually references the thing I put the dbg in
so both are surprising, but one of them is only surprising in that I got my answer sooner than I expected, whereas the status quo is surprising in that I get some combination of not seeing my answer at all, or seeing it later than expected, or seeing it multiple times when I expected to see it exactly once
Richard Feldman said:
e.g. if we silently "hoist"
foo = bar + 1from inside a function to the top-level, because it doesn't depend on any function parameters, then if I put adbgin there, I will always see it at compile time
Well, if you did foo = eval dbg bar + 1 then you would also see that debug only once.
also, if I'm doing a build-and-run command (as opposed to a check for example), then it actually will feel more like it's being executed "at startup"
because what I'm seeing is:
which is the same order I'd see if I were building and then running a TypeScript or Python app
it's just that technically the compile-time eval output is happening at the end of the "still compiling" side of the line, as opposed to on the other side of that line in "finished compiling and now running"
Derin Eryilmaz said:
Richard Feldman said:
e.g. if we silently "hoist"
foo = bar + 1from inside a function to the top-level, because it doesn't depend on any function parameters, then if I put adbgin there, I will always see it at compile timeWell, if you did
foo = eval dbg bar + 1then you would also see that debug only once.
oh sure, I'm just comparing status quo to the "always-on" design for compile-time evaluation. My point is that they are both potentially surprising, but I think the compile-time evaluation approach is much less surprising
I agree with that for sure
my main question is about how you'd replicate some of these evals within functions in the current proposal, or whether that's even wanted as a language feature
oh I think we can just do that automatically
as we're compiling, we already build dependency graphs between definitions to detect and report cyclic definitions
if we expand that to detect when arbitrary subexpressions depend on definitions, then at the end of canonicalization we can filter that down to "which subexpressions do not depend on any function arguments, including indirectly?" and evaluate all of those at compile time
so it's like we eval everything that could possibly be eligible for eval
automatically
Doesn't that make compilation a lot slower though?
it's a fair question - I'm not sure, but if it does turn out to do that, we can consider only doing it in --optimize builds
it's possible we could do the "is this effectively top-level" inference during type-checking because there's some overlap between work we already do for type inference and how that analysis would work
sure. my only concern is the halting problem--how can the compiler know whether some list is gonna have ten million items (probably shouldn't be comptime evaluated) or two items
it definitely can't!
but I don't think it's a given that this is a problem in practice
hm fair enough
like for example, if I write something today that doesn't terminate, I generally notice as soon as I run my program
when it hangs
it's a tricky problem for sure
if instead compilation hangs, especially if we have a separate thread that watches and says "hey btw this is taking awhile...we're currently evaluating compile-time constants, so FYI there may be an infinite loop somewhere" it seems like debugging it would be (if anything) easier than debugging it at runtime
but again, that could be very slow
i think automatically doing compile time evaluation on top level stuff is a pretty good idea
well the "set a timeout for 5 seconds and print something to stderr reporting that compile-time eval is taking a long time" approach doesn't add anything to compile times really
This is why zig uses backwards branches as a metric. Enables limiting compile time and avoiding hangs
yeah there's a possible design we can get into that guarantees detecting infinite loops at compile times, but it's complicated so I don't want to start out with it
basically since we know we're only evaluating pure functions, at the point of each backward branch (or recursion) we can record snapshots all the inputs that can change, and then if we see the same inputs that we've recorded in the past, it's definitely an infinite loop and we report that and bail out
since that's expensive, there are two performance optimizations we can do to make it less bad in common cases:
so that system would mean that long-running loops (over 1k iterations, or whatever heuristic cutoff we use) would resolve to either:
but like I said, I'm not sure if it will end up being such a problem in practice that we need to do any of that :big_smile:
and I definitely don't want to impose some caps on like number of iterations that are allowed
either the code succeeds or it fails, but I don't want to end up in a position where you're like "hey heads up, to build this roc project you gotta pass a special CLI flag to bump the limit because this has a big compile-time loop"
I really want to avoid any kind of "bump a cli flag or env var to make it compile" - it should either compile and succeed or not
I am so excited by this thread! I had never seen it before today. It seems like most of the discussion has revolved around the "How should Roc do comptime?" question. @Richard Feldman do you have thoughts on things you would want available for use (or not) in the comptime environment? Specifically I am thinking of a couple categories:
Ryan Barth said:
In-lining data into the roc program, think Rust's include_str!
we already have this! :smiley:
https://www.roc-lang.org/tutorial#importing-files
we already have our own generics system, and I don't think we'd want to change that
I think inlining, unrolling, and simd are all unrelated to this...inlining and unrolling would be (if anything) some sort of keyword attribute
and simd is a whole can of worms - see #ideas > simd
regarding type system info, that's also something that if we wanted to, we could expose it, but right now we have that basically scoped to Encode, Decode and Inspect
but we could potentially expand it in other ways if it makes sense
so that system would mean that long-running loops (over 1k iterations, or whatever heuristic cutoff we use) would resolve to either:
- infinite, in which case we deterministically detect it and report it
- not infinite, but they run slower than they would be if we didn't do the checking bc of all the extra state logic
This approach wouldn't catch infinite loops that never repeat the same arguments:
foo = \list ->
foo (List.append list 1)
But I'm sure it would still catch many mistakes
true, although that one is statically detectable
"recursion that isn't behind a conditional"
if you have both that check as well as the check described above, I'm not sure if there are any remaining scenarios where an infinite loop could fail to be caught :thinking:
But that would mean we'd be solving the halting problem, right?
To be more concrete, I could write a pure function that checks if each even number is the sum of two prime numbers. If it encounters a number that is not the sum of two prime numbers, then it terminates. If the compiler could detect that this function would loop forever, then it would solve Goldbach's Conjecture which is unsolved despite a huge amount of effort.
Isaac Van Doren said:
But that would mean we'd be solving the halting problem, right?
no, I mean that we do the compile-time check for recursion that isn't behind a conditional - we already do this, actually:
── DEFINITION ONLY USED IN RECURSION in examples/helloWorld.roc ────────────────
This definition is only used in recursion with itself:
8│> foo = \arg ->
9│> foo (List.append arg "")
If you don't intend to use or export this definition, it should be
removed!
────────────────────────────────────────────────────────────────────────────────
and then on top of that we do the runtime state tracking thing when evaluating constants
the halting problem states that it's impossible to prove that a program terminates without running it, but in this case we'd be running it :big_smile:
the question I'm not sure about is whether if you have both the state-tracking thing and also the static "no looping without conditionals" check, there are any remaining scenarios where an infinite loop could actually happen at runtime
but what if the programmer really wants to run a long loop that will halt for a reason the compiler can't understand? I don't know how that would be possible without at least being able to turn off the infinite loop check
the halting problem states that it's impossible to prove that a program terminates without running it, but in this case we'd be running it :big_smile:
It's still not possible to prove that a program will terminate while running it. How would you determine that the Goldbach's conjecture program I mentioned above would terminate or not?
It just gives a warning that says, hey this has been running a long time.. but continue.
There's no requirement in the halting problem that you can't run the program. A function attempting to determine if a program halts or not could contain an interpreter but that doesn't help determine if the program will halt
Derin Eryilmaz said:
but what if the programmer really wants to run a long loop that will halt for a reason the compiler can't understand? I don't know how that would be possible without at least being able to turn off the infinite loop check
my contention is that this case needs to be supported - in other words, if we can be sure this is an infinite loop (e.g. pure function that encountered a previous state during recursion/looping; that will deterministically end up back at this state again later!), then we stop and error, but if we haven't proven for sure that we're in an infinite loop, we keep going
Isaac Van Doren said:
To be more concrete, I could write a pure function that checks if each even number is the sum of two prime numbers. If it encounters a number that is not the sum of two prime numbers, then it terminates. If the compiler could detect that this function would loop forever, then it would solve Goldbach's Conjecture which is unsolved despite a huge amount of effort.
oh, to clarify - when I say infinite loop, I mean that it's never making progress
in that case, it would be continuing to do work, and continue to make progress, it would just presumably run out of system resources before it terminated
but that's already something that in the general case we have to allow: if your system has insufficient resources for doing this, then obviously we can't compile it :big_smile:
so maybe I should be more precise and say that we try to detect that the program is looping without making progress
and give an error if you end up in that state, and otherwise, as long as you're making progress, continue the computation
so we don't know for sure that it will terminate (it might run out of system resources first)
but I think the past-state-checking system tells us if we end up in the "we are no longer making progress" situation, at which point we can definitely error out because there is no chance it was going to terminate
Ah okay, yeah erroring on loops that aren’t making progress definitely makes sense :smiley:
What if we are making really slow progress e.g. every 10, 100, 1000… iteration? We are making progress, but exponentially slower
then progress will be really slow :big_smile:
same as if you did it at runtime
Ahh because at least one counter would be counting up and that would be the state change
Last updated: Jun 16 2026 at 16:19 UTC