Having worked with functional error handling a bunch in Elm and Scala, I feel confident in saying that it is quite good at discerning what went wrong and adding context at different layers. But functional error handling is fairly poor at discerning where the error happened, and certainly much inferior to the stack trace you get when throwing an exception (regardless of any other deficiencies exceptions have).
I'm not suggesting that stack traces are the end-all-be-all of error locations, but I do think they would be a big step up of what we have today. I'll write out a bit about what's wrong with the current setup further down, but I'll write my proposal first:
We have talked a bunch about how to manage errors with try, ?, ??, and map_err, all centered around Result where you can put in anything as an error. But what if the error variant, instead of being Err e was Err (Error e) where Error is a custom type provided by the standard library, which can contain extra information, like a stack trace.
The reason we need to store the stack trace is because we're going to build it as we pass errors around with the various operators. I'll admit that I'm only confident in what semantics map_err has and the others I just find confusing at the momemt, so I'm going to suggest my own operators here to avoid getting tangled up in existing semantics. But in the real world, they should just be merged into the existing operators.
Operator 1, aka new_error (just pretend it's an operator and not a keyword):
The new_error operator takes any value e and creates an Error e type from it. When the compiler sees the new_error operator, it creates the string of the location of current file and line, e.g. "MyRocProject/MyCode.roc:234" and it inserts that as the first element of the stack trace in the Error, which contains the stack trace as just a List String.
After that comes operator 2, aka propagate_error:
The propagate_error operator works like new_error, except it doesn't create a new error but just propagates an existing one. But when the compiler sees this operator, it will still create the next level of the stack trace and append it to the Error.
Then there's operator 3, map_error_operator (named to not be confused with the existing function):
Again it's only special in that it builds the stack trace behind the scenes when used, but otherwise it just maps the contents of the given error as usual.
In other words, Error can only be constructed and mapped by using operators that build out the stack trace, and the existing map_err function would be removed. Users can still return an existing Error normally which would not build out the stack trace, but I think that could produce a warning when building.
Imagine an error like this, which is nested according to current best practices:
ConfigError
CouldNotReadConfigError
FileReadError
FileReadError is the original error, which has been wrapped up a few times.
This structure is a "jack of all trades, master of none" error structure. That is:
FileReadError is just the original "symptom" of what went wrong, as many errors are only symptoms of the real problem somewhere else anyway.CouldNotReadConfigError actually tells you what went wrong at the context level that, in this example, has the most information about the error.ConfigError exists mainly as form of stack trace, to tell you where the error occurred, but it doesn't add any useful information about what went wrong.And in the real world you'd have potentially many more levels mixing both context and tracing together. I think it would be preferable for this error to be something like
{ payload= CouldNotReadConfigError (FileReadError)
, trace = [
"MyRocProject/Config.roc:234"
, "MyRocProject/Setup.roc:123"
, "MyRocProject/Main.roc:23"
]
}
where the contents of payload is what Error is generic over (so Error payload essentially).
The benefits here are twofold:
There's a few reasons the stack trace is superior to the old trace:
ConfigError may be applied in multiple places in the code, and it might not be clear from the trace which one was hit)Config.get!().map_err(ConfigError). This code is harder to read without any real benefit. Roc afford us easy propagation of errors, but we're not really getting the most of it when the best practice is to map the errors everywhere, even when it doesn't add value to do so.All the issues above can also be fixed with developers writing disciplined code, but in that case, it still feels like it needs a lot of effort compared to just getting a stack trace for free, and only mapping your errors when you actually have something useful to add to it.
I'm not sure about the specifics with the proposed operators but I would like to have errors with traces as well :)
One of the reasons this isn't done is because the performance is generally awful.
The issue is that errors in results are generally not exceptional. They are actually pretty common in many cases. Adding locations is both bloat to the binary and extra allocation with data movement.
That said, I completely understand the goal here. Location traces can be great. Debugging can be a pain without them.
This is the one advantage of exceptions in my opinion. They essentially have free error traces. That said, exceptions are also slower in the error cases where errors are often handled by default. They really are meant for the exceptional case where errors are very uncommon.
I generally find that either the error is expected to be handled. At which point any sort of error trace or even wrapping is pretty wasteful. (Might still be worth while to wrap a little, but not great to do it a ton).
Or the error is truly exceptional and is not expected to be handled. At which point crashing can give you a backtrace and has low cost.
Yet, as shown by error context wrapping in go and rust, clearly some form of nested error that kinda half holds a stack trace will exist in pretty much any language that defaults to using errors instead of exceptions. (My gut feeling is that this is a bad design pattern)
One main advantage of simply wrapping is that it is way cheaper than strings and exceptions. You get a string in the form of the tag name, but it is nearly free to add.
Could we go with something like RUST_BACKTRACE=1?
General question. What is your plan for the trace? Is it ever actionable in code?
What is your plan for the trace?
I would want it just to understand the path the code has taken so I can understand and solve the bug more quickly
Yeah, so not actionable in code.
Also, looking at the state of anyhow in rust and go error wrapping. They basically are akin to nested tags, but with slightly more free form error strings. Also, anyhow will apparently store a backtrace in the root error if RUST_BACKTRACE=1.
That definitely is an interesting idea. Not sure what it would take to orchestrate, but just grabbing a backtrace (not converting it to string yet) and holding onto that for printing when the error is printed.
The only way to get an equivalent trace in roc today would be to crash at the error generation sight. Which is actually really easy to do with ??.
Which, if you don't plan to handle the error in your app at all sounds totally reasonable
now that we have purity inference, platforms can offer a backtrace! function
which applications could use to log backtraces immediately when desired, separately from error handling
in other words, do something like log_error!(backtrace!(), "Something really unexpected happened" and then return Err
so decoupling the logging of the backtrace from the handling of the error
If backtrace is a => Str function, you _could_ even store the backtrace in your Err if you wanted
Switching to logging does work, but it also loses some value due to requiring an effectful function chain.
That's a strong point
Can't we just have a dbg-like command add a stack-trace for development?
Or something of the sort?
yeah there's an ongoing open question as to whether there will be demand in practice for logging in the middle of pure functions, or if it's fine to have logging (which is obviously an effect) only allowed in effectful functions
that makes the most sense by default, of course
Yeah, I guess backtraces are still kinda tangential to this. If we exposed logging as a special non-effectful builtin like dbg, a platform could choose to log a backtrace on every error log.
the argument for "allow logging in pure functions as a special case exception" is that it's an effect that isn't supposed to affect the rest of the program, and is also theoretically only supposed to be recording what's happening, so if the compiler decides to optimize the pure function away (e.g. evaluate it at compile time) then the fact that the logging gets skipped should also be harmless
I think it's fine with something like dbg that's stripped from release builds
I guess the biggest disadvantage of solutions like logging is that they are more verbose...though it could just be my_fn(a, b, c) ? log_err!
certainly I expect webservers to do lots of logging (and/or spans/traces/etc)
Yep
I think a keyword like trace would be nice paired with ??
not sure how much that will vary by use case, and how much other use cases want logging
I think it's fine if the handler type is => for a webserver
As long as your core logic is pure
That keeps effects on the edges
If you don't allow for logging in request handlers, I think your webserver use cases are dead in the water
well request handlers are usually full of I/O, so certainly those are effectful :big_smile:
Exactly
I think it is very important to note that the goal of this idea is to get a backtrace from any function (including a pure function), not just an effectful one. The pure function returns a result and as part of the error, a trace is included. This enables an effectful logger to capture the full contexts that starts at the error root in a pure function.
If it only works for effectful functions it is a lot less useful for debugging and understand the full stack trace when first developing code.
yeah so it seems to me like:
then there's the separate issue of "I'm just trying to debug the program I'm running right now, I don't care about persistent logging"
for that use case, one obvious question is "if we had a really nice debugger, how much demand would remain for backtraces inside pure functions?"
I'm not sure what the answer would be there
the nice thing about doing effectful logging of backtraces via the platform is that it's already doable today, so there's no blocker to trying it out and seeing what use cases remain in practice when you already have that
Richard Feldman said:
for that use case, one obvious question is "if we had a really nice debugger, how much demand would remain for backtraces inside pure functions?"
I kinda have answer to this. There are no good debuggers that exist on all platforms and are easy to use. As such, there is almost always a demand to easily add a backtrace at least for people who aren't used to debuggers. As someone used to debuggers (albeit mostly stuck in cli with lldb), I still often would rather just get a good backtrace and never have to open a debugger
Yeah, I definitely think we should add a logging effect to basic CLI and basic webserver that can also log a backtrace. That might alleviate a lot of the pain.
The goal is really to create more useful errors. Both in stopping people from trying to come up with make-shift alternatives to stack traces inside their errors, which dilute the error itself, but also in actually giving people the information they need proper.
Generally I find the types of errors reported with Result break down into three kinds:
Before you figure which kind of error you have, I think every error goes through roughly these stages:
Result error starts as a potentially recoverable errorIn the ideal world, I think I would want the following:
CouldNotReadConfigError, as well as any context it's received, along with the full stack trace of where the error originated.crash, but I think it's important that crash won't give the stack trace that I want. It only has the stack trace of where the investigation concluded, not where the error originated.Is a stack trace actually useful to an end user though?
I'm alright with logging context along the way, rather than at a single point, but it requires more discipline from developers and more tooling to get right, since you need to correlate more logs, rather than having a single log entry that contains everything you need, and you need to remember to log everywhere that might be relevant.
No, I wouldn't want a stack trace when reporting something end users messed up (e.g. permissions, configuration, etc.). But if the error is for the end user to report to the developer, you definitely want the stack trace available for the user to report. Like "please open a GitHub issue at #link and include file applogs.log"
Yeah, this is one of those surprising annoying problems of results and error returns. You really don't want to pay extra cost on every error, but stack traces are amazing when needed.
And you can't get a full stack trace if you first return through a chain of pure functions.
Kasper Møller Andersen said:
I'm alright with logging context along the way, rather than at a single point, but it requires more discipline from developers and more tooling to get right, since you need to correlate more logs, rather than having a single log entry that contains everything you need, and you need to remember to log everywhere that might be relevant.
Instrumenting code with tracing instead of logging might be a good alternative for this. If an operation fails you'll end up with a single trace of the request/operation in which the error happened.
Because trace frames, like logs, are added manually, they don't capture as many frames as a stack trace, so that's a downside.
But an upside is that a trace of an error can contain frames about code branches that were completed before the error happened, which can provide a ton of useful information when debugging. Plus, you can use that trace for other types of debugging as well, such as looking into performance problems.
Yet tracing without some correlated logs (or events) aren't really helpful as well, are they?
I like what e.g. they are doing here:
https://effect.website/blog/releases/effect/311/#effectfn
https://effect.website/blog/releases/effect/312/#effectfn-improvements
https://effect.website/docs/observability/tracing/
I wanna bang this drum again, because I think stack traces are still really important. Roc's error handling is pretty similar to Rust in a lot of ways, and it's very easy to find people asking about how they get stack traces in Rust, like so and so. Not having stack traces by default is essentially a big deficiency and eats a good chunk of weirdness budget.
I also still think it nullifies a good deal of the benefit of the Roc's tag unions, because it essentially forces people to wrap their error types a whole bunch to try and recreate stack traces. So even though you can just propagate errors in Roc, you might not really want to, because you lose location information that way. Roc wants to make it easy to map errors, but I think that's partially about fixing this symptom, rather than fixing the root problem, because we would be mapping errors way less if we didn't have to manually build up types to mimick stack traces.
So here's a revised proposal for how this might look!
Result looks like this:
Result ok err :
[
Ok ok,
Err StackTrace err
]
where StackTrace is a nominal type that stores stack trace lines (we'll get back to how it does so later). Those lines would all be the type StackTraceLine a la:
StackTraceLine :
[
MyRocProject__Config U16
MyRocProject__Setup U16
MyRocProject__Main U16
...
]
where there exists a function line_to_string that does this:
when line is
MyRocProject__Config line -> Str.concat("MyRocProject/Config.roc:", Num.to_str line)
...
StackTraceLine and line_to_string would be generated by the compiler and not something the developer would deal with. It does have a few implications:
Err themselves, because they would need to insert the right stack trace line when doing so.Err should have a function trace_to_string that converts the full stack trace to a human readable string.In other words, we would need a way to construct an Err with a keyword probably. Like fail or (please don't shoot me) throw :big_smile:
Taking an example from the tutorial might then look like:
|str|
if Str.is_empty(str) then
Ok "it was empty"
else
fail ["it was not empty"]
And any time you use ? and whatever else we have to handle errors these days, the compiler basically desugars that to the same code as today, except it also inserts the corresponding StackTraceLine into the StackTrace.
What would StackTrace look like though? Ideally it would be an array on the stack, as that would be the simplest and most performant solution I think. We would have to spill the lines onto the heap at some point of course, but given that a single StackTraceLine would only take up something like 32 bits, there's at least room for a chunk of them on the stack.
Without arrays though, how might StackTrace look? It might just be
{ line1, line2, line3, line4 ... }
and keeping track of which line to use with an integer and something like
when previousLine is
1 -> set_line_2(...)
2 -> set_line_3(...)
...
Not as nice as an array, but workable at least. Alternatively, I don't know if it's been considered to have List be able to start off on the stack when there are only a few elements in it, and only spilling to the heap as needed?
And then there's the question of how many errors should be storable before we spill onto the heap. I don't have a good answer to this. On the one hand, a small number might be sufficient, because if you're not handling the error in short order, you're probably going to let it bubble all the way out anyway. On the other hand, a library might have a deep stack of its own before the error reaches the user of the library, and it would be nice not to use the heap before they've had the chance to deal with the error. So I'm not really sure what makes the most sense there.
I think adding this amount of runtime overhead to every error operation is too incompatible with Roc's goals of running fast
for example, this would mean that doing div_checked potentially does a bunch of string copying and dynamic array resizing - that's just wildly out of bounds for an acceptable performance cost compared to today where it's a single branch
of the options we've discussed, this seems like the frontrunner to me:
Richard Feldman said:
now that we have purity inference, platforms can offer a
backtrace!function
then error tracing libraries like bugsnag can take an effectful "get backtrace" function during init, so when they log errors they automatically include stack traces just like they do in e.g. JavaScript. "Log an error to an external service, including stack trace" is the most common scenario I've seen for stack traces being useful for debugging after the fact, and we already have full support for that use case today!
if you're debugging locally, there are other options (e.g. setting a breakpoint and seeing what the trace is at that point)
Note that I specifically addressed those performance concerns with the new proposal. Creating an error shouldn’t allocate strings or lists on the heap, require anything to be resized, etc.
:thinking: how would it prevent resizing?
sorry, I think I should be more direct about this: regardless of performance, I don't think we should do this.
I don't think Result should store stack trace information, period, and I think the Rust error handling libraries that do similar things are the wrong design
I think there are two scenarios where we want stack traces:
I agree that stack traces are valuable information, but I disagree with the premise that we should store them eagerly and accumulate them and pass them around, just in case we want them later. I think we explicitly should do the opposite of that, and only retrieve a trace on demand, right at the point where we've determined we want it.
Note that I specifically addressed those performance concerns with the new proposal. Creating an error shouldn’t allocate strings or lists on the heap, require anything to be resized, etc.
I work on a system that has to store stack trace on creation of nodes. It just stores the raw reference to the stack trace and is doing pretty minimal work. It is still quite costly. Much more expensive than the old version that was just return errors without nice locations. (like 1.5 to 2x slower and it is not storing that many stack trace references).
I am really curious to see what mojo ends up doing in this space. They currently only have an error type and not an exception type cause they have not found a performant enough way to do exceptions (though I think they had some ideas). Due to wanting to be a superset of python in the long term, they definitely want exceptions eventually. Currently the solution is to run code in the debugger and make it so that any time an error is generated the debugger adds the stack trace and treats it like an exception (that or manually grabbing the stack trace and adding it to an error explicitly). It is currently pretty painful to work with.
I wanna bang this drum again, because I think stack traces are still really important.
Yeah, rust and go often deal with this by repeated wrapping and adding of more and more context. It is definitely not as nice as a stack trace in most cases.
I think it really depends on the application
like in the Roc compiler I want context so I can report them to the user
I wouldn't want to spit out a stack trace even if it were free
in a web server I want my logged error events to have stack traces but I don't think having the stack trace be passed around as a value is in any way useful to me (although it's a security concern if it's inspectable)
I wonder if we can enable getting a stack trace cleanly (even if only for crash messages).
With debug info and the llvm backend, backtraces should work if grabbed from the host. If running via the interpreter, a host backtrace would be useless.
hm yeah that's true
actually the compiled Roc app could expose a function to the host for getting the current roc backtrace
which the host could call, both for its own use and also as a way to provide it to the app
Yep, though I assume that would add a dependency on libunwind to roc. Which might be ok.
and then that function could silently either ask the interpreter or else walk stack frames
yeah that seems fine
Ok
like we want every host to be able to support backtraces
And then we could also expose that functionality to the app (though only as an effect?)
and right now you kind of have to know the tricks
I think it should be up to the platform to provide that functionality to the application (or not), but we should make it trivial for platforms to offer it
Richard Feldman said:
like we want every host to be able to support backtraces
Even in a fully roc is embedded in a larger host use case, like a game engine?
I think it's just simpler if all effectful functions come from the platform, no exceptions
fair
@Luke Boswell sure, like if the roc plugin crashes, you want to be able to know what chain of calls led to the crash
Brendan Hansknecht sagde:
I work on a system that has to store stack trace on creation of nodes. It just stores the raw reference to the stack trace and is doing pretty minimal work. It is still quite costly. Much more expensive than the old version that was just return errors without nice locations. (like 1.5 to 2x slower and it is not storing that many stack trace references).
What is the reference to in this case? I assume it's a heap allocated collection (whether string or something else), so the cost is for building that initial trace, rather than just holding on to it?
Richard Feldman sagde:
I don't think
Resultshould store stack trace information, period, and I think the Rust error handling libraries that do similar things are the wrong design
I'm curious what you see as being wrong about that design? Not that I disagree necessarily, I just want to make sure we're talking about the same things :blush:
What is the reference to in this case?
A traceback object which should should just be a list of function pointers extracted from the stack. No strings have been created yet. But I assume it has to walk the stack and make a list. I guess you could minorly amortize the cost if you grab it one step at a time on every return, but I think it is fundamentally the same amount of extra cost.
Kasper Møller Andersen said:
Richard Feldman sagde:
I don't think
Resultshould store stack trace information, period, and I think the Rust error handling libraries that do similar things are the wrong designI'm curious what you see as being wrong about that design? Not that I disagree necessarily, I just want to make sure we're talking about the same things :blush:
in no particular order:
Result is a simple and flexible type, and including stack trace information seems like massive scope creep for it with really unclear benefits in comparison to alternative ways of getting stack traces that don't involve Result
even if it were free, the idea that a Dict.get saying a key wasn't present in the dictionary triggers an automatic walking of the entire stack frame feels wrong in a visceral way
Richard Feldman sagde:
I agree that stack traces are valuable information, but I disagree with the premise that we should store them eagerly and accumulate them and pass them around, just in case we want them later. I think we explicitly should do the opposite of that, and only retrieve a trace on demand, right at the point where we've determined we want it.
My problem with this approach is that it relies on discipline to get a lot of things right, and you don't really know ahead of time when you're going to need it. Since capturing a trace is not the default, you end up having to decide between paying the performance cost or the debugging cost without knowing what the debugging cost is (because you have to understand every way a piece of code can fail in order to know that cost).
I would personally much rather lug around a stack trace, and be able to opt out of collecting it in the few places where I know this performance matters.
Richard Feldman sagde:
even if it were free, the idea that a
Dict.getsaying a key wasn't present in the dictionary triggers an automatic walking of the entire stack frame feels wrong in a visceral way
This seems like you're thinking the Err should walk the entire stack upon creation, which isn't what I'm proposing. Instead I'm proposing that the stack trace is built up as the error is propagated through the stack anyway. In this sense, creating an Err still has no logic attached (no branches, no heap allocations). You only have to deal with this as you start propagating the error.
Just so I'm sure you're disagreeing with the right thing :smiley:
fair, but I don't see how that would work without the possibility of reallocation if that gets too big
I guess the underlying problem is that stack traces attached to Result are an imperfect approximation anyway. What we really want is a way to retrace the exact steps the code took to get to a certain point, and it just happens that Result is usually the place where the breakage becomes visible.
Having said that, I do worry that Roc's strength of allowing you to do whatever you want with errors is also a great weakness. Because it means you are free to do nothing at all with the error until you are far away from its origin. It's kind of like exceptions in that regard, except you don't get a stack trace either, so you're truly in trouble when you have to debug where it came from. And it's not like making this error easily debuggable is a one-off effort. It requires continuous discipline at every level the error gets passed around.
I think the history of errors in programming is that they are mostly ignored way more often than they should be
I've never seen any system that really fixes this
Richard Feldman sagde:
fair, but I don't see how that would work without the possibility of reallocation if that gets too big
You would need to reallocate at some point. I'm just distinguishing between:
exception-throwing systems and null-based systems seem to result in more unintentionally u handled errors than Result/Option/Maybe
maybe a better way to frame my thinking on this is:
"Okay, so this will break purity, but hear me out..."
"Yikes, this had better be the most incredible upside I've ever heard of to compensate for that downside"
"Well it has a bunch of other tradeoffs"
"Okay then absolutely not"
like I don't really think it's worth spending more time talking about it, sorry :sweat_smile:
I have, if nothing else, won myself the right to feel smug the day people start complaining about not having stack traces :stuck_out_tongue:
hahaha :joy:
I went to bed feeling weird about this, because I think most arguments against this proposal are not based on the proposal itself, but rather just perceptions of what it is. Maybe that's on me for communicating it poorly, so I want to try again!
Just to get it out of the way: my proposal does not mess with purity in any way @Richard Feldman
As I laid out in the original post, today it is up to users to construct their error types such that they can actually be traced back to their origin. You do this by wrapping layers upon layers of error types, with the associated risks that you forget wrapping some places and/or you reuse names of these error wrapper types. This makes it very easy to have an error that is only partially traceable, because you weren't 100% disciplined about the tracing.
My proposal takes that work that users need to be doing today themselves, and automates it. It's the same fundamental mechanism, just handled by the language as opposed to the user. And because it doesn't rely on effects, it works just as well for libraries as for applications (where backtrace! needs to be hooked up in a library for example).
Regarding security, the only way a library would be able to read from a stack trace is if you pass it a Result as input (but it's still pure!). This is actually less invasive than calling backtrace!, because you can only see where the code has been since it became an Err, whereas backtrace! will give the code the full trace.
One thing that's not really clear to me here though, is whether we would want to encourage people to use backtrace! instead of creating these adhoc tracing structures in their error.
I think that becomes important for the argument of the type complexity at least. Because the argument that the stack trace complicates Err is kind of fair, but also ignores a bunch of other complexity.
The type signature Err err is obviously simpler than Err StackTrace err of course, but that's also glossing over the complexity of err. Without StackTrace in there, users are asked to build that structure themselves and contain it in err (and most likely do a somewhat poor job with it). Introducing the StackTrace type is not about introducing new complexity. It's about taking complexity away from err, and by extension the user, and automating it.
Kasper Møller Andersen said:
Just to get it out of the way: my proposal does not mess with purity in any way Richard Feldman
Kasper Møller Andersen said:
So here's a revised proposal for how this might look!
Resultlooks like this:Result ok err : [ Ok ok, Err StackTrace err ]where
StackTraceis a nominal type that stores stack trace lines (we'll get back to how it does so later).[...]
where there exists a function
line_to_stringthat does this:when line is MyRocProject__Config line -> Str.concat("MyRocProject/Config.roc:", Num.to_str line) ...[...]
Errshould have a functiontrace_to_stringthat converts the full stack trace to a human readable string.[...]
And any time you use
?and whatever else we have to handle errors these days, the compiler basically desugars that to the same code as today, except it also inserts the correspondingStackTraceLineinto theStackTrace.
the parts I just quoted mean:
Result and call it from two different functions, passing the same argumentstrace_to_string on the returned ResultResult has this propertywhat am I missing? :sweat_smile:
oh I guess it's that the trace only starts when you call fail for the first time?
in that case, you get way less info than with backtrace!() because you don't get to see which calls led to the error in the first place.
assuming that's correct, this still has the problem that now reorganizing pure functions can break them
like I can take working code, add a comment somewhere, and now the code breaks
because pure functions that return Result now incorporate their own source path and line numbers into their own return values
Kasper Møller Andersen said:
The type signature
Err erris obviously simpler thanErr StackTrace errof course, but that's also glossing over the complexity oferr. WithoutStackTracein there, users are asked to build that structure themselves and contain it inerr(and most likely do a somewhat poor job with it). Introducing theStackTracetype is not about introducing new complexity. It's about taking complexity away fromerr, and by extension the user, and automating it.
I'm just not convinced by the fundamental premise that accumulating and passing around stack traces is actually the right way to organize error handling code
like if an error happens in my webserver, and I want to log a stack trace to a reporting service, literally what I want is to call bugsnag.error!("This should never happen...") and have it capture a stack trace for me. This is what error logging services do in languages that support getting a stack trace anywhere.
I actively do not want to pass the stack trace around anywhere in that scenario
I just want bugsnag to put it in my logs and then I want to move on
I'm never going to do anything with it again
moreover, I'm often not going to return a Result
I'm going to try to gracefully handle the error for the end user, but I still want to have captured the trace of how I got to the point where gracefully recovering was necessary, so I can fix it next time and not have to recover
in contrast, other times I'm translating from one error type to another because I want to report it to the end user without a stack trace, since a stack trace would not help them
again, in that scenario I want to log that the problem happened, right where it happened (possibly just locally via log levels rather than to an external service, depending on what the program does) and then after that point I'm not going to use the trace ever again
so I think if Roc changed Result in this way, I would be inconvenienced by the type and performance costs and then never use the thing it's encouraging me to do
Richard Feldman said:
like if an error happens in my webserver, and I want to log a stack trace to a reporting service, literally what I want is to call
bugsnag.error!("This should never happen...")and have it capture a stack trace for me. This is what error logging services do in languages that support getting a stack trace anywhere.
How would it get the stack trace in this case?
by calling backtrace!() - as the application author, I'd pass it in both that function as well as the function to do an http request, and it would store both of them
(on initialization, when I'm providing the API key - not every time)
one possible design:
Bugsnag.add_backtrace : Request, List TracedCall -> Request
Bugsnag.init : (Request => {}) -> Bugsnag
example:
Bugsnag.init(|req|
_ =
req
.(Bugsnag.add_backtrace(backtrace!()))
.(Bugsnag.add_api_key(key))
.(Http.send!)
{} # if logging fails, do nothing
)
nice properties of this design:
Note, even if this doesn't break purity cause it accumulates one function at a time as a result is returned, it still breaks correctness in roc.
A pure roc function should return the same thing no matter the compilation mode or target. These back traces would depend on inlining, file location, and potentially target (due to different debug info on windows and wasm)
Richard Feldman sagde:
because pure functions that return
Resultnow incorporate their own source path and line numbers into their own return values
It feels weird because Roc doesn't have other meta programming built in I suppose. If it was clearer that this actually takes the entire code base as input, then it would still be considered pure, because any code relying on the output would only "break" if the input was changed. Not that I think Roc should have more meta programming, it just still makes sense to call this a pure function to me :big_smile:
Brendan Hansknecht sagde:
Note, even if this doesn't break purity cause it accumulates one function at a time as a result is returned, it still breaks correctness in roc.
A pure roc function should return the same thing no matter the compilation mode or target. These back traces would depend on inlining, file location, and potentially target (due to different debug info on windows and wasm)
I don't get how this would break across targets? Wouldn't it work exactly the same as the current error mapping and propagation does with respect to inlining and so on? This wouldn't rely on debuginfo at all in my mind.
Richard Feldman sagde:
like if an error happens in my webserver, and I want to log a stack trace to a reporting service, literally what I want is to call
bugsnag.error!("This should never happen...")and have it capture a stack trace for me. This is what error logging services do in languages that support getting a stack trace anywhere.
I think about it this way:
In this view, backtrace! and my proposal are complimentary actually. backtrace! tells you how you reached function x, whereas the Result stack trace states what went on inside of x.
I think it would help me to break this down by error type here too.
The two overall categories of errors are predictable (Result) and unpredictable (crash essentially).
When talking about unpredictable errors, I think we concluded that those aren't recoverable in Roc unless the platforms specifically gives the tools needed for that (spawn a new process that is allowed to crash for example).
Predictable errors on the other hand can be either recoverable or unrecoverable. That depends entirely on the error. And when I say "recoverable", I mean the error is entirely within the normal functioning of the system, and all ends up being good.
I assume the different approaches we're talking about here pertain to predictable unrecoverable errors. But I think I'm missing parts of the larger picture still.
In the thread about unpredictable errors with crashes and recoveries, I also believe we discussed that the platform would be the stability boundary, and if you had an unrecoverable error (even from a Result), it was better to crash and let the platform handle the error.
The reason I made this proposal is really to help capture what goes on before you decide to crash. That is, when you get a Result, you will presumably do some analysis and propagate it around a bit before you conclude that it is indeed unrecoverable and you decide to crash. But capturing the stack trace inside the Result from the first Err instance through to calling crash, helps provide a fuller stack trace. Because the stack trace from crash is only the trace from when you finished your analysis, and not from the original error actually occurred.
But maybe I got the wrong impression from that thread, and crash is no longer the preferred option for dealing with unrecoverable errors?
I know there's also a lot of moving parts here and that it's uncharted territory for Roc, so I'm not blaming anyone if it's all down to things having yet to settle :big_smile:
Note, even if this doesn't break purity cause it accumulates one function at a time as a result is returned, it still breaks correctness in roc.
A pure roc function should return the same thing no matter the compilation mode or target. These back traces would depend on inlining, file location, and potentially target (due to different debug info on windows and wasm)
I don't get how this would break across targets? Wouldn't it work exactly the same as the current error mapping and propagation does with respect to inlining and so on? This wouldn't rely on debuginfo at all in my mind.
Let's not even think across targets. Let's just think solely across optimization levels. An optimized build can inline functions. This changes the stack trace. An optimized build can also remove debug info. That also changes the names in a stack trace.
Unless an effect occurs, roc will return the exact same result across targets and optimization levels. Adding stack traces to results without some sort of explicit backtrace! effect would break that.
What if there was a syntax that generated a unique tag that the roc toolchain/debugger/etc could easily map back to a specific source location? Something like $MyTagName, with some tooling/library functions that could extract the name and source location it was instantiated from.
You'd use it like so:
File.readUtf8!(my_path) ? $ErrorReadingConfig
If that read fails you get back something that'd print as Err($ErrorReadingConfig:1234(FileNotFound)). The actual representation would be a guaranteed-unique tag value, even unique across callsites where the same $tag name is used.
And you could match on that (relatively) normally, like so:
match err {
$ErrorReadingConfig(FileNotFound) -> Stdout.line!("Config file ${config_path} was not found")
# etc
}
That would use a global table to grab the "canonical" tag id and match on that (mapping potentially several source locations where $ErrorReadingConfig is constructed, down to a single value suitable for matching).
You'd have something like Tag.source_location(err), which would use a similar global table to map back to the filename name line/col number and return that.
You'd need some utilities that could take roc binary and a string "backtrace" like the above, and map that back to source locations. Or perhaps that just gets compiled into the binary.
That gets you 70% of the way to full backtraces and encourages wrapping errors with useful, match-able context.
What enables this mapping from source location to backtraces? Is it simply recursive with tons of tag wrapping?
With this, the back trace -is- the ‘inspect’ output. You’d have another tool that maps the generated ids parsed from that string back into source locations that are more human readable.
Generally stack traces represent recursion and mutual recursion, but tags don't.
If you make your stacktrace represent recursion or mutual recursion, you are back to allocating a ton for an error that likely will just get thrown away and handled.
On top of that, you still lose information if some chunk of code doesn't opt into this form of error. I know one important call put above is that you may want a stack trace from a library, but this would be opt-in at a perf cost, so libraries probably wouldn't opt in.
I'm not sure this would turn out well in practice, but maybe.
Yeah, you’d have to be careful with recursive cases. I’d probably just not add tags there.
The perf cost should be very low, if the backtrace isn’t inspected.
Even for flat tags, you likely would devour stack space. Assuming no recursion, but a large branching tree of functions, the top level main has to allocate stack space for a nested tag that can represent any possible call chain
And the next function down the same but with one less wrapping and same again
So you would get gigantic error result payloads
Hence why this would be judicious and manually added
Don’t want that to happen automatically for all calls, just semantically important ones
This solves the problem of getting “just enough” source location info back to understand errors from production.
And it’s less annoying than manually choosing unique tag names
Possibly. Still block any library introspection (without author opt in), but is definitely something.
Last updated: Jun 16 2026 at 16:19 UTC