Stream: ideas

Topic: error line numbers


view this post on Zulip Anton (Jan 09 2024 at 12:44):

After the new proposal for chaining syntax I thought about a missing component in our Task/Result error handling approach; the error origin location. If an error happens in your Roc program, the printing/logging of the error may happen far away from the code that caused it. You obviously want to be able to find that code easily.

Could we make this work with a black box function like Compiler.origin MyErrTag?

This is not a high priority issue but I wanted to talk about this in case it required modifications to the chaining syntax proposal.

view this post on Zulip Agus Zubiaga (Jan 09 2024 at 13:03):

Interesting. It’d be cool to be able to track errors back to their source. I’m not sure what you want is where the tag was created, though. I think that’d mostly give you the location of helper functions or package internals, when I guess what you want is the place you called those from in the app.

view this post on Zulip Agus Zubiaga (Jan 09 2024 at 13:13):

Unless we gave you a full backtrace until it was created?

view this post on Zulip Anton (Jan 09 2024 at 13:45):

A full backtrace is something I want as well :) but for origin it could be limited to non-dependency code.

view this post on Zulip Richard Feldman (Jan 09 2024 at 14:47):

I've thought about this in the past. A general concern I have about it is that runtime logic can now depend on line numbers, which in turn means things like if you get a PR to a Roc code base that adds a comment, you need to run all the tests because they might not pass anymore

view this post on Zulip Richard Feldman (Jan 09 2024 at 14:48):

also rearranging where functions are defined, even if their implementations don't change, can now cause regressions

view this post on Zulip Richard Feldman (Jan 09 2024 at 14:49):

I think the only way to avoid that would be to have getting source code info be a Task even though it's just getting a static constant

view this post on Zulip Richard Feldman (Jan 09 2024 at 14:50):

either that or having some builtin SourceInfo opaque type that you could only turn into List U8 or Str using a Task

view this post on Zulip Richard Feldman (Jan 09 2024 at 14:51):

and which didn't have any abilities

view this post on Zulip Anton (Jan 09 2024 at 14:52):

You'll most likely already be dealing with tasks to print/log the error so chaining them together seems alright.

view this post on Zulip Richard Feldman (Jan 09 2024 at 14:54):

backtrace would definitely need to be a task

view this post on Zulip Richard Feldman (Jan 09 2024 at 14:54):

but maybe that one should be a platform thing

view this post on Zulip Richard Feldman (Jan 09 2024 at 14:54):

:thinking: in fact maybe all of these should be platform things?

view this post on Zulip Richard Feldman (Jan 09 2024 at 14:55):

like just have file errors automatically include a backtrace

view this post on Zulip Richard Feldman (Jan 09 2024 at 14:57):

the platform could also do a "task-aware backtrace" where it tells you what the tasks were that led up to it, by writing them down as it encountered them in the state machine

view this post on Zulip Richard Feldman (Jan 09 2024 at 14:58):

all of these things have an unavoidable runtime cost, and some of them have that cost even if the error doesn't happen

view this post on Zulip Anton (Jan 09 2024 at 15:02):

We could only enable it when an env var is set

view this post on Zulip Anton (Jan 09 2024 at 15:12):

The used Tasks could then have EnvVarNotSet errors.

view this post on Zulip Richard Feldman (Jan 09 2024 at 15:24):

hm, I definitely want to resist changing progarm behavior based on env vars - except maybe for compiler debugging things :sweat_smile:

view this post on Zulip Richard Feldman (Jan 09 2024 at 15:25):

I heard a good point somewhere that the reason it's so hard to build a lot of modern software is that a bunch of the build configuration needed to get a program that works as expected comes from the environment (including compiler args) rather than the source code

view this post on Zulip Richard Feldman (Jan 09 2024 at 15:26):

so basic --release and --debug flags are pretty straightforward, and --target seems unavoidable (unless cosmopolitan hosts started becoming popular or something) but reproducibility can deteriorate pretty quickly after that

view this post on Zulip Richard Feldman (Jan 09 2024 at 15:27):

basically I want to minimize the number of scenarios where GitHub issues on Roc projects are answered with "oh you just need to roc build with ______, then it'll work"

view this post on Zulip Richard Feldman (Jan 09 2024 at 15:28):

so optimize vs don't optimize flags seem necessary, and --target seems necessary, but anything beyond that I'd really like to avoid (e.g. --no-link is something I'd really like to move into platform module configuration at some point)

view this post on Zulip Richard Feldman (Jan 09 2024 at 15:29):

because really, either the platform always expects --no-link or it never does, so that's part of the build configuration and not something you want to change back and forth on the fly on a regular basis :big_smile:

view this post on Zulip Anton (Jan 09 2024 at 15:43):

the reason it's so hard to build a lot of modern software is that a bunch of the build configuration needed to get a program that works as expected comes from the environment (including compiler args) rather than the source code

I definitely agree with that but I think this would be a very minor infraction in this regard. I never had issues because of how RUST_BACKTRACE is set.

view this post on Zulip Agus Zubiaga (Jan 09 2024 at 15:46):

I think the difference is that RUST_BACKTRACE doesn't affect what code runs, unless you're literally reading the env var from the code.

view this post on Zulip Agus Zubiaga (Jan 09 2024 at 15:49):

Maybe instead of getting a backtrace you can act on in Roc code, the platform should just expose a way to print it if the env var was set

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 15:52):

This feels like a debate of: do we want exceptions of raw error return types.

view this post on Zulip Anton (Jan 09 2024 at 15:52):

I want the benefits of both :p

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 15:53):

Anything using results and errors should really be touched locally as It goes up the call stack (look at go for the clearest examples, rust for second).

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 15:53):

Again, why automatically just passing all the errors up the stack unmodified isn't wanted default behaviour.

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 15:54):

Really should be wrapping errors with more context as they go up the stack if you want richer understanding and info. That or more likely transforming them completely cause inner context may not matter to outer functions

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 15:58):

Basically, default of automatic propagation is default of not caring about errors and considering them exceptional. If that is the case, we should just use exceptions. If that isn't the case, we should not default to propagate and making users think about errors in a way that this isn't wanted.

view this post on Zulip Richard Feldman (Jan 09 2024 at 16:54):

I think a relevant distinction here is who the target audience is for the way the error gets reported (assuming it isn't handled somewhere)

view this post on Zulip Richard Feldman (Jan 09 2024 at 16:54):

for example, in most consumer-facing applications, reporting line numbers is actively bad UX

view this post on Zulip Richard Feldman (Jan 09 2024 at 16:54):

so there's no benefit to recording them for those use cases

view this post on Zulip Richard Feldman (Jan 09 2024 at 16:56):

if you want the line number to be reported to someone other than the end user (e.g. the programmers who wrote the application) then there's the question of when it gets reported

view this post on Zulip Richard Feldman (Jan 09 2024 at 16:57):

e.g. are we talking "log the error somewhere immediately when the error is created" or "pass the error around to other functions, and then maybe one of them decides to handle it and maybe none of them do, but either way the error gets logged later"

view this post on Zulip Anton (Jan 09 2024 at 16:59):

It seems realistic for a production app to do both of these things (in different places).

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:00):

For a production app, do you think they would just add a helper that is essential onErr log Err but still return the Err.

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:17):

I think there is an important point of context that should be added here. This problem has been hit and dealt with in both Rust and Go. It isn't a single standard solution, but the dominant solution that I have seen in production apps is a special error type that is made for wrapping (and later unwrapping). The default use at every function call is to add extra context and just pass up the stack.

This is done specifically to deal with the lack of context that makes people want line numbers. (it also is often done poorly, but is still much better than no context).

Only at specific api points are those special errors collapsed and simplified to what should be exposed to users. The rest of the time they are full context and great for use by the developers.

To do the same in roc would mean that every single call to Task.await would be followed by Task.mapErr MyWrappingTag.

In production apps, I wouldn't be surprised if the code standard arose to never use the default ! that maps to Task.await, but instead a new version would be used that maps to a function that is essentially Task.awaitMappingErrTo. It would be equivalent to writing:

x <- someTask a b |> Task.mapErr WrappingTag |> Task.await

That said, I don't think you can use ! with a function that takes 2 args.

In rust and go this tends to look like:

let x = someTask(a, b).context("more info")?;

and

x, err := someTask(a, b)
if err != nil {
    return fmt.Errorf("more info: %w", err)
}

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:17):

From the production code I have seen, wrapping every single error is definitely the default.

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:19):

|> Task.mapErr! WrappingTag does exactly this! :smiley:

(if the task being piped in does not use !)

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:19):

e.g.

File.witeUtf8 path
|> Task.mapErr! ConfigFileWriteErr

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:25):

I like |> Task.mapErr SomeTagForContext better than .context("...") in Rust, because it's a tag rather than a string so it can be handled later semantically in case I want to defer the error handling to somewhere else (or just centralize it), and also it means I'm not writing tests against strings if I want to verify that the context was added properly

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:25):

Also, depending on the application, some places always wrap with the function name. In the craziest I have seen they also always include all arguments. In roc, I guess that would be:

errorHelper = \wrapFn, args ->
     \task, cont ->
        task |> Task.mapErr! \x -> wrapFn args x
        cont

myFunc = \a, b, c -> with (errorHelper MyFunc (a, b, c))
    # write the rest of the function with automatic error wrapping and arg inclusion.

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:26):

I don't have personal experience handling errors in Go, but the main thing I've heard about it is that it's widely considered a design flaw of the language and not something that should be emulated in other languages. That might or might not be a reasonable view, but it doesn't make me think (in the absence of personal experience one way or the other) it's likely to be a good source of inspiration here :sweat_smile:

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:28):

These comments about wrapping and what I have seen have nothing to do with specific of language.

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:29):

Go errs are only weird in how explicit handling them is.

view this post on Zulip Agus Zubiaga (Jan 09 2024 at 17:29):

Also in that they use a product type for something that should really be a sum type

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:30):

Otherwise, I think they tend to be quite well thought out. A lot of people just hate the explicit if err != nil checks. I really haven't heard any major complaints from people that have seriously used the language.

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:30):

well I've specifically heard the complaint that propagating them isn't nice

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:30):

for example: https://stackoverflow.com/questions/18771569/avoid-checking-if-error-is-nil-repetition

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:31):

top answer says "this is a common complaint" :shrug:

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:31):

None the less, I think the wrapping is important part of this discussion and it can be seen both in go and rust (which has the easy return with just ?)

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:31):

oh totally!

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:31):

I just think Roc already has the best wrapping story of any language I'm aware of :big_smile:

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:32):

maybe with line numbers it could be improved even more, but I don't know that the use case for wrapping context and the use cases for line numbers overlap so much that coupling them is necessarily the way to go

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:33):

Just curious, how would you feel if you saw the code I wrote above used everywhere in a production app (like that is a requirement for all error handling). one version for task and another for results.

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:34):

my first impression would be that a cost/benefit analysis has been made incorrectly somewhere :big_smile:

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:34):

like the benefit of that context is not worth the cost of what it does to the code

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:35):

and in general, whether you want to automate that or not, there is a steep and unavoidable runtime cost to doing that

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:35):

either you need to clone every argument in every one of those function calls, or at a minimum make them shared so they're no longer eligible for in-place mutation

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:35):

and I mean maybe it's worth it, but by default I would be very skeptical

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:36):

yeah. I think much more commonly would be the next level down from that:

errorHelper = \wrapFn ->
     \task, cont ->
        task |> Task.mapErr! \x -> wrapFn x
        cont

myFunc = \a, b, c -> with (errorHelper MyFunc)
    # write the rest of the function with automatic error wrapping.

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:36):

well that would be silly haha

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:36):

why?

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:36):

at that point it would be less work to just build up the task as normal and then do |> Task.mapErr on the whole thing :smiley:

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:37):

if MyFunc is going to be the wrapper on all errors anyway, why do that wrapping in N places instead of 1 place?

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:38):

that's actually a great example of what I like about the current design

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:38):

myFunc = \a, b, c ->
    task =
        # write rest of function
    task |> Task.mapErr MyFunct

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:38):

yep!

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:39):

I actually think that would be less common just cause it reads less nice if you only care about the function

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:39):

in Elm you could also do \a, b, c -> Task.mapErr MyFunc <| but I've been hesitant to add a <| operator

view this post on Zulip Agus Zubiaga (Jan 09 2024 at 17:39):

It depends on the case, a lot of times you could skip the task def and pipe directly

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:39):

sometimes yeah

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:40):

Anyway, I feel like I have spiraled this conversation a bit. Just wanted to give examples things I have seen in production code.

view this post on Zulip Agus Zubiaga (Jan 09 2024 at 17:40):

Or you’d actually have multiple mapErr for different tasks inside the function

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:41):

that's also possible yeah

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:41):

@Brendan Hansknecht oh I think this is very on-topic!

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:41):

Not recommending any of them. Just noting what people may want to do at some point.

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:41):

I think use cases are super important here

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:42):

like in Ruby it's very common to see production exception logs with stack traces, which on the one hand is helpful context for debugging, but on the other hand it's also not great that the errors were unhandled

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:42):

Oh, also, for the first example with capturing args, they would probably avoid the copy by actually capturing Inspect.toStr (a, b). Still a lot of overhead though.

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:43):

so I think it's an important question to ask: if you're trying to gracefully handle errors instead of crashing by default whenever they happen, what kind of logging context do you want? And how does that trade off against the runtime cost of storing info, the design considerations of what can possibly break if you add comments or move code around, etc.

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:54):

It is two fold though (context for user and context for developer when debugging). The errors may be recorded just to be thrown away in the common end user case. So the context will all be dropped. That said, when someone files a bug, the developer would change to dumping the full context. They don't want to have to annotate all functions whenever a bug is filed, so they would just always add the context in their code and specifically clear it in the end user case.

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:55):

well but then it matters when and how the error is being recorded

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:57):

for example, if this is a server and you want to be able to go back into log files and see all the context there, then (in the absence of a time machine) either you need to have logged the context and paid all the associated costs at the time the error was logged (and do that every time) or else you need to have saved enough information to replay the program in a debug build where you can step through etc.

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:57):

Recorded was probably the wrong word here. I mean wrapped.

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:58):

I think testing is also a relevant consideration here

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:58):

like sometimes I want to verify "hey if I call this function and this bad thing happens, I want it to return something indicating 'I noticed the bad thing happened, and so my return value is different'"

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 17:58):

So default is wrapping to build up a context and then probably simplifying to a user friendly error at the edge. For debugging, you would just skip the simplification step and dump the full error

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:58):

and then I can write a separate function (and test it separately) for "given that this particular bad thing happened, report it"

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:59):

oh, and also that's valuable for libraries

view this post on Zulip Richard Feldman (Jan 09 2024 at 17:59):

since reporting the same error might be different in a GUI vs a CLI vs a server

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:00):

Brendan Hansknecht said:

So default is wrapping to build up a context and then probably simplifying to a user friendly error at the edge. For debugging, you would just skip the simplification step and dump the full error

yeah so in this scenario, we're potentially talking about two different compiled artifacts (one built for debugging and one built for release), and then somehow the programmer is recreating locally whatever state led to the bug for the end user

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:01):

at which point I wonder if line numbers are the answer vs something like getting a dump of all the inputs/outputs sent to/from the host, so you can recreate exactly what happened and do step debugging to debug the error (once we have step debugging)

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:02):

because to me as the person debugging, if I have some replay feature, that's essentially automating "steps to reproduce" and letting me skip right to where the exact problem is

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:02):

and then from there I can get not only line numbers and backtrace, but also I can step both forwards and backwards in the debugger

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 18:03):

yeah, a good debugging experience could solve a lot of this local case.

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 18:03):

In the non-local case, you would probably log the full context and then report a simplified form of the error

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:04):

yeah I definitely wonder how feasible it would be to log enough context on a server to download it and do replay locally

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:05):

and how much storage space (and performance) that would cost compared to trying to store backtraces (especially across async I/O boundaries)

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:05):

I think platforms can already do both today if they want to

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 18:06):

for replay?

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 18:06):

That is way more info than just trying to replace backtracing

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:06):

yeah, although they'd have to expose some way to actually do the replay

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 18:06):

Though in roc, I guess it would be logging every input from host to roc

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:06):

sorry, I guess I should say "they already will be able to in the effect interpreters world"

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:06):

yeah exactly

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 18:07):

well, until complex multithreading is involved at least.

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:07):

the Roc functions are all pure, so if you just write down all the arguments you pass to them, then that's all you need for full replay

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:07):

I actually like the way replay would interact with concurrency

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:08):

because it's like "hey, in practice this particular weird set of timings happened to come up, and so the roc functions ended up getting called in exactly this highly unusual order...but good news, we wrote it down and now it's 100% reproducible!"

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:08):

usually those are heisenbugs :big_smile:

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:09):

and in some cases you can change the roc functions and try the replay again

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 18:09):

how do you write down accross multithread in order without changing the order or adding delay?

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:09):

as long as they always perform the same tasks in the same order with the same arguments going into the host

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:09):

oh there would definitely be a performance cost to doing that (as with any of this)

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 18:10):

Often adding logging to detect a race condition or deadlock leads greatly reducing the chance of hitting the issue.

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:10):

heh, fair

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 18:10):

Especially if you enforce linear logging with a mutex

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 18:11):

I would guess most servers would have logging per thread instead and those don't always interleave truthfully.

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:17):

:thinking: maybe you could do something like using the RDTSC instruction to get a timestamp right before running the actual task, and then another one immediately after the task had completed, and then at that point log the start and end times so they could be merged into one combined replay log

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 18:20):

Hmm. Yeah probably. So assuming you can always have that level of logging on you probably can have separate threads with merge-able logs using that.

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:34):

yeah in general I'm really interested in the potential of replay for debugging

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:36):

the state machine makes it theoretically straightforward to track, so I wonder what apps would want to pay the costs in production (and also of course there are the usual potential security data privacy concerns around recording/logging things!)

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:36):

but I think regardless, in debug builds we could probably make something really nice

view this post on Zulip Richard Feldman (Jan 09 2024 at 18:37):

also I forget where we discussed it previously, but being able to take serialized recordings and turn them into test cases is something else I'm interested in

view this post on Zulip Anton (Jan 09 2024 at 18:45):

so I wonder what apps would want to pay the costs in production

FInance, healthcare, government... apps

view this post on Zulip Luke Boswell (Jan 09 2024 at 19:21):

Now I want a time travelling debugger for roc-wasm4 :sweat_smile:

view this post on Zulip Brendan Hansknecht (Jan 09 2024 at 19:31):

One cool think with roc-wasm4 is that the hot swapping at least mostly works.

view this post on Zulip Kevin Gillette (Jan 11 2024 at 15:02):

Richard Feldman said:

A general concern I have about it is that runtime logic can now depend on line numbers, which in turn means things like if you get a PR to a Roc code base that adds a comment, you need to run all the tests because they might not pass anymore

A solution to that could be to prohibit checking the value of Compiler.origin inside of test (or probably inside of non-test code as well). I don't feel great about having a one-off exclusion rule, though.

view this post on Zulip Kevin Gillette (Jan 11 2024 at 15:14):

Richard Feldman said:

the platform could also do a "task-aware backtrace" where it tells you what the tasks were that led up to it, by writing them down as it encountered them in the state machine

What counts as one of those precursor tasks, and how do we avoid unbounded growth of stored stack traces?

I imagine any classic backpassing (or chaining) of tasks, where you're essentially just performing external actions in a sequence, could look like precursor tasks, even if they're not particularly related.

It seems like it'd be better as an opt-in "configurable trace mode" that can be requested from Roc via task, which would apply for the current context until it exits or is turned off, and during which time, stack traces and potentially other data is accumulated. A roc http application could choose to turn this on 1% of the time (or in the presence of trace headers), or could turn it on if it sees something suspicious. Or it could have it on by default (to capture even earlier request lifecycle trace info), and turn it off most of the time, which would then discard the accumulated trace data.

view this post on Zulip Kevin Gillette (Jan 11 2024 at 15:21):

Richard Feldman said:

:thinking: in fact maybe all of these should be platform things?

That might be tedious for platforms to consistently implement. Could we have Roc implement its own to cover Roc code, and perhaps have an opt-in platform function which returns the platform part of the stack trace? I realize that both roc and the platform would likely each need to observe and discard portions of the same stack trace, but iiuc, it is not always the case that one source language can do a great job formatting a stack trace for another, and not every language necessarily adheres to the same ABI internally (even if they can share an ABI at an FFI boundary).

view this post on Zulip Kevin Gillette (Jan 11 2024 at 16:03):

Recording inputs would probably need some configurable params. Consider reading a 5GB file or something: that can't plausibly be stored unless the user requests that capability.

It may be nice to have a configurable ring buffer for recorded state, like "I'm willing to store 1GB of rolling state," possibly with configurable compression (like zstd --fast equivalent), and automatic collapsing of certain threshold-exceeding values into checksums.

The result might be a packed replay artifact that might have a lot of actual task values, but which, for that 5GB file, the replay may attempt to do a real open of it from the developer's machine, and verify prior to continuing whether it matches the stored checksum.


Last updated: Jun 16 2026 at 16:19 UTC