Default error handling in basic-webserver · API design

Stream: API design

Topic: Default error handling in basic-webserver

Richard Feldman (May 01 2024 at 00:41):

I like where our design ended up for basic-cli error handling (Task {} [Exit I32 Str]) and I was thinking we could do something similar for basic-webserver - accept a main with any error type you like, and by default return HTTP status 500 (Internal Server Error) if any error occurs

Richard Feldman (May 01 2024 at 00:42):

so that way, just like in basic-cli, you can get up and running with no error handling and just add a main : Request -> Task Response [] annotation to enable full mandatory handling of all errors whenever you're ready

Richard Feldman (May 01 2024 at 00:43):

thoughts?

Agus Zubiaga (May 01 2024 at 00:49):

Hm, I don’t think it works as well for servers. In a CLI application that you’re running in a given machine, you can’t expose anything that the user doesn’t already have access to.
However, in a server/client context, you need to carefully control what you expose.

Agus Zubiaga (May 01 2024 at 00:59):

It could work if you didn’t return the error to the client or if you only did so in dev mode. That said, I think this is a platform where it’s more important to encourage good error handling.

Richard Feldman (May 01 2024 at 01:23):

I appreciate that point, but it's super easy to harden it for release - literally adding one type annotation :big_smile:

Luke Boswell (May 01 2024 at 01:39):

I wonder if we can have a different handler implementation and swap it out for cargo build verse cargo build --release

Luke Boswell (May 01 2024 at 01:49):

I've been thinking about/planning to do some cleanup and improvements for basic-webserver soon. I'm just wanting to finish cleaning up the basic-cli build process I've started on. So if we come up with a nice API, happy to make those changes too

Richard Feldman (May 01 2024 at 01:50):

I'd rather not make things work differently for dev vs release if we can avoid it

Luke Boswell (May 01 2024 at 01:53):

What about some kind of logging. Like the Task {} [Exit I32 Str] could append to a logfile or something

Agus Zubiaga (May 01 2024 at 01:54):

Richard Feldman said:

I appreciate that point, but it's super easy to harden it for release - literally adding one type annotation :big_smile:

I guess my point is that there are good CLI apps were this default crash behavior would be what you want forever. Whereas in a server, you really don’t want this beyond the initial experimentation stage.

Richard Feldman (May 01 2024 at 01:54):

oh that's interesting - like it returns error 500 but doesn't say what the problem was, but then just logs the Inspect.toStr to stderr?

Richard Feldman (May 01 2024 at 01:55):

that sounds interesting - would basically only affect initial experimentation then, yeah?

Richard Feldman (May 01 2024 at 01:55):

yeah @Agus Zubiaga I agree, but I'm mainly thinking about a way to improve the experience of the initial experimentation stage in a way where it's really easy to transition to production once you're ready

Luke Boswell (May 01 2024 at 01:55):

What I like about that is while I'm developing with localhost I can see any errors in the terminal.

Luke Boswell (May 01 2024 at 01:56):

It's been a surprisingly nice workflow with the latest basic-cli API that I did not expect

Richard Feldman (May 01 2024 at 01:56):

we could actually make that a default thing that basic-webserver does automatically

Luke Boswell (May 01 2024 at 01:57):

Like I've started tagging errors everywhere and it makes it so easy to see where things are broken, and experiment/iterate super quickly

Agus Zubiaga (May 01 2024 at 01:57):

Yeah, that seems like a good compromise

Richard Feldman (May 01 2024 at 01:59):

so basically whenever basic-webserver receives a Response where the status code is in the 500 range, it just automatically logs to stderr whatever the response was?

Richard Feldman (May 01 2024 at 01:59):

and then it does the same if it gets back a Task.err

Agus Zubiaga (May 01 2024 at 01:59):

Oh, I was thinking it would only do this for Task.err

Richard Feldman (May 01 2024 at 02:01):

I was too at first! But this comment:

Luke Boswell said:

Like I've started tagging errors everywhere and it makes it so easy to see where things are broken, and experiment/iterate super quickly

...wouldn't you still want that even after you start handling them gracefully? :thinking:

Agus Zubiaga (May 01 2024 at 02:03):

I don’t know if that would work. Handling them gracefully might mean you won’t return a 5xx.

Luke Boswell (May 01 2024 at 02:03):

I'm not sure, but I think its nice to have a catchall for unhandled cases. Is it ever valid to send a 500 deliberately??

Luke Boswell (May 01 2024 at 02:04):

Like maybe a downstream service failed or something??

Agus Zubiaga (May 01 2024 at 02:04):

Yeah, it can be when you know what went wrong but don’t want to expose it.

Richard Feldman (May 01 2024 at 02:04):

right

Richard Feldman (May 01 2024 at 02:04):

for example, a "this should never happen" situation :big_smile:

Richard Feldman (May 01 2024 at 02:05):

or yeah, you called out to another service and it gave an error, so you can't proceed

Agus Zubiaga (May 01 2024 at 02:06):

You could have a platform feature to log requests and responses and some logic to filter them

Richard Feldman (May 01 2024 at 02:06):

true!

Richard Feldman (May 01 2024 at 02:06):

I mentioned 500s specifically because if gracefully handling it results in a non-500 error, I wouldn't want to see that

Richard Feldman (May 01 2024 at 02:06):

e.g. if it's a 403 Forbidden then I don't think I need more info logged haha

Agus Zubiaga (May 01 2024 at 02:06):

Right. I guess it might be a convenient default

Richard Feldman (May 01 2024 at 02:08):

personally whenever I'm working on a server and I see a 500 my next step is to look into "ok what happened?" and so default logging sounds nice to me! :big_smile:

Agus Zubiaga (May 01 2024 at 02:14):

Now the problem with the platform logging to stderr is that people are gonna want to log to other places such as Sentry

Richard Feldman (May 01 2024 at 02:17):

I think that's something we should look into anyway, e.g. for things like this:

Agus Zubiaga said:

You could have a platform feature to log requests and responses and some logic to filter them

I'm not sure exactly how we should expose that

Agus Zubiaga (May 01 2024 at 02:19):

Something like this could work:

app [main] { pf: "https://../basic-webserver" }

import pf.Logging

main = \req ->
    Logging.onWrite! \level, message ->
        # send to Sentry

    # handle request

Richard Feldman (May 01 2024 at 02:19):

one idea is that since you can already do that in userspace (e.g. instead of writing all your logic to return Response, instead return something like MyCustomLoggableThing and then translate that to "first log whatever we want and then return the Response that main wants") just offer a way to log crashes, e.g. Str -> Task Response []

Richard Feldman (May 01 2024 at 02:20):

another possibility is to allow logging hooks for all I/O too, e.g. onCrash, onFileRead, onFileWrite, onHttpRequest, etc.

Richard Feldman (May 01 2024 at 02:20):

since the platform is in charge of all that

Richard Feldman (May 01 2024 at 02:20):

so yeah I think there's some separate design work to be done there

Luke Boswell (May 01 2024 at 02:21):

Also instead of

#[no_mangle]
pub unsafe extern "C" fn roc_panic(msg: &RocStr, _tag_id: u32) {
    panic!("The Roc app crashed with: {}", msg.as_str());
}

We might want to turn that into a 500 also and log that too.

Richard Feldman (May 01 2024 at 02:21):

oh yeah for sure!

Richard Feldman (May 01 2024 at 02:21):

also while we're at it, I'd like to change the entrypoint name from main to something more like handleRequest

Richard Feldman (May 01 2024 at 02:22):

respond maybe?

Agus Zubiaga (May 01 2024 at 02:23):

I like respond. You might not actually explicitly respond, but the platform will always respond for you :smile:

Luke Boswell (May 01 2024 at 02:25):

https://github.com/roc-lang/basic-webserver/issues/49 <-- made an issue just to track this thread. I don't want to lose these ideas

Agus Zubiaga (May 01 2024 at 02:25):

What if in addition to respond, the app provided a config value:

app [respond, config] { pf: "https://../basic-webserver" }

respond = \req ->
    # handle request

config =
    import pf.Config

    Config.default
    |> Config.port 8080
    |> Config.onLog \req, level, message ->
        # Log to Sentry
    |> Task.ok

  ```

Agus Zubiaga (May 01 2024 at 02:26):

I guess it makes the hello world a little more wordy, but I think we are going to need something like that eventually

Luke Boswell (May 01 2024 at 02:26):

I'm wondering if we will need passed in allocators and context though before I could make roc_panic log errors back on the HTTP response? :thinking:

Luke Boswell (May 01 2024 at 02:27):

@Agus Zubiaga what if we supported another roc app as a plugin that could be used to configure the server on startup?

Luke Boswell (May 01 2024 at 02:28):

So just roc server.roc would use default config. Or roc server.roc config.roc runs the configuration plugin.

Agus Zubiaga (May 01 2024 at 02:29):

Interesting. That would require a new language feature, right?

Agus Zubiaga (May 01 2024 at 02:29):

Cause' otherwise the platform would have to compile the config app on the fly

Luke Boswell (May 01 2024 at 02:30):

I don't think any new features are required. -- assuming the config is just a value or data that implements Encoding and Decoding

Luke Boswell (May 01 2024 at 02:31):

We could have the plugin do some setup, maybe it has extra effects or superpowers even, and then it could produce an immutable/ready only config object that is given to each request.

Agus Zubiaga (May 01 2024 at 02:32):

I don't think roc can run two apps like that

Luke Boswell (May 01 2024 at 02:32):

I am almost certain I can have the webserver run one app on startup and then another for handling requests

Luke Boswell (May 01 2024 at 02:33):

Or is the issue the roc build part?

Luke Boswell (May 01 2024 at 02:33):

I'll need to experiment

Agus Zubiaga (May 01 2024 at 02:34):

Well, if you build config.roc first, then I guess server.roc can execute it as a command

Luke Boswell (May 01 2024 at 02:35):

I was thinking the host could look at its args and if it finds a config.roc it roc build --lib that and then dynamically loads it.

Agus Zubiaga (May 01 2024 at 02:35):

ah right, that's what I meant with building it on the fly

Agus Zubiaga (May 01 2024 at 02:36):

So the platform ships with the compiler as a lib or it has to be available in the system

Luke Boswell (May 01 2024 at 02:36):

Roc cli has to be available in the system

Luke Boswell (May 01 2024 at 02:37):

And the plugin would use a URL release I assume

Agus Zubiaga (May 01 2024 at 02:37):

Thats interesting, but it doesn’t seem very convenient for deployment

Luke Boswell (May 01 2024 at 02:37):

Or maybe I'm just overcomplicating this... we just have a second function config alongside respond like you suggested

Luke Boswell (May 01 2024 at 02:38):

Yeah I think I'm getting carried away with the art of the possible... just throwing ideas out there

Agus Zubiaga (May 01 2024 at 02:39):

We could maybe make platform requires work like module params in that they are just a record pattern. That would allow us to have optional fields, so config wouldn’t need to be specified if you want the default.

Luke Boswell (May 01 2024 at 02:41):

That might be nicer than using a record with two functions for main.

Luke Boswell (May 01 2024 at 02:43):

I just noticed, from the platform side the requires currently is a record or at least

platform ""
    requires { Model, respond, configure } { ... }

Wheras the app is a list

provides [Model, respond, configure] to web

Agus Zubiaga (May 01 2024 at 02:45):

Luke Boswell said:

That might be nicer than using a record with two functions for main.

Honestly, that seems totally fine

Luke Boswell (May 01 2024 at 02:45):

Just threw a sneaky Model in there... it would be awesome to have some kind of in-memory cache

Jasper Woudenberg (May 01 2024 at 06:49):

I really like how Elm implements it's main function. It needs to be assigned a value of type Program, and the Browser module exports a couple different functions to create a Program.

https://package.elm-lang.org/packages/elm/browser/latest/Browser

The first function displayed is sandbox, and it takes the bare minimum options to construct a basic program, ideal for getting started with Elm and it's basic architecture. Next is element, which adds the option for the program to listen to outside events. Next is document, which adds the capability to perform side-effects. Finally there's application, which allows the program to also own page navigation.

I really like how this teaches you the Elm 'platform' one concept at a time. I think it'd be great if Roc platforms could do something similar.

I wonder what you think (and maybe this has been discussed before) about implementing it like this:

app [main] { pf: "https://../basic-webserver" }

main = loggingServer { respond, onLog }

respond = \req ->
    # handle request

onLog = \msg ->
    # log something

Luke Boswell (May 01 2024 at 07:08):

Interesting idea... could we achieve this using Abilities? maybe the platform requires a type that implements the WebServerAbility, and there are a few default implementations provided.

The first might be SimpleServer { respond }, second might be WithLogging { respond, handleLogs }, etc.

The platform then calls all of the functions it needs to build the record to provide to the host. Some of these may be default implementations.

Agus Zubiaga (May 01 2024 at 12:09):

I don’t know if you need different kinds of servers. You could have something like this:

app [server] {...}

import pf.Http exposing [Server, Request, Response]

server : Task Server *
server = Http.serve! { respond }

respond : Request -> Task Response *
respond = \req ->
    ...

Http.serve takes a record with optional fields so you can pass other config if you need to:

server : Task Server *
server = Http.serve! {
    respond,
    port: Env.get! "PORT",
    onLog: \x -> Cloudwatch.log x …
}

If a record with optional fields isn’t enough, you can also use the builder pattern.

Richard Feldman (May 01 2024 at 14:36):

yeah I'd like to move to this type of API, but it requires being able to send closures to hosts, which we can't do yet :sweat_smile:

Richard Feldman (May 01 2024 at 14:37):

Folkert and I have been working on that for awhile, but it's tricky to get right

Richard Feldman (May 02 2024 at 03:38):

unfortunately, looks like implementing this idea runs into https://roc.zulipchat.com/#narrow/stream/395097-compiler-development/topic/Alias.20analysis.20error.20across.20modules

thread 'main' panicked at crates/compiler/gen_llvm/src/llvm/build.rs:5759:19:
Error in alias analysis: error in module ModName("UserApp"), function definition FuncName("\x0f\x00\x00\x00\x05\x00\x00\x00\xce\x826j\x04j\xb3\xbf"), definition of value binding ValueId(3): could not find func in module ModName("UserApp") with name FuncName("\x02\x00\x00\x00\x00\x00\x00\x00\x1d\xc8\xcdx>Y\x17\x03")

Luke Boswell (May 30 2024 at 08:06):

So to summarise the key points from above, I am going to make the following changes in a PR:

change the name of the function provided to the platform from main to respond
change the API to accept tag union of errors e.g. respond : Request -> Task Response [ServerErr Str]_.
- When the platform see a ServerErr tag, it will respond with a status code of 500 and log the Str to stderr (not in the response).
- Any other unhandled errors are automatically converted to a 500 and also Inspect.toStr logged to stderr.
attempt to convert roc_panic to respond with a 500 and log the stderr, instead of crashing the server
split the platform and host files into separate crates (like basic-cli)

Luke Boswell (May 30 2024 at 08:24):

For any experienced rust people ... to implement the roc_panic handling I wonder if we can use std::panic::catch_unwind to wrap the call into roc.

My theory is that then when roc_panic gets called and we panic! in there, that is be somewhere down the stack and this will catch that so we can handle it appropriately.

Richard Feldman (May 30 2024 at 11:15):

Luke Boswell said:

For any experienced rust people ... to implement the roc_panic handling I wonder if we can use std::panic::catch_unwind to wrap the call into roc.

My theory is that then when roc_panic gets called and we panic! in there, that is be somewhere down the stack and this will catch that so we can handle it appropriately.

I know @Folkert de Vries has looked into this for Nea!

Folkert de Vries (May 30 2024 at 11:35):

this is dangerous in practice. Rust panics are not normally allowed to cross an FFI boundary either way. There is now a special calling convention "C-unwind" but even then I think the unwinding through roc code won't really work?

Folkert de Vries (May 30 2024 at 11:36):

I mean we could make it work technically I guess (though I have no idea how)

Richard Feldman (May 30 2024 at 12:45):

hm, what should we do instead?

Richard Feldman (May 30 2024 at 12:45):

setjmp/longjmp?

Folkert de Vries (May 30 2024 at 12:46):

maybe. are we sure we can skip running destructors ?

Folkert de Vries (May 30 2024 at 12:46):

that would just leak memory otherwise

Richard Feldman (May 30 2024 at 12:47):

hm, yeah good point

Richard Feldman (May 30 2024 at 12:47):

this is a good argument for doing the unwinding inside the generated Roc code and having the Roc call return a Result :sweat_smile:

Richard Feldman (May 30 2024 at 12:47):

(along with effect interpreters and passing in an Allocator struct)

Brendan Hansknecht (May 30 2024 at 17:40):

If exceptions are truly rare, returning a result generally has worse performance characteristics. Especially if the result has to accumulate some sort of information related to the stack trace on the error case.