Stream: ideas

Topic: Decoder APIs: Streaming = Bad Perf?


view this post on Zulip Brendan Hansknecht (Apr 20 2024 at 02:25):

I pull this to a second thread so it doesn't derail the first one.

Apparently according to serde_json, using from_reader is generally a lot slower than just loading the entire list/str before hand:

Note that counter to intuition, this function is usually slower than reading a file completely into memory and then applying from_str or from_slice on it. See issue #160.

So this may actually be a pattern that isn't actually worrying a ton about. The perf hit of using the streaming decoder can apparently be huge (like 3 to 5x slower). For files, you essentially always just want to mmap the entire file into memory (this is super great for roc where you can have a seamless slice that points into the mmap).

For webservers, to avoid ddos attacks, it is generally recommended to limit the size of the body. Given the size of the body is limited, it may actually be better to just load the entire body into memory and then decode. Obviously, there will exist some network speed where it is faster to do the streaming decode.

Probably the biggest concern with not a having a streaming decode is if you have some sort of trusted generated input of unknown length. It may not be reasonable to load everything in one go. You might run out of memory. In that specific case, loading in chunks may be required.


Another example is simd-json. From what I can tell, it does not work with a continuation style api. It requires that the entire data be loaded into a single allocation up front. It then does lazy parsing of the json.


So, just more food for thought, maybe we don't actually want a streaming api?

view this post on Zulip Richard Feldman (Apr 20 2024 at 02:27):

huh, wow!

view this post on Zulip Richard Feldman (Apr 20 2024 at 02:27):

I never knew about that

view this post on Zulip Richard Feldman (Apr 20 2024 at 02:28):

if it's a footgun, then at a minimum we probably shouldn't add it eagerly :sweat_smile:

view this post on Zulip Richard Feldman (Apr 20 2024 at 02:28):

and instead wait to really confirm there's a use case where it's actually the right choice, even with the knowledge of this :point_up:

view this post on Zulip Brendan Hansknecht (Apr 20 2024 at 02:34):

Oh wow, reading through that issue more, some cases even with buffered readers on serde_json can be more than 10x slower.

Oh wow, this one is crazy (EDIT: huge, but not that crazy):

view this post on Zulip Eli Dowling (Apr 20 2024 at 03:29):

Benchmarks in dotnet don't agree with that at all sadly:

| Method           | Mean     | Error    | StdDev   |
|----------------- |---------:|---------:|---------:|
| IdOnlyJsonStream | 38.68 ms | 0.548 ms | 0.512 ms |
| IdOnlyReadFIle   | 42.02 ms | 0.840 ms | 1.843 ms |

I'm using 21MB of json. I'll try to make a rust benchmark too, but I'm not as familiar with streaming there.
If you'd like to audit the highly complex code that got us here :sweat_smile: :

    [<Benchmark>]
    member _.IdOnlyJsonStream() =
        let jsonStream = File.OpenRead("/home/eli/Code/roc/lsp/small-json.json")
        let objects = JsonSerializer.Deserialize<MyObject ResizeArray>(jsonStream)
        let item=(objects.Item (objects.Count-1)).id
        item
    [<Benchmark>]
    member _.IdOnlyReadFIle() =
        let json = File.ReadAllBytes("/home/eli/Code/roc/lsp/small-json.json")
        let objects = JsonSerializer.Deserialize<MyObject ResizeArray>(json)
        let item=(objects.Item (objects.Count-1)).id
        item

Edit: Using ReadAllBytes instead

view this post on Zulip Eli Dowling (Apr 20 2024 at 03:33):

In this case it's what I'd expect, a tiny bit faster because the data is decoded as it's being read and reading data takes time

view this post on Zulip Brendan Hansknecht (Apr 20 2024 at 03:50):

Does donet let you mmap a file? If so, how does that compare?

view this post on Zulip Brendan Hansknecht (Apr 20 2024 at 05:22):

Ok, so @Eli Dowling and I did some more digging. The original perf numbers of 5x to 10x are definitely wrong. The reader api is still not as fast in rust, but later commits have sped it up tremendously (mostly by adding and reusing buffers).

These are the rough findings. Would have to do more testing to fully confirm all of them:

  1. 5x to 10x perf difference was definitely a bug
  2. Streaming vs non-streaming can be very close in perf
  3. DotNet doesn't have this, but in rust (and roc), we have can use string slices to avoid copying string bytes. This is where a 1x to 2x perf difference can come from. Note, you can only uses slices if you are in a non-streaming setup.
  4. If a string has an escape sequence, we have to copy it, so no gain from the string slices.
  5. For absolute max performance, you have to do something like simd-json. Many of the techniques can be applied to a streaming setup. That said, lazy decoding and avoiding all data copying is only possible in a non-streaming setup (still has single copy to load a string with escaped characters if the string is used).
  6. For lowest memory use, streaming can make a crazy difference especially when only loading some of the fields.

So this definitely makes the tradeoff feel much more nuanced.

view this post on Zulip Brendan Hansknecht (Apr 20 2024 at 05:26):

I think my suggestion in #ideas > Decoder APIs likely require streaming still stays the same, but I now think that a streaming api could definitely be useful. It seems in many cases the perf may be negligible but the memory usage difference significant.

view this post on Zulip Eli Dowling (Apr 20 2024 at 05:33):

I do think it is worth remembering there are some applications that would be simply impossible without some kind of stream api..
eg: Summing all the values in a 1gb json file on a little cloud instance with only 500mb of memory, or streaming a video file over the network or any number of other things.

I think a streams design in roc is worth keeping in mind.

view this post on Zulip Brendan Hansknecht (Apr 20 2024 at 05:41):

Yeah, theoretically mmap partially helps for the "1gb json file on a little cloud instance with only 500mb of memory", but the streaming decode/encode is definitely needed for video/audio streams.

view this post on Zulip Richard Feldman (Apr 20 2024 at 11:57):

would streaming video/audio streams use Decoding though? :thinking:

view this post on Zulip Eli Dowling (Apr 20 2024 at 11:58):

No likely not, but this conversation had kind of wondered into "should we support streams" area as well. So I just wanted to reiterate their importance :sweat_smile:


Last updated: Jun 16 2026 at 16:19 UTC