buffered I/O in a CLI platform · ideas

so I'm working on a CLI platform, and I'm wondering whether buffered I/O is actually a good idea in the context of a pure functional language.

EDIT: to clarify, I don't mean OS buffering, like disk cache - I mean buffering before the syscall, like Java's BufferedReader

Richard Feldman (Jun 24 2022 at 03:57):

Richard Feldman (Jun 24 2022 at 03:58):

in an imperative language, I might do that by opening a buffered file writer, and then periodically sending chunks of the file to the buffered file writer

Richard Feldman (Jun 24 2022 at 03:58):

and if the chunks are small enough, the fact that it's buffered will save a bunch of syscalls and a bunch of small disk I/O, both of which will make the whole thing run faster

Richard Feldman (Jun 24 2022 at 03:59):

but in a functional language, that's not how I'd write it - I'd just build up a big 10MB data structure in memory using pure functions and then write that whole thing to disk

Richard Feldman (Jun 24 2022 at 03:59):

Richard Feldman (Jun 24 2022 at 04:00):

then on the reading side, I'd write (or let it get auto-derived based on the type) a parser (or "decoder") for the format of whatever I'm reading, and then set that loose on a big flat buffer of bytes that all got read in from disk

Richard Feldman (Jun 24 2022 at 04:01):

Richard Feldman (Jun 24 2022 at 04:02):

the only times I can think of where I wouldn't want to do that are cases where buffered I/O would be slower so I still wouldn't want buffered I/O, like if the entire goal is to do a bit of work, then flush to disk, then do some more work - or to parse incrementally as bytes come in over the network

Richard Feldman (Jun 24 2022 at 04:03):

so all of this is leading me to conclude I should go for the simpler design, which is to have the lowest-level primitives be unbuffered I/O, and then of course higher-level primitives where the caller doesn't really think about buffering (e.g. the typical "give me a path and some stuff to write to it, and I'll take care of writing the stuff to the file at that path, don't worry about the low-level details of how I do that")

Richard Feldman (Jun 24 2022 at 04:04):

and nothing in between - no first-class "buffered I/O" primitive because that would actually encourage doing imperative programming in a functional language

Richard Feldman (Jun 24 2022 at 04:05):

(although of course if somehow that really was exactly what you wanted for your particular use case, you could always do your own buffering manually)

Brian Carroll (Jun 24 2022 at 06:10):

Sounds good to me though I'm not super knowledgeable about this.
When you mention big chunks of data in memory it makes me wonder about a streaming API, for the case where you're processing a really large amount of data and you don't want to put it all in memory.
That's still a functional API, but the user provides a "fold" function that works on little chunks at a time. And then maybe if those chunks are too tiny, you'd want them to be buffered.
Even if the data isn't too big to fit in memory, a streaming API would minimise memory usage and time spent in the memory allocator.
I suppose the same platform could provide batched and streaming APIs.
I'm not sure if this is relevant to what you want to do with this specific platform though!

Brian Carroll (Jun 24 2022 at 06:15):

So if you do want some buffering, then I wonder if there's an advantage to calling buffered functions in the OS rather than managing your own buffer in user-space?

Qqwy / Marten (Jun 24 2022 at 11:08):

Qqwy / Marten (Jun 24 2022 at 11:17):

I am aware of some languages (e.g. the Erlang VM (Erlang/Elixir). I believe this design also exists in some other languages but I do not remember them from the top of my head) allowing to build up something known as "IOdata".

IOData = Char | String | LinkedList IOData

(Eschewing Rock-specific types here to make it clear that we are dealing with character codepoints and linked lists. Also note that in the actual Erlang implementation it is OK for these kinds of list not to end in Nil but instead in another Char or String.)

The idea is that you can build up something you want to output by just copying over some references, rather than building an intermediate string.
It turns out to be much more efficient than the alternative in many practical situations, as much copying is prevented.

Qqwy / Marten (Jun 24 2022 at 11:19):

The idea is that you can build up these IODatas extremely efficiently, by just concatenating whatever chunks you have together.

Qqwy / Marten (Jun 24 2022 at 11:20):

:thinking: There might be a cleaner way to translate that super dynamic datastructure in a nice ADT by the way.

Qqwy / Marten (Jun 24 2022 at 11:21):

In any case, if we have something in place that can pass on such a datastructure to the OS directly, then I think that would be enough to satiate anyone's needs for "buffered" I/O.

Qqwy / Marten (Jun 24 2022 at 11:27):

IOData = Nil | Byte | String | Cons IOData IOData

Richard Feldman (Jun 24 2022 at 12:27):

so there are a few layers of buffering - there's OS-level buffering, which is basically "disk cache" where you tell the OS "hey here are some bytes I want to write to disk" and it doesn't actually write them right away; rather, it puts them into an in-memory buffer and then periodically actually writes them to disk (e.g. once per second)...but you don't usually notice, because also if you ask the OS to read bytes from there, it will read them straight out of the cache, so it always feels like everything is written instantly.

However, the very act of telling the OS "write these bytes to disk" is a syscall, which is a lot more expensive than a normal function call because it has to fire an interrupt, then the OS has to save a bunch of registers, switch into kernel space, do the operation you asked for, and then switch back and restore all the registers. So it can make sense to do another layer of buffering outside the OS, to prevent having to call OS syscalls as often. :big_smile: That's what BufferedReader and the like do - libc even has some convenience functions to do this.

Richard Feldman (Jun 24 2022 at 12:33):

yeah, writev does something like this - it's among the lowest-level OS APIs for writing to a file (along with write and pwrite, also documented in that link).

writev is essentially like "give me a List (List U8) and I'll write the contents of each one to disk, one at a time" - although with slightly different data structures than a Roc List.

Qqwy / Marten (Jun 24 2022 at 17:58):

I'm pretty certain that writev (which is part of the sys/uio.h POSIX header by the way.) served as inspiration for (and probably is used internally) the introduction of IOlist/IOdata into Erlang.

Qqwy / Marten (Jun 24 2022 at 17:59):

One approach we could take in Roc is of course to say "If you want it, a Platform can provide it (potentially with few or no syscalls)", but I think that there is real benefit to support some kind of buffering in Roc itself.

Qqwy / Marten (Jun 24 2022 at 18:00):

And one such an API (which can live in pure Roc) might be the mentioned IOData API.

EDIT: (Of course, an extra performance benefit is obtained if the resulting IOData does not need to be converted into a String before being passed to an I/O function, so some support for it could be added as part of a platform which could use it to call functions like writev directly. In this case, a final copy of all the parts of the string is also avoided.)

Richard Feldman (Jun 24 2022 at 18:13):

oh yeah - this is definitely a platform question! It's just that I'm working on such a platform, and am trying to figure out what the platform's file I/O API should be :big_smile:

Qqwy / Marten (Jun 24 2022 at 18:14):

:thinking: Unless you want to blow people away with extreme benchmarks from the start, you could consider adding a simple API now and defer the potential design and implementation of a more performant API for later

Qqwy / Marten (Jun 24 2022 at 18:14):

Brendan Hansknecht (Jun 24 2022 at 18:56):

Richard Feldman (Jun 24 2022 at 19:01):

the way it works today is that to mark something as readonly, there has to be a usize zero right before the beginning of the allocation

Richard Feldman (Jun 24 2022 at 19:02):

and with memmap we can't control that; we can't make sure there's eight bytes worth of zeroes right before the beginning of the allocation

Richard Feldman (Jun 24 2022 at 19:03):

however, with slices it works differently; we get to set the capacity field to a negative number, which means "add this (negative) number to the pointer to obtain a pointer to the refcount (the refcount being the usize we need to set to 0)"

Richard Feldman (Jun 24 2022 at 19:04):

so in that world we can do something sneaky, which is to put a usize zero in memory somewhere in the host, find the (negative) offset between that and the pointer that memmap returns (wrapping around if necessary), and then set that negative number in the capacity field

Richard Feldman (Jun 24 2022 at 19:05):

that said, I'm not sure if we want to just because we can - it's risky in the sense that if I have a Roc program that stores the memmap'd List U8 and reads from it multiple times throughout the course of the program, it could get different answers

Richard Feldman (Jun 24 2022 at 19:06):

because it's immutable with respect to Roc code, but other processes can still modify it on the filesystem, resulting in the bytes changing in-memory for the Roc program too

Richard Feldman (Jun 24 2022 at 19:06):

Richard Feldman (Jun 24 2022 at 19:11):

kinda - in this particular API, there are some cases where the low-level operations need to share certain types with the high-level operations, so there are certain questions I want to figure out now in order to see how everything fits together.

I'm trying to get a basic draft working and see what problems arise, so if there are any blocking compiler bugs (e.g. it turns out https://github.com/rtfeldman/roc/issues/3314 is a blocker) they can hopefully get resolved in parallel to the actual implementation of the CLI platform

Richard Feldman (Jun 24 2022 at 19:24):

Qqwy / Marten (Jun 24 2022 at 19:29):

Looking forward to the draft! If you have more concrete ideas about how the API might look, I'd love to talk more about it

Qqwy / Marten (Jun 24 2022 at 19:35):

:thinking: In a perfect world, everyone would be using io-uring already and no extra buffering would be required. One can dream :sweat_smile:

Brendan Hansknecht (Jun 24 2022 at 20:32):

MAP_PRIVATE
              Create a private copy-on-write mapping.  Updates to the
              mapping are not visible to other processes mapping the
              same file, and are not carried through to the underlying
              file.  It is unspecified whether changes made to the file
              after the mmap() call are visible in the mapped region.

Brendan Hansknecht (Jun 24 2022 at 20:36):

In most functional languages, wouldn't this normally be done lazily. The lazy io handler would then deal with buffering? Run func until I have a chunk, write it, repeat until lazy function returns nothing.

Qqwy / Marten (Jun 24 2022 at 20:37):

Brendan Hansknecht (Jun 24 2022 at 20:40):

As a side question: what about patching writes? Is that at all in your use case here, or are you always thinking in whole files for CLI. Cause patching would want a different api probably.

And I guess another question: what is your stdin/out plan? I feel that files should be able to look the same as stdin/out and vice versa. Which I guess would suggest supporting streaming (byte by byte std*) and either full block or chunked api (file).

Richard Feldman (Jun 24 2022 at 21:27):

patching writes as in writing to part of the file, but not overwriting the whole thing (or just appending)?

Brendan Hansknecht (Jun 24 2022 at 21:35):

Richard Feldman (Jun 24 2022 at 22:20):

yeah I've got a thing to support that too. I'll try to write this whole thing up over the weekend to explain how the pieces fit together, and get feedback :smiley:

Richard Feldman (Jun 24 2022 at 22:44):

related: has anyone had a use for opening a single file (or I guess socket maybe?) for both reading and writing? I've never done it personally, and I'm not sure what the use cases are!

Brendan Hansknecht (Jun 24 2022 at 22:45):

Mostly just in the memmap and update case. Might read things too as part of that.

Brendan Hansknecht (Jun 24 2022 at 22:46):

But generally, not really. Like it could instead be a buffered or random read followed by a patch

Qqwy / Marten (Jun 25 2022 at 08:41):

Anti-caching comes to mind. A niche, but useful technique for certain applications.

Qqwy / Marten (Jun 25 2022 at 08:43):

More common is to e.g. open a file, read all contents, do something with them, and then re-write the file (truncating and re-filling it) with updated contents.

Richard Feldman (Jun 25 2022 at 11:06):

Qqwy / Marten (Jun 25 2022 at 12:01):

Yes, that's another good example. (Although I'm sure that SQLite does all kinds of crazy tricky stuff with low-level APIs and mmapping for extra efficiency and resiliency when the computer is turned off partway through a write)

Stream: ideas

Topic: buffered I/O in a CLI platform

Richard Feldman (Jun 24 2022 at 03:57):

Richard Feldman (Jun 24 2022 at 03:57):

Richard Feldman (Jun 24 2022 at 03:58):

Richard Feldman (Jun 24 2022 at 03:58):

Richard Feldman (Jun 24 2022 at 03:59):

Richard Feldman (Jun 24 2022 at 03:59):

Richard Feldman (Jun 24 2022 at 04:00):

Richard Feldman (Jun 24 2022 at 04:01):

Richard Feldman (Jun 24 2022 at 04:02):

Richard Feldman (Jun 24 2022 at 04:03):

Richard Feldman (Jun 24 2022 at 04:04):

Richard Feldman (Jun 24 2022 at 04:05):

Brian Carroll (Jun 24 2022 at 06:10):

Brian Carroll (Jun 24 2022 at 06:15):

Qqwy / Marten (Jun 24 2022 at 11:08):

Qqwy / Marten (Jun 24 2022 at 11:17):

Qqwy / Marten (Jun 24 2022 at 11:19):

Qqwy / Marten (Jun 24 2022 at 11:20):

Qqwy / Marten (Jun 24 2022 at 11:21):

Qqwy / Marten (Jun 24 2022 at 11:27):

Richard Feldman (Jun 24 2022 at 12:27):

Richard Feldman (Jun 24 2022 at 12:33):

Qqwy / Marten (Jun 24 2022 at 17:58):

Qqwy / Marten (Jun 24 2022 at 17:59):

Qqwy / Marten (Jun 24 2022 at 18:00):

Richard Feldman (Jun 24 2022 at 18:13):

Qqwy / Marten (Jun 24 2022 at 18:14):

Qqwy / Marten (Jun 24 2022 at 18:14):

Brendan Hansknecht (Jun 24 2022 at 18:56):

Brendan Hansknecht (Jun 24 2022 at 18:56):

Brendan Hansknecht (Jun 24 2022 at 18:56):

Richard Feldman (Jun 24 2022 at 19:01):

Richard Feldman (Jun 24 2022 at 19:01):

Richard Feldman (Jun 24 2022 at 19:02):

Richard Feldman (Jun 24 2022 at 19:03):

Richard Feldman (Jun 24 2022 at 19:04):

Richard Feldman (Jun 24 2022 at 19:05):

Richard Feldman (Jun 24 2022 at 19:06):

Richard Feldman (Jun 24 2022 at 19:06):

Richard Feldman (Jun 24 2022 at 19:11):

Richard Feldman (Jun 24 2022 at 19:24):

Qqwy / Marten (Jun 24 2022 at 19:29):

Qqwy / Marten (Jun 24 2022 at 19:35):

Brendan Hansknecht (Jun 24 2022 at 20:32):

Brendan Hansknecht (Jun 24 2022 at 20:36):

Qqwy / Marten (Jun 24 2022 at 20:37):

Brendan Hansknecht (Jun 24 2022 at 20:40):

Richard Feldman (Jun 24 2022 at 21:27):

Richard Feldman (Jun 24 2022 at 21:27):

Brendan Hansknecht (Jun 24 2022 at 21:35):

Brendan Hansknecht (Jun 24 2022 at 21:35):

Richard Feldman (Jun 24 2022 at 22:20):

Richard Feldman (Jun 24 2022 at 22:44):

Brendan Hansknecht (Jun 24 2022 at 22:45):

Brendan Hansknecht (Jun 24 2022 at 22:46):

Qqwy / Marten (Jun 25 2022 at 08:41):

Qqwy / Marten (Jun 25 2022 at 08:43):

Richard Feldman (Jun 25 2022 at 11:06):

Qqwy / Marten (Jun 25 2022 at 12:01):