streaming file read · beginners · Zulip Chat Archive

Hey all, I was looking to attempt the 1 billion rows challenge with roc and since the input for this problem is pretty large (14.8GB in my case) I need to stream in the data. I looked at the File.read and File.readBytes docs and it seems they attempt to read it all in one go. Maybe I'm missing something. Is there an example of parsing large files I could look at?

Hristo (May 08 2024 at 07:27):

I know this isn't answering your question, but just out of curiosity - did you manage to get correct answers in Roc on smaller-scale instances of the problem (my understanding is that the input size is effectively user-controlled)?

I tried it some time ago, but was running into some - non-intuitive to me (without having had the time to have doven into the weeds) - floating-point rounding/truncation issues, which I couldn't dedicate the time to look into at the time.

My plan was to eventually host a competition for the best Roc solution, which we could then showcase in the discussion section of the 1brc repository.

Brendan Hansknecht (May 08 2024 at 09:17):

Yeah, probably should have a way to open a file and then read a chunk (I did that in the false interpreter). Also would be really nice to have a way to just mem map a file into a roc list.

Musab Nazir (May 08 2024 at 11:14):

@Hristo honestly I haven't attempted a smaller scale version of the file yet. Maybe I'll try that until we get File.readLine in the cli platform or something similar to help with large files

Hristo (May 08 2024 at 11:16):

I think that'd be the better approach for the time being, yes :thumbs_up:

Anton (May 08 2024 at 11:35):

Just fyi, we have an issue for this https://github.com/roc-lang/basic-cli/issues/205

Last updated: Jul 26 2025 at 12:14 UTC

Stream: beginners

Topic: streaming file read

Musab Nazir (May 08 2024 at 00:42):

Hristo (May 08 2024 at 07:27):

Brendan Hansknecht (May 08 2024 at 09:17):

Musab Nazir (May 08 2024 at 11:14):

Hristo (May 08 2024 at 11:16):

Anton (May 08 2024 at 11:35):