Hey all, I was looking to attempt the 1 billion rows challenge with roc and since the input for this problem is pretty large (14.8GB in my case) I need to stream in the data. I looked at the File.read
and File.readBytes
docs and it seems they attempt to read it all in one go. Maybe I'm missing something. Is there an example of parsing large files I could look at?
I know this isn't answering your question, but just out of curiosity - did you manage to get correct answers in Roc on smaller-scale instances of the problem (my understanding is that the input size is effectively user-controlled)?
I tried it some time ago, but was running into some - non-intuitive to me (without having had the time to have doven into the weeds) - floating-point rounding/truncation issues, which I couldn't dedicate the time to look into at the time.
My plan was to eventually host a competition for the best Roc solution, which we could then showcase in the discussion section of the 1brc repository.
Yeah, probably should have a way to open a file and then read a chunk (I did that in the false interpreter). Also would be really nice to have a way to just mem map a file into a roc list.
@Hristo honestly I haven't attempted a smaller scale version of the file yet. Maybe I'll try that until we get File.readLine
in the cli platform or something similar to help with large files
I think that'd be the better approach for the time being, yes :thumbs_up:
Just fyi, we have an issue for this https://github.com/roc-lang/basic-cli/issues/205
Last updated: Jul 06 2025 at 12:14 UTC