Day 7 · advent of code · Zulip Chat Archive

Here is my partial solution for Day 7. I've got to the point where it parses the input, and passes all my tests. I shouldn't have too much trouble implementing the next bit, but I would like to make my. parsing more reliable. At the moment I have to manually append a newline character to the end of the input to make it work properly. I've really had a lot of trouble trying to figure out how to skip and ingore something. Basically I want to consume/chomp a newline character if it's there, but if it's not then be happy. I could write something using buildPrimitiveParser maybe, but I'm not sure how to get it to play nicely with the applicative API. Would really appreciate any assistance here. I'm definitely missing a trick on how const and apply work together here I think.

Ayaz Hafiz (Dec 07 2022 at 13:46):

Shritesh Bhattarai (Dec 07 2022 at 14:29):

Here's my Day 7. I feel like I should memoize the calculation but computers are fast.

Luke Boswell (Dec 08 2022 at 22:57):

Here is my completed Day 7. That was more of a challenge. I had quite a few crashes and things to work around, and then plenty of minor logic bugs along the way. I also refactored along the way to significantly speed up the calculation so now it runs almost instantly. Pretty happy with the parsers and combinators too :big_smile:

Luke Boswell (Dec 08 2022 at 23:44):

I know it's probably not a fair comparison right now and the implementations are very different, but a bit of fun anyway... but I've been comparing the speed of my Day 7 with mates who wrote it in Rust. My Roc implementation is sitting at 33.7ms, verse Rust's 2.2ms to run the whole puzzle including reading the file so about 15x slower at the moment. I'm pretty impressed with this! I imagine there is a lot of things that can be optimised yet in basic-cli. :smiley:

Brendan Hansknecht (Dec 08 2022 at 23:46):

I would love to see some flamegraphs of the 2 apps to know where the time is being spent.

Ayaz Hafiz (Dec 08 2022 at 23:55):

Luke Boswell (Dec 08 2022 at 23:57):

Yes I think so. Its the release version of Rust, and roc then using roc build --optimize

Luke Boswell (Dec 09 2022 at 00:06):

$ hyperfine --warmup 3 './app-aoc-2022-day-7'
Benchmark 1: ./app-aoc-2022-day-7
  Time (mean ± σ):      48.3 ms ±   0.2 ms    [User: 47.9 ms, System: 0.4 ms]

Luke Boswell (Dec 09 2022 at 00:06):

Brendan Hansknecht (Dec 09 2022 at 00:26):

You have a ton of allocations. So that probably is the main cost of your code in Roc

Luke Boswell (Dec 09 2022 at 00:35):

Brendan Hansknecht (Dec 09 2022 at 00:38):

Not really sure. My gut feeling is that it is related to the dicts of lists of strings, but it could be the parser. Maybe I should compare to a different day that just parses and does simple math

Brendan Hansknecht (Dec 09 2022 at 00:39):

Oh, parser is definitely part of it. List.sublist called by the parser is pretty heavy.

Brendan Hansknecht (Dec 09 2022 at 00:40):

Luke Boswell (Dec 09 2022 at 00:41):

Brendan Hansknecht (Dec 09 2022 at 00:42):

use the legacy linker and maybe an unoptimized build if you want the best flamegraph. Though unoptimized is not necessarily representive of the final graph

Luke Boswell (Dec 09 2022 at 00:42):

I've been thinking the parser could just pass indexes around. Maybe that would be a lot faster?

Brendan Hansknecht (Dec 09 2022 at 00:43):

With seamless slices, it should hopefullly be approximately as fast as passing around indices, but yeah, that would be faster than calling sublist currently.

Brendan Hansknecht (Dec 09 2022 at 00:44):

Also, Dict.insert is using a lot of the time in subfunctions. Not completely sure why. How big are these dictionaries? I think pretty small, so my guess is that the hashing is actually costing a solid amount. but that is just a guess.

Luke Boswell (Dec 09 2022 at 00:44):

$ roc build day7.roc --linker legacy

I was expecting this file to exist:

    /home/luke/.cache/roc/packages/github.com/roc-lang/basic-cli/releases/download/0.1.2/3bKbbmgtIfOyC6FviJ9o8F8xqKutmXgjCJx3bMfVTSo/linux-x86_64.o

However, it was not there!

If you have the platform's source code locally, you may be able to generate it by re-running this command with --prebuilt-platform=false

Brendan Hansknecht (Dec 09 2022 at 00:48):

Brendan Hansknecht (Dec 09 2022 at 00:49):

Brendan Hansknecht (Dec 09 2022 at 00:50):

Brendan Hansknecht (Dec 09 2022 at 00:52):

Luke Boswell (Dec 09 2022 at 00:54):

lol, I think I need to figure out how to write Roc without all the allocations. But tbh, I'm not exactly sure where the line is, what makes an allocation and what doesn't? I feel like it is everytime you create a new thing in memory. So if I can change my code to re-use the same structure, i.e. a record that gets passed around then that would reduce the allocations?

Brendan Hansknecht (Dec 09 2022 at 00:55):

Though most of the time malloc is giving us the same pointer over an over again. We allocate it and free it and then allocate it again...

The address 0x5644723eb9b0 got allocated 21633 times for me. Most of the time just with about 10 to 20 bytes of data.

Brendan Hansknecht (Dec 09 2022 at 00:56):

given the varying small sizes, I would assume most of the allocations are Str or List U8, but I am not sure...just a guess.

Brendan Hansknecht (Dec 09 2022 at 00:58):

I should double check how I wrote dict to make sure it doesn't allocate unnecessarily when taking a key that is on the heap.

Luke Boswell (Dec 09 2022 at 01:05):

To clarify this Issue#1361 is what you mean by seamless slices? I guess this feature is still in the works?

Brendan Hansknecht (Dec 09 2022 at 01:08):

Yeah, that feature. It is decently well thought through at this point, just someone has to actually implement it and wire it through the compiler.

Luke Boswell (Dec 09 2022 at 01:11):

Brendan Hansknecht (Dec 09 2022 at 01:12):

Now just imagine an essentially zero allocation parser because the lists and strings all are seamless slices and just referencing the original strings/lists. Allocating at most once for each final element going into your datastructure

Brendan Hansknecht (Dec 09 2022 at 01:14):

Anyway, i definitely think I need to dig into your build directory listing function and what it calls to figure out if dicts have an issue with cloning keys and what the cost actually is here.

Brendan Hansknecht (Dec 09 2022 at 01:17):

One immediate thought on general performance, can you reverse your cwd list.? Append to it and drop last. Prepend and drop first are expensive.

Brendan Hansknecht (Dec 09 2022 at 01:18):

Luke Boswell (Dec 09 2022 at 01:24):

Brendan Hansknecht (Dec 09 2022 at 01:57):

I definitely need to look into dictionary and see what is going on. I think there may be a referencing issue there.

Past that, it should help performance to change the key from List Str to Str for now
Just use a helper like: makeKey = \path -> Str.joinWith path "/"

Brendan Hansknecht (Dec 09 2022 at 02:01):

Oh yeah, almost certainly a referencing bug with dictionary keys that leads to copies. That or a referencing bug with hashing lists of reference counted types that leads to copies. Changing the key type to Str just for buildDirectoryListing and it's sub functions cuts out ~15000 allocations.

Brendan Hansknecht (Dec 09 2022 at 02:04):

Random aside, I have no idea why, but the program as a whole even after reseting all of the code is running about 2x slower for me. My computer must be throttling or doing something weird, but so far I can't figure out what.

Brendan Hansknecht (Dec 09 2022 at 02:35):

Ok. I think I found the root issue at least for the dictionary part. Small String hashing has terrible performance (it always allocates). All of your strings are small strings. So you haveList SmallStr, which is probably the worst case for hashing in the compiler currently. That is why joining the strings is faster. It costs one allocation, but leads to way better hashing.

Brendan Hansknecht (Dec 09 2022 at 02:36):

I think we may need a special hack for small string hashing if we want good performance here.

Brendan Hansknecht (Dec 09 2022 at 02:40):

Anyway new summary: With Str keys to the Dict for the buildDirectoryListing the time of that function goes from 22ms to about 3ms

Luke Boswell (Dec 09 2022 at 02:43):

Brendan Hansknecht (Dec 09 2022 at 02:46):

Brian Carroll (Dec 09 2022 at 08:49):

It's whenever you create a new thing whose type has variable size
For example every value of U64 is always the same size. But different values of List U64 could be different sizes. That means U64 does not allocate but List U64 does.
The only 3 things in Roc that allocate are List, Str, and recursive tag unions (for example tree structures or linked lists).

Brian Carroll (Dec 09 2022 at 08:49):

Brian Carroll (Dec 09 2022 at 08:58):

The one exception to the rule is that we have a performance trick where we are able to skip allocation for "small" Str (smaller than 3 Nats). In general this makes things faster, but it means that low-level code in the standard library needs special cases for small strings. And Brendan found out that the hashing code for this case isn't as fast as it could be.

Gabriel Pickl (Dec 09 2022 at 22:10):

think i'll take a break for today since I ran into 3 probable compiler bugs in a row :sweat_smile:

Logan Lowder (Dec 10 2022 at 23:35):

Logan Lowder (Dec 10 2022 at 23:36):

Had to borrow some of your code @Luke Boswell
(gave you a shout out in the comments) :grinning:

Stream: advent of code

Topic: Day 7

Luke Boswell (Dec 07 2022 at 11:27):

Ayaz Hafiz (Dec 07 2022 at 13:46):

Shritesh Bhattarai (Dec 07 2022 at 14:29):

Luke Boswell (Dec 08 2022 at 22:57):

Luke Boswell (Dec 08 2022 at 23:44):

Brendan Hansknecht (Dec 08 2022 at 23:46):

Ayaz Hafiz (Dec 08 2022 at 23:55):

Luke Boswell (Dec 08 2022 at 23:57):

Luke Boswell (Dec 09 2022 at 00:06):

Luke Boswell (Dec 09 2022 at 00:06):

Brendan Hansknecht (Dec 09 2022 at 00:26):

Luke Boswell (Dec 09 2022 at 00:35):

Brendan Hansknecht (Dec 09 2022 at 00:38):

Brendan Hansknecht (Dec 09 2022 at 00:39):

Brendan Hansknecht (Dec 09 2022 at 00:40):

Luke Boswell (Dec 09 2022 at 00:41):

Luke Boswell (Dec 09 2022 at 00:41):

Brendan Hansknecht (Dec 09 2022 at 00:42):

Luke Boswell (Dec 09 2022 at 00:42):

Brendan Hansknecht (Dec 09 2022 at 00:43):

Brendan Hansknecht (Dec 09 2022 at 00:44):

Luke Boswell (Dec 09 2022 at 00:44):

Brendan Hansknecht (Dec 09 2022 at 00:48):

Brendan Hansknecht (Dec 09 2022 at 00:49):

Brendan Hansknecht (Dec 09 2022 at 00:50):

Brendan Hansknecht (Dec 09 2022 at 00:52):

Brendan Hansknecht (Dec 09 2022 at 00:52):

Luke Boswell (Dec 09 2022 at 00:54):

Brendan Hansknecht (Dec 09 2022 at 00:55):

Brendan Hansknecht (Dec 09 2022 at 00:56):

Brendan Hansknecht (Dec 09 2022 at 00:58):

Luke Boswell (Dec 09 2022 at 01:05):

Brendan Hansknecht (Dec 09 2022 at 01:08):

Brendan Hansknecht (Dec 09 2022 at 01:08):

Luke Boswell (Dec 09 2022 at 01:11):

Brendan Hansknecht (Dec 09 2022 at 01:12):

Brendan Hansknecht (Dec 09 2022 at 01:14):

Brendan Hansknecht (Dec 09 2022 at 01:17):

Brendan Hansknecht (Dec 09 2022 at 01:18):

Luke Boswell (Dec 09 2022 at 01:24):

Brendan Hansknecht (Dec 09 2022 at 01:57):

Brendan Hansknecht (Dec 09 2022 at 02:01):

Brendan Hansknecht (Dec 09 2022 at 02:04):

Brendan Hansknecht (Dec 09 2022 at 02:35):

Brendan Hansknecht (Dec 09 2022 at 02:36):

Brendan Hansknecht (Dec 09 2022 at 02:40):

Luke Boswell (Dec 09 2022 at 02:43):

Brendan Hansknecht (Dec 09 2022 at 02:46):

Brian Carroll (Dec 09 2022 at 08:49):

Brian Carroll (Dec 09 2022 at 08:49):

Brian Carroll (Dec 09 2022 at 08:58):

Gabriel Pickl (Dec 09 2022 at 22:10):

Logan Lowder (Dec 10 2022 at 23:35):

Logan Lowder (Dec 10 2022 at 23:36):

Luke Boswell (Dec 11 2022 at 00:57):