Stream: ideas

Topic: monomorphizing sets of numbers


view this post on Zulip Brian Hicks (Sep 15 2022 at 10:42):

In the same vein as Richard's topic about monomorphizing dictionaries to densely-packed arrays, it looks like there's a fast+small way to store sets of integers called roaring bitmaps. https://roaringbitmap.org/about/

basically the problem is that storing a bitmap for all (say) u64s requires 2^64 bits, or 2 exabytes buuuuuut you're unlikely to actually have 2^64 values, so you can store them sparsely. IIUC, roaring bitmaps does this by taking some bits (I think it's 16) off the front of the integer and using it as a lookup key.

view this post on Zulip Anton (Sep 16 2022 at 06:08):

Such a smart and simple idea :)

view this post on Zulip Brian Carroll (Sep 16 2022 at 06:26):

Yeah pretty cool! Reminds me of the bitvec crate that we use in a few places in the compiler. Basically a compact version of Vec<bool> that squeezes all the bits into bytes and does all the bit-masking logic for you.

view this post on Zulip Arya Elfren (Sep 16 2022 at 08:36):

I've been using this as a zig practice project!

This is a nice explanation https://vikramoberoi.com/a-primer-on-roaring-bitmaps-what-they-are-and-how-they-work/

Basically you keep the first 16 bits in an array next to a pointer to a "chunk". If the chunk has less than 4056 16-bit numbers it's also an array, if it's more it's a full bitset. Then there's an optional extension for another type of chunk that does run length encoding in case you have many ranges.

Also all arrays are sorted so lookup is a binary search.


Last updated: Jun 16 2026 at 16:19 UTC