Stream: ideas

Topic: Enum as List Indices


view this post on Zulip Brendan Hansknecht (May 22 2023 at 17:50):

What is the general thought of allowing using raw enums as list indices? This is to avoid the branching that would be required by awhen statement when implementing a state machine or similar.

current:

States : [
    Start,
    Proccessing,
    End,
]

Action : [
    A,
    B,
]

transitionTable : List (List State)
transitionTable = ...

currentState : State
currentState = ...

chosenAction : Action
chosenAction = ...

# These are the unwanted branches/indices functions
stateIndex = \state ->
    when state is
        Start -> 0
        Processing -> 1
        End -> 2

actionIndex = \action ->
    when action is
        A -> 0
        B -> 1

# to get the next state
when List.get transitionTable (stateIndex currentState) is
     Ok actionTable ->
        when List.get actionTable (actionIndex chosenAction) is
             Ok nextState ->

view this post on Zulip Brendan Hansknecht (May 22 2023 at 17:56):

The hope would be to enable using the State and Action enum directly as indices to avoid branches.
So the stateIndex and actionIndex function would go away.

view this post on Zulip Brendan Hansknecht (May 22 2023 at 17:57):

Assuming, the when in stateIndex and actionIndex was correctly in alphabetical order to match Roc internal choices, I would hope that llvm would optimize it correctly, but either way, I think it would be preferable to avoid the indexing function if possible.

view this post on Zulip Anton (May 22 2023 at 18:02):

It does seem like altering the enum could lead to completely different output and it then may not be clear that changing the enum is what caused the change.

view this post on Zulip Anton (May 22 2023 at 18:08):

It also feels a bit dirty for a pure fp language

view this post on Zulip Brendan Hansknecht (May 22 2023 at 18:09):

I think the best of both worlds may not be possible without larger changes. You would want a list or array type that could only be accessed by the enum type. That would enforce type safety in a very nice way. Then you couldn't access it incorrectly with just an arbitrary nat.

view this post on Zulip Brendan Hansknecht (May 22 2023 at 18:18):

As a note, I am looking at some godbolt output, and it looks like there are a few possibilities for generation with current roc:

  1. If actionIndex is defined in alphabetical order, it does look like llvm should figure it out an make it a null opt (kinda random detail for users to track if they care about max perf)
  2. If actionIndex is out of order and small, it will become an array lookup. So adds 1 extra array lookup, which is at least branchless.
  3. If actionIndex is out of order and large, it will be a true switch statement with branches.

view this post on Zulip Brendan Hansknecht (May 22 2023 at 18:18):

I am not sure where the exact line between small and large is.

view this post on Zulip Brendan Hansknecht (May 22 2023 at 18:19):

That said, it look like generally this will be branchless, so nevermind about the request I guess. Didn't realize that out of order and small would just be an array lookup instead of branching.

view this post on Zulip Brendan Hansknecht (May 22 2023 at 18:19):

So probably fine for many uses.

view this post on Zulip Richard Feldman (May 22 2023 at 18:24):

that's awesome, I did not realize it would optimize to that!

view this post on Zulip Kiryl Dziamura (May 22 2023 at 19:36):

I recently thought exactly about that! At least about enums as keys in a dictionary - the result is kind of structure with generic values, which is actually an array.
Anyway, in case of rust, it can be done either via hasher implementation (0 cost) for the enum, or using this crate (which, I guess, does the same)

https://crates.io/crates/enum-map

I hope it’s relevant

view this post on Zulip Brendan Hansknecht (May 22 2023 at 21:39):

Yeah, something like enum-map could probably be done in roc. Would need to make sure things are in alphabetical order to make it truly zero cost. Though, compared to hashing, any order would be fast.

view this post on Zulip Richard Feldman (May 22 2023 at 22:11):

I was actually thinking that we could have Dict optimize to that someday if the key is an enumeration

view this post on Zulip Richard Feldman (May 22 2023 at 22:11):

although then I guess you don't get the "insertion order" guarantee

view this post on Zulip Richard Feldman (May 22 2023 at 22:12):

but maybe we could redefine Dict's guarantee to be "we don't guarantee traversal in insertion order, but rather that order is independent of hashing function" :thinking:

view this post on Zulip Richard Feldman (May 22 2023 at 22:12):

although that could have unintended consequences I suppose

view this post on Zulip Brendan Hansknecht (May 22 2023 at 22:59):

I think there are a few options here. For Dict specifically, I would just modify the hashing function to be enum aware. In the case of an enum, don't hash at all, just identity map. We still would keep around all the extra info to make it track insertion order. That would honestly be pretty small, though not perfect overhead.

Note: the disadvantage of changing the hash function is that there may be more collisions. It depends on the use, but alphabetically close enum variants would collide. That said, assuming enums aren't too large, collisions shouldn't really be an issue

view this post on Zulip Brendan Hansknecht (May 22 2023 at 23:01):

If someone want perfect overhead, they could make an EnumMap type that wraps a list. On init, you have to pass in the count of the enum, and a function to go from enum variant to index. EnumMap would just be a list with the exact capacity for the enum variants. The function you define and pass into init needs to be alphabetical for max perf.

view this post on Zulip Brendan Hansknecht (May 22 2023 at 23:04):

Hmm, though if we have to store the function as a closure, the call probably won't get optimized away. So would be a call that does nothing and returns. So maybe minor overhead of an extra call instructions.

view this post on Zulip Brendan Hansknecht (May 22 2023 at 23:10):

I guess you could always avoid storing the closure and instead always pass it into EnumMap.get/set, that would almost certainly inline. Just is a worse API to use.


Last updated: Jun 16 2026 at 16:19 UTC