Stream: compiler development

Topic: Sort ability for Unions


view this post on Zulip Ben Plotke (Jul 21 2024 at 21:29):

I am working with @Jasper Woudenberg on implementing the Sort ability. We were wondering, when sorting Unions, what should the behavior be when the tags are different? Jasper suggested sorting by tag index, which corresponds to alphabetical order. I was thinking we should put empty tags first, but then sorting the rest alphabetically feels inconsistent to me. Does anyone have an opinion on this?

On a separate note, I was under the impression that we have a different back-end for dev builds, so I was surprised to see I could run roc test someCode.roc and the LLVM code gen I had written appeared to work. Does the dev build plug into LLVM? I am correct that we will only need to implement code gen for LLVM and WASM, but not the dev build?

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 21:36):

Tags maybe shouldn't autoderive sort?

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 21:36):

That or alphabetical is our general opinion of how to expose tags

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 21:37):

We have 2 dev backends. One for wasm and one for native

view this post on Zulip Ben Plotke (Jul 21 2024 at 21:37):

Not auto deriving sort would definitely be easier from an implementation perspective

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 21:37):

native dev is used in the repl by default

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 21:38):

wasm dev in the web repl

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 21:38):

otherwise, everything is llvm currently

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 21:38):

Also, for the ability, I think it should autoderive at the mono level. So no need to implement in any of the backends

view this post on Zulip Ben Plotke (Jul 21 2024 at 21:41):

@Brendan Hansknecht Thank you for taking the time to respond. Unfortunately I am now more confused. We only need to implement for LLVM, and all the other backends get it automatically?

view this post on Zulip Ben Plotke (Jul 21 2024 at 21:42):

Brendan Hansknecht said:

native dev is used in the repl by default

Does roc dev ... not run the dev build?

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 21:45):

nope. just enables expects and dbg statements. roc == roc dev

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 21:45):

Eventually it should, but we haven't wired everything up yet

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 21:47):

We only need to implement for LLVM, and all the other backends get it automatically?

We need to implement for mono and all backends get it. That is the level before all backends. Same way that Hash or Encode is implemented.

https://github.com/roc-lang/roc/tree/main/crates/compiler/derive/src

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 21:47):

I think that should work, but maybe I am missing something. I guess Eq doesn't use derive, so maybe it is special...not sure

view this post on Zulip Brian Carroll (Jul 21 2024 at 21:49):

There's a diagram of the compiler stages at the bottom of this readme
https://github.com/roc-lang/roc/blob/main/crates%2FREADME.md

view this post on Zulip Brian Carroll (Jul 21 2024 at 21:50):

Everything goes through mono before it splits into the dev / LLVM backends

view this post on Zulip Richard Feldman (Jul 21 2024 at 22:28):

Ben Plotke said:

I am working with Jasper Woudenberg on implementing the Sort ability. We were wondering, when sorting Unions, what should the behavior be when the tags are different? Jasper suggested sorting by tag index, which corresponds to alphabetical order. I was thinking we should put empty tags first, but then sorting the rest alphabetically feels inconsistent to me. Does anyone have an opinion on this?

:thinking: what would be the advantage of sorting empty tags first?

view this post on Zulip Richard Feldman (Jul 21 2024 at 22:29):

I expect the main use for giving Sort to a tag union would be that it would let you put it in a data structure that relies on knowing how things are ordered (e.g. a tree)

view this post on Zulip Richard Feldman (Jul 21 2024 at 22:29):

so the fastest thing would be beneficial there, and the fastest thing would be to go by index because it's an integer that is right there already! :big_smile:

view this post on Zulip Ayaz Hafiz (Jul 21 2024 at 23:32):

sorting by index would be really confusing

view this post on Zulip Ayaz Hafiz (Jul 21 2024 at 23:32):

that would result in two different sort derivations for [A U8, B U8] vs [B U8, A U8] even though those types are equivalent

view this post on Zulip Ayaz Hafiz (Jul 21 2024 at 23:33):

it gets worse because in the implementation the order is not guaranteed to be consistent across module boundaries

view this post on Zulip Ayaz Hafiz (Jul 21 2024 at 23:33):

the benefit of sorting alphabetically is that your program semantics also do not change if you refactor the order of your tag union

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 23:43):

Don't we sort tags to generate the index? (Otherwise, how would platforms use tags)

view this post on Zulip Brendan Hansknecht (Jul 21 2024 at 23:44):

Either the exception of null cases of executive types that get the null ptr directly

view this post on Zulip Richard Feldman (Jul 22 2024 at 01:31):

Ayaz Hafiz said:

the benefit of sorting alphabetically is that your program semantics also do not change if you refactor the order of your tag union

that's a great point!

view this post on Zulip Richard Feldman (Jul 22 2024 at 01:33):

Brendan Hansknecht said:

Don't we sort tags to generate the index? (Otherwise, how would platforms use tags)
Either the exception of null cases of executive types that get the null ptr directly

that's the default, but yeah like you noted - there are some optimizations we do where we no longer store a discriminant for one tag, which means the others get different indices because there's one missing compared to before

view this post on Zulip Richard Feldman (Jul 22 2024 at 01:34):

but the "refactoring tag payloads shouldn't change tag sort order in a distant part of the program" is a strong argument to me; that's enough to convince me we should do it strictly alphabetically

view this post on Zulip Brendan Hansknecht (Jul 22 2024 at 01:42):

Sounds good

view this post on Zulip Brendan Hansknecht (Jul 22 2024 at 01:43):

Also, alphabetically will only be minorly slower. It will be a switch (probably optimize to a lookup table or no-op) on the tag returning an alphabetic index

view this post on Zulip Ben Plotke (Jul 23 2024 at 04:44):

Thank you everyone for you responses! Alphabetical it is.


Last updated: Jul 06 2025 at 12:14 UTC