More approachable unit type convention · ideas

Stream: ideas

Topic: More approachable unit type convention

Kevin Gillette (Dec 22 2023 at 22:41):

Brendan Hansknecht said:

Yeah, it is convention. Could technically using Nothing or Unit as a tag with a single variant, also no data in those types.

A shameless re-gifting of Brendan's idea: why don't we use Nothing as the conventional unit type? Empty record, empty tuple, etc have a certain amount of historical zero-size/one-possible cleverness about them, but also require a certain amount of explanation.

Nothing is self-describing, and could be introduced in a tutorial as an Aside after covering tags. All that you'd need to know as a reader/learner is that there are no special tags whatsoever (they're all just tags), and optionally that tags are stored efficiently based on the number and nature of the variants. From there, "why Nothing?" is just a "shrug, why not?" answer.

Kevin Gillette (Dec 22 2023 at 22:44):

The main counter I could see is a potential concern about Nothing showing up in tag unions, though since it is usually used as input for thunks or as the throwaway result for tasks like Stdout.line, it doesn't seem like it'd propagate too far via open tag unions.

Brendan Hansknecht (Dec 22 2023 at 22:47):

For me, I like {} cause it is super short and looks kinda like a function call or struct initialization in context. Dict.empty {} vs Dict.empty Nothing

Brendan Hansknecht (Dec 22 2023 at 22:48):

I guess I would potentially prefer empty tuple better, but idk Dict.empty ()

Kevin Gillette (Dec 22 2023 at 22:57):

Does empty tuple work? In the repl on the main site page, (1, 2) works but () does not. The tutorial also doesn't mention them at this time. Tuples are syntactically, because (1) can't be a tuple without introducing ambiguity or magic.

Python has (1,) for single-element tuples, which is also awkward.
Roc could disallow single element tuples, but then should also disallow zero-element tuples for consistency.

Kevin Gillette (Dec 22 2023 at 23:01):

Side note: I wonder why Dict.empty is a function rather than just a value. As an immutable language, there shouldn't theoretically be any need to make a thunk-like wrapping just to initialize an empty data structure.

Brendan Hansknecht (Dec 22 2023 at 23:02):

Yeah, empty tuple doesn't work, but I think it could be nice to add.

Brendan Hansknecht (Dec 22 2023 at 23:04):

Also, Dict.empty being required to be a function is related to specialization and monomorphization. I would need to dig to find the doc.

@Ayaz Hafiz do you have an easy link to your "let specialization, let's not" doc?

Brendan Hansknecht (Dec 22 2023 at 23:05):

The base is that it makes the compiler type checking a lot more complicated because you now have a Dict.empty value that is trying to be used as many different concrete types.

Brendan Hansknecht (Dec 22 2023 at 23:06):

It is a Dict Str I32 and Dict Something Str and etc.

Kevin Gillette (Dec 22 2023 at 23:07):

Seems like a possible place for special casing, i.e. make some built-in thunky things look like values, but underneath rewrite them into functions if that's the path of least resistance for the implementation. The benefit is an unnecessary implementation detail is kept out of the core modules.

Richard Feldman (Dec 22 2023 at 23:15):

Brendan Hansknecht said:

Yeah, empty tuple doesn't work, but I think it could be nice to add.

so that's how it's done in Rust, and also in Elm, but in Elm it's always bothered me a bit that both {} and () exist and there's a convention to always use () so I wanted to have only one in the language so there could be more of an obvious one way to do it

Richard Feldman (Dec 22 2023 at 23:16):

it was also obvious at the time to go with {} because back then we had records but not tuples

Brendan Hansknecht (Dec 22 2023 at 23:30):

Seems like a possible place for special casing, i.e. make some built-in thunky things look like values, but underneath rewrite them into functions if that's the path of least resistance for the implementation.

Yeah, that might be a way to go, but it only essentially affects creating empty data structures. So very minor gains. Also, if done accidentally, it could lead to allowing something to be two different types when it really should be a type mismatch. I think this is a case where the cost is so minor that we mostly aren't concerned, but some special complexity could be added if there ends up being enough demand.

Kevin Gillette (Dec 22 2023 at 23:39):

My thought is that the implementation may get sophisticated enough later to handle gradual type inference from a * state, or it might shift to a fundamentally different internal paradigm with different tradeoffs.

At some point we'll probably want to lock down the language to start ensuring compatibility. At that point, we'll have locked-in oddities that are solely there due to perhaps arbitrary implementation constraints rather than due to language design reasons.

Richard Feldman (Dec 22 2023 at 23:40):

also the creating data structures case is a temporary state of affairs

Richard Feldman (Dec 22 2023 at 23:41):

Dict.empty shouldn't need to be a function in the future

Brendan Hansknecht (Dec 22 2023 at 23:42):

Oh, we have a solution for that?

Brendan Hansknecht (Dec 22 2023 at 23:42):

I thought ayaz's doc was about wanting to keep that as a a function

Richard Feldman (Dec 22 2023 at 23:48):

so there are some cases where we can be like "ok this is a variation on a builtin that we know" (e.g. a wrapper around List) so we can allow it

Brendan Hansknecht (Dec 22 2023 at 23:56):

Cool

Kevin Gillette (Dec 22 2023 at 23:59):

Would non-builtin data types (like opaque types that internally use the builtins) need to still jump through thunk hoops to provide an empty/zero value once the builtins are de-thunked, or will the de-thunking transitively benefit custom types as well?

Richard Feldman (Dec 23 2023 at 00:40):

Ayaz knows more than I do, but I think it's only builtins. Otherwise we end up with the "Let's Not" problem again.

Ayaz Hafiz (Dec 23 2023 at 00:48):

https://rwx.notion.site/Let-generalization-Let-s-not-742a3ab23ff742619129dcc848a271cf

Ayaz Hafiz (Dec 23 2023 at 00:50):

The problem is more than just implementation complexities, though restricting polymorphism does ease a lot of things. It's also that making

value : Result (Num *) Err
value = someComplicatedPolymorphicDecode "1" # I'm any number!

is a bad idea in the absence of at least constant evaluation because now you actually have N copies of these values, and they all must be re-evaluated each time they're called

Ayaz Hafiz (Dec 23 2023 at 00:52):

Supporting Dict.empty = @Dict {} (or any other function defined in such a way that it consists only of literals) to be eligible for polymorphism would be pretty simple to add today - it's a purely syntactic check. It's just that right now, the syntactic check for whether something can be polymorphic only admits values that look like a number, or a lambda (\... -> ...)

Kevin Gillette (Dec 23 2023 at 00:58):

That makes sense. Polymorphic decodes, even with explicit hinting (like Rust's into stuff) are a bit magical. I could see a language like Idris doing that, but it does seem like that kind of thing is ruled out as a non-goal for Roc.

Kevin Gillette (Dec 23 2023 at 01:05):

Though anything like [] |> List.append 5 or Dict.empty |> Dict.insert 1 2, or MyTree.empty |> MyTree.insert "a" "b" are all something I'd eventually hope would work without thunks for the empty initializers.

It's gradual type inference/specialization, but something that theoretically should be deterministic at compile time.

Richard Feldman (Dec 23 2023 at 02:22):

Ayaz Hafiz said:

Supporting Dict.empty = @Dict {} (or any other function defined in such a way that it consists only of literals) to be eligible for polymorphism would be pretty simple to add today - it's a purely syntactic check.

interesting! could this be a good first issue for someone new to the compiler, given a write-up of how to do it?

Ayaz Hafiz (Dec 23 2023 at 03:33):

Yeah

Eli Dowling (Dec 23 2023 at 13:37):

I would be a big fan of using () over {} and either over Nothing. Anything that significantly increases line length is worth avoiding IMO, more length means more breaks, more breaks means less code on screen.

Also I think () is better just because every other language I'm aware of uses it as the "unit" type and I think for something where it doesn't matter it's worth following convention.

Kevin Gillette (Dec 23 2023 at 15:11):

What would be the one-tuple syntax though?

Sky Rose (Dec 23 2023 at 15:28):

There doesn't have to be one. () isn't being used as a 0-length tuple, it's being used as an arbitrary value, and we don't have any need for 0- or 1-length tuples.

Kevin Gillette (Dec 23 2023 at 17:31):

I suppose the concern I've got is the question of internal consistency.

(1, 2) is the syntax for introducing tuples, thus () would most likely be interpreted as a zero-element tuple.
Tuples in Roc and similar FP languages would tend to be a bit rigid, and so we'd probably never use 1-element tuples even if there was a supported literal form for them. i.e. Roc tuples aren't a kind of list, as they would be in a language like Python.
If 2+ element tuples are meaningful, but 1 element tuples are not, then it seems like we shouldn't have zero-element tuples. Specifically, it's conceptually awkward to have a large range of supported tuple lengths with an exclusion in the middle of that range.
It's absolutely meaningful to have single-member records, because of open records. I'm not aware of anything like an equivalent open tuple concept, but if it did exist, I imagine it would lead to bad practice.
If the main desire behind () is alignment with other languages, there are many things in Roc that intentionally deviate. It could've been syntax-identical to Elm, for example (as NodeJS is to browser JS). Or we could have curly braces to delimit scope, and parens to wrap function params and calls: that would be the most approachable to the largest group of programmers. To them, both () and {} will be equally obscure. My point is that in terms of approachability, the difference between () and {} is probably negligible.
We could say () is _not_ a tuple, but instead just a new token, value, and type (perhaps called unit). That would remove the awkwardness of having a 0- and 2-tuples but not 1-tuples. However, we'd still have {} either way, and would've introduced something new to the language that was already fulfilled by something else.

As such, I believe {} is the better choice (it introduces fewer special cases to the language).

Kevin Gillette (Dec 23 2023 at 17:43):

As an exception, if tuples just desugared to records (e.g. if (A, B) were equivalent to {f1: A, f2: B}), then {} and () would be identical, the choice would distill down to just a stylistic convention. It also wouldn't matter if 1-tuples could be constructed, because they'd just be records anyway.

Brendan Hansknecht (Dec 23 2023 at 19:21):

Tuples do desugar to records

Brendan Hansknecht (Dec 23 2023 at 19:21):

Just with special number fields

Brendan Hansknecht (Dec 23 2023 at 19:22):

That is why tuple.0 works

Kevin Gillette (Dec 23 2023 at 19:23):

I missed that discussion. Does List.map .0 work to get the first element of each tuple, or does .0 parse as a Frac literal?

Kevin Gillette (Dec 23 2023 at 19:25):

And can number fielded-records be constructed using record syntax, and have a mix of number and non-number fields?

Brendan Hansknecht (Dec 23 2023 at 19:29):

Will get the first element of a tuple

Brendan Hansknecht (Dec 23 2023 at 19:29):

They cannot be constructed with record syntax. They are special to tuples.

Kevin Gillette (Dec 23 2023 at 19:30):

Thanks

Kevin Gillette (Dec 23 2023 at 19:31):

The "desugaring to records" means I don't really have much of an opinion on () vs {}

Richard Feldman (Dec 23 2023 at 19:53):

oh actually we ended up going with a separate (but still extensible) type for tuples, which is what is currently implemented

Richard Feldman (Dec 23 2023 at 19:53):

the plan for awhile was to do the "records but with numbers for fields" design but it ended up changing before the implementation

Kevin Gillette (Dec 23 2023 at 23:58):

@Richard Feldman is this the thread corresponding to what ended up being selected for the language?

Brendan Hansknecht (Dec 23 2023 at 23:59):

Oh, I guess I missed that change...oops

Richard Feldman (Dec 24 2023 at 00:01):

yep!

Last updated: Jul 23 2026 at 13:15 UTC