Wanted to say hello as I am totally new here :)
Amazing work on both the design and the implementation of the language! I love the output of the compiler, and when it knows what is going on, it is as helpful as documentation (and sometimes more!)
Can't wait to contribute!
I have one improvement idea when it comes to abilities I wanted to share, but first I should to explain what improvement area I have in mind.
Suppose we have a trait Summary with method summarize() in Rust.
If I now have a struct Person that implements Summary, and an instance of that struct named person, we can do both:
person.first_name
person.summarize()
In other words, we are able to use both dot notation and trait methods on the SAME type. This works the same way in Haskell (with OverloadedRecordDot extension), Java and many other languages. In my opinion, this is very good for general language ergonomics.
Currently, in Roc, one would need to choose: you can either have dot notation (if you use the records directly), or be able to define abilities on your records (if you wrap the record with an opaque type). I think that having to choose between these two is a bit unfortunate. What is more important for your project: easier use of records or possibility to define abilities for them?
Like, one can either start with a simple type alias Person : { firstName : Str, lastName : Str }, and then when Json serialization or something else comes in, one can add a wrapper PersonW := Person implements [...].
But should one then use Person in most functions and do jsonEncode (@PersonW person), summarize (@PersonW person) or rather pass PersonW and then do something like person = unwrap personW, followed by person.firstName?
Also, accessing wrapped and unwrapped records is syntactically different, so whatever decision one makes, the existing code would need to be re-worked if one changes their mind about this.
Some time ago, I was considering Purescript for part of my PhD project, but decided to go with a different language because I really didn't like this choice. Not sure how the situation is in Purescript now, but back then it was a very similar choice one had to face there.
I know that Roc is super new, and abilities have just recently been introduced, but at the same time I think that it could be good to discuss these things early rather than later.
I have a concrete improvement idea in this area, and if it worked, it would also have another added benefit, and Roc would get a yet another feature that no other language I know of has :)
But before jumping into the specific idea, I would like to hear whether my current understanding of the situation is more or less correct, and if yes, whether it would be interesting to discuss ideas improving this specific area, obviously, as long as they align with the language design goals, etc.?
But before jumping into the specific idea, I would like to hear whether my current understanding of the situation is more or less correct, and if yes, whether it would be interesting to discuss ideas improving this specific area, obviously, as long as they align with the language design goals, etc.?
I think your understanding is correct, and also it's always fair game to discuss ideas in #ideas - that's what the channel is for! :big_smile:
Paul Kapustin said:
Like, one can either start with a simple type alias
Person : { firstName : Str, lastName : Str }, and then when Json serialization or something else comes in, one can add a wrapperPersonW := Person implements [...].
But should one then usePersonin most functions and dojsonEncode (@PersonW person),summarize (@PersonW person)or rather passPersonWand then do something likeperson = unwrap personW, followed byperson.firstName?
Also, accessing wrapped and unwrapped records is syntactically different, so whatever decision one makes, the existing code would need to be re-worked if one changes their mind about this.
This is correct, but I think we have to think about larger projects and best practices whenever we dive into a discussion like this. This ties into a mix of issues (none of which I am claiming to have answers on), but many of which I have some perspective on from working in the Roc community for quite a while. I am going to try and organize this with a nice flow, but there is potentially a lot to comment on.
Firstly, namespacing. Roc chooses to anchor namespacing to the module level. If people are important to your project, you will probably have a Person.roc file. For any use of people outside of that file, you will be importing that file. In that file you will have a Person record. There is a chance that Person will be just a type alias, but it may also be a opaque type. For example, maybe Person loads certain private information that you don't want to be accessible from any part of the program. Then it would be an opaque type to begin with. For this example, lets assume nothing is private. Instead, you just have some mundane fields that are fine to access anywhere in the program (Name, date of birth, employer, etc). As such, when importing and using a person, you would import the Person.Person type. When you have a person, you can just grab any of the fields due to Person being a record. For common functions on a Person, you would would use supporting functions in the Person module. Instead of writing myPerson.calculateAge, you would write Person.calculateAge myPerson. Though written differently, this essentially reads the same and has a namespace that should enable any sort of autocomplete in editors.
Secondly, serialization. I am specifically just talking about encoding/decoding before looking at other abilities. When it comes to serialization, though the first thought is to just directly serialize the Person type, it is almost always a bad idea. Directly serializing the Person type is an easy way to introduce bugs. At some point, the type will get changed. Someone who doesn't realize that you are serializing people and sending them to the frontend will easily break an entire frontend with a minor change. As such, serialization should be done with a separate opaque type (potentially multiple of them if you need it for versioning reasons). Then, when someone modifies the Person type, they will get a type mismatch when trying to serialize. Instead of breaking the frontend, it will be caught at compile time. On top of that, it can protect from accidentally leaking private information and many issues of that nature. With all of that, how do you serialize a Person? Probably should just write a wrapper function. Then just call Person.encode myPerson. It isn't quite that clean cut all of them time, but it is quite simple to build the need wrappers for whatever type you need to serialize. Same with deserialization. To all of the code outside of those serialization functions, Person is still the easy to use record type alias.
Lastly, abilities in general. I agree that abilities make types more complex to work with. I think this is partially intentional. Generally speaking, abilities are a special case that should not be used. There is a very limited subset of times when they should be depended on (like serialization). Roc is a language that pushes away from tons of abilities, complex types, and higher kinded types (which are actually totally impossible in roc).
That said, if you need to work with abilities, it is best to do it in the module that an opaque type is defined. Abilities are no big deal to work with. It is just a minor change to the start and end of functions.
Without ability:
myPersonModifierFunc = \person, data ->
...
...
modifiedPerson
With ability (and thus as an opaque type):
myPersonModifierFunc = \@Person person, data ->
...
...
@Person modifiedPerson
As such, opaque types and abilities try to push for good encapsulation at the module level. As long as a good module api is built, they are generally hidden from end users.
Anyway, hopefully this gives some context at least at roughly how I think Roc views abilities (and minorly how it views opaque types). Not to discourage any of your ideas. Please post them. We would love to make the language nicer and reduce overall frictions. Just trying to give background and context on the current landscape.
Oh, also, if you want a case study of a reasonably complex opaque type with a decent bit of use, look at Dict in the standard libary. It's impl should be able to show the many potential pitfalls and potentially give some concrete examples to talk about.
Thanks a lot for a detailed intro. I think I agree with a lot of this, especially trying to encourage encapsulation on the module level.
And yes, speaking of specific code, if I have a person record in scope, I can easily do Person.encode person, rather than encode person. However, the true power of interfaces, type classes, and abilities is in helping us to write generic code, right?
Suppose I am working on some game where I need to render a scene, and currently I need to write a generic function that renders a list of items that can be part of the scene. I don't know the type of items upfront, as it should work with all kinds of renderable items. As Roc supports abilities, this seems a straight-forward choice: I can use Renderable ability to constrain the types that can be rendered. But if I can't define abilities for records / tags, how can I explain Roc how to render those?
Also, suppose a user of this library has Scene structure composed of records and tags. One day, they want to change the way all the cars are rendered (and Car is now a type alias for a record used as part of the Scene). So, now the user needs to use an opaque type instead of a type alias, so that Renderable ability can be implemented for Car. But this means that a lot of the code working with the Scene types will now need to be changed, and using dot syntax with these records is no longer possible.
Another example: suppose I have some structures of records that I need to compare (possibly for search or writing tests). Let's take lists for simplicity. Now, everything works until I figure that default comparison isn't good enough for me for some reason. So, in Roc we already have the Eq ability that I should use to define how things are compared.
But what if my item is a record or a tag? They don't allow to implement abilities. Then, instead of doing [item1A, item1B] == [item2A, item2B] I might define an opaque type Item and try to do [@Item item1A, @Item item1B] == [@Item item2A, @Item item2B].
However, this is a bit cumbersome and also error-prone, for example if I forget to use the Item wrapper, my comparison is going to behave differently at runtime, using the default comparison.
Very draft proposal idea: basically introduce a different kind of wrappers that would be "transparent" rather than "opaque". These wrappers would essentially bring more of the nominal typing to Roc, while still being extensible (just as opaque wrappers already are, but allowing for transparent access of the underlying data).
To keep terms simple for the programmer, we could maybe call these named records and named tags.
Possible syntax to define the types (note " " contrasted to ":" or ":=")
Person {firstName : Str, lastName: Str}
Person a {firstName : Str, lastName : Str}a
Color [Red, Green, Blue]
Color a [Red, Green, Blue]a
Possible syntax for constructing the values:
Person {firstName: "John", lastName: "Smith"}
Color.Blue
Advantages:
{id: "some-person-id"} and {id: "some-session-id"} are not mixed up (as now these would be Person {id: "some-person-id"} and Session {id: "some-session-id"}. Or, even more importantly, so that tag Valid for sessions does not git mixed up with tag Valid for credit cards (as now these would be SessionStatus.Valid and CreditCardStatus.Valid)Person is now not a type alias but a "first-class" type that you can define abilities for.Person a {firstName : Str, lastName : Str}a
fullName : Person * -> Str
fullName = \user ->
"\(user.firstName) \(user.lastName)"
Disadvantages
There is a possible alternative here: remove anonymous types, keeping only the named types (as these are still extensible). In this case the user doesn't need to choose, but then we would be removing a very interesting and distinctive feature from the language.
So, I wanted to share these draft ideas even if it turns out they don't make any sense. I have no experience in either programming language theory, type theory, or language implementation, so please don't be too harsh in your feedback / judgements :)
If this looks interesting, I am of course happy to work on a draft proposal for this together with others who are interested. And in any case I would really like to hear some other ideas that could help improve the situation that I tried to describe.
Thanks!
So there is a very important missing piece of information. I am going to just address that for now. So all of my comments essentially relate to:
Suppose I am working on some game where I need to render a scene, and currently I need to write a generic function that renders a list of items that can be part of the scene. I don't know the type of items upfront, as it should work with all kinds of renderable items. As Roc supports abilities, this seems a straight-forward choice: I can use Renderable ability to constrain the types that can be rendered. But if I can't define abilities for records / tags, how can I explain Roc how to render those?
Abilities are only a compile time concept. Lets say we have a Renderable ability. At runtime, there is no such thing as a Renderable. As such, it is not possible to have a list of various Renderables. It is possible to have a list of Person and for Person to be required to have Renderable.
Roc will not enable dynamic dispatch such that we could have a list of items where you don't know the type upfront.
Instead, Roc would use tag enums in a situation like this.
We would have a list of:
Renderable : [ Person PersonData, Animal AnimalData, Object ObjectData ]
This is a tagged union that is able to distinguish from the various types of Renderable at runtime.
In this option, we would define a single render function that matches on the tag to decide how to render each type.
Now abilities could still be used in a similar situation to this, in a more struct of arrays form though. We could have 3 separate arrays: persons, animals, objects. These three arrays might hold the opaque types that implement the Renderable. Then we could do List.map list render to each of persons, animals, objects. If they did not include the opaque type but instead they had the raw type, we have a few options:
List.map persons Person.renderList.map persons \person -> person |> @Person |> render. That wrapping is just specifying the opaque type and if there is not data layout change, it is free.render persons (\person -> @Person person) This render function would convert each Person to the opaque type as needed. Then would use it as a Renderable from that point forward.Thanks, this is very useful info!
However, I think my example was confusing. I was actually not thinking about dynamic dispatch at all (even though that was probably suggested by the idea of rendering scenes that often are pretty dynamic).
I thought about a completely static scenario, for example something like this.
Location : {x : U32, y : U32}
Color : [Red | Green | Blue]
Size : [Small | Medium | Large]
Car : {size : CarSize, color : Color, location : Location}
RoomData : {cars: [Car]}
Scene : [Room RoomData]
Here one might be willing to use abilities to define or customize how certain types should be rendered. Another similar example would be encoding a type similar to this to JSON and willing to customize serialization of some of the types in the structure.
Sorry for confusing, I was only thinking about this, defining the right behavior for different types using the abilities. The type will be known at compile time, but will not be known to the author of the generic function that is rendering a list of renderable items or encoding a list of serializable items to JSON.
I think my suggested use of abilities for these cases is very similar to the use of type classes in Haskell (like ToJson), and, AFAIK, traits in Rust.
Yeah, I guess the rest of the proposal is more general.
Really it is about having transparent types, which could make a lot of sense as an option.
I guess in elm land, the standard would be that you define separate render functions for each type. I think that would also be the suggestion in current roc.
So renderLocation, renderColor, etc. Then you could just pass that function into any generic rendering related function.
Given the type isn't opaque, the renderLocation etc functions can be defined anywhere (maybe with scene and the rest of the rendering code).
Obviously not the same level of automatic genericness, but still trivial to make generic.
That said, there are likely multiple less convenient cases: Eq as you mention, Hash as well. Inspect or anything for logging likely as well. Really you often want those to take any type. And preferably be easy to modify on any type even if it isn't opaque.
My current sentiment is:
We probably don't want to implement this idea currently, but we should save it as a possibility if it turns out to be needed.
Reasons for this sentiment:
renderPerson rather than just render, but it is no extra code. Either way, we have to write the renderPerson function.person |> Person.toRenderable |> renderNote: I am not the BDFL, so please feel free to ignore my sentiment and keep discussing (or even better, take it an explain why really we do need this).
@Brendan Hansknecht You are not BDFL, but a mighty important lever that is lifting Roc up!
Abilities are a last resort that we don't want to be used regularly. Having transparent types may make abilities easy to sprawl across a codebase. In general, we don't want them to be common.
Why does it have to be like this? Why do we not want them to be used more generally in places where they are a good fit?
I mean, I do understand that we don't want to go in the direction of the higher kinded things like Functor and Monad, as that raises abstraction level and complexity a lot. But there are plenty of very simple and very useful things, as you mentioned, like Eq, Hash, Inspect, Log, etc. These are really simple to understand, use and are very expressive, so in my opinion I don't see why it is bad if a user defines Eq for their type, allowing them to write code in an expressive and more direct way. I think people are very used to these interfaces as they are present in so many modern languages.
Elm doesn't even have abilities and manages this problem fine
I have two thoughts regarding this:
Eq! I can define it for my type! Oh no, actually I can't as then I lose dot-syntax".Somewhat unrelated, I think that having named, or transparent types for tags would be very useful for the reasons of type safety. Suddenly you have a bug at runtime where you treat an expired credit card as an expired session due to the same tag Expired being used for both. This is actually a new phenomena in Roc, as neither Elm, Purescript, Haskell, or Rust have structurally typed sum types.
I think these anonymous sum types are really cool, but in my opinion this is often too little type safety, and nice to give people some choice. Otherwise with anonymous tags this becomes a bit similar to atoms in Erlang / Elixir, where they things like :ok and :error everywhere.
Like, if we any function returns Ok in Roc, this Ok would be type-compatible with any other Ok in Roc, right?
Unless two tag unions have all the same variants and fields within each variant, they won't type check.
So yes, could potentially hit issues, but essentially unlikely for most types
Why does it have to be like this? Why do we not want them to be used more generally in places where they are a good fit?
I don't think it has to be like this. I think it is more that ablities were added to solve a small handful of specific problems. If those specific problems didn't exist, we wouldn't have abilities. So in general, we want to try and keep them small in scope.
But there are plenty of very simple and very useful things, as you mentioned, like
Eq,Hash,Inspect,Log, etc.
This is where I think the biggest open potential debate is. Roc has decide to bless a handful of ablities that are put directly in the standard library. If you mess with Eq or Hash has, you will notice that they just work. It doesn't matter that something is a type alias. Everything by default implements Eq and Hash (maybe not floats?).
This gives us something that works in essentially all cases. For the ultra important abilities, they are built into the standard library, and magically on all types. This means that there would be no need for a transparent types except in the rare case that the auto derived behavior is not what is wanted.
I'm not sure this is the right decision, but I think it helps to clarify how abilities are carved into the language just enough to fill a need and not necessarily setup to be general.
Unless two tag unions have all the same variants and fields within each variant, they won't type check.
You are right, I missed that normally if one uses type aliases everywhere, this would be rare.
But, without named types, there isn't really such a thing as the tag union type, right? In the sense that there are really only lists of variants.
I mean that this code compiles, and I am able to use the output of expireSession in checkCardStatus even though CardStatus and SessionStatus do not have all the same variants. And I think that examples like this don't have to be that rare (btw sorry for random use of strings and return values in this example).
Date := {day : Str}
CardStatus : [Invalid, Valid Date, Expired, Empty]
SessionStatus : [Current, Expired]
expireSession : Str -> [Expired]
expireSession = \_ -> Expired
checkCardStatus : CardStatus -> Str
checkCardStatus = \_ -> "Valid"
main =
session = expireSession "Session"
status = checkCardStatus session
Stdout.line ("Card status: \(status)")
This gives us something that works in essentially all cases
Is there any vision regarding what kind of things should be "blessed" and auto-derived in the standard library? And what about JSON? How do you customize the way your records and tags are serialized?
This gives us something that works in essentially all cases. For the ultra important abilities, they are built into the standard library, and magically on all types. This means that there would be no need for a transparent types except in the rare case that the auto derived behavior is not what is wanted.
I'm not sure this is the right decision, but I think it helps to clarify how abilities are carved into the language just enough to fill a need and not necessarily setup to be general.
Got it, appreciate the explanations a lot.
Sure, this is one way to go about it, but I am kind of hoping that we can do even better than that, hence my proposal :smile:
Like, if we managed the named types, Roc would truly Roc in my opinion (in the sense that I don't know other languages that give you extensible but still sort of nominal types. Especially in addition to anonymous types. And still being super-practical and simple to use.
Do you agree that named types / transparent wrappers would add some fairly nice benefits?
Do you see any significant downsides to introducing them?
I mean, obviously this would need a proper proposal work first even if this looks interesting at all.
I mean that this code compiles, and I am able to use the output of
expireSessionincheckCardStatuseven thoughCardStatusandSessionStatusdo not have all the same variants.
The code compiles for 2 reasons:
expireSession is typed wrong. Should be expireSession: Str -> SessionStatus[Expired] can seamlessly expand into either CardStatus or SessionStatusRelated to 2, note that [Expired] can't expand into both CardStatus and SessionStatus. So if the value is ever constrained (like if you put the session into a typed Record or pass it to a function), it will not be allowed to also be used as a card status. For example, this does not type:
Date := {day : Str}
CardStatus : [Invalid, Valid Date, Expired, Empty]
SessionStatus : [Current, Expired]
expireSession : Str -> [Expired]
expireSession = \_ -> Expired
checkCardStatus : CardStatus -> Str
checkCardStatus = \_ -> "Valid"
checkSessionStatus : SessionStatus -> Str
checkSessionStatus = \_ -> "Valid"
main =
session = expireSession "Session"
status = checkCardStatus session
status2 = checkSessionStatus session
Stdout.line ("Card status: \(status)\n Session status: \(status2)")
Is there any vision regarding what kind of things should be "blessed" and auto-derived in the standard library?
I think we have actually named almost everything. I don't think there are any plans to expand farther. Hash, Eq, Encode, Decode, Inspect (maybe a separate Log).
And what about JSON?
Done through Encode and Decode. They are basically serde from rust and support any format.
How do you customize the way your records and tags are serialized?
put it in an opaque type for serialization.
Do you agree that named types / transparent wrappers would add some fairly nice benefits?
For sure. Basically it is unique types without the hassle of wrapping. Use like the underlying type. Abilities without opaque wrapping. Being able to distinguish two otherwise identical types without requiring wrapping.
Do you see any significant downsides to introducing them?
Here are the concerns off of the top of my head:
when ... is? Would they work with open record functions? Record update syntax? How does type inference work? To have decidable type inference, I think they would always need to be specified. This means that they would still require all of the wrapping and unwrapping (same as opaque types). They would just also allow record dot syntax. Is that the only gain? We could make them implicitly unwrap (this is likely required to be used like a type alias), but that sounds like it could easily lead to bugs. Same with implicitly wrapping, which sounds even worse.expireSession is typed wrong. Should be expireSession: Str -> SessionStatus
Is this necessarily wrong? Like, an author of this code might want to express the fact that an expired session will always be in the expired state, using Roc's type system.
Specifically, what would be clear cut rules when when an opaque or transparent type should be used? What would be they why?
To clarify, I don't refer to these "transparent types" as wrappers at all (they could possibly be implemented as wrappers as one possible option, but maybe let's leave the implementation details for now). Sorry for the confusion. I would rather propose to call them named types, contrasted with anonymous types. For the user, regardless of implementation, there will be no such thing as wrapping or unwrapping a named type, it's an atomic thing you can't divide, just like you can't divide a custom type in Elm, a struct in Rust or a record in Haskell.
So the choice for the programmer is really between
{ firstName : Str, lastName : Str } (anonymous) and Person {firstName : Str, lastName : Str} (named)
You go with anonymous if you
You go with named if you
Also, if named types are introduced, we could remove anonymous types (as named types are still extensible), then this would go away.
When it comes to opaque types, I would use them exactly to hide the internals of the underlying type, in the specific cases where it is needed. I would advise using them a bit sparingly. And again, when used, an opaque type could wrap an anonymous type, or a named type. In either case, there would be only one layer of wrapping that the user is aware of.
Can they be used in a when ... is
Yes, and no extra wrapping / unwrapping (as there is nothing to unwrap, at least for the user)
Would they work with open record functions?
Yes
Person a {firstName : Str, lastName : Str}a
fullName : Person * -> Str
fullName = \user ->
"\(user.firstName) \(user.lastName)"
Record update syntax?
Yes
How does type inference work? To have decidable type inference, I think they would always need to be specified
Correct, when constructing a value you would have to do this
Person {firstName: "John", lastName: "Smith"}
Color.Blue
This means that they would still require all of the wrapping and unwrapping (same as opaque types)
Speaking only about the user experience here, there would be no wrapping / unwrapping as a named type would be an atomic thing used just as an anonymous type with the only difference that you have to specify the name / nominal part when constructing a value explicitly.
They would just also allow record dot syntax. Is that the only gain?
No, they would allow everything anonymous types allow + more type safety + abilities, making them truly "first class"
We could make them implicitly unwrap (this is likely required to be used like a type alias), but that sounds like it could easily lead to bugs. Same with implicitly wrapping, which sounds even worse.
I am thinking that unwrapping, as mentioned above, won't be needed (at least not visible to the user regardless of implementation). The user will just be accessing the data directly just like this is done in Elm with custom types, or in Rust with both enums and structs, and in Haskell with both sum types and records.
Example with a closed type:
Person {firstName : Str, lastName : Str}
createPerson : Str -> Str -> Person
createPerson = \firstName, lastName -> Person {firstName : firstName, lastName : lastName }
greetPerson : Person -> Str
greetPerson = \person -> "Hello, \(person.firstName) \"person.lastName)
Using named tags would be pretty much identical to using custom types in Elm.
Is this necessarily wrong? Like, an author of this code might want to express the fact that an expired session will always be in the expired state, using Roc's type system.
I would argue yes, but it technically is not strictly wrong due to returning an open tag. This function claims to return a Session in the name, but in the type, it clearly does not return a Session, it returns a [Expired]. The [Expired] type has not tying to Session at all. It is not clear that Session has an Expired variant with no data in it. Roc is special due to having open tags, but in most languages, that would never type check at all. Fundamentally [Expired] is unconstrained. It only specifies part of the type.
Aside: I don't think named types have any effect on this function. If you put a name there, you would fully constrain the type, but also if you put an alias there, you will probably fully constrain the type as well.
Related to wrapping, I am going to try and get more concrete. Which of the following would you expect to work. Also, if you think they need different syntax to not really be wrapping/unwrapping, please specify.
First just the named type definitions and variables
Color [ Red, Green, Blue]
Person {firstName : Str, lastName : Str}
c = Color.Red
p = Person { firstName: "Joe", lastName: "Williams" }
1) wrapped when
when c is
Color.Red -> ...
_ -> ...
2) unwrapped when
when c is
Red -> ...
_ -> ...
3) implicit wrapping when
x = Red
when x is
Color.Red -> ...
_ -> ...
4) implicit unwrap for function call
fullName : {firstName: Str, lastName: Str} -> Str
fullName = \user ->
"\(user.firstName) \(user.lastName)"
fullName p
5) open record function call
getFirstName : { firstName: Str }* -> Str
getFirstName = \{firstName} -> firstName
getFirstName p
6) implicit wrap for function call
fullName : Person -> Str
fullName = \user ->
"\(user.firstName) \(user.lastName)"
fullName { firstName: "Diana", lastName "Ray" }
7) Wrapping an existing value
val = { firstName: "Bob", lastName: "Fraser" }
person = val |> Person
8) unwrapping a value
Person val = p
9) record update expanding type
{p & age: 43}
10) record update that unwraps the type
out : {firstName: Str, lastName: Str}
out = {p & firstName: "Lia"}
Aside: I don't think named types have any effect on this function. If you put a name there, you would fully constrain the type, but also if you put an alias there, you will probably fully constrain the type as well.
You are right, my original code won't compile if you use a type alias in the expireSession function. However, if CardStatus is declared as a named type, it won't compile even if you don't.
Also, with named tags one would be able to do this:
Date := {day : Str}
CardStatus [Invalid, Valid Date, Expired, Empty]
SessionStatus [Current, Expired]
expireSession : Str -> [SessionStatus.Expired]
expireSession = \_ -> SessionStatus.Expired
checkCardStatus : CardStatus -> Str
checkCardStatus = \_ -> "Valid"
Here one is able to further refine the type: rather than allowing to return any SessionStatus from expireSession, we can limit it only to SessionStatus.Expired, which I think is quite nice.
At the same time, you would not be able to use it erroneously in the checkCardStatus function, as that is taking a CardStatus, not a SessionStatus.
Note that in the above examples both CardStatus and SessionStatus are closed named tag unions, meaning I cannot write CardStatus.Other or SessionStatus.Other.
CardStatus a [Invalid, Valid Date, Expired, Empty]a
checkCardStatus : CardStatus * -> Str
checkCardStatus = \cardStatus ->
when cardStatus is
CardStatus.Valid date -> "Valid"
_ -> "Invalid"
# type checks
checkCardStatus CardStatus.Expired
# type checks, as this is an open named tag union
checkCardStatus CardStatus.Other
# does not type check, expecting named tag union
# CardStatus (and not an anonymous one)
checkCardStatus Expired
# does not type check, expecting named tag union
# CardStatus (and not SessionStatus)
checkCardStatus SessionStatus.Expired
Related to wrapping, I am going to try and get more concrete. Which of the following would you expect to work. Also, if you think they need different syntax to not really be wrapping/unwrapping, please specify
Sorry for the confusion, really no wrapping or unwrapping here. So, either you are working with an anonymous type or named type, and the syntax is pretty much the same. There is no implicit conversion between them.
The only difference is that with named types you have to explicitly mention the name both in the type and when constructing the value.
1) No wrapping, you are just referring to the named tag using Color.Red syntax
2) No implicit unwrapping, so that wouldn't type check as Red would be referring to an anonymous tag union (rather than a named one). So you would instead write
when c is
Color.Red -> ...
_ -> ...
3) No implicit wrapping, so that wouldn't type check as Red and Color.Red have different types (anonymous and named). So you would instead write
x = Color.Red
when x is
Color.Red -> ...
_ -> ...
4) No implicit unwrapping, so that wouldn't type check. You would instead write:
fullName : Person -> Str
fullName = \user ->
"\(user.firstName) \(user.lastName)"
fullName p
5) No implicit unwrapping, so that wouldn't type check as the function is receiving an anonymous record, and a named record is passed. You would instead write:
getFirstName : Person { firstName: Str }* -> Str
getFirstName = \Person {firstName} -> firstName
# or
getFirstName = \person -> person.firstName
getFirstName p
6) No implicit wrapping, so that wouldn't type check. You would instead write:
fullName : Person -> Str
fullName = \user ->
"\(user.firstName) \(user.lastName)"
fullName Person { firstName: "Diana", lastName "Ray" }
7) No implicit wrapping, so that wouldn't type check. If you would like to convert an anonymous record to a named record, you would need to do it manually:
val = { firstName: "Bob", lastName: "Fraser" }
person = Person {firstName: val.firstName, lastName: val.lastName}
8) No implicit unwrapping, so that wouldn't type check. If you would like to convert a named record to an anonymous one, you would need to do it manually:
val = {firstName: p.firstName, lastName: p.lastName}
9) This syntax would work for both anonymous and named records, and it wouldn't change change the type: if you start with an anonymous record, you get an anonymous record back. If you start with a named record, you get a named record back.
10) No implicit unwrap, so that wouldn't type check. If you need to update a named record, and then convert it to an anonymous one, you would do:
out : {firstName: Str, lastName: Str}
updatedPerson = {p & firstName: "Lia"}
out = {firstName: updatedPerson.firstName, lastName: updatedPerson.lastName}
Sorry, my original mention of "transparent wrappers" was very confusing. Unlike opaque types, named types are very different, there is nothing to wrap or unwrap here, named types work just like the anonymous ones, only you need to be explicit in what type you need.
Speaking of implementation, suppose that currently records and tags are represented using some Record and Tag data structures that hold the internals. Conceptually, we would just need to add an optional name field to those structures.
So, if the name is present, it is a named type, otherwise it is an anonymous type. There is no change in the semantics at all, only the parser needs to be adapted to support the explicit mentions of these names, adding it to the Record / Tag data structures.
Also, if named types are really covering all of the use cases, one could maybe even consider completely replacing anonymous types with the named ones, as this would encourage a bit more type safety and simplify the language a bit (though I don't think it is too hard for the users to learn that you can name your types if you wish, or not).
For 7. It is explicit, not implicit.
Would it still not be allowed?
Well, what I am suggesting does not require any extra semantics. One uses named records exactly as the anonymous ones, with the only difference that you need to specify the type explicitly where it is not otherwise known.
For 7, if you would like to support this, of course I wouldn't mind :smiley:
val = { firstName: "Bob", lastName: "Fraser" }
person = val |> Person
However, to support it, we need at least the following features:
What happens if the anonymous record is missing fields compared to the named one (and named record can also be open or closed)?
So, it raises a lot of interesting questions that need answers (and possibly other questions).
What I am suggesting is merely a mechanical change. We allow to add the name to the types, getting better type safety and abilities. That's it.
Yeah, the reason I keep talking about wrapping is because syntactically what you have describe works the same way that opaque type wrapping works. Just without the convenience functions to wrap and unwrap them. It also fits Roc current design. Saying x = Person { ... } maybe doesn't fit as clearly otherwise
@Paul Kapustin Thanks for taking the time to answer all of my questions.
So I want to flip the script and look at a more minimal alternate change.
For this framing, please ignore syntax and etc. I really just want to know if this idea would (for the most part) alleviate the need for named types as describe above.
The change is twofold:
1) Allow opaque types to be used in matching.
Color := [Red, Green, Blue]
c = @Color Red
when c is
@Color Red -> "yay"
_ -> ""
Note, I think this should normally be done with exposed constants. Something like:
# Color.roc
Color := [Red, Green, Blue]
red = @Color Red
green = @Color Green
blue = @Color Blue
# SomeCode.roc
c = Color.red
when c is
Color.red -> "yay"
_ -> ""
2) Allow opaque types that wrap records to specify public fields. The public fields would get record dot syntax.
Person := {
pub firstName : Str,
pub lastName : Str
}
x = @Person {firstName: "Allen", lastName: "Ross" }
x.firstName
Yeah, the reason I keep talking about wrapping is because syntactically what you have describe works the same way that opaque type wrapping works
Yeah, syntactically, as @Person can be used for constructing and deconstructing the opaque type so you are able to access the underlying data, however semantically it is different from a named type.
Just without the convenience functions to wrap and unwrap them.
Just to make sure we are fully on the same page, with named types the wrapping / unwrapping isn't needed even conceptually right? Because you can just access the data directly
Saying x = Person { ... } maybe doesn't fit as clearly otherwise
Could you elaborate a bit more whether you mean style or something else? Also, the syntax could be different of course.
Regarding your alternative proposal, I think this looks very interesting. Now we are sort of moving into the land where FP meets OOP (in a good sense). Now the underlying records and tags resemble private fields in your C#/Java classes, and opaque types are more like a public interface, public fields are your public C#/Java properties.
By doing this we would be sort of encouraging to wrap things in opaque types a bit more than we are doing now, at least for some cases. Which is probably okay.
And yes, I think this would help a lot, as you can have your first-class types with dot syntax / pattern matching that you can define abilities for.
Some questions though:
But other than that, I think it is a great suggestion!
Not entirely sure that it is more minimal than the named types though, but I guess that can be fairly subjective unless you only mean implementation complexity.
Just to make sure we are fully on the same page, with named types the wrapping / unwrapping isn't needed even conceptually right? Because you can just access the data directly
Yeah, but you hit a wall when passing to functions. Without unwrap, you can't pass a Person to {firstName: Str}* -> Str function. So if there is an existing set of functions that could theoretically interact with your type, you lose access to them when you name it. You may still want to be able to use those without needing to explicitly copy every single field in a record.
Like by default you don't want to be able to pass a Person to a {firstName: Str}* -> Str, but you want to be able to opt into it.
Is there any runtime overhead to opaque types or are they fully erased during the compilation?
erased
How about updating fields in a record wrapped in an opaque type?
Yeah...forgot about that one. I would say that it should just work for the public fields. So {x & firstName : "Dave"}. Though, I am less convinced of that for opaque types. Exposing something as readable should probably be different than exposing something as writable on an opaque type.
Saying
x = Person { ... }maybe doesn't fit as clearly otherwise
I mean that Person { ... } looks like you are passing { ... } to some sort of special person function just like @Person { ... } . In the case for opaque types, @Person is a function. So with named types, it would really feel like Person is also a wrapping function.
Not entirely sure that it is more minimal than the named types though, but I guess that can be fairly subjective unless you only mean implementation complexity.
Definitely should be less implementation work, but I more meant in terms of changes to the language and ramifications that need to be thought through.
My suggestion is a few small changes to opaque types. Yes, they clearly could have some big impacts, but they are just modifications to opaque types.
Named types adds a whole new concept to the language, leads to questions around if we would still want type aliases (which is a large shift to the language), and leads to us having multiple similar options which need to be cleanly justified.
So if there is an existing set of functions that could theoretically interact with your type, you lose access to them when you name it.
Could you give an example, why would there be such a set of functions? If the idea is to use them with the named Person type, why would those functions be written in the anonymous form to begin with?
In the case for opaque types, @Person is a function. So with named types, it would really feel like Person is also a wrapping function.
Yeah, there is of course @ that makes it a bit different, but I agree. If one only had named types, though, this visual ambiguity would be totally gone, as {firstName: "John", lastName: "Smith"} would no longer be a thing on its own.
My suggestion is a few small changes to opaque types. Yes, they clearly could have some big impacts, but they are just modifications to opaque types.
Sounds awesome!
I think I agree with the most of your points.
Also, I think that if one wanted to really introduce named types, probably a cleaner way to do it would be to get rid of the anonymous types altogether.
So basically I see two possible approaches to this.
1) A more revolutionary, getting rid of anonymous types, introducing named types. With this approach I think you get a fairly clean design, as you always create a named type, no other choices really, type aliases and opaque types are going to cover very specific needs like PersonDictionary : Dict Str Personand UserId := {userId : Str}. The language is going to get a little bit more of a Haskell / Rust style.
2) A more evolutionary, doing what you propose with extending opaque wrappers. With this approach there are no significant changes to the language design, in my opinion there is a little bit less clarity on how to model things as you can either go with type alias or wrap your anonymous type with an opaque type. Also slightly more difficult access to data as there are more layers. However, a very interesting approach to encouraging more encapsulation with opaque types, which I think it is mostly a good thing. The language is going to remain more in the Elm / Purescript style.
I think that as long as Roc is still in a quite early stage, the way the final language design becomes plays an important role, that's why I am suggesting to consider both.
But, as I mentioned, I think that both of these alternatives are good, and I think both are much better than not introducing any change, as they allow for more first-class types with better type safety (nominal aspect to typing) plus abilities, without sacrificing the ease of use.
this is a really interesting discussion, thanks for starting it @Paul Kapustin!
just as a quick note, Brendan's comments here sums up my sentiments too:
Brendan Hansknecht said:
My current sentiment is:
We probably don't want to implement this idea currently, but we should save it as a possibility if it turns out to be needed.
Reasons for this sentiment:
- Abilities are a last resort that we don't want to be used regularly. Having transparent types may make abilities easy to sprawl across a codebase. In general, we don't want them to be common.
- Elm doesn't even have abilities and manages this problem fine. Yes, it leads to more names like
renderPersonrather than justrender, but it is no extra code. Either way, we have to write therenderPersonfunction.- It is trivial to wrap a type in an opaque wrapper. If an ability is truly needed, the type can be made opaque in general or converted at the site where the ability is needed. Could even make a well named wrapper:
person |> Person.toRenderable |> render- We don't have any large enough code bases in Roc to really analyze how much of a pain this might become in practice. I think there is a good chance that it is easy enough to work around that it won't matter in practice (again, look at Elm where there aren't even abilities (or C or Go)). But this is ultimately a tradeoff of simplicity, defaults, and friction.
I agree with all 4 of these, and I think the bare minimum for seriously considering actually making a concrete design change along these lines would be "we're hearing a lot of complaints about the status quo being painful in a way this might resolve."
I'm not saying we should rule out something like this indefinitely, I just wanted to be clear that I'm not aligned on this part specifically:
Paul Kapustin said:
I think that both of these alternatives are good, and I think both are much better than not introducing any change
I think this is interesting, but I definitely think that right now not introducing any change is better than introducing any change to address a problem that is currently hypothetical (we've literally never heard anyone mention this as a pain point they encountered while actually building a Roc program!) and which also has drawbacks that have yet to be robustly explored :big_smile:
that said, I do genuinely think this is an interesting design space, so I'd be curious to explore the implications!
one thing that immediately comes to mind is how this interacts with type inference. For example, today the inferred type of \rec -> rec.x == rec.y should be:
{
x : a,
y : a,
}* -> Bool
where a implements Eq
I can also write this function:
\rec1, rec2 ->
if rec1.x == rec2.y then
rec1 == rec2
else
Bool.false
the inferred type of this function should be:
{
x : a,
y : a,
}*,
{
x : a,
y : a,
}* -> Bool where a implements Eq
note that although I wrote rec1 == rec2, there was no need for the type to specify implements Eq on both records, because records automatically have Eq if all of their fields do
which is nice here because implements only works on type variables, so we'd have to do { ... }* as b or something in order to add an implements constraint in the where` clause on the records
that's relevant to this idea because the original goal here (as I understood it) is to be able to use non-inferred abilities (e.g. ones defined userspace) on a type that also supports record field access
so let's take the example above and let's say instead of rec1 == rec2 it's MyAbility.foo rec1 rec2 and MyAbility.foo has the type foo : a, a -> Bool where a implements MyAbility
so now I have this implementation:
\rec1, rec2 ->
if rec1.x == rec2.y then
MyAbility.foo rec1 rec2
else
Bool.false
...so what's the inferred type here? Since there's no type annotation (and I have extreme difficulty imagining a world where I would be okay with giving up principal decidable type inference for any form of this idea, so let's assume we have to preserve principal type inference in the design), the only information type inference has to work with here is the implementation, which tells us:
rec1.x, so rec1 needs to be capable of having .x performed on it (today, that means rec1 is a record, but in this idea, it might also be something other than a plain record)rec2.y, so rec2 needs to be capable of having .y performed on itx and y both need implements Eq, and need to be type-compatible, because they're both given to ==rec1 and rec2 both need implements MyAbility, and need to be type-compatible, because the foo function requires that of both of its argumentsputting that together, one possible inferred type could be:
{ x : a }* as b,
{ y : a }* as c
-> Bool where
a implements Eq,
b implements MyAbility,
c implements MyAbility
so in essence, what this type is now saying is "I have a record-shaped thing, which might be a structurally typed record, or maybe a nominally typed record-like thing, and it has to have this particular ability"
which then raises the question of whether all inferred record types need to start listing implements Eq because that would no longer be safe to assume like it is today; I could (for example) define a nominal record-like type which declines to implement Eq even though all of its fields do
I bring all of this up because I haven't seen the implications of type inference discussed so far, and I think that's an important thing to take into account
in general, the questions of "what happens if I use this feature without type annotations? What is the inferred type, and how would that inferred type have to work differently from how it works today?" seem under-explored, and worth exploring! :big_smile:
we've literally never heard anyone mention this as a pain point they encountered while actually building a Roc program!
Well, I think this is partly because Roc is still pretty new. Abilities are not even described in the tutorial yet!
But this very issue has been raised several times in the Purescript community, and not only by me ;) But yeah, I complained about it too and actually had to pick a different language for my PhD project because of this very reason.
Also, Purescript is a very niche language (frontend-only, high learning curve), while Roc is general-purpose and beginner-friendly.
So, of course I may be wrong, but I think that people coming from other languages like Rust, Haskell, Java, C#, Kotlin, Swift, etc. would be expecting some way to implement behavior on their types to support ad-hoc polymorphism, that is available in all those languages in form of traits, type classes, interfaces, protocols. Ideally without making those types harder to use. And, unlike in Purescript, I don't think we can expect Roc users to implement various lens-based workarounds to support easy access to the data wrapped by the newtypes, etc.
Of course, we could wait and see whether this becomes an real problem confirmed by real users, but doing significant changes then will be much harder because of all the existing code, so I think it is good to discuss it early.
@Richard Feldman
Thanks for bringing type inference up, I totally forgot about it :)
...so what's the inferred type here?
First, a little digression to see what some of the other languages do.
Rust refuses to compile this and asks to specify the struct type. Gleam does the same. Ocaml picks the record type for you, and if there are multiple types with the same field name that fit, it picks "the most recent definition", Haskell with OverloadedRecordDot gives you a HasField constraint.
I am assuming that none of these would be suitable for Roc. And also we want full type inference.
So, I think what you suggested in terms of the type signature looks good. The details could be discussed further in a more detailed proposal, but in general I think that something like this is straightforward enough and easily extends if one record type is expected to implement multiple abilities, for example:
{ x : a }* as rec1,
{ y : a }*
-> Bool where
a implements Eq,
rec1 implements MyAbility1,
rec1 implements MyAbility2
And I would say that I am not discouraged by this type being somewhat complex. I think it is actually a good thing: the user should ideally provide a more specific type signature in this case as it would increase both readability and type safety, and compiler could even emit a suggestion about that. After rewriting, the signature would be something like
RecType1 -> RecType2 -> Bool (closed record) or
RecType1 a -> RecType2 a -> Bool (open record)
We do a rec1.x, so rec1 needs to be capable of having .x performed on it (today, that means rec1 is a record, but in this idea, it might also be something other than a plain record)
so in essence, what this type is now saying is "I have a record-shaped thing, which might be a structurally typed record, or maybe a nominally typed record-like thing, and it has to have this particular ability"
In the named records world, it is still just a record (might have a name, but it's not really important here). So no new semantics really.
which then raises the question of whether all inferred record types need to start listing implements Eq because that would no longer be safe to assume like it is today; I could (for example) define a nominal record-like type which declines to implement Eq even though all of its fields do
I think this would be a good idea in general. I would go even further and consider whether we could keep only the named types in the language, removing the anonymous ones. I see the following benefits to this:
• Better type safety as CardStatus is never compatible with SessionStatus
• Simpler design as all types would be named, you don't need to choose between them and change your code back and forth, you can implement abilities on all types
• You don't get any of the abilities automatically derived for you without you knowing. I think this behavior should ideally be opt-in (you can neither opt-out or change the behavior when using anonymous types).
I have recently seen one bug twice in Elixir code, related to comparing dates that works at first, and stops working when the month changes, as then field-by-field structural record comparison is no longer correct for dates, and it reminds me of this.
I think one of the selling points of Roc is correctness and reducing the number of bugs and runtime errors, so I don't think asking the user to explicitly add a derive annotation would be too much to ask.
I know that talking about named types and especially about considering to remove anonymous records sounds like a large design change. But I think it would be wrong not to discuss or propose something only because it feels (or is) large. Well, actually I hope that it isn't that large. Conceptually we would only be adding a name / nominal part to the types, the rest of the semantics would stay more or less the same.
having to name all my records when I use Rust annoys me enough compared to what I'm used to with Elm and Roc that I don't think taking anonymous records away is worthwhile to explore :big_smile:
let's assume that feature is here to stay forever
This is fairly subjective though, some other users may feel the opposite :smile:
But sure, we could have rather clear guidance when to prefer anonymous and when named (in addition to taste differences), so people can adapt to their needs and preferences.
For example:
Anonymous is probably your default if
Named is probably your default if
And then, of course, one can then use another option rather than the default when needed.
Sorry, I know this is discussed earlier in the thread, but I hope you'll understand that it is difficult for me to summarize them - what are the specific issues with opaque types that named types (instead of the current opaque types feature) would resolve?
Ayaz Hafiz said:
Sorry, I know this is discussed earlier in the thread, but I hope you'll understand that it is difficult for me to summarize them - what are the specific issues with opaque types that named types (instead of the current opaque types feature) would resolve?
Mainly that today, if you want your type to implement an ability, you need to wrap it with an opaque type. But then you really make it opaque and lose easy access to the data inside it (for example, you lose dot syntax for records).
In many cases you this is somewhat undesirable, as you don't really want an opaque type, what you would like instead is to be able to define abilities and still have easy access to data in one type (like you have in Rust, Haskell, Java).
So, the suggestion of named types is aiming to support exactly this: being able to define abilities directly on your record or tag types.
This also results in somewhat higher type safety when using named types instead of standard records / tags (as for the type to be the same not only the fields or data constructors should match, but also the name / nominal part of the type).
Opaque types obviously provide that too, but at the cost of wrapping the underlying type - and you wouldn't need the wrapping with named types.
Do you think it would be a fair characterization to say
Hmm...I am not sure...I guess this depends a lot on the design.
I would say that in my experience with purely functional languages and immutable data, and somewhat contrasted to OO approaches, I have usually seen less "interaction with a type using a public API" (as long as types are just data in FP), but rather "a public type being part of an API".
I have often seen and worked with designs where a module exposes certain functionality using public functions and public types (keeping some other functions and types internal). Now, the module may not necessarily expose the way to instantiate these public types, but definitely the users would be able to access the underlying data, usually with dot syntax for records. Often the users would also need to use ability functions on these types.
For example, when working on a "Core" part of the system that allows to search for store locations in a city, I would likely define a public type Location and a function search exposed in the module Stores:
search : CityName -> [Location]
The "Core" part would also likely define a sensible way of comparing locations and serializing them to JSON using abilities.
Having this, in an "Application" part of the system, I could do something like this:
import Stores
storeLocations = Stores.search "London"
And then for each location I could also do:
location.name
location.gps.x
location.gps.y
As a consumer of "Core", I would also be able to compare locations using ==, as well as serialize them to JSON, trusting that this behavior provided by the "Core" subsystem would be reasonable for the Location type.
However, as @Brendan Hansknecht suggested, one could try to solve this by separating the type in two, introducing even more encapsulation and adding public fields to the opaque type. However, I am wondering if this would be an unnecessarily complex solution for many of the cases, due to having two types for each level of nesting.
Also, as @Richard Feldman pointed out, this would likely complicate the situation with type inference, as then .x may mean "record" or "nominal type with record behavior" (unlike with named records, where .x still means "record").
Personally, I don't really see the issue with this example. This example really shouldn't need opaque types except for encode and decode. That should be as as simple as either:
Store.serializeLocation myLoc or Store.toserializable myLoc |> Encode.encode.
Just a single API call directly before or after serialization to convert to/from an opaque type to/from the plain record that has access to .x and .gps
However, if you do myLoc1 == myLoc2, or Encode.encode myLoc, this will still compile (as all records get these abilities automatically derived), but will likely lead to incorrect behavior at runtime, right?
Like, wouldn't it be better if we could use myLoc1 == myLoc2 and Encode.encode myLoc directly, and it would still give you the right behavior as implemented by the "Core" part of the system?
I think encode is not automatically derived (if it was is, yeah, that is a really share edge)
As for EQ, why would location need custom equality?
I think almost always naive eq is correct except in data structures that should be opaque anyway
I don't think it's safe to assume people will run into bugs in this area just because it's possible to :big_smile:
we don't have any data on it being a problem in practice, and this isn't the sort of thing I think it's so safe to assume will be a problem that we should make a big effort to prioritize solving it preemptively
(which is just a comment about how much we should weight this particular aspect of the idea; I agree it's a benefit in this situation!)
If you really need EQ, simply make location opaque and expose just the data that needs to be accessed directly either in separate functions or as a record
locData = Location.data loc
locData.gps.x
This enables location to hide some details, get abilities like encode and custom equals, expose data with record syntax (that could even be massaged to only be rough location instead of exact if wanted)
Sure, you could, but then you would have to deal with two types: Location and LocationData, and remember that you should never do locationData == locationData or Encode.encode locationData, etc.
Also, when you need to update your records you'd have to first update LocationData, and then update Location with the updated LocationData. And purely functional record updates are already fairly painful when you have multiple levels.
Though, I agree with you that sometimes it is actually good to have an extra layer for more encapsulation, but I believe that in many cases (I would maybe even say most), it could be easier to deal with just one type that gives you both easy access and correct behavior of the abilities.
But it seems to me that we more or less understand each other. I am thinking that maybe for further discussion to be more constructive, we could consider
Or what do you think?
Like, if there is no clear agreement on the first point, then we should probably discuss that more, but if there is, we could move to the other points.
we don't have any data on it being a problem in practice, and this isn't the sort of thing I think it's so safe to assume will be a problem that we should make a big effort to prioritize solving it preemptively
@Richard Feldman I agree with this, but at the same we could look at it differently, rather than seeing it as a problem that has to be solved we could see it as an opportunity to make a language significantly better
And yeah, we don't know for sure that it will be significanly better. Maybe it would be just somewhat better
But I think that it is a large change that affects how you design your programs.
I'm curious about concrete examples of 1, and how they'd compare to status quo. I think the benefits so far are too abstract for me to describe them as "substantial" yet. :big_smile:
Like for example, sometimes I'm interacting with a third-party http API that uses JSON which includes things that don't map directly to roc records, such as heterogenous arrays or non-dictionary objects with field names that have emojis in them or something.
in that scenario, I have some Roc data type and some way to translate the JSON into it. So in both of these design worlds (status quo and this idea) what Roc code would I write to address this use case?
I think the answer in either design is the same: I need a custom decoder which decodes into my data structure, handling the unsupported stuff appropriately.
so in that example, I can't really point to a substantial advantage of this design
so for it to be a serialization benefit, I think it would have to be something like "I have a serialization format like JSON which can be derived entirely from the type (unlike, say, protobuf), but I don't want it to line up 1 to 1 with the Roc type I want to use, but I also don't want to write a custom decoder or use a translator function (e.g. "translate snake_case fields to camelCase")"
in that scenario, today I would need to maintain 2 different types (one opaque with custom Decoding, and one record) and make sure to use the right one in the right scenario. But to be honest, if I were personally in that scenario I think I'd just use a decoder instead of a separate opaque type.
but in the other world, I could use a nominal-but-not-opaque type plus a custom Decode implementation (which would be basically the same amount of work as a decoder) instead of either 2 different types, or 1 structural type and a decoder.
so again that really seems like in either design I'm going to have 1 type and a custom decoder implementation, so the benefit comes down to how hard it is to remember to use the custom one instead of the default inferred one
which is something! I'd call that substantial, although it's unclear how big of a benefit it would be in practice - especially in a world where we could potentially support a custom linter rule saying "always give an error whenever a record of this shape is given to Encode.encode" or something like that.
(Maybe that is or isn't a good idea, but my point is that there are other potential ways to explore if that's the problem we're trying to solve.)
overall, I think a point I should make is that all else being equal, this would require making the language bigger (since I'm not okay with removing anonymous structural record types, and of course we still need some notion of opaque types) and therefore it's starting out as a net negative.
The burden of proof is on the new feature to justify that its benefits outweigh - at a minimum - the unavoidable drawback of adding something new to the language. So maybe it's better, and I'm open to that possibility! But I don't think it makes sense to think of it as "hey maybe if we do things differently in this way, it'll turn out to be better" because it'll at least have to be bigger, which - all else being equal - is a downside for a language that values being small and simple.
so in that example, I can't really point to a substantial advantage of this design
I would suggest that the difference would come in after you have decoded the unsupported stuff into your data structure using any approach.
In this idea, after going through that adapter layer, you would have one first-class data structure that you can safely use throughout the rest your system, easily accessing the data, and without having to think whether the behavior given you by ==, Encode.encode or any other abilities is correct, or you need to lookup some function somewhere and remember to call that function before you can use ==, Encode.encode, etc.
Like, going in the direction of "if it compiles, it works". Improving correctness of the programs, catching more bugs at compile time. In my view at least, this is a very substantial benefit. Like, I love Elm, and I know that Elm doesn't give you that, but Elm doesn't have abilities.
And yeah...we could make it a linter warning, but in Roc we do have abilities, so we could also do one more step and make it a real compile error. That's why I feel it is substantial.
But yes, I agree that if we can't give away anonymous types, it makes the language bigger for sure.
I guess this also depends on how general-purpose is Roc aiming to be and how large systems we would like to support building with it.
right, but I think Elm is a good example of why this isn't a problem in practice :big_smile:
like if that experiment has already been done at scale, and the conclusion is that structural record types aren't error-prone, what's the problem we're solving?
if the answer is "what if people want to make custom Abilities and put them on lots of things and then that becomes error prone" - that doesn't reasonate with me. We had a big discussion before adding abilities to the language, and one of the biggest concerns was that custom abilities would be overused.
We may be starting with different premises here, but to me, if there are lots of popular custom abilities in the Roc ecosystem, I think that's a downside. To me, the main reason abilities should be customizable at all (and we also discussed whether they should be creatable in userspace or if we should just have a predefined set of builtin ones and that's it) is that they have a performance benefit compared to passing around records of functions.
For example, Brendan used Rust traits to implement code sharing in our development backend (across x64 and arm64 targets, which do some things the same way but also do various things differently) without runtime overhead. That to me is a good use of custom Abilities in userspace.
I don't think size of system or applicability of Roc has anything to do with abilities haha
well, except maybe to the extent that if people think that making custom abilities will make their system scale better (instead of worse, due to overcomplicating things unnecessarily) and therefore focus on the wrong things, in a similar way to how large Enterprise Java code bases suffer from overcomplicating things unnecessarily - I guess that could arguably be considered a downside of abilities when it comes to scaling, but personally I wouldn't go quite that far :big_smile:
like if that experiment has already been done at scale, and the conclusion is that structural record types aren't error-prone, what's the problem we're solving?
I think the fact that Elm is good doesn't mean Roc can't do something better. Even if there is no "problem", by adding named records we would:
• Give users a simple and consistent way to support ad-hoc polymorphism (define how a type implements certain behavior) without sacrificing on ergonomics
• Give users "first-class" record and tag types (in the sense that they support other language features like abilities fully)
• Improve type safety (SessionStatus not being compatible with CardStatus), and this is an area where Roc would otherwise have less type safety than Elm (as custom types in Elm are nominal)
• Help eliminate a whole class of bugs originating from the fact that standard abilities like Eq, Encode and others may provide incorrect default behavior for user types, and the user forgets to use the right function or an opaque type
• Allow the users to opt-out of the behavior provided by the standard abilities, or override that behavior
if there are lots of popular custom abilities in the Roc ecosystem, I think that's a downside.
I agree there shouldn't be many "popular custom" abilities in the ecosystem. But I think that my arguments apply just as well to standard abilities. And also I believe it is okay for a project to define one or two abilities for their internal use where they need ad-hoc polymorphism.
I'll try and add another argument: language design consistency.
I am imagining the following dialog:
I am sorry for some exaggeration here, I am solely using this to (hopefully) make the argument clearer!
So this in my head is also about language design consistency. Fair enough, Elm doesn't have abilities, and neither does Ocaml (yet, however modular implicits seem to be a thing). So the users have to come with other ways to support their ad-hoc polymorphism.
But Roc does have abilities, and also user space abilities, these decisions have been made. However, as a user, you can't really use them on your types (at least not easily). That's why to me it feels like somewhat inconsistent (or incomplete) design, sort of like stopping a hundred meters before the finish line.
So, I am still on the first point of whether there are clear benefits (not whether this is something Roc should implement today). I agree that these named records come at a cost of making the language a bit bigger (because anonymous types stay anyway), and maybe it is not a problem that needs to be solved today, but I think this is still a substantial gain for the language in terms of design consistency, as well as making it significantly more powerful (better support for ad-hoc polymorphism), and leading to more correct programs with fewer possibilities for bugs.
But Roc does have abilities, and also user space abilities, these decisions have been made. However, as a user, you can't really use them on your types (at least not easily). That's why to me it feels like somewhat inconsistent (or incomplete) design, sort of like stopping a hundred meters before the finish line.
I totally get all your sentiment and probably want the features that named types would enable (maybe in slightly different forms though). That said, I think the point here is the root of most of the contention and differing views. This restriction is intentional.
For the most part, I don't want the average user to ever implement an ability. They should depend on abilities like Hash and Eq which get used by data structures, but they should essentially never have to implement those. Some library authors will of course use them for creating new data structures and algorithms and such. Otherwise, I think we should promote the use of plain data types where the auto-derived implementations just work. Most of the time if you have a custom equal, you either 1) are implementing a data structure, 2) are generating exactly what would be auto-derived, but on an opaque type, or 3) probably shouldn't have a custom equal and should use a type specific method instead (in this case, the custom equal is probably more error prone due to end users not realizing you aren't truly comparing all fields and other implications).
There is one exception to this: Serialization. At the boundary of an app, app internal types should get wrapped in custom opaque types that understand versioning and are tested via communication integration tests. These types are just for Encode and Decode and avoid changes to internal data structures affecting communication with other systems. They get wrapped immediately before encode and unwrapped immediately after decode. These types add safety in two major forms: 1) they guarantee you won't accidentally break your communication api by renaming a field on an internal type, and 2) they stop you from accidentally sending extra fields over the wire (no leaking secrets added to internal data structures).
I agree with you in that it is nice when the language is guiding the user and removing poor (and too many) choices. in particular I like it when for each problem there is exactly one straight-forward solution.
However, in this case we have a couple of problems that don’t have any satisfactory solution:
I think one example of a language with intentional restrictions is Elm. It clearly communicates that there is no type classes. If you need your own ad-hoc polymorphism, you need to do it manually. One may disagree with that situation, but at least it is a clearly documented restriction with some reasoning.
With Roc’s abilities, to me it feels differently. Roc supports both abilities and user space abilities. So, this WOULD BE the way to get those earlier mentioned features 1-3, however it is not a straight-forward solution, as the user has to choose between getting those features and good ergonomics (like field access).
That’s why to me personally it doesn’t feel like an intentional restriction, but rather an ad-hoc situation and an issue related to language design, and this is why I am suggesting to improve it.
But of course I see that others may perceive it differently.
Guidance by complete removal vs guidance by friction.
I agree that guidance by complete removal with documentation is better when possible, especially with larger features.
That’s why to me personally it doesn’t feel like an intentional restriction, but rather an ad-hoc situation and an issue related to language design, and this is why I am suggesting to improve it.
This may end up being the case for many people. If so, we may have to find a way to either make the design lower friction (like explored above), or high enough friction that it is very clearly an intentional restriction.
Exactly, so I generally prefer less friction as long as it is possible to do this with a consistent design.
Like, sure, someone can always misuse a feature and make a mess. Then one can also fix that mess. But In my opinion it is much more annoying when there is simply no way to achieve what you want. Then there is nothing to fix either.
Like, if I tried to convince someone to use Roc at work to build a larger system, and someone presented an argument "But we have a complex system, we need ad-hoc polymorphism, so Java with interfaces would be a better choice", I would have nothing to object to that argument.
Also, I think that it is good to consider changes like this earlier rather than later, both because it is just easier to implement as there is less backwards compatibility to think about, and also because it is easier to end up with a consistent design in the end.
Paul Kapustin said:
Like, if I tried to convince someone to use Roc at work to build a larger system, and someone presented an argument "But we have a complex system, we need ad-hoc polymorphism, so Java with interfaces would be a better choice", I would have nothing to object to that argument.
I think the best counterargument to "we have a complex system and therefore we need ad-hoc polymorphism" is "actually complex systems don't demand ad-hoc polymorphism; if you'll give me a specific example of complexity that seems to demand ad-hoc polymorphism, I can talk about how it can be addressed in another way."
Paul Kapustin said:
Like, sure, someone can always misuse a feature and make a mess. Then one can also fix that mess. But In my opinion it is much more annoying when there is simply no way to achieve what you want. Then there is nothing to fix either.
Importantly, "achieve what you want" in the case is almost entirely syntactic. You can already get full ad-hoc polymorphism in Roc using abilities and opaque types, and you can expose functions which allow arbitrary access to the internals of any opaque type.
So "achieve what you want" in this case is specifically trying to remove an "unwrap" function call. Another way of looking at it is that the feature idea is to make a function call implicit.
I'd put that on a different level of importance than things like "it's impossible to customize how hashing works for a custom data structure," which was true before we added abilities to the language
@Richard Feldman Sorry for the late reply.
if you'll give me a specific example of complexity that seems to demand ad-hoc polymorphism, I can talk about how it can be addressed in another way.
Well, "customizing how hashing works for a data structure" is an example of when ad-hoc polymorphism is helpful, and this is exactly what abilities were implemented for?
So "achieve what you want" in this case is specifically trying to remove an "unwrap" function call. Another way of looking at it is that the feature idea is to make a function call implicit.
Yes, but this need to wrap/unwrap everywhere in the codebase makes the two styles (one with anonymous types, the other with opaque wrappers) syntactically incompatible, so when working on a real system you can't change back and forth every day, so you have to make a choice between ease-of-use records and abilities. I think this choice is somewhat unfortunate and propose to give the users the best of both worlds.
Another way of looking at it is that the feature idea is to make a function call implicit.
Sounds good, of course, I am more than happy to discuss other alternatives that could help!
I'd put that on a different level of importance than things like "it's impossible to customize how hashing works for a custom data structure," which was true before we added abilities to the language
Totally agree, I think that addition of abilities to the language is a major improvement!
I just think that it would be even better to go "all the way" in the sense that currently it is still technically impossible to customize how hashing works for a custom data structure (if this data structure is not an opaque type).
When is a custom data structure not an opaque type? If a custom data structure isn't opaque, any code across the app can change its internal details which sounds really bad for a custom data structure. Generally custom data structure means custom API with hidden details, which requires opaques to begin with.
Well, this of course depends on how one defines "custom data structure". In my understanding, a custom (as opposed to one provided by, for example, standard library) data structure is pretty much any data type created by the user for their own needs.
For example, a custom type or a record in Elm, any data (ADT or record) in Haskell, a struct or an enum in Rust.
I would say that the degree of encapsulation is kind of a separate dimension to this.
I agree that in some (but not all) cases we would like to prevent the user from directly accessing (or modifying) the internals of the data structure. In OOP languages one would make fields private and expose the needed ones using properties (however, the ergonomics is still good!).
In Haskell one could, for example, choose not to export the constructors or even the data structure at all (however, in the internal module, where the data structure is used and manipulated with a lot, one would still have both type classes and good ergonomics like dot syntax).
So, I feel "data structure" is an important term here. Very very few custom types should need a custom hash or eq. On the other hand, custom data structures, like a set, often need both an opaque API and abilities.
Wikipedia definition for data structure (essentially collection types):
More precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data,
That said, maybe my premise of few custom types outside of data structures needing things like custom hash and EQ is incorrect. That or maybe there is a much more useful ability that will arise and be wanted on more arbitrary types. (In a way that couldn't be autoderrived)
Last updated: Jun 16 2026 at 16:19 UTC