I think we should just remove wildcards and make this whole class of edge cases disappear
Yeah, it only has very limited correct use and most people do not understand those use cases.
Really the main correct use case is function inputs that are not cared abot.
Brendan Hansknecht said:
Really the main correct use case is function inputs that are not cared abot.
That's what _
is for though...
no
_
constrains to a specific type
*
represents "any type"
_
you can still use the underlying type cause underscore means the compiler will fill it in for you:
List _ -> U64
is a valid singnature for List.get on a list of U64
.
List * -> U64
can only return the length or capacity and can never access an element.
Oh ... that's interesting. so *
is polymorphic and _
isn't?
Which is why a function that accepts [Red, Blue]_
called with [Red, Blue, Green]
and [Red, Blue, Yellow]
will only resolve correctly for one of them
_
is more flexible. Technically _
could resolve to *
*
isn't quite polymorphic, I think
It means "this variable must work for any type replaced", so it can only work for, say, empty lists
Since no matter what type [] is, you can't get a value out of it, so the elem type doesn't matter
so it can only work for, say, empty lists
If talking about a concrete value. Means something different for a function input.
Yeah...
^ Another reason
Anyway, _
is just a request for the compiler to fill in the blank, it is super flexible.
» x : List _ -> _
… x = \l -> List.get l 0 ?? 24
<function> : List (Num a) -> Num a
» y : List _ -> _
… y = \l -> List.map l \i -> i - 1
<function> : List (Num a) -> List (Num a)
» z : List _ -> _
… z = \l -> List.get l 1
<function> : List a -> Result a [OutOfBounds]
None of those functions are valid if *
was used for the first blank instead of _
Anyway, I personally would prefer to restrict *
to only be allowed in function inputs. I think that is the main valid use case for it. Ban it in all other locations.
But I understand just using an arbitrary letter instead of *
Seems simpler to just get rid of it
And then beginners don't say "oh, what's this" followed by "why can't I use it elsewhere"
That definitely wasn't what I came here for... but it has come up before
I'm ON a MISSION
You are a convenient vehicle
I just really hate the arbitrary letters gunking up my type signatures.
Me too
Okay, separate thread incoming
(unless I can find an old one to necro)
i'm pretty sure *
and a
like variables have already been banned from all non-generalizable positions and only functions are generalizable
You can still do stuff like this:
» x : List *
… x = []
And this:
» y : {} -> List *
… y = \{} -> []
Yep, this typechecks
module [num, func, func2]
num : Num *
num = 123
func : {} -> List *
func = \{} -> []
func2 : List * -> U64
func2 = \_list -> 123
num : Num * num = 123
This surprises me. I think that is just a bug
Oh, that's probably part of the one I got assigned
Nope, it's different
https://github.com/roc-lang/roc/issues/7357 shows that types implemented in Zig like List
don't have their type args checked properly, since we only check aliases for type args
And List
isn't registered as a normal type alias
But this is different, since Num is defined as Num range := range
It might be that numbers are special cased
well, even though I can't define a concrete value, I can still type def one without a problem
module [my_abc]
Abc a : { field : a }
my_abc : Abc *
51 messages were moved here from #compiler development > Wildcard in opaque types by Sam Mohr.
52 messages were moved here from #compiler development > Remove wildcards from Roc by Sam Mohr.
Try again
43 messages were moved here from #compiler development > Wildcard in opaque types by Sam Mohr.
That's a better range
x : List *
This is definitely a bug
y : {} -> List *
This seems fine and should be allowed, \{} -> []
is a generalizable function
num : Num *
This is a special case today but i think it's worth removing
func2 : List * -> U64
This also seems fine because the function can take in a list of any type
\{} -> []
is a generalizable function
Is this a valid List.empty
? I wasn't sure if *
in an output location generalized.
yeah it's generalized
the intuition is that because the entry point of the program is concrete, by the transitive property, all used values must eventually be concrete as well
Ah yeah, just realized we do this for Dict.empty
so that makes sense.
if a value's type is not concrete after you propagate all the types through, then the value must never be used. in that case it's sufficient to generate the void type
I would be quite happy if *
was restricted to only function inputs and outputs.
agreed
and any named type variables too
So it seems like we have two positions that move us in the direction of minimizing wildcard *
type variables in Roc:
*
entirely, and replace all usages with named variables. There are quite a few usages in our builtins, but there doesn't seem to be anything that uses more than one or two at a time, meaning we aren't introducing a whole soup of variables. This avoids the multiple times beginners have been confused by wildcards, and should improve comprehensibility for newcomers. The cost here is that beginners need to consider that any variables they see on the left-hand side of a function but not the right, or vice versa, are ignorable. This is more mental overhead when reading Roc, but not much IMO.*
in function input types. That allows us to communicate that a value is not important (in that it isn't used), but doesn't lead to confusion about its usage in variable types. We'd need to figure out how we should write empty_list : List *
, but it shouldn't require a massive conversion. We still need to now think "if a variable exists in the return type but not the args, ignore it", but we still get a simpler Roc.It seems like this comes down to:
*
entirely?*
only in function args and nowhere else bother us?Oh, we'd want function outputs as well
empty list should be \{} -> []
if you want a generic list. Otherwise write empty_list : List _
and the compiler will figure out what type to use for you.
How much do we care about consistency: does having
*
only in function args and nowhere else bother us?
No. Cause it doesn't really make sense anywhere else.
*
in function input and output types. That allows us to communicate that a value is not important (in that it isn't used), but doesn't lead to confusion about its usage in variable types. We'd need to figure out how we should write empty_list : List *
, but it shouldn't require a massive conversion. This would probably be the smallest change. Beginners still need to learn what *
is, but can think of it as a "function cleanup tool" and not need to understand the implications in values.i would prefer removing *
only very slightly due to the specific case of * -> *
I think being confusing because the input and output types are not related whereas they are in a -> a
, but I would strongly suggest not having a world where *
can be used on one side of a function and not the other. I think that will be more confusing than having both wildcards and named variables or only having named variables, because it would make the type system seem more complicated than it is.
Yes, I agree that 2 seems like the worst option because it feels like an arbitrary and confusing restriction
I don't quite follow *-> *
is just a -> b
I think we should remove wildcards
I think it was an experiment that didn't work out
the upside is that it's "kinda nice" when you understand it, but the downside is that it's actively confusing, and the downside comes up a lot in practice
just doesn't seem like it has been worth it overall
I think Roc is totally fine without them, and fewer features is always better unless said features add a significant benefit
I will miss this:
List.len: List * -> U64
But yeah, seems fine to remove
I'll wait for more voices, and then make a GH issue later
yeah, there are things I'll miss about it, but overall I think the language is better off without it
Yeah, I definitely believe that. I mean @Luke Boswell is definitely not a beginner and still hits issues with *
. That is a pretty clear sign that is should probably be removed.
Thanks @Brendan Hansknecht
I would say I'm comfortable working around the *
's... but I will admit I still don't fully understand them. I usually just mentally replace them with a
, b
, c
... etc
I also do the mental replacement, meaning I'd rather have the vars anyway
https://github.com/roc-lang/roc/issues/7451
I think *
was an interesting idea to not have to get deep into type vars early in the learning journey but I think type vars are so omnipresent in languages with generics for good reason.
yeah that was part of the hope, and the other part was just to have a better way to talk about it
like if you say "suppose it just accepts num a"
out loud
the listener has to understand that "a" means lowercase A
so sometimes I'd say like "suppose it just accepts num lowercase a"
which always felt awkward, and I was hoping to improve on that by being able to say out loud "suppose it just accepts num star" and then everyone immediately understands the implications of what you're saying
but of course if star is confusing, then that makes it harder to talk about everything with everyone understand what's being said :sweat_smile:
I think another downside is that even for some functional programmers it ate up some strangeness budget
The way that OCaml does with "backwards types" like int option
bugfix for the issue of wildcards where they dont make sense
https://github.com/roc-lang/roc/pull/7454
ugh, implementing the replacement for *
in error messages and there's a really annoying edge case:
my_fn : List(a), Str -> List(a)
my_fn = |list, str| {
inner_fn = |other_list| other_list.len()
# ...
}
let's say there's a type mismatch involving inner_fn
and the error message needs to print the inferred type of inner_fn
if the compiler looks at that inner_fn
type in isolation, with wildcards we'd infer the type being List(*) -> U64
but if we don't have that, we have to generate a type variable name to use instead of *
however, in this case if we generate List(a) -> U64
, that would not actually be an unbound type, because the type variable a
is in scope (and has a different meaning) because of the outer annotation
so we need to instead generate List(b) -> U64
this is very annoying, because it means you can no longer print types using only the type as the input
you also need to know what other type variables are in scope
however, that's not something that we persist all the way through to error message generation because it's not ordinarily something that matters
in fact there's not ordinarily any way to go from a type mismatch error to figure out what the parent nodes are to re-walk the canonical node tree to figure out what type variables are in scope
Richard Feldman said:
-> U64
In isolation we don't know anything about the types other than a static dispatch call?
inner_fn : a -> b where a.len() -> b
sure, whatever - maybe not the perfect example
the point is just that if you have an inner function with an unbound type var
and there is an outer type in scope which has a type var named like a
or something that the auto-generator would collide with
it causes this problem, and this problem seems to require a ton of complexity to fix :disappointed:
I think I understand the problem now... :thinking:
shouldn't the type var for inner fn need to be calculated at some point later on? And shouldn't it either be concretized or bound to a var from the outer fn?
the best solution I've come up with so far is that we keep the scope around long enough so that we still have it when we generate the Problem
and then right when we're taking a snapshot of the type for the Problem
, that's when we auto-generate the names for the variables (instead of doing it later in reporting like we would otherwise)
I'm not quite sure how easy it will be to know what's in scope for the mismatched thing, because that might happen during unification, and I'm not sure if the scope will still make sense then, but it's the best idea I have so far :sweat_smile:
Anthony Bullard said:
shouldn't the type var for inner fn need to be calculated at some point later on? And shouldn't it either be concretized or bound to a var from the outer fn?
the point is that it doesn't have a name in memory
it's just like "type var number 2439" or whatever
it's easy to render unnamed type vars as *
but if we want to give them a name, we need some algorithm to generate a name that's not already taken
unless we want to generate _
which definitely seems worse because then it's like "I'm not even telling you what's here; maybe it's an unbound type variable, but who knows? It could be anything!" :laughing:
Hm i'm trying to think what you'd get in other functional languages
i'd also need to see the actual full code snippet
because i can't see a way where there's a type mismatch in inner_fn here that you wouldn't have a way to fill that type var
it's the same problem in Elm and Haskell (if you have the language extension turned on for variables in outer scopes being accessible from inner scopes)
Because this seems like something solved by every HM type system
I mean I don't know how they solve it in particular, but either they have a solution or they have a bug :laughing:
makes me want to open up ellie again
does that still exist? i'm on my phone
yeah you should be able to repro it in Elm :thumbs_up:
yep!
nothing like running elm compiler in haskell compiled to wasm on my iphone
it's a bug in Elm
Screenshot 2025-07-02 at 9.48.01 PM.png
it should say List b
or something
i don't know if that a bug
sure it is haha
if the outer fn had a regular argument a
and the inner one did too (assuming shadowing is allowed), there is no conflict
so why is there here with type vars?
the type of innerFn
is not connected in any way to the type of the outer function
but if its argument type contains a
, that is saying - in Elm's type system - that they are connected
the bug is that it's reporting that there is a more restrictive type on innerFn
's first argument than reality
in other words, this error message is saying "you can only give innerFn
a list that's of the same type that you gave the outer function"
which is not true; you can give it any list you like!
if it said List b
then it would be accurate
or really any other name besides exactly a
but of course that's what it auto-generated, presumably because the auto-generator code was written before Elm added the feature of inner types being able to reference type variables in outer scopes, which if memory serves was around Elm 0.14 or so
because I never knew that type system feature existed until I heard about Evan talking about adding it :laughing:
example of this distinction being relevant:
Screenshot 2025-07-02 at 10.02.19 PM.png
but if I change it to innerFn listA
it compiles just fine
because innerFn : List a -> ...
is saying that innerFn
only accepts lists with the same type of element as the listA
argument
just to be super clear, this is the edge case to end all edge cases and approximately nobody will notice if it's broken in Roc's compiler, but I still want to do it correctly if we're going to do it :stuck_out_tongue:
image_067A4BBE-A99A-40FD-ABE9-C5DC74A8E5EB_1751508252.jpeg
I don't know if OCaml has the feature where inner type annotations can be connected to outer type annotations by using the same variable names though
you'd have to try to reproduce that last Ellie screenshot I posted above
see if it gives an error; if not, OCaml's type annotations are disconnected (which I think they are?) and that error is not a bug in OCaml
i get it now. we are saying List a because it would be List a if that function , unannotated, was on the top level
but since it's inside of a scope List(a) MUST mean the a of the outer function if it appears there
So the best you could do is capture the next possible type var in that scope and put it in the problem when we create it
I now want to see what this scenario looks like in every type system with generics
Silly idea ... maybe when we produce error reports and have generated type vars, we use the reverse alphabet, z
y
x
...
Or, prefix it with the name of the function
Or use a sigil maybe?
it could be inner_a
Seems reasonable to me
we could even note the conflict in the prob and have a note that says "this is NOT the a from <outer function name>"
I like this solution.
Only potential issue I can think of is maybe the name it super long and that's kind of annoying. But probably not an issue in practice.
I don't think a sigil is a good idea. it would mean you can't copy generated type and paste it to your program with no errors. prefix is better but introduces implicit naming convention. which is not bad, hardly it would be a problem for anyone
I thought of a very simple solution:
this might lead to the auto-generated variable name being like List(d)
when it could have been List(b)
but it'll be accurate, and I don't think anyone cares :stuck_out_tongue:
had we come up with generated names before?
I assume generated names could be confusing because one couldn't find them in their code. prefix makes it more explicit
we never needed to come up with generated names before, because we had *
removing *
requires generating names
then we never had this kind of confusion
correct
a downside of removing *
is that it introduces this problem, and this is one of the problems I was hoping to avoid by having *
in the language :smile:
however, my conclusion is that overall the downsides of *
outweigh the upsides and we should generate names instead (like every other language)
prefix does not solve the problem btw
if you choose my_fn_a
as the name, then that still collides if someone happens to choose my_fn_a
as their type variable name
and if you're ok with collisions being unlikely, but still possible, then it's definitely best to just have the bug like Elm does, because in practice nobody is going to notice either way :stuck_out_tongue:
I agree. I just mean need to communicate generated names somehow. otherwise I anticipate questions like "what this means? I haven't this name in my code"
Richard Feldman said:
I thought of a very simple solution:
- when we're canonicalizing a module, we write down every unique type var name used anywhere in that module in any scope
- when generating unique names, we just make sure to avoid all of those
I'm fine with this very simple solution though :point_up:
Kiryl Dziamura said:
I agree. I just mean need to communicate generated names somehow. otherwise I anticipate questions about "what this means? I haven't this name in my code"
I haven't seen this in other languages which do this
lifetimes in rust?
yeah I haven't seen people be confused about that particular aspect of lifetimes :laughing:
or like in Elm I haven't seen people say "hey why is it called a
in List a
when I don't have an a
anywhere in my code?"
or rather, people generally seem to wonder about the semantics
yeah I haven't seen people be confused about that particular aspect of lifetimes
you're taking with one of them right now :D
I think the SML style of using a, b is an unfortunate thing to propagate
what would be a better style?
names that mean something :rolling_on_the_floor_laughing:
like List(item)
Map(key,value)
we could do that in some cases
but that would make this problem harder :smile:
like for example, if I define List
as List(elem) := ...
then we could choose elem
as the default unbound var name
it doesn't solve this problem, but maybe makes the appearance of a random names from unannotated code easier to deal with
actually, we could do likeList(elem2)
or something
might be confusing though? not sure
I have a crazy idea
actually, we could do like
List(elem2)
or something
damn, that was my crazy idea
so we extend what type already has in its default name
yeah I'm not sure how it would look in practice, might be weird? I'm not sure
I wonder how common this case is
is there a world where * could only exist in problems, with appropriate context when it appears?
"something_else" lol
or as a sigil after the default type var?
like:
TYPE MISMATCH
in foo.roc 5:14
5: inner_fn(str)
^--
This is a
`Str`
but I was expecting a
`List(item*)`
Where `item*` is a type variable that has not
been given a name and should not be confused
with a type variable `item` in scope
My point is if it expects this type, and I copypaste it in my code - parser won't like it
but I was expecting a
`List(item*)`
Do people do that? And if so do they expect it to work without the understanding that it's not valid syntax?
can't say for people, but for me, it's confusing to see invalid syntax even in such reports
I think Richard's approach of just taking the first open type var in the alphabetic sequence is fine
Though i still maintain that in the actual definition of types, we should promote the use of meaningful type vars
And then the appearance of these single letter type vars (probably starting at a, or close to it) are at least a sign that we just don't know the type for it
With a similar note to that above
"Here a
is not a named type var, but a valid one to introduce in this scope."
And a
could be replaced with any letter
We could also suggest ways to improve the report
do you mind starting a thread for unbound var naming in #ideas ? these single letters are really confusing for beginners (I remember how they confused me previously in other languages)
Start at the beginning of the sequence and look up if it's a type var in this scope, if not, use it.
Anthony Bullard said:
Start at the beginning of the sequence and look up if it's a type var in this scope, if not, use it.
this is super complicated
it's very easy to say and adds an insane amount of complexity to the compiler :joy:
because the scope is gone at the point where we discover the type mismatch
and also the type knows which CIR node it came from, but nodes only know their children, not their parents, so it's also hard to walk back up the tree to recreate the scope
the straightforward way to do "look up if it's a type var in this scope" is "literally redo all of canonicalization on the file every time we want to generate a type variable"
which would be...suboptimal for compile times :laughing:
that's why it's appealing to just build up a list of "here are all the type variable names we use anywhere in any scope in this module" as we're doing canonicalization
and when we're generating type var names, just make sure they aren't in that list and we're all set
no conflicts, guaranteed, minimal complexity, and minimal performance cost
as an aside, regarding meaningful type var names - for years I did this in Elm and I honestly have mixed feelings about it in retrospect
Richard Feldman said:
that's why it's appealing to just build up a list of "here are all the type variable names we use anywhere in any scope in this module" as we're doing canonicalization
sorry this is exactly what i meant
in Elm I would write things this all the time:
cancelButton : Html msg
cancelButton = button [] [ text "Ok" ]
send the list with the problem, and do the above at time of rendering the report
instead of this:
cancelButton : Html a
cancelButton = button [] [ text "Ok" ]
Here's my topic https://roc.zulipchat.com/#narrow/stream/304641-ideas/topic/Unbound.20type.20variable.20naming.20conventions/near/527008506
oh ok I'll re-post there! :thumbs_up:
Richard Feldman said:
it's a bug in Elm
Richard Feldman said:
the bug is that it's reporting that there is a more restrictive type on
innerFn
's first argument than reality
Sorry to come late to the party, but this is not what's happening in Elm. The "a" in the error message is not connected to the "a" from the type signature. If you change the type signature to use the type variable "z", the error message still says that the first argument of innerFn
needs be List a
.
So the "bug" is that the Elm compiler doesn't check whether the general type variable "a" used in the error message is already defined in the outer scope.
yep, that's the bug! :smile:
it's very niche
the reason it's a bug is that what it's saying is not true. It is not true that that value's type is List a
, because if that were true, then adding a type annotation of List a
to that value would be a no-op because that's already its type
but adding that annotation would not be a no-op! It would change the value's type to a more restrictive type.
that's why it's inaccurate to claim that List a
is that value's type. In the context of that particular value, the type variable a
is in scope and has meaning.
If the problem comes from not having * and the problem with that is just seeing * is confusing, we could have a keyword "unbound" that has the same meaning as *. That would be it's only use.
Last updated: Jul 06 2025 at 12:14 UTC