Remove wildcard type vars from Roc · compiler development

List _ -> U64 is a valid singnature for List.get on a list of U64.
List * -> U64 can only return the length or capacity and can never access an element.

Luke Boswell (Jan 01 2025 at 22:29):

Oh ... that's interesting. so * is polymorphic and _ isn't?

Sam Mohr (Jan 01 2025 at 22:29):

Which is why a function that accepts [Red, Blue]_ called with [Red, Blue, Green] and [Red, Blue, Yellow] will only resolve correctly for one of them

Brendan Hansknecht (Jan 01 2025 at 22:29):

_ is more flexible. Technically _ could resolve to *

Sam Mohr (Jan 01 2025 at 22:29):

* isn't quite polymorphic, I think

Sam Mohr (Jan 01 2025 at 22:31):

It means "this variable must work for any type replaced", so it can only work for, say, empty lists

Sam Mohr (Jan 01 2025 at 22:32):

Since no matter what type [] is, you can't get a value out of it, so the elem type doesn't matter

Brendan Hansknecht (Jan 01 2025 at 22:32):

so it can only work for, say, empty lists

If talking about a concrete value. Means something different for a function input.

Sam Mohr (Jan 01 2025 at 22:32):

Yeah...

Sam Mohr (Jan 01 2025 at 22:32):

^ Another reason

Brendan Hansknecht (Jan 01 2025 at 22:34):

Anyway, _ is just a request for the compiler to fill in the blank, it is super flexible.

» x : List _ -> _
… x = \l -> List.get l 0 ?? 24

<function> : List (Num a) -> Num a
» y : List _ -> _
… y = \l -> List.map l \i -> i - 1

<function> : List (Num a) -> List (Num a)
» z : List _ -> _
… z = \l -> List.get l 1

<function> : List a -> Result a [OutOfBounds]

Brendan Hansknecht (Jan 01 2025 at 22:34):

None of those functions are valid if * was used for the first blank instead of _

Brendan Hansknecht (Jan 01 2025 at 22:36):

Anyway, I personally would prefer to restrict * to only be allowed in function inputs. I think that is the main valid use case for it. Ban it in all other locations.

But I understand just using an arbitrary letter instead of *

Sam Mohr (Jan 01 2025 at 22:37):

Seems simpler to just get rid of it

Sam Mohr (Jan 01 2025 at 22:38):

And then beginners don't say "oh, what's this" followed by "why can't I use it elsewhere"

Luke Boswell (Jan 01 2025 at 22:38):

That definitely wasn't what I came here for... but it has come up before

Sam Mohr (Jan 01 2025 at 22:38):

I'm ON a MISSION

Sam Mohr (Jan 01 2025 at 22:38):

You are a convenient vehicle

Brendan Hansknecht (Jan 01 2025 at 22:38):

I just really hate the arbitrary letters gunking up my type signatures.

Sam Mohr (Jan 01 2025 at 22:39):

Me too

Sam Mohr (Jan 01 2025 at 22:39):

Okay, separate thread incoming

Sam Mohr (Jan 01 2025 at 22:39):

(unless I can find an old one to necro)

Ayaz Hafiz (Jan 01 2025 at 22:40):

i'm pretty sure * and a like variables have already been banned from all non-generalizable positions and only functions are generalizable

Brendan Hansknecht (Jan 01 2025 at 22:41):

You can still do stuff like this:

» x : List *
… x = []

Brendan Hansknecht (Jan 01 2025 at 22:42):

And this:

» y : {} -> List *
… y = \{} -> []

Sam Mohr (Jan 01 2025 at 22:42):

Yep, this typechecks

module [num, func, func2]

num : Num *
num = 123

func : {} -> List *
func = \{} -> []

func2 : List * -> U64
func2 = \_list -> 123

Brendan Hansknecht (Jan 01 2025 at 22:43):

num : Num *
num = 123

This surprises me. I think that is just a bug

Sam Mohr (Jan 01 2025 at 22:43):

Oh, that's probably part of the one I got assigned

Sam Mohr (Jan 01 2025 at 22:43):

Nope, it's different

Sam Mohr (Jan 01 2025 at 22:44):

https://github.com/roc-lang/roc/issues/7357 shows that types implemented in Zig like List don't have their type args checked properly, since we only check aliases for type args

Sam Mohr (Jan 01 2025 at 22:44):

And List isn't registered as a normal type alias

Sam Mohr (Jan 01 2025 at 22:45):

But this is different, since Num is defined as Num range := range

Sam Mohr (Jan 01 2025 at 22:45):

It might be that numbers are special cased

Sam Mohr (Jan 01 2025 at 22:46):

well, even though I can't define a concrete value, I can still type def one without a problem

module [my_abc]

Abc a : { field : a }

my_abc : Abc *

Notification Bot (Jan 01 2025 at 22:47):

51 messages were moved here from #compiler development > Wildcard in opaque types by Sam Mohr.

Notification Bot (Jan 01 2025 at 22:47):

52 messages were moved here from #compiler development > Remove wildcards from Roc by Sam Mohr.

Sam Mohr (Jan 01 2025 at 22:48):

Try again

Notification Bot (Jan 01 2025 at 22:48):

43 messages were moved here from #compiler development > Wildcard in opaque types by Sam Mohr.

Sam Mohr (Jan 01 2025 at 22:48):

That's a better range

Ayaz Hafiz (Jan 01 2025 at 23:02):

x : List *
This is definitely a bug

y : {} -> List *
This seems fine and should be allowed, \{} -> [] is a generalizable function

num : Num *
This is a special case today but i think it's worth removing

func2 : List * -> U64
This also seems fine because the function can take in a list of any type

Brendan Hansknecht (Jan 01 2025 at 23:05):

\{} -> [] is a generalizable function

Is this a valid List.empty? I wasn't sure if * in an output location generalized.

Ayaz Hafiz (Jan 01 2025 at 23:06):

yeah it's generalized

Ayaz Hafiz (Jan 01 2025 at 23:06):

the intuition is that because the entry point of the program is concrete, by the transitive property, all used values must eventually be concrete as well

Brendan Hansknecht (Jan 01 2025 at 23:07):

Ah yeah, just realized we do this for Dict.empty so that makes sense.

Ayaz Hafiz (Jan 01 2025 at 23:07):

if a value's type is not concrete after you propagate all the types through, then the value must never be used. in that case it's sufficient to generate the void type

Brendan Hansknecht (Jan 01 2025 at 23:07):

I would be quite happy if * was restricted to only function inputs and outputs.

Ayaz Hafiz (Jan 01 2025 at 23:07):

agreed

Ayaz Hafiz (Jan 01 2025 at 23:08):

and any named type variables too

Sam Mohr (Jan 01 2025 at 23:08):

So it seems like we have two positions that move us in the direction of minimizing wildcard * type variables in Roc:

Remove * entirely, and replace all usages with named variables. There are quite a few usages in our builtins, but there doesn't seem to be anything that uses more than one or two at a time, meaning we aren't introducing a whole soup of variables. This avoids the multiple times beginners have been confused by wildcards, and should improve comprehensibility for newcomers. The cost here is that beginners need to consider that any variables they see on the left-hand side of a function but not the right, or vice versa, are ignorable. This is more mental overhead when reading Roc, but not much IMO.
Only allow * in function input types. That allows us to communicate that a value is not important (in that it isn't used), but doesn't lead to confusion about its usage in variable types. We'd need to figure out how we should write empty_list : List *, but it shouldn't require a massive conversion. We still need to now think "if a variable exists in the return type but not the args, ignore it", but we still get a simpler Roc.

It seems like this comes down to:

Would beginners get a significant learning improvement from removing * entirely?
How much do we care about consistency: does having * only in function args and nowhere else bother us?

Sam Mohr (Jan 01 2025 at 23:08):

Oh, we'd want function outputs as well

Ayaz Hafiz (Jan 01 2025 at 23:09):

empty list should be \{} -> [] if you want a generic list. Otherwise write empty_list : List _ and the compiler will figure out what type to use for you.

Brendan Hansknecht (Jan 01 2025 at 23:10):

How much do we care about consistency: does having * only in function args and nowhere else bother us?

No. Cause it doesn't really make sense anywhere else.

Sam Mohr (Jan 01 2025 at 23:11):

Only allow * in function input and output types. That allows us to communicate that a value is not important (in that it isn't used), but doesn't lead to confusion about its usage in variable types. We'd need to figure out how we should write empty_list : List *, but it shouldn't require a massive conversion. This would probably be the smallest change. Beginners still need to learn what * is, but can think of it as a "function cleanup tool" and not need to understand the implications in values.

Ayaz Hafiz (Jan 01 2025 at 23:12):

i would prefer removing * only very slightly due to the specific case of * -> * I think being confusing because the input and output types are not related whereas they are in a -> a, but I would strongly suggest not having a world where * can be used on one side of a function and not the other. I think that will be more confusing than having both wildcards and named variables or only having named variables, because it would make the type system seem more complicated than it is.

Sam Mohr (Jan 01 2025 at 23:13):

Yes, I agree that 2 seems like the worst option because it feels like an arbitrary and confusing restriction

Brendan Hansknecht (Jan 01 2025 at 23:13):

I don't quite follow *-> * is just a -> b

Richard Feldman (Jan 01 2025 at 23:13):

I think we should remove wildcards

Richard Feldman (Jan 01 2025 at 23:14):

I think it was an experiment that didn't work out

Richard Feldman (Jan 01 2025 at 23:14):

the upside is that it's "kinda nice" when you understand it, but the downside is that it's actively confusing, and the downside comes up a lot in practice

Richard Feldman (Jan 01 2025 at 23:14):

just doesn't seem like it has been worth it overall

Sam Mohr (Jan 01 2025 at 23:14):

I think Roc is totally fine without them, and fewer features is always better unless said features add a significant benefit

Brendan Hansknecht (Jan 01 2025 at 23:15):

I will miss this:
List.len: List * -> U64

Brendan Hansknecht (Jan 01 2025 at 23:15):

But yeah, seems fine to remove

Sam Mohr (Jan 01 2025 at 23:15):

I'll wait for more voices, and then make a GH issue later

Richard Feldman (Jan 01 2025 at 23:15):

yeah, there are things I'll miss about it, but overall I think the language is better off without it

Brendan Hansknecht (Jan 01 2025 at 23:16):

Yeah, I definitely believe that. I mean @Luke Boswell is definitely not a beginner and still hits issues with *. That is a pretty clear sign that is should probably be removed.

Luke Boswell (Jan 01 2025 at 23:18):

Thanks @Brendan Hansknecht

Luke Boswell (Jan 01 2025 at 23:19):

I would say I'm comfortable working around the *'s... but I will admit I still don't fully understand them. I usually just mentally replace them with a, b, c... etc

Sam Mohr (Jan 01 2025 at 23:20):

I also do the mental replacement, meaning I'd rather have the vars anyway

Sam Mohr (Jan 02 2025 at 00:11):

https://github.com/roc-lang/roc/issues/7451

Anthony Bullard (Jan 02 2025 at 03:14):

I think * was an interesting idea to not have to get deep into type vars early in the learning journey but I think type vars are so omnipresent in languages with generics for good reason.

Richard Feldman (Jan 02 2025 at 03:20):

yeah that was part of the hope, and the other part was just to have a better way to talk about it

Richard Feldman (Jan 02 2025 at 03:20):

like if you say "suppose it just accepts num a"

Richard Feldman (Jan 02 2025 at 03:20):

out loud

Richard Feldman (Jan 02 2025 at 03:20):

the listener has to understand that "a" means lowercase A

Richard Feldman (Jan 02 2025 at 03:20):

so sometimes I'd say like "suppose it just accepts num lowercase a"

Richard Feldman (Jan 02 2025 at 03:21):

which always felt awkward, and I was hoping to improve on that by being able to say out loud "suppose it just accepts num star" and then everyone immediately understands the implications of what you're saying

Richard Feldman (Jan 02 2025 at 03:22):

but of course if star is confusing, then that makes it harder to talk about everything with everyone understand what's being said :sweat_smile:

Anthony Bullard (Jan 02 2025 at 03:24):

I think another downside is that even for some functional programmers it ate up some strangeness budget

Anthony Bullard (Jan 02 2025 at 03:25):

The way that OCaml does with "backwards types" like int option

Ayaz Hafiz (Jan 02 2025 at 21:48):

bugfix for the issue of wildcards where they dont make sense

https://github.com/roc-lang/roc/pull/7454

Richard Feldman (Jul 03 2025 at 01:26):

ugh, implementing the replacement for * in error messages and there's a really annoying edge case:

my_fn : List(a), Str -> List(a)
my_fn = |list, str| {
    inner_fn = |other_list| other_list.len()
    # ...
}

let's say there's a type mismatch involving inner_fn and the error message needs to print the inferred type of inner_fn

Richard Feldman (Jul 03 2025 at 01:27):

if the compiler looks at that inner_fn type in isolation, with wildcards we'd infer the type being List(*) -> U64

Richard Feldman (Jul 03 2025 at 01:27):

but if we don't have that, we have to generate a type variable name to use instead of *

Richard Feldman (Jul 03 2025 at 01:28):

however, in this case if we generate List(a) -> U64, that would not actually be an unbound type, because the type variable a is in scope (and has a different meaning) because of the outer annotation

Richard Feldman (Jul 03 2025 at 01:28):

so we need to instead generate List(b) -> U64

Richard Feldman (Jul 03 2025 at 01:28):

this is very annoying, because it means you can no longer print types using only the type as the input

Richard Feldman (Jul 03 2025 at 01:28):

you also need to know what other type variables are in scope

Richard Feldman (Jul 03 2025 at 01:29):

however, that's not something that we persist all the way through to error message generation because it's not ordinarily something that matters

Richard Feldman (Jul 03 2025 at 01:29):

in fact there's not ordinarily any way to go from a type mismatch error to figure out what the parent nodes are to re-walk the canonical node tree to figure out what type variables are in scope

Luke Boswell (Jul 03 2025 at 01:30):

Richard Feldman said:

-> U64

In isolation we don't know anything about the types other than a static dispatch call?

inner_fn : a -> b where a.len() -> b

Richard Feldman (Jul 03 2025 at 01:31):

sure, whatever - maybe not the perfect example

Richard Feldman (Jul 03 2025 at 01:31):

the point is just that if you have an inner function with an unbound type var

Richard Feldman (Jul 03 2025 at 01:31):

and there is an outer type in scope which has a type var named like a or something that the auto-generator would collide with

Richard Feldman (Jul 03 2025 at 01:32):

it causes this problem, and this problem seems to require a ton of complexity to fix :disappointed:

Luke Boswell (Jul 03 2025 at 01:33):

I think I understand the problem now... :thinking:

Anthony Bullard (Jul 03 2025 at 01:35):

shouldn't the type var for inner fn need to be calculated at some point later on? And shouldn't it either be concretized or bound to a var from the outer fn?

Richard Feldman (Jul 03 2025 at 01:35):

the best solution I've come up with so far is that we keep the scope around long enough so that we still have it when we generate the Problem

Richard Feldman (Jul 03 2025 at 01:35):

and then right when we're taking a snapshot of the type for the Problem, that's when we auto-generate the names for the variables (instead of doing it later in reporting like we would otherwise)

Richard Feldman (Jul 03 2025 at 01:36):

I'm not quite sure how easy it will be to know what's in scope for the mismatched thing, because that might happen during unification, and I'm not sure if the scope will still make sense then, but it's the best idea I have so far :sweat_smile:

Richard Feldman (Jul 03 2025 at 01:36):

Anthony Bullard said:

shouldn't the type var for inner fn need to be calculated at some point later on? And shouldn't it either be concretized or bound to a var from the outer fn?

the point is that it doesn't have a name in memory

Richard Feldman (Jul 03 2025 at 01:37):

it's just like "type var number 2439" or whatever

Richard Feldman (Jul 03 2025 at 01:37):

it's easy to render unnamed type vars as * but if we want to give them a name, we need some algorithm to generate a name that's not already taken

Richard Feldman (Jul 03 2025 at 01:38):

unless we want to generate _ which definitely seems worse because then it's like "I'm not even telling you what's here; maybe it's an unbound type variable, but who knows? It could be anything!" :laughing:

Anthony Bullard (Jul 03 2025 at 01:41):

Hm i'm trying to think what you'd get in other functional languages

Anthony Bullard (Jul 03 2025 at 01:42):

i'd also need to see the actual full code snippet

Anthony Bullard (Jul 03 2025 at 01:43):

because i can't see a way where there's a type mismatch in inner_fn here that you wouldn't have a way to fill that type var

Richard Feldman (Jul 03 2025 at 01:44):

it's the same problem in Elm and Haskell (if you have the language extension turned on for variables in outer scopes being accessible from inner scopes)

Anthony Bullard (Jul 03 2025 at 01:44):

Because this seems like something solved by every HM type system

Richard Feldman (Jul 03 2025 at 01:44):

I mean I don't know how they solve it in particular, but either they have a solution or they have a bug :laughing:

Anthony Bullard (Jul 03 2025 at 01:45):

makes me want to open up ellie again

Anthony Bullard (Jul 03 2025 at 01:45):

does that still exist? i'm on my phone

Richard Feldman (Jul 03 2025 at 01:45):

yeah you should be able to repro it in Elm :thumbs_up:

Richard Feldman (Jul 03 2025 at 01:45):

yep!

Anthony Bullard (Jul 03 2025 at 01:45):

nothing like running elm compiler in haskell compiled to wasm on my iphone

Richard Feldman (Jul 03 2025 at 01:48):

it's a bug in Elm

Screenshot 2025-07-02 at 9.48.01 PM.png

Richard Feldman (Jul 03 2025 at 01:48):

it should say List b or something

Anthony Bullard (Jul 03 2025 at 01:54):

i don't know if that a bug

Richard Feldman (Jul 03 2025 at 01:54):

sure it is haha

Anthony Bullard (Jul 03 2025 at 01:55):

if the outer fn had a regular argument a and the inner one did too (assuming shadowing is allowed), there is no conflict

Anthony Bullard (Jul 03 2025 at 01:55):

so why is there here with type vars?

Richard Feldman (Jul 03 2025 at 01:55):

the type of innerFn is not connected in any way to the type of the outer function

Richard Feldman (Jul 03 2025 at 01:56):

but if its argument type contains a, that is saying - in Elm's type system - that they are connected

Richard Feldman (Jul 03 2025 at 01:56):

the bug is that it's reporting that there is a more restrictive type on innerFn's first argument than reality

Richard Feldman (Jul 03 2025 at 01:56):

in other words, this error message is saying "you can only give innerFn a list that's of the same type that you gave the outer function"

Richard Feldman (Jul 03 2025 at 01:56):

which is not true; you can give it any list you like!

Richard Feldman (Jul 03 2025 at 01:57):

if it said List b then it would be accurate

Richard Feldman (Jul 03 2025 at 01:57):

or really any other name besides exactly a

Richard Feldman (Jul 03 2025 at 01:58):

but of course that's what it auto-generated, presumably because the auto-generator code was written before Elm added the feature of inner types being able to reference type variables in outer scopes, which if memory serves was around Elm 0.14 or so

Richard Feldman (Jul 03 2025 at 01:58):

because I never knew that type system feature existed until I heard about Evan talking about adding it :laughing:

Richard Feldman (Jul 03 2025 at 02:02):

example of this distinction being relevant:

Screenshot 2025-07-02 at 10.02.19 PM.png

Richard Feldman (Jul 03 2025 at 02:02):

but if I change it to innerFn listA it compiles just fine

Richard Feldman (Jul 03 2025 at 02:03):

because innerFn : List a -> ... is saying that innerFn only accepts lists with the same type of element as the listA argument

Richard Feldman (Jul 03 2025 at 02:04):

just to be super clear, this is the edge case to end all edge cases and approximately nobody will notice if it's broken in Roc's compiler, but I still want to do it correctly if we're going to do it :stuck_out_tongue:

Anthony Bullard (Jul 03 2025 at 02:04):

image_067A4BBE-A99A-40FD-ABE9-C5DC74A8E5EB_1751508252.jpeg

Richard Feldman (Jul 03 2025 at 02:05):

I don't know if OCaml has the feature where inner type annotations can be connected to outer type annotations by using the same variable names though

Richard Feldman (Jul 03 2025 at 02:05):

you'd have to try to reproduce that last Ellie screenshot I posted above

Richard Feldman (Jul 03 2025 at 02:05):

see if it gives an error; if not, OCaml's type annotations are disconnected (which I think they are?) and that error is not a bug in OCaml

Anthony Bullard (Jul 03 2025 at 02:07):

i get it now. we are saying List a because it would be List a if that function , unannotated, was on the top level

Anthony Bullard (Jul 03 2025 at 02:08):

but since it's inside of a scope List(a) MUST mean the a of the outer function if it appears there

Anthony Bullard (Jul 03 2025 at 02:09):

So the best you could do is capture the next possible type var in that scope and put it in the problem when we create it

Anthony Bullard (Jul 03 2025 at 02:10):

I now want to see what this scenario looks like in every type system with generics

Luke Boswell (Jul 03 2025 at 02:11):

Silly idea ... maybe when we produce error reports and have generated type vars, we use the reverse alphabet, z y x ...

Anthony Bullard (Jul 03 2025 at 02:11):

Or, prefix it with the name of the function

Luke Boswell (Jul 03 2025 at 02:11):

Or use a sigil maybe?

Anthony Bullard (Jul 03 2025 at 02:11):

it could be inner_a

Anthony Bullard (Jul 03 2025 at 02:12):

Seems reasonable to me

Anthony Bullard (Jul 03 2025 at 02:13):

we could even note the conflict in the prob and have a note that says "this is NOT the a from <outer function name>"

Luke Boswell (Jul 03 2025 at 02:15):

I like this solution.

Only potential issue I can think of is maybe the name it super long and that's kind of annoying. But probably not an issue in practice.

Kiryl Dziamura (Jul 03 2025 at 08:28):

I don't think a sigil is a good idea. it would mean you can't copy generated type and paste it to your program with no errors. prefix is better but introduces implicit naming convention. which is not bad, hardly it would be a problem for anyone

Richard Feldman (Jul 03 2025 at 13:42):

I thought of a very simple solution:

when we're canonicalizing a module, we write down every unique type var name used anywhere in that module in any scope
when generating unique names, we just make sure to avoid all of those

Richard Feldman (Jul 03 2025 at 13:43):

this might lead to the auto-generated variable name being like List(d) when it could have been List(b) but it'll be accurate, and I don't think anyone cares :stuck_out_tongue:

Kiryl Dziamura (Jul 03 2025 at 13:44):

had we come up with generated names before?
I assume generated names could be confusing because one couldn't find them in their code. prefix makes it more explicit

Richard Feldman (Jul 03 2025 at 13:44):

we never needed to come up with generated names before, because we had *

Richard Feldman (Jul 03 2025 at 13:44):

removing * requires generating names

Kiryl Dziamura (Jul 03 2025 at 13:45):

then we never had this kind of confusion

Richard Feldman (Jul 03 2025 at 13:45):

correct

Richard Feldman (Jul 03 2025 at 13:45):

a downside of removing * is that it introduces this problem, and this is one of the problems I was hoping to avoid by having * in the language :smile:

Richard Feldman (Jul 03 2025 at 13:45):

however, my conclusion is that overall the downsides of * outweigh the upsides and we should generate names instead (like every other language)

Richard Feldman (Jul 03 2025 at 13:45):

prefix does not solve the problem btw

Richard Feldman (Jul 03 2025 at 13:46):

if you choose my_fn_a as the name, then that still collides if someone happens to choose my_fn_a as their type variable name

Richard Feldman (Jul 03 2025 at 13:46):

and if you're ok with collisions being unlikely, but still possible, then it's definitely best to just have the bug like Elm does, because in practice nobody is going to notice either way :stuck_out_tongue:

Kiryl Dziamura (Jul 03 2025 at 13:47):

I agree. I just mean need to communicate generated names somehow. otherwise I anticipate questions like "what this means? I haven't this name in my code"

Richard Feldman (Jul 03 2025 at 13:47):

Richard Feldman said:

I thought of a very simple solution:

when we're canonicalizing a module, we write down every unique type var name used anywhere in that module in any scope

when generating unique names, we just make sure to avoid all of those

I'm fine with this very simple solution though :point_up:

Richard Feldman (Jul 03 2025 at 13:47):

Kiryl Dziamura said:

I agree. I just mean need to communicate generated names somehow. otherwise I anticipate questions about "what this means? I haven't this name in my code"

I haven't seen this in other languages which do this

Kiryl Dziamura (Jul 03 2025 at 13:47):

lifetimes in rust?

Richard Feldman (Jul 03 2025 at 13:48):

yeah I haven't seen people be confused about that particular aspect of lifetimes :laughing:

Richard Feldman (Jul 03 2025 at 13:48):

or like in Elm I haven't seen people say "hey why is it called a in List a when I don't have an a anywhere in my code?"

Richard Feldman (Jul 03 2025 at 13:48):

or rather, people generally seem to wonder about the semantics

Kiryl Dziamura (Jul 03 2025 at 13:49):

yeah I haven't seen people be confused about that particular aspect of lifetimes

you're taking with one of them right now :D

Anthony Bullard (Jul 03 2025 at 13:49):

I think the SML style of using a, b is an unfortunate thing to propagate

Richard Feldman (Jul 03 2025 at 13:49):

what would be a better style?

Anthony Bullard (Jul 03 2025 at 13:49):

names that mean something :rolling_on_the_floor_laughing:

Anthony Bullard (Jul 03 2025 at 13:49):

like List(item)

Anthony Bullard (Jul 03 2025 at 13:50):

Map(key,value)

Richard Feldman (Jul 03 2025 at 13:50):

we could do that in some cases

Richard Feldman (Jul 03 2025 at 13:50):

but that would make this problem harder :smile:

Richard Feldman (Jul 03 2025 at 13:51):

like for example, if I define List as List(elem) := ... then we could choose elem as the default unbound var name

Anthony Bullard (Jul 03 2025 at 13:51):

it doesn't solve this problem, but maybe makes the appearance of a random names from unannotated code easier to deal with

Richard Feldman (Jul 03 2025 at 13:51):

actually, we could do likeList(elem2) or something

Richard Feldman (Jul 03 2025 at 13:51):

might be confusing though? not sure

Kiryl Dziamura (Jul 03 2025 at 13:51):

I have a crazy idea

Kiryl Dziamura (Jul 03 2025 at 13:51):

actually, we could do likeList(elem2) or something

damn, that was my crazy idea

Kiryl Dziamura (Jul 03 2025 at 13:52):

so we extend what type already has in its default name

Richard Feldman (Jul 03 2025 at 13:52):

yeah I'm not sure how it would look in practice, might be weird? I'm not sure

Kiryl Dziamura (Jul 03 2025 at 13:53):

I wonder how common this case is

Anthony Bullard (Jul 03 2025 at 13:55):

is there a world where * could only exist in problems, with appropriate context when it appears?

Kiryl Dziamura (Jul 03 2025 at 13:56):

"something_else" lol

Anthony Bullard (Jul 03 2025 at 14:00):

or as a sigil after the default type var?

like:

TYPE MISMATCH
in foo.roc 5:14
5:   inner_fn(str)
              ^--
This is a
    `Str`
but I was expecting a
    `List(item*)`
Where `item*` is a type variable that has not
been given a name and should not be confused
with a type variable `item` in scope

Kiryl Dziamura (Jul 03 2025 at 14:02):

My point is if it expects this type, and I copypaste it in my code - parser won't like it

but I was expecting a
    `List(item*)`

Anthony Bullard (Jul 03 2025 at 14:02):

Do people do that? And if so do they expect it to work without the understanding that it's not valid syntax?

Kiryl Dziamura (Jul 03 2025 at 14:03):

can't say for people, but for me, it's confusing to see invalid syntax even in such reports

Anthony Bullard (Jul 03 2025 at 14:04):

I think Richard's approach of just taking the first open type var in the alphabetic sequence is fine

Anthony Bullard (Jul 03 2025 at 14:05):

Though i still maintain that in the actual definition of types, we should promote the use of meaningful type vars

Anthony Bullard (Jul 03 2025 at 14:06):

And then the appearance of these single letter type vars (probably starting at a, or close to it) are at least a sign that we just don't know the type for it

Anthony Bullard (Jul 03 2025 at 14:06):

With a similar note to that above

Anthony Bullard (Jul 03 2025 at 14:08):

"Here a is not a named type var, but a valid one to introduce in this scope."

And a could be replaced with any letter

We could also suggest ways to improve the report

Kiryl Dziamura (Jul 03 2025 at 14:10):

do you mind starting a thread for unbound var naming in #ideas ? these single letters are really confusing for beginners (I remember how they confused me previously in other languages)

Anthony Bullard (Jul 03 2025 at 14:11):

Start at the beginning of the sequence and look up if it's a type var in this scope, if not, use it.

Richard Feldman (Jul 03 2025 at 14:13):

Anthony Bullard said:

Start at the beginning of the sequence and look up if it's a type var in this scope, if not, use it.

this is super complicated

Richard Feldman (Jul 03 2025 at 14:13):

it's very easy to say and adds an insane amount of complexity to the compiler :joy:

Richard Feldman (Jul 03 2025 at 14:13):

because the scope is gone at the point where we discover the type mismatch

Richard Feldman (Jul 03 2025 at 14:14):

and also the type knows which CIR node it came from, but nodes only know their children, not their parents, so it's also hard to walk back up the tree to recreate the scope

Richard Feldman (Jul 03 2025 at 14:14):

the straightforward way to do "look up if it's a type var in this scope" is "literally redo all of canonicalization on the file every time we want to generate a type variable"

Richard Feldman (Jul 03 2025 at 14:14):

which would be...suboptimal for compile times :laughing:

Richard Feldman (Jul 03 2025 at 14:15):

that's why it's appealing to just build up a list of "here are all the type variable names we use anywhere in any scope in this module" as we're doing canonicalization

Richard Feldman (Jul 03 2025 at 14:16):

and when we're generating type var names, just make sure they aren't in that list and we're all set

Richard Feldman (Jul 03 2025 at 14:16):

no conflicts, guaranteed, minimal complexity, and minimal performance cost

Richard Feldman (Jul 03 2025 at 14:17):

as an aside, regarding meaningful type var names - for years I did this in Elm and I honestly have mixed feelings about it in retrospect

Anthony Bullard (Jul 03 2025 at 14:18):

Richard Feldman said:

that's why it's appealing to just build up a list of "here are all the type variable names we use anywhere in any scope in this module" as we're doing canonicalization

sorry this is exactly what i meant

Richard Feldman (Jul 03 2025 at 14:18):

in Elm I would write things this all the time:

cancelButton : Html msg
cancelButton = button [] [ text "Ok"  ]

Anthony Bullard (Jul 03 2025 at 14:18):

send the list with the problem, and do the above at time of rendering the report

Richard Feldman (Jul 03 2025 at 14:18):

instead of this:

cancelButton : Html a
cancelButton = button [] [ text "Ok"  ]

Anthony Bullard (Jul 03 2025 at 14:18):

Here's my topic https://roc.zulipchat.com/#narrow/stream/304641-ideas/topic/Unbound.20type.20variable.20naming.20conventions/near/527008506

Richard Feldman (Jul 03 2025 at 14:18):

oh ok I'll re-post there! :thumbs_up:

Pit Capitain (Jul 04 2025 at 16:14):

Richard Feldman said:

it's a bug in Elm

Richard Feldman said:

the bug is that it's reporting that there is a more restrictive type on innerFn's first argument than reality

Sorry to come late to the party, but this is not what's happening in Elm. The "a" in the error message is not connected to the "a" from the type signature. If you change the type signature to use the type variable "z", the error message still says that the first argument of innerFn needs be List a.

So the "bug" is that the Elm compiler doesn't check whether the general type variable "a" used in the error message is already defined in the outer scope.

Richard Feldman (Jul 04 2025 at 16:45):

yep, that's the bug! :smile:

Richard Feldman (Jul 04 2025 at 16:45):

it's very niche

Richard Feldman (Jul 04 2025 at 16:46):

the reason it's a bug is that what it's saying is not true. It is not true that that value's type is List a, because if that were true, then adding a type annotation of List a to that value would be a no-op because that's already its type

Richard Feldman (Jul 04 2025 at 16:47):

but adding that annotation would not be a no-op! It would change the value's type to a more restrictive type.

Richard Feldman (Jul 04 2025 at 16:48):

that's why it's inaccurate to claim that List a is that value's type. In the context of that particular value, the type variable a is in scope and has meaning.

Norbert Hajagos (Jul 06 2025 at 07:43):

If the problem comes from not having * and the problem with that is just seeing * is confusing, we could have a keyword "unbound" that has the same meaning as *. That would be it's only use.

Richard Feldman (Jul 06 2025 at 13:38):

I don't think the character * is the problem

Richard Feldman (Jul 06 2025 at 13:39):

I think the problem is the concept

Richard Feldman (Jul 06 2025 at 13:39):

like I don't think this will help:

>> 1 + 1
2 : Num(unbound)

Joshua Warner (Jul 06 2025 at 15:12):

Instead of avoiding (just) type variables, what about avoiding _any lowercase_ ident in the module?That's something that's very easy+fast to compute by looking at the tokens (or ast), without any extra work during can.

Joshua Warner (Jul 06 2025 at 15:15):

That also has the advantage of avoiding any confusions a user might possibly have between what's a type variable vs a normal ident.

Richard Feldman (Jul 06 2025 at 15:18):

ah so we just look up whether it's been interned?

Richard Feldman (Jul 06 2025 at 15:18):

yeah we could do that!

Norbert Hajagos (Jul 07 2025 at 09:51):

Richard Feldman said:

like I don't think this will help:
>> 1 + 1
2 : Num(unbound)

You're right, it's even a worse first-time experience, because my impression is "cool, 2 is a bigint"

Last updated: Jul 26 2025 at 12:14 UTC