✔ Is type introspection something we might see in Roc? · beginners

Stream: beginners

Topic: ✔ Is type introspection something we might see in Roc?

Jasper Woudenberg (Jul 19 2024 at 19:23):

I'm asking because I think it could support a much nicer API for a database design I'm working on.

Specifically: I was thinking of describing the database schema in Roc, with each table storing one particular type of Roc value. To see if the schema has changed with regards to an earlier version I'd like to calculate a hash of a type. The hash is enough to get something working, but more details on the type might come in useful to for detailed error messages.

As a backup plan, I might define a little value-level DSL for defining types:

list (tuple2 u64 string) : TypeDesc (List (U64, Str))

But it'd be a shame to have to teach users of the library a language for describing types if they already know one.

Richard Feldman (Jul 19 2024 at 20:50):

I think it's worth discussing in #ideas what it might look like

Richard Feldman (Jul 19 2024 at 20:51):

I think the design @Brendan Hansknecht came up with for making encoding and decoding able to customize their error types might be an interesting starting point

Brendan Hansknecht (Jul 19 2024 at 20:52):

I think Inspect probably can do what you want

Richard Feldman (Jul 19 2024 at 20:52):

like could that be used to make something more general than Encoding or Hash, where you specify the thing you want to turn the type into, which Encoding and Hash could then be implemented in terms of?

Richard Feldman (Jul 19 2024 at 20:52):

oh hm interesting

Brendan Hansknecht (Jul 19 2024 at 20:53):

That said, future encode prbably would be better suited for this

Jasper Woudenberg (Jul 19 2024 at 21:16):

That sounds really great, and I'm not in a rush so happy to wait a bit and return at this later.

I'm curious what that would look like though, some questions I have:

Would I need a value of the type I'm looking to inspect to be able to pass to inspect or the decoder? I'd like to hash the type before I start a database connection, and so before I'll have any values of the type going to/from the database.
Even if I had a value of the type, suppose it's [Foo, Bar]. Would I be able to get a hash that takes into account all the possible tags the value has, and not just the specific tag that particular value uses?

Brendan Hansknecht (Jul 19 2024 at 21:17):

Ah, nvm. Yeah neither would help here.

Brendan Hansknecht (Jul 19 2024 at 21:17):

Cause these all require data examples to extract the type

Brendan Hansknecht (Jul 19 2024 at 21:18):

Yeah, in current roc, your best bet is a dsl then

Richard Feldman (Jul 19 2024 at 21:50):

@Brendan Hansknecht what do you think of the idea of trying to make a more general ability where you can build up an arbitrary state?

Brendan Hansknecht (Jul 19 2024 at 21:51):

How is it different from the new encode?

Brendan Hansknecht (Jul 19 2024 at 21:52):

Encode and inspect both allow for arbitrary state

Brendan Hansknecht (Jul 19 2024 at 21:53):

The issue here is really needing to run on a type alone without any data.

Brendan Hansknecht (Jul 19 2024 at 21:54):

Like you would need decode, but a version that builds a state while returning a phantom type. Or the ability to operate on type directly in general.

Richard Feldman (Jul 19 2024 at 22:03):

the new encode still only outputs List U8 though right?

Brendan Hansknecht (Jul 19 2024 at 22:04):

Nope

Brendan Hansknecht (Jul 19 2024 at 22:04):

It's up to the specific encoder

Richard Feldman (Jul 19 2024 at 22:05):

huh, I guess I missed that! do we have a link to the implementation somewhere?

Brendan Hansknecht (Jul 19 2024 at 22:06):

I realized from serde for I think it was toml that some formats need to go through an intermediate and can't encode straight to bytes

Brendan Hansknecht (Jul 19 2024 at 22:07):

The most up to date (though block by type errors for a full implementation) is here: https://github.com/bhansconnect/roc-msgpack/blob/main/package/FutureEncode.roc

Brendan Hansknecht (Jul 19 2024 at 22:07):

With example partial impl: https://github.com/bhansconnect/roc-msgpack/blob/main/package/MsgPack.roc

Richard Feldman (Jul 20 2024 at 12:21):

ahh ok! For some reason I misremembered state being for accumulating the error type, not the entire output type :big_smile:

Richard Feldman (Jul 20 2024 at 12:23):

so unless I'm missing something, this would be flexible enough that Eq, Hash, and Ord could all be implemented in terms of it, right? :thinking:

(not Inspect, since it needs to work on functions)

Richard Feldman (Jul 20 2024 at 12:25):

like if we infer this version of Encoding automatically the same way we do today, then essentially you have type introspection for anything that doesn't involve functions

Richard Feldman (Jul 20 2024 at 12:27):

although I suppose at that point we could consider going one step further and having a different Ability for introspection which also gave you access to functions (unlike encoding, which can't do anything with functions)

Jasper Woudenberg (Jul 20 2024 at 12:41):

I still think you'd need to be able to give it a value of the type though, right?

Suppose that I have a type with phantom parameter:

TypeDesc a := {}

Then I wouldn't be able to use encode or decode to do something with the type parameter a.

Richard Feldman (Jul 20 2024 at 12:49):

I think this is the same as having constraints on type parameters, e.g. being able to say "List a has Hash as long as a does"

Richard Feldman (Jul 20 2024 at 12:50):

which we don't have syntax for yet, but already plan to

Brendan Hansknecht (Jul 20 2024 at 15:23):

Richard Feldman said:

so unless I'm missing something, this would be flexible enough that Eq, Hash, and Ord could all be implemented in terms of it, right? :thinking:

(not Inspect, since it needs to work on functions)

Could be used for hash. Eq and ord take two inputs, so I don't think it would work.

Brendan Hansknecht (Jul 20 2024 at 15:24):

Also, I think even if it could be used for eq or ord it would have worse perf due to how it splits up types and would have types that it can't distinguish. For example, list and set both encode to a sequence.

Jasper Woudenberg (Jul 20 2024 at 15:27):

I think that might not be enough.

Suppose I'd be able to write the following in Roc (making up some syntax):

TypeDesc a := {}
  implements [
    Decoding { .. } where a implements Decoding
  ]

I think the only thing that would give me, is that in the implementation of Decoding for TypeDesc a I can decode values of a, but we previously established that's not useful to this particular case.

Brendan Hansknecht (Jul 20 2024 at 15:27):

Of course, if hash uses encode....I'm not sure how you would make something generic over hash.

Brendan Hansknecht (Jul 20 2024 at 15:29):

Richard Feldman said:

I think this is the same as having constraints on type parameters, e.g. being able to say "List a has Hash as long as a does"

We already have this.

Brendan Hansknecht (Jul 20 2024 at 15:33):

Oh wait, I think you could actually use the future decode.

Decode any type a. When it requests a value, record the type and return a garbage value. Finally, throw away the value and keep the type description.

Jasper Woudenberg (Jul 20 2024 at 16:21):

Ah, neat, yeah, I can see how that would work.

Also, I see that the new decoder for records will tell you (the decoder author) all the fields the record supports, which is great. I imagine the one for tags will as well?

Thanks, I'm excited to play around with this!

Notification Bot (Jul 20 2024 at 16:21):

Jasper Woudenberg has marked this topic as resolved.

Brendan Hansknecht (Jul 20 2024 at 16:55):

I imagine the one for tags will as well?

Yeah, it should.

Iuri Brindeiro (Oct 16 2024 at 18:00):

Brendan Hansknecht said:

I imagine the one for tags will as well?

Yeah, it should.

So, I'm trying to achieve something similar to what @Jasper Woudenberg mentioned here and I'm kind of stuck...

Suppose I have a function of signature a -> b where b will always be a tag union of [A, B, C] or variations of it (like [A] or [B, C]...). How would I decode this function return type to a List of it's respective type?
Example: a -> b where b is [A, B] I would like to have a function like (a -> b) -> List [A, B]. Or even (a -> b) -> List Str returning ["A", "B"].

Andy Ferris (Nov 18 2024 at 12:55):

Jasper Woudenberg said:

I'm asking because I think it could support a much nicer API for a database design I'm working on.

I actually came here to ask the exact same question for the exact same reason. I see talks where Richard suggests Roc platforms will be good for database plugins - I really think it goes far beyond that, and Roc can just be the query language. Things like the web REPL proves the query can be compiled dynamically and run safely on demand. The idea of type introspection would be to implement the data definition language in Roc, with Roc's usual structural data types, too, but I'm having trouble seeing how to use and manipulate types as runtime values.

Dan G Knutson (Nov 20 2024 at 00:28):

Would this future TypeDesc/Encode/Decode API be able to tell the platform the size of a type? Like, at compile time we get an Decode ability or ability replacement, and then at runtime we pass some opaque bytes to the platform with some metadata? Kind of like a limited pseudo-reflection from the platform's perspective.
Edit: or is this another "you should do that in roc_alloc" kind of situation?

Brendan Hansknecht (Nov 20 2024 at 01:34):

This will not give type size or platform details

Brendan Hansknecht (Nov 20 2024 at 01:35):

For a platform, anything with unknown size at compile time should be boxed or behind a pointer somehow.

Brendan Hansknecht (Nov 20 2024 at 01:35):

So a platform will take a Box a for any a with unknown size.

Dan G Knutson (Nov 20 2024 at 02:16):

Is there a stable rule or heuristic for when the compiler puts a record behind a pointer? My real goal is to get contiguous allocations for an ECS. I think I can see a way to do it all in Roc if there's a rule like "if your record is smaller than xyz it won't be behind a pointer".