Plan to support dependent types and affine types? · ideas

Hello, just landed in Roc. Wondering if there are any plans (long term plans?) on implementing support for dependent types (aka const generics exprs in rust nightly), and support for affine types (aka a moved value in rust) for Roc?

These two types help catching a wide range of issues at compile time, that very few languages actually offer.

Brendan Hansknecht (Sep 24 2025 at 13:50):

There has been solid discussion in the past. As it stands today, I don't think there are any plans.

Anton (Sep 24 2025 at 13:55):

For dependent types we did not want to add that kind of complexity to the compiler at this time. I don't recall any dicsussion about affine types.

Richard Feldman (Sep 24 2025 at 14:54):

souf (Sep 24 2025 at 15:21):

@Richard Feldman thanks for your answer. To me the first and foremost reason I'm looking for dependent types is to perform safe operations on matrices. At the moment I'm working with Python for machine learning tasks, but it can be a great source of pain when passing matrices around. Using wrong dimensions occurs a lot.

-- Without dependent types - compiles but crashes at runtime
transpose : Matrix -> Matrix
multiply : Matrix, Matrix -> Matrix
model : Matrix -> Matrix

a = createMatrix 3 4  -- 3x4 matrix
transposed = transpose a  -- Now 4x3, but type system doesn't know
transformed = multiply transposed someMatrix  -- Runtime error if dimensions don't match!
result = model transformed  -- Another potential runtime error if model expects different dimensions

-- With dependent types - catches errors at compile time
transpose : Matrix m n -> Matrix n m
multiply : Matrix m n, Matrix n p -> Matrix m p
model : Matrix 128 64 -> Matrix 10 1  -- Model expects specific input/output dimensions

a = createMatrix 3 4  -- Matrix 3 4
transposed = transpose a  -- Matrix 4 3 (dimensions tracked in types)
transformed = multiply transposed someMatrix  -- Compile error if dimensions incompatible
result = model transformed  -- Compile error if transformed isn't Matrix 128 64

I can see great potential in Roc for ML pipelines. Iterating quickly with strong compile-time safety on tensor dimensions, while interfacing with numerical backends (like BLAS). The combination of Roc's expressiveness and simplicity would be absolutely killer for this use case.

As for affine types, to be honest it's more out of curiosity. The main reasons I can think of right now are dealing with streams (making sure we're not trying to read a stream that has already been consumed) and resource management like file handles - ensuring files are closed exactly once and preventing use after close. But that's personally not something I struggle with so much. It would just add good extra safety.

Kiryl Dziamura (Sep 24 2025 at 15:25):

From what I remember, the solution in such a case is codegen of multiple static matrix types. I believe there's even an implementation in stable roc, let me look for it

Kiryl Dziamura (Sep 24 2025 at 15:30):

Anton (Sep 24 2025 at 15:37):

Richard Feldman (Sep 24 2025 at 16:09):

gotcha, how big are these matrices? e.g. in game dev I know they're almost never bigger than 4x4 but I've also heard in other domains they can be absolutely huge

Brendan Hansknecht (Sep 24 2025 at 16:35):

souf (Sep 24 2025 at 16:42):

@Richard Feldman It can get huge, yes! Using an image as an easy-to-understand example: if the image has dimensions of 2 pixels by 3 pixels, it would look like this:

[
  [
    [0, 10, 200],
    [0, 10, 200],
    [20, 10, 30]
  ],
  [
    [0, 10, 200],
    [0, 10, 200],
    [20, 10, 30]
  ]
]

Each number is a color channel (RGB), so in this case the dimensions are 2x3x3. For larger images you might have dimensions like 500x500x3. If you process a batch of 100 images, that becomes 100x500x500x3.

In a language model, instead of pixels you have tokens. Each token is represented by a vector of numbers (an embedding).

For example, a sentence with 2 words and an embedding size of 3 would look like this:

[
  [0.1, 0.5, 0.3],  # word 1
  [0.0, 0.2, 0.7],  # word 2
]

The embedding size can range from about 50 to over 1000, depending on the size of the model. Also the dimension will very within the model itself when moving from one layer to another.

So, as Brendan mentioned, it can really be any size, it depends on what we’re dealing with!

Richard Feldman (Sep 24 2025 at 16:55):

I see, so it sounds like at least from those examples, having numbers that can appear in types would be sufficient?

Richard Feldman (Sep 24 2025 at 16:55):

Richard Feldman (Sep 24 2025 at 16:56):

you can already create arbitrary anonymous tags in types, so I wonder if (for example) doing something like Matrix([Rows128], [Cols64]) would work

souf (Sep 24 2025 at 17:10):

@Richard Feldman I think that might actually cover most common scenarios! With some codegen as suggested by Kiryl. I’d have to think more about it to see if there are cases where it doesn’t work well.

souf (Sep 24 2025 at 17:14):

I'll implement something small as a demo and see how it fits. Might take a few days before I actually do that.

souf (Sep 24 2025 at 17:50):

The tutorial file and std lib you made for the AI are genuinely awesome. I was able to get Claude to draft a project for me in no time, and thanks to the clear error messages it was fixed in just one pass.

souf (Sep 24 2025 at 17:51):

So that approach, while not ideal, might actually work in many cases. Having dependent types/constant generics built into the language would make things a lot friendlier, but I understand the extra complexity of implementing it. I think a well-polished codegen could already help a lot. It wouldn’t be bulletproof and might still need some manual steps, but it would already make the dev experience pretty good (given the codegen itself is good). So I’m definitely keeping my hand raised for that feature, but I can already see real value in the existing tools solving, say, 75% of the problem.

souf (Sep 24 2025 at 17:54):

This transposed value is a:

    Matrix Dim2 Dim3

But print_matrix_2x3! needs its 1st argument to be:

    Matrix Dim3 Dim3

Richard Feldman (Sep 24 2025 at 18:32):

yeah it's not really about the implementation complexity so much as language design complexity and potential compiler performance impact for everyone

Brendan Hansknecht (Sep 24 2025 at 18:35):

Of note, these large matrices would be references. As in the area just pointers to GPU data or large CPU data. So performance wise none of this matter (generally speaking for AI). That said, for user ergonomics and compile time errors it has lots of pain

Brendan Hansknecht (Sep 24 2025 at 18:35):

In frameworks in python, you don't get errors until you have wrong shapes at runtime and it crashes

souf (Sep 25 2025 at 13:59):

Just throwing out an idea that crossed my mind: does Roc have, or plan to have, some kind of macros or annotations that can be used at compile time to perform custom checks outside of the compiler itself? I’m not sure whether this is a good idea, but it might provide additional compile-time safety without penalizing people who don’t use the feature.

Such a feature should also work with the LSP to be ergonomic (like having real time warnings).

Kiryl Dziamura (Sep 25 2025 at 14:38):

Kiryl Dziamura (Sep 25 2025 at 15:10):

There is a discussion about roc tooling command which would allow processing outputs from compiler steps, allowing to build sophisticated analyzers and potentially be used for codegen

Matthieu Pizenberg (Sep 25 2025 at 18:30):

As someone who used to do a lot of image processing and computer vision, fixed size named types don’t do a good job. They are limited to small size, and code generation macros create both terrible error messages, and really increase compilation times. The prime example of this is the nalgebra library in rust. https://docs.rs/nalgebra/latest/nalgebra/
(Or at least it used to be couple years back, I haven’t done CV for a couple years)

souf (Sep 25 2025 at 18:54):

@Matthieu Pizenberg that's an interesting observation. I haven't done any advanced experiment with that approach yet, but I can also see the case when you need to do some type unions, to say that a function can accept any dimension. Might work with OOP with inheritance, but I'm not sure how well it plays with Roc, yet

Richard Feldman (Sep 25 2025 at 21:25):

the thing I was talking about was using type parameters and no generated anything

Richard Feldman (Sep 25 2025 at 21:26):

I might be missing something, but I assume if the matrices are big enough that they'd be heap-allocated anyway, that approach would work fine?

Matthieu Pizenberg (Sep 25 2025 at 22:56):

Yes big matrices are typically heap allocated, with the sizes recorded to make checks at runtime for the various operators and functions operating on them. Which is what we want to avoid. Because getting a crash after a few hours of work isn’t the most pleasant surprise.

I’m not sure what you mean exactly by type params, but what nalgebra was doing with its type params (U1, U2, U3, U4, ...) was to generate all the appropriate operators and functions specifically tailored for the matrix sizes thanks to macros.

Matthieu Pizenberg (Sep 25 2025 at 22:59):

Matthieu Pizenberg (Sep 25 2025 at 23:11):

I don’t know if dependent types would help a lot because I’ve never really used dependent types for math in practice. But in theory, having types able to be manipulated and compute at compile time seems useful. For example, flattening a matrix of size (n, m) would give a vector of size nxm. And being able to have that information is useful to guarantee for example that you don’t sum a nxm vector with a nxn vector or other similar issues that may arise depending on the math algorithms manipulated.

Matthieu Pizenberg (Sep 25 2025 at 23:18):

I’m wondering if with limited compile time evaluation of roc code, and with clever usage of data that don’t perform anything in the functions other than being able to crash at compile time, maybe they get erased by the compiler at runtime. Giving somewhat similar benefits of dependent types, while not being it.

Richard Feldman (Sep 25 2025 at 23:18):

I assume I'm missing something because this isn't a domain I know much about, but let's say in Rust terms I have a Matrix<T, U> and T and U are phantom types to make the operations type-safe

Richard Feldman (Sep 25 2025 at 23:19):

why isn't that sufficient? I'm assuming there's some reason it isn't, or else presumably nalgebra would have done that

Matthieu Pizenberg (Sep 25 2025 at 23:22):

I think because the types needs logic attached. For example, vectorisation, which takes a 2D matrix and turn it into a 1D vector, can be a no-op, depending on how matrices are represented in memory, but just operation on the types capturing the size of the vector, which is the product of the sizes of the 2D matrix.

Maybe there are also performance specific constraints for nalgebra, and not only type safety.

Matthieu Pizenberg (Sep 25 2025 at 23:28):

Maybe also because of the accessor patterns. How do you access coordinate [3, 4] in a <T,U> sized matrix if you can’t interpret T and U directly as integers. You cannot implement that accessor efficiently for static sizes, and instead would need a runtime computation the number of elements in a row to perform a multiplication and an addition first to get the index of the element in the matrix.

Matthieu Pizenberg (Sep 25 2025 at 23:43):

I think Mojo is a language that enables using sizes of matrices in a somewhat dependent fashion. Basically as I recall, as long as the type bounds for the sizes are comp time, it’s possible. It’s not fully dependent type system though.

Richard Feldman (Sep 26 2025 at 03:05):

I think compile-time evaluation of constants would take care of any runtime cost of that

Richard Feldman (Sep 26 2025 at 03:05):

like the actual integer dimensions are still there at runtime even if the phantom types don't hold them

Richard Feldman (Sep 26 2025 at 03:05):

Richard Feldman (Sep 26 2025 at 03:06):

Richard Feldman (Sep 26 2025 at 03:07):

Richard Feldman (Sep 26 2025 at 03:08):

not dependent types, just like you can say Matrix(3, 4) and then you can use static dispatch on the type to turn it into a runtime integer

Richard Feldman (Sep 26 2025 at 03:08):

Brendan Hansknecht (Sep 26 2025 at 15:11):

I'm gonna be really honest here. I do not think roc will ever have a complex enough type system to have great matrix types for AI. Not saying it can't be decent with some of the ideas here, but I think really solving this problem requires much more complexity.

I have worked on a graph compiler with maximal shape inference. Having maximal shape inference requires having full support for algebraic dimensions in the type system (e.g. min(x * y - 3, 6)). You need to have shape simplification and constant folding at compile time. On top of that you really should support data dependent shape and have a way to deal with that.

Brendan Hansknecht (Sep 26 2025 at 15:23):

Also, even just with matmul if you have broadcasting, you hit where instead of Tensor(x, y), Tensor(y, z) -> Tensor(x, z)... You have Tensor(x, y), Tensor(1, z) -> Tensor(x, z).

Lastly, you also have to deal with dtype and promotion (which also is preferably done in the type system or be forced on the user explicitly). bfloat16 matmul int32, what should that return?

Brendan Hansknecht (Sep 26 2025 at 15:25):

Again, some minor things above can still make things nicer, but putting the full set into the compiler is much much more complex.

Really I think doing something like this in roc would make most sense in user space as building up a graph, validating the graph (now there is a guarantee of no errors), and finally executing it.

Stream: ideas

Topic: Plan to support dependent types and affine types?

souf (Sep 24 2025 at 13:41):

Brendan Hansknecht (Sep 24 2025 at 13:50):

Anton (Sep 24 2025 at 13:55):

Richard Feldman (Sep 24 2025 at 14:54):

souf (Sep 24 2025 at 15:21):

Kiryl Dziamura (Sep 24 2025 at 15:25):

Kiryl Dziamura (Sep 24 2025 at 15:30):

Anton (Sep 24 2025 at 15:37):

Richard Feldman (Sep 24 2025 at 16:09):

Brendan Hansknecht (Sep 24 2025 at 16:35):

souf (Sep 24 2025 at 16:42):

Richard Feldman (Sep 24 2025 at 16:55):

Richard Feldman (Sep 24 2025 at 16:55):

Richard Feldman (Sep 24 2025 at 16:56):

souf (Sep 24 2025 at 17:10):

souf (Sep 24 2025 at 17:14):

souf (Sep 24 2025 at 17:50):

souf (Sep 24 2025 at 17:51):

souf (Sep 24 2025 at 17:54):

Richard Feldman (Sep 24 2025 at 18:32):

Brendan Hansknecht (Sep 24 2025 at 18:35):

Brendan Hansknecht (Sep 24 2025 at 18:35):

souf (Sep 25 2025 at 13:59):

Kiryl Dziamura (Sep 25 2025 at 14:38):

Kiryl Dziamura (Sep 25 2025 at 15:10):

Matthieu Pizenberg (Sep 25 2025 at 18:30):

souf (Sep 25 2025 at 18:54):

Richard Feldman (Sep 25 2025 at 21:25):

Richard Feldman (Sep 25 2025 at 21:26):

Matthieu Pizenberg (Sep 25 2025 at 22:56):

Matthieu Pizenberg (Sep 25 2025 at 22:59):

Matthieu Pizenberg (Sep 25 2025 at 23:11):

Matthieu Pizenberg (Sep 25 2025 at 23:18):

Richard Feldman (Sep 25 2025 at 23:18):

Richard Feldman (Sep 25 2025 at 23:19):

Matthieu Pizenberg (Sep 25 2025 at 23:22):

Matthieu Pizenberg (Sep 25 2025 at 23:28):

Matthieu Pizenberg (Sep 25 2025 at 23:43):

Richard Feldman (Sep 26 2025 at 03:05):

Richard Feldman (Sep 26 2025 at 03:05):

Richard Feldman (Sep 26 2025 at 03:05):

Richard Feldman (Sep 26 2025 at 03:06):

Richard Feldman (Sep 26 2025 at 03:07):

Richard Feldman (Sep 26 2025 at 03:08):

Richard Feldman (Sep 26 2025 at 03:08):

Brendan Hansknecht (Sep 26 2025 at 15:11):

Brendan Hansknecht (Sep 26 2025 at 15:23):

Brendan Hansknecht (Sep 26 2025 at 15:25):