Stream: ideas

Topic: builtin matrix support


view this post on Zulip Richard Feldman (Oct 16 2023 at 14:44):

this idea came up in #ideas > customizing infix operators. That discussion is very long, so I want to scope this thread with the following assumptions:

  1. Let's assume we are not going to add the ability to customize how infix operators work in Roc. (That idea is already being discussed in #ideas > customizing infix operators; the point of this thread is to discuss an alternative possible direction.)
  2. Let's also assume we want Roc to be nicer than it is today for scientific computing, particularly when it comes to making operations with tensors (vectors, matrices, etc.) more syntactically readable. I put "matrix" in the title of the thread for discoverability, because I think way more people in this Zulip know what a matrix is than a tensor. (Certainly that included me when I first created this Zulip!)
  3. Let's assume we're open to making changes to builtins (e.g. potentially making changes to the Num module, adding new modules, etc.), including auto-derived builtin abilities, to facilitate this.
  4. Finally, let's assume that modifications to Roc's type system are on the table, but that we'd would prefer to keep the changes as minimal as possible (ideally no changes at all).

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:00):

even with full builtin support, there are some tricky design questions here. Here's a pretty fundamental one: the type of Num.mul (which * desugars to) is:

mul : Num a, Num a -> Num a

This is fundamentally not the type we want for matrix multiplication, because the rule for matrix multiplication is:

These rules are way more specific than a, a -> a no matter what implements (or Num wrapper) we put around them; normal type unification alone won't implement them.

We could get into type-level numbers, but since we want to keep type system changes to a minimum, let's say for the sake of argument we used Peano numbers (which are implementable in Roc just like they are in Elm) to encode the rows and columns of a matrix in its type. (Peano numbers like this have atrocious ergonomics, so if we actually ended up doing this we'd presumably want type-level numbers, but let's just go with it for now.)

In that case, we could have:

mulMat : Matrix rows a, Matrix a cols -> Matrix rows cols

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:01):

so if we want * to work for matrix multiplication, and we want to track the rows and columns in the type of the matrix, then we have to figure out how to get that function :point_up: to accept both vectors and scalars and do the right thing

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:02):

another thing I'm not sure of: matrices are 2-dimensional, but what if you want more dimensions than that?

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:03):

personally I genuinely don't know what people use higher-dimensional math than matrices for (e.g. I know matrices are used in games, other 3D programming, physics simulations, etc., but I don't know what the use cases are for higher-rank tensors than matrices) - e.g. are people doing 74-dimensional stuff? If so, what are they using it for?

(The difference between "we really need to support N-dimensional, where N could be anything" and "we can pick a small, hardcoded number of dimensions to support, and it'll be okay if you don't get good support in Roc for more dimensions than that" is absolutely massive in terms of what designs are possible, so this is a very important question to answer.)

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:03):

I also don't know what the types would need to be for multiplication of tensors that have more dimensions than matrices do

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:04):

of note, it's possible that where this line of exploration ends up is the conclusion that we just can't possibly have the dimensions of the matrices encoded in the type, and instead their dimensions need to be stored for runtime use only - but I'd like to fully explore this idea of encoding the dimensions in the type first

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:05):

oh, another thing worth noting: if someone wanted to define that mulMat (perhaps with Peano numbers), that's already doable today; it just wouldn't work with *

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:06):

so the real question I'm trying to get at is: can we have * desugar to a function whose type works seamlessly with matrices, vectors, and also scalars

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:07):

(because having it "desugar to something else depending on types" would be equivalent from a language perspective to having function overloading based on type, which is not something I think Roc should ever have - so multiplication has to be one function!)

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:10):

another possibility here: maybe we can't figure out a single function that * could desugar to which meets all of these criteria, but encoding the matrix dimensions in its type is so important that this means matrix multiplication shouldn't be done with *.

another thing I think is a potentially interesting possibility is that we have * for scalar multiplication (maybe vectors too? That might work better from a type perspective), and then another operator (e.g. **) for matrix multiplication, which desugars to a different function. However, that wouldn't scale to higher dimensions; as soon as you wanted more dimensions than a matrix, you'd be back to prefix function calling.

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:27):

mulMat : Matrix rows a, Matrix a cols -> Matrix rows cols

This technically is better if we care about performance.

mulMat : Matrix a rows, Matrix a cols -> Matrix rows cols

Just pointing this out because I think it is important to realize that matrices are complex and it isn't just *. Also, if you make * mean matrix multiplication, how do you do pervasive multiplication of a matrix (which is also a very common operations. 2 * myMat is not requesting for us to turn 2 into a matrix and then do matrix multiplication. It is requesting pervasive multiplication where each element of myMat is multiplied by 2.

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:29):

hm, I don't follow why the perf would be better - can you elaborate?

also, is that type correct for what matrix multiplication is supposed to do? :face_with_raised_eyebrow:

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:31):

It's about memory access. In the original signiture, you are traversing the columns of one matrix and the rows of anther. This means that in one matrix you are jumping around in memory with strided accesses instead of just linearly reading a dense cacheline.

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:31):

by transposing one matrix, you can make it so that all of the data is in dense linear chunks of memory.

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:33):

It is not the correct type based on how it would be written out in match, but it is pretty common in high performance code to try to have one matrix transposed where possible. That said, there are lots of techniques to try and make the first version without transposing faster (generally by transposing in chunks and then doing dense math, but it is never as efficient)

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:40):

personally I genuinely don't know what people use higher-dimensional math than matrices for

Probably the most common is for a list of matrices. It then puts all of the matrices in one contiguous chunk of memory (instead of having a list of pointers to matricies). On top of that, you may want do operations across all elements in a way that can be defined as a single math op instead of writing a loop. (higher dimensions are also super super common in ML)

As a concrete example. In my ray tracer, I have multiple rays for ever pixel. Each ray has 3 data points (x, y, z) So that is a tensor of shape samples x rows x cols x 3. By simply transposing it, I can get 3 x samples x rows x cols. Now that the outer most dimension is 3, I can do any sort of vector math that expects 3 elements on the entire tensor at once. For example, shift them over by 1. rays - [0 0 1]

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:40):

In the real ray tracer, I solve collision equations for all rays at once and don't need to explicitly write out any looping. This can all be done with fast simd over the dense tensor.

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:41):

how many dimensions are common in ML?

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:41):

like what's the highest number you're aware of people using in practice

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:42):

I would say almost always within like 10. Most common 2 to 5.

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:42):

Though if you have to partition something over multiple execution untis or over time, those often become dimensions in the tensor, so that can add a few extra.

view this post on Zulip Lakin Wecker (Oct 16 2023 at 15:43):

Yeah, and 3 dimensions are common in multidimensional data fitting for the reasons that Brendan mentioned

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:44):

ok so let's say we have two 5-dimensional values - what would the type look like to multiply them?

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:47):

can we have * desugar to a function whose type works seamlessly with matrices, vectors, and also scalars

That would be pervasive multiplication. It works with anything with pretty minimal agreement. Matrix multiplication is special.

Matrix multiplication is the sum reduction of the pervasive multiplication of all rows with all columns. (or all rows with all rows if you transpose one matrix)
This also applies to higher dimensional tensor. It is just that the higher dimensions are ignored/perserved.

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:49):

interesting! so the exact signature of mulMat I wrote above would work for all use cases in ML?

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:49):

assuming pervasive multiplication was used for all higher dimensions

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:51):

I think it is just this

mulNd : Tensor a b c ... n row x, Tensor a b c ... n x cols -> Tensor A B C ... N rows cols

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:52):

Note, this is still touching all elements in both tensors and modifying them, so it isn't really doing pervasive mulitplication for all higher dimensions, it just isn't changing the size of higher dimensions.

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:53):

But yeah, it could be seen as looping over all higher dimensions until you get to the inner 2 and then doing matrix multiplication.

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:53):

that's not a type though :wink:

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:53):

the ... is not a thing

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:53):

that's kinda the root of the challenge here :big_smile:

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:54):

Just trying to point out the repeating pattern. Obviously would have to be unique to each tensor rank to fit into peano numbers.

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:54):

right, so we'd need N different mul functions to support N dimensions

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:55):

Maybe more if you consider automatic broadcasting that is expected.

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:56):

which, if we wanted all of those to support infix operations, means we'd need something like ** for 2d, *** for 3d, **** for 4d, etc.

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:56):

haha...yeah.

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:56):

so something's gotta give here

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:57):

That said, I think in practice, I think that it is fine for higher dimension matrices to just use dynamic shape generally.

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:57):

So maybe after 2 or 3 dim, it is always dynamically shaped.

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:57):

that sounds reasonable to me, but then again I have no experience in the domains in question :laughing:

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:57):

Cause generally, by dim 3 or 4, the tensors are big enough that the dynamic nature has litle cost

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:57):

I'm curious what others think of that idea!

view this post on Zulip Richard Feldman (Oct 16 2023 at 15:58):

sure, although there's also the question of how valuable it is to represent the rows/cols/etc in the type for reasons other than performance (I have no idea how valuable that is or isn't in practice!)

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 15:58):

If we want to start simple, I would argue for just going 100% dynamic shape and then later debate adding in some static versions for smaller ranks.

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 16:02):

the question of how valuable it is to represent the rows/cols/etc in the type for reasons other than performance

Working in the land of ML compilers, I have main thoughts:

  1. it is super useful for me and under the hood most ML frameworks do this (or give a way to attempt to extract this information). They even have explicitly dynamic dimensions in static tensors.
  2. Almost zero user land tools expose this. Everything is dynamic all the time and users can do whatever they want with it.

From what I have seen from my many engineer friends. Everything they do is always dynamic (and often totally untyped)

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 16:03):

Don't know about games, scientific computing, or other applications. Have only messed with rust/c++ libraries that tend to give both as options by using super complex templates and such.

view this post on Zulip Lakin Wecker (Oct 16 2023 at 16:03):

Yeah, aside from specific pipelines (3D rendering being one obvious one), most matrices are dynamically sized and operations checked at runtime in my experience in other languages, which is limited.

view this post on Zulip Richard Feldman (Oct 16 2023 at 16:03):

and that's ok for scientific computing?

view this post on Zulip Lakin Wecker (Oct 16 2023 at 16:03):

Yes.

view this post on Zulip Richard Feldman (Oct 16 2023 at 16:04):

cool, great to know!

view this post on Zulip Lakin Wecker (Oct 16 2023 at 16:04):

Eigen3 is the library I'm most familiar with and it supports both styles (typed matrix sizes and dynamic sizes). And we only used the typed sizes for rendering pipelines and some specific physical simulations. Everything else ends up being dynamic

view this post on Zulip Richard Feldman (Oct 16 2023 at 16:05):

when you use the typed sizes for rendering, how many dimensions are you using?

view this post on Zulip Lakin Wecker (Oct 16 2023 at 16:05):

2.

view this post on Zulip Lakin Wecker (Oct 16 2023 at 16:06):

But I dunno if I have wide enough experience to be certain this is sufficient for all purposes. So consider it more of a confirmation that it's satisfactory for my own purposes.

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 16:15):

Also, one advantage of making matrices a builtin is that they can access the refcount.

view this post on Zulip Richard Feldman (Oct 16 2023 at 16:15):

good point, I hadn't thought of that!

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 16:18):

The main issue being when you add 2 matrices for example, you want to do it in place, but you can't tell which of the two matrices is unique (if I either). If you happen to pick a unique matrix, our list unique work will kick in, but it is nicer to be able to check both matrices rather than guessing. (of course could be track in userland with an extra integer but would need to be manually updated which doesn't sound great).

view this post on Zulip Richard Feldman (Oct 16 2023 at 16:18):

Brendan Hansknecht said:

This technically is better if we care about performance.

mulMat : Matrix a rows, Matrix a cols -> Matrix rows cols

[...]

It's about memory access. In the original signiture, you are traversing the columns of one matrix and the rows of anther. This means that in one matrix you are jumping around in memory with strided accesses instead of just linearly reading a dense cacheline.

by transposing one matrix, you can make it so that all of the data is in dense linear chunks of memory.

It is not the correct type based on how it would be written out in math, but it is pretty common in high performance code to try to have one matrix transposed where possible. That said, there are lots of techniques to try and make the first version without transposing faster (generally by transposing in chunks and then doing dense math, but it is never as efficient)

@Lakin Wecker and @Declan Joseph Maguire I'm curious what your perspective on this - how would you feel about matrix multiplication requiring that you transpose the first argument before calling the function, if it meant better performance?

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 16:21):

Great video with some context (though dense): https://youtu.be/QGYvbsHDPxo?si=bqsm-yV3BqZKrZ9-

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 16:31):

Also, just to clarify, if we make a matmul infix operator, we probably want it to match standard matmul (not transposed). But in most libraries, they expose both. Cause in many cases, you have a constant or otherwise that you can just created transposed or transpose once and then repeatedly use over and over again. So might as well just take the free perf.

view this post on Zulip Lakin Wecker (Oct 16 2023 at 16:35):

I'd prefer developer ergonomics over raw performance. If I need raw performance I may end up back in a different language anyways. My understanding of Roc is that we want to be close enough with raw performance, but with better ergonomics. If we're considering starting with dynamic sizes at a small runtime overhead expense, then we could also consider having both column major and row major representations of matrices, which might solve the same purpose without the confusing API. Maybe this complicates things way too much though, so I'll defer to others who are more familiar with these details. I will point out that there is a whole rabbit hole here, such as things like z-order curve element ordering for efficient memory traversal.

view this post on Zulip Lakin Wecker (Oct 16 2023 at 16:37):

I'm also often dealing with banded sparse matrices with known repeating patterns as well which use a completely different in memory representation, and so it's unclear to me if ensuring that the first matrix is always transposed will help with performance in this case anyways

view this post on Zulip Anton (Oct 16 2023 at 16:38):

Could the transpose be used automatically behind the scenes by the compiler as an optimization?

view this post on Zulip Lakin Wecker (Oct 16 2023 at 16:39):

The compiler doesn't know which matrix you're going to multiply with another one, and for large enough matrices transposing is a pretty expensive operation if you change the memory layout.

view this post on Zulip Lakin Wecker (Oct 16 2023 at 16:39):

We're not trying to go down the road of building a full linear algebra library here with matrix decompositions and the like as a built in, so I think something that loosely translates to a memory layout that is compatible with existing libraries (such as rendering libraries) and has a nice ergonomic API to write simple arithmetic like code is what I'd aim for

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 16:42):

Could the transpose be used automatically behind the scenes by the compiler as an optimization?

For sure, but it would probably be a pretty limited pattern. Anyway, yeah, probably not an optimization to worry about for now.

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 16:45):

loosely translates to a memory layout that is compatible with existing libraries (such as rendering libraries)

Most existing libraries that I know of tend to have some sort of tensor wrapper that is just a pointer and a shape (maybe also a flag about traverasl). So should be easy to expose a view in general

view this post on Zulip Lakin Wecker (Oct 16 2023 at 16:46):

Agreed

view this post on Zulip Johan Lövgren (Oct 16 2023 at 19:14):

(Here follows some math which gives a little bit of extra context, sorry if it is too much.) Here is a useful way to think about matrix and tensor multiplication. We can think of a matrix AA in terms of its components AijA_{ij}, where ii enumerates the rows and jj enumerates the columns (so both go from 1 to 3 for 3D square matrices). Then when we multiply with some other matrix BB. The result can be found summing over "the contracted index", and multiplying the components: C=ABC=A * B if the components of C are Cik=jAijBjkC_{ik} = \sum_{j} A_{ij}B_{jk}. Or with a convention that repeated indices are always summed over, Cik=AijBjkC_{ik} =A_{ij}B_{jk}.

Also, given our matrices above, we can perform other contractions, e.g. by contracting the first index of A with the second of B, which is the same as first transposing BB and then doing a matrix multiplication with BTB^T to the left: AijBki=BkiTAij=(BTA)kjA_{ij}B_{ki} =B^{T}_{ki}A_{ij}=(B^T A)_{kj}.

And for higher-ranked tensors we have many possible different products. All that is required is that the contracted indices (i.e. the dimensions summed over) both enumerate the same size. So for example if we have rank-3 tensor DijkD_{ijk} and a rank-4 tensor ElmnoE_{lmno} then there are many different ways to multiply them! Contracting one index removes one index from each tensor, so we end up with a tensor of rank 3 + 4 - 2 =5. So e.g. Fijklm=DijaEaklmF_{ijklm}=D_{ija}E_{aklm}, where aa is summed over. (i.e. this case requires the third dimension of D and the first dimension of E to be of the same size).

view this post on Zulip Johan Lövgren (Oct 16 2023 at 19:17):

The reason I bring all of this up is because it kind of explains the type in your expression of matrix multiplicaiton @Richard Feldman:

mulMat : Matrix rows a, Matrix a cols -> Matrix rows cols

Here a is a type variable , which enumerates the index we contract. And then a is not present in the final expression in the end. Same would be true for generic tensor multiplication

view this post on Zulip Richard Feldman (Oct 16 2023 at 19:19):

Johan Lövgren said:

So for example if we have rank-3 tensor DijkD_{ijk} and a rank-4 tensor ElmnoE_{lmno} then there are many different ways to multiply them!

right, but given that each infix operator needs to correspond to exactly one type signature function, it's very relevant which one(s) people expect * to do in practice, and which of the others (if any) require infix operators in order for the ergonomics to be nice! :big_smile:

view this post on Zulip Johan Lövgren (Oct 16 2023 at 19:23):

To be honest I don't know what people would expect. To me it seems like it could mean anything honestly. This might actually be a good argument to restrict it to matrices. Because for matrices you always get a matrix back, i.e. by contracting you go 2 + 2 - 2 = 2.

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 20:27):

There is definitely a reason I think that all of our default infix operators should just be the pervasive ops with broadcasting. They are simple, clear, and consistent with what scalars do.

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 20:27):

We can then choose to add a special operator for the special case of matrix multiplication, but that would be a side question

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 20:28):

I mean lots of people use numpy to great success and it doesn't have an infix matrix multiplication op (I know, this isn't an argument for not improving, but I think it points to what a solid baseline is)

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 20:28):

it is np.matmul(a, b)

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 20:29):

I think pervasive ops as infix gives us something that will always just feel like scalars and will still be helpful in many cases. Then named ops (or maybe a few special infix ops) can be used for other cases.

view this post on Zulip Richard Feldman (Oct 16 2023 at 20:52):

that design makes sense to me! I'm curious what others think though :big_smile:

view this post on Zulip Richard Feldman (Oct 16 2023 at 20:52):

also just to clarify since this is a different thread: what does broadcasting mean in this context?

view this post on Zulip Brendan Hansknecht (Oct 16 2023 at 21:00):

If you have a vector of shape [3] and a matrix of shape [10, 3], you can do regular math with them. So vector + matrix is equivalent to "broadcasting" the vector to [10, 3] and then adding it to the matrix. So it is as if you had duplicated the elements multiple times.

Note: this can be done on either the leading or trailing axis and we would have to pick which direction we want. With trailing axis (like numpy), [3] can be added to [10 3]. With leading axis (like most array languages), [3] can be added to [3 10]. There are reasons to pick either, we probably should follow scientific computing by looking at numpy, matlab, and julia (which I think are all trailing axis for this, but should double check).

In numpy code if that is clearer:

>>> import numpy as np
>>> v = np.array([1, 2, 3])
>>> m = np.array ([[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4], [5, 5, 5]])
>>> v.shape
(3,)
>>> m.shape
(5, 3)
>>> v + m
array([[2, 3, 4],
       [3, 4, 5],
       [4, 5, 6],
       [5, 6, 7],
       [6, 7, 8]])

view this post on Zulip Declan Joseph Maguire (Oct 17 2023 at 08:31):

Richard Feldman said:

Lakin Wecker and Declan Joseph Maguire I'm curious what your perspective on this - how would you feel about matrix multiplication requiring that you transpose the first argument before calling the function, if it meant better performance?

The transposition is very common. However, this touches on a deep nerve, which is that the underlying representations of matrices are often different from the explicit form for performance reasons, often in a variety of ways. For example, there's the distinction between row-major and column-major encodings, which is what @Brendan Hansknecht is getting at. These are respectively those cases where rows or columns are memory contiguous. As such, most implementations of matrices/tensors have a lot of hidden implementation details, describing how the dimensions are laid out in memory, as well as stuff like "has this whole thing been negated/conjugated?" This makes many operations have negligible footprint, like transposition, because you just need to edit the metadata to change how the memory gets interpreted.

Further, you often have cases like diagonal, tridiagonal, and upper/lower triangular matrices, which only store part of their memory with the rest implicitly 0, which are super common cases in many systems (tridiagonals often come up when you have systems of equations, each variable only directly coupled to its 1D neighbours). The most exotic one (the only one whose memory footprint isn't knowable from shape) is a sparse array, which is basically a hashtable, but a matrix (many matrices are huge, but almost everywhere 0). Similarly you have symmetric/anti-symmetric/self-ajoint matrices for whom aji=aijaijaija_{ji} = a_{ij}|-a_{ij}|a_{ij}^*, which both allows memory saving and lets you always use efficient layouts. BTW almost all of these classes are closed under addition and multiplication.

I have no idea how much of this, if any, is compatible with the range of type systems under consideration. In practice, it's common for people to have to keep track of row vs. column major for performance, so I don't think it's a terrible ask for people to manually transpose, though I'd _heavily_ suggest making transposition not fiddle with the underlying memory anyway as it's ubiquitous and otherwise very costly, which would make manual transposition ineffective.

The dream would be a system that handles these considerations for you, choosing the most efficient underlying representation for a given task, choosing the right order to multiply matrices in to minimise memory footprint (collapse the largest dimensions first, tracking multiply referenced matrices), etc.

But yeah, manual transposition wouldn't be too hard, but it's probably a bad idea because it wrecks a basic optimisation, instead the row/column major pattern should ideally be inferred and be hidden or exposed only by a dedicated function. I've got more to say about tensors, btw.

view this post on Zulip Declan Joseph Maguire (Oct 17 2023 at 08:37):

Brendan Hansknecht said:

it is np.matmul(a, b)

Nowadays you can use @ as an infix for matrix multiplication. It was explicitly added for this usecase to python, though without being bound to that usecase. Obviously python has good reasons to listen to the linalg library makers, but it's still notable that python thought it necessary enough to add a whole infix to the base language just to enable it.

view this post on Zulip Anton (Oct 17 2023 at 09:12):

instead the row/column major pattern should ideally be inferred and be hidden or exposed only by a dedicated function.

Could we have a type Matrix rows cols that is an alias for RawMatrix rows cols [RowMajor, ColMajor, Diagonal, UpperTriangular, LowerTringular, Identity, ...]? In the beginning we could limit it to RawMatrix rows cols [RowMajor, ColMajor].

This could allow us to provide a simple API to most users, give power users nice options and allow us to optimize matrix multiplications behind the scenes some time in the future.

view this post on Zulip Declan Joseph Maguire (Oct 17 2023 at 09:12):

@Johan Lövgren Thank you for sparing everyone my rambly explanation of the topic.

As for tensors, I believe in current numpy, the first two dimensions of a tensor get treated like the two dimensions of a matrix. You then interpret matrix multiplication as tensor contraction (as Johan explained), resulting in a final matrix. To make a sort of pseudo-type signature with variables, it would be *: Ten a b y, Ten b c z -> Ten a c y z. Most other things generalise along the line @Brendan Hansknecht mentioned when talking about broadcasting.

As for actually __using__ tensors, the big time I've used them is quantum simulations. In those cases, I used tensors with 4 dimensions, alongside a bunch of others. Here's a small exerpt of the code:

M = self.get_pair_tensor(j)
M_new = np.einsum('astb,stuv -> auvb',
        M, # astb
        operator, # stuv
        optimize = True)
self._set_pair_from_tensor(j, M_new)

If you see that weird string in the second line, that's actually the instructions to einsum telling it what contraction to perform on M and operator. This is actually an example of einstein sum notation (hence einsum), and I think it shows a pretty good upper limit, the point at which trying to continue supporting * for tensors becomes redundant. The letters represent and label the dimensions/indices, and letters may be repeated (no more than twice) to implicitly indicate summation/contraction over that index. You can have any non-zero number of tensors in an expression, and any pair of indices can be chosen to be contracted/summed over/given the same name.

This einstein notation actually generalises dot products, cross products, outer products, matrix multiplication, transposition, and a bunch of other things. If we were talking in OOP land, it'd be a dope class to inherit from. Point being, if you can get the type system to be robust enough for einsum, and general broadcasting, you'd probably be able to solve 90% of your type problems. The last 10% is probably banal stuff like glueing two matrices edge-to-edge to make a bigger one.

view this post on Zulip Declan Joseph Maguire (Oct 17 2023 at 09:18):

Richard Feldman said:

We could get into type-level numbers, but since we want to keep type system changes to a minimum, let's say for the sake of argument we used Peano numbers (which are implementable in Roc just like they are in Elm) to encode the rows and columns of a matrix in its type. (Peano numbers like this have atrocious ergonomics, so if we actually ended up doing this we'd presumably want type-level numbers, but let's just go with it for now.)

Regarding peano numbers, are you asking for practical implementations in Peano arithmetic, or just "in principle" ones? Because as Godel taught us, peano arithmetic can represent anything if you really really really really try hard enough.

view this post on Zulip Brendan Hansknecht (Oct 17 2023 at 14:54):

As for tensors, I believe in current numpy, the first two dimensions of a tensor get treated like the two dimensions of a matrix... *: Ten a b y, Ten b c z -> Ten a c y z

Other way around. Numpy does trailing access agreement. It pretends you have a list of matrices. Then does matrix multiplication on the inner dimensions with broadcasting. So Ten a b y, Ten a y z -> Ten a b z. It will do broadcasting if the shapes don't align so Ten a b y, Ten x y z -> Ten max(a, x) b z

view this post on Zulip Brendan Hansknecht (Oct 17 2023 at 15:01):

Also, I don't think Peano numbers can represent broadcasting (though I don't fully understand peano numbers). Can Peano numbers represent picking the max of 2 different numbers in the type system? Can they also add a check that either two numbers are the same, or one of them is one? Also, they would need to deal with understand shapes and which dimensions align when dealing with mixing different rank tensors.

view this post on Zulip Brendan Hansknecht (Oct 17 2023 at 15:04):

I'm pretty sure to support static tensors we would either need, a way more powerful type system, special logic for static tensors, or the really bad form of static tensors that can't do any automatic shape stuff.

As such, I think we are stuck with either huge changes or dynamic tensors.

Could we have a type Matrix rows cols

I am pretty sure we would just have Tensor or NdMatrix or whatever the name with no type variables. I think it has to be dynamic and there is no need to put any of the rest of that information in a type exposed way. May be part of the underlying type, but it would be hidden from the user (mode exposed on explicit request)

view this post on Zulip Brendan Hansknecht (Oct 17 2023 at 15:17):

Further, you often have cases like diagonal, tridiagonal, and upper/lower triangular matrices, which only store part of their memory with the rest implicitly 0, which are super common cases in many systems (tridiagonals often come up when you have systems of equations, each variable only directly coupled to its 1D neighbours). The most exotic one (the only one whose memory footprint isn't knowable from shape) is a sparse array, which is basically a hashtable, but a matrix (many matrices are huge, but almost everywhere 0). Similarly you have symmetric/anti-symmetric/self-ajoint matrices for whom , which both allows memory saving and lets you always use efficient layouts.

If you see that weird string in the second line, that's actually the instructions to einsum telling it what contraction to perform on M and operator. This is actually an example of einstein sum notation (hence einsum), and I think it shows a pretty good upper limit, the point at which trying to continue supporting * for tensors becomes redundant. The letters represent and label the dimensions/indices, and letters may be repeated (no more than twice) to implicitly indicate summation/contraction over that index. You can have any non-zero number of tensors in an expression, and any pair of indices can be chosen to be contracted/summed over/given the same name.

All of this is the type of complicated and ever expanding list of stuff is what makes me think none of this should be builtin. I really do think this is a complex enough space that there is extremely huge merit in figuring out how to enable it in userland. This is especially true given the amount of domain specifics are possible in these fields. I don't think we want to pull the entire complexity of numpy or some other high performance matrix library into the builtins. That said, if we don't pull in that level of complexity and support, we would probably be leaving a lot of people dissatisfied.

Also, I think that discussing type system is really the wrong conversation here. I think we should be discussing overall complexity, ease of use of the end library, and long term support. Fundamentally, almost everyone who write matrix code works in a dynamic language (matlab, octave, python, julia, mathematica, R, array languages, even most c/fortan code tends to use dynamically shape tensor libraries). The amount of people that care about typing for matrix code is very small. The code tends to be dynamic in a way that is very easy to debug when shapes don't align and otherwise not really use types.

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:06):

As such, I think we are stuck with either huge changes or dynamic tensors.

that seems to be emerging from this discussion, I agree

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:13):

don't think we want to pull the entire complexity of numpy or some other high performance matrix library into the builtins. That said, if we don't pull in that level of complexity and support, we would probably be leaving a lot of people dissatisfied.

I agree that the builtins shouldn't get a numpy-ish level of complexity, but I don't think that's necessary. The point of this discussion is to explore the idea of making tensors builtins for the primary purpose of making infix math operators work with them (without requiring introducing operator overloading to the language), with the possible additional benefit of perhaps making it possible to have things like myMatrix * 2 work

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:17):

today, people can build large and complex math libraries on top of Num, e.g. for statistics. We don't incorporate that complexity into the builtins, and we shouldn't.

I think of the tensors-as-builtins ideas in the same way: making tensors builtins gives people a baseline to build on (e.g. basic arithmetic operations, which work with the normal math infix operators) but not trying to be a comprehensive solution

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:18):

another way to say it: I think the most important job of builtins is to provide things that can't be provided any other way - e.g. access to operations that take a single CPU instruction, operations whose implementation requires doing unsafe UTF-8 things, etc.

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:18):

and today that includes infix operators.

view this post on Zulip Brendan Hansknecht (Oct 17 2023 at 16:34):

For sure, I think the big question is: Given all the possible ways to represent a matrix, can we give a base that would satisfy the needs of essentially all users?

As pointed out by @Declan Joseph Maguire, there are a ton of representation optimizations that are possible. In some fields (specifically those with huge sparse matrices), it is a non-starter if you don't have the correct representation.

Even in other simpler physics and math based fields, there are still tons of potential storage and compute based optimizations that would be wanted.

All of these pieces would need to be need be represented in our matrix type as a baseline to enable users. Of course, we could choose to just a have a simple matrix (and that is probably fine for most users), but it kinda is opening the door for one subset of people to get nice infix math while another group has no access to it due to using sparse (or otherwise optimized) tensors that no longer match the builtin type.

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:42):

yeah that's a very interesting point

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:42):

of note, I think builtin tensors is the only way to get myMatrix * 2 to work

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:44):

unless I'm missing something

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:45):

although I suppose we could theoretically have separate infix operators for both matrix multiplication and also multiplication by a scalar

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:48):

also worth noting:

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:49):

accessing individual elements for dynamic matrices would always have to be a function call, e.g. Mat.get myMatrix a b

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:50):

not myMatrix[a][b]

view this post on Zulip Richard Feldman (Oct 17 2023 at 16:51):

how big of a deal is that?

view this post on Zulip Brendan Hansknecht (Oct 17 2023 at 16:54):

Probably not a big deal at all.

view this post on Zulip Jacob (Oct 17 2023 at 17:12):

Doesn't futhark have something for this?

view this post on Zulip Brendan Hansknecht (Oct 17 2023 at 17:13):

which "this"?

view this post on Zulip Elias Mulhall (Oct 17 2023 at 22:26):

(deleted)

view this post on Zulip Elias Mulhall (Oct 17 2023 at 22:28):

(deleted)

view this post on Zulip Declan Joseph Maguire (Oct 18 2023 at 07:03):

Richard Feldman said:

accessing individual elements for dynamic matrices would always have to be a function call, e.g. Mat.get myMatrix a b

That sucks big big time but it'd be a price I'd pay if I can have my arithmetic. Plus, I reckon that heavily using indices to access specific values is a sort of antipattern when overused by otherwise higher level scripts. It's so easy even with good notation to get utterly lost.

Also, regarding static vs dynamic, might we have tensors be dynamic by default, but allow the user to override that when certain conditions are met? Then games and graphics people can make their matrices and associated libraries nice and neat and efficient without too much pain.

view this post on Zulip Brendan Hansknecht (Oct 18 2023 at 14:42):

A user definitely can make a static matrix, but with current proposals, they just won't be as ergonomic.

view this post on Zulip Brendan Hansknecht (Oct 18 2023 at 14:43):

For example, they won't be able to get infix matmul except with square matrices due to type system limitations


Last updated: Jun 16 2026 at 16:19 UTC