builtin matrix support · ideas · Zulip Chat Archive

this idea came up in #ideas > customizing infix operators. That discussion is very long, so I want to scope this thread with the following assumptions:

Richard Feldman (Oct 16 2023 at 15:00):

even with full builtin support, there are some tricky design questions here. Here's a pretty fundamental one: the type of Num.mul (which * desugars to) is:

mul : Num a, Num a -> Num a

This is fundamentally not the type we want for matrix multiplication, because the rule for matrix multiplication is:

These rules are way more specific than a, a -> a no matter what implements (or Num wrapper) we put around them; normal type unification alone won't implement them.

We could get into type-level numbers, but since we want to keep type system changes to a minimum, let's say for the sake of argument we used Peano numbers (which are implementable in Roc just like they are in Elm) to encode the rows and columns of a matrix in its type. (Peano numbers like this have atrocious ergonomics, so if we actually ended up doing this we'd presumably want type-level numbers, but let's just go with it for now.)

mulMat : Matrix rows a, Matrix a cols -> Matrix rows cols

Richard Feldman (Oct 16 2023 at 15:01):

so if we want * to work for matrix multiplication, and we want to track the rows and columns in the type of the matrix, then we have to figure out how to get that function :point_up: to accept both vectors and scalars and do the right thing

Richard Feldman (Oct 16 2023 at 15:02):

another thing I'm not sure of: matrices are 2-dimensional, but what if you want more dimensions than that?

Richard Feldman (Oct 16 2023 at 15:03):

personally I genuinely don't know what people use higher-dimensional math than matrices for (e.g. I know matrices are used in games, other 3D programming, physics simulations, etc., but I don't know what the use cases are for higher-rank tensors than matrices) - e.g. are people doing 74-dimensional stuff? If so, what are they using it for?

(The difference between "we really need to support N-dimensional, where N could be anything" and "we can pick a small, hardcoded number of dimensions to support, and it'll be okay if you don't get good support in Roc for more dimensions than that" is absolutely massive in terms of what designs are possible, so this is a very important question to answer.)

Richard Feldman (Oct 16 2023 at 15:03):

I also don't know what the types would need to be for multiplication of tensors that have more dimensions than matrices do

Richard Feldman (Oct 16 2023 at 15:04):

of note, it's possible that where this line of exploration ends up is the conclusion that we just can't possibly have the dimensions of the matrices encoded in the type, and instead their dimensions need to be stored for runtime use only - but I'd like to fully explore this idea of encoding the dimensions in the type first

Richard Feldman (Oct 16 2023 at 15:05):

oh, another thing worth noting: if someone wanted to define that mulMat (perhaps with Peano numbers), that's already doable today; it just wouldn't work with *

Richard Feldman (Oct 16 2023 at 15:06):

so the real question I'm trying to get at is: can we have * desugar to a function whose type works seamlessly with matrices, vectors, and also scalars

Richard Feldman (Oct 16 2023 at 15:07):

(because having it "desugar to something else depending on types" would be equivalent from a language perspective to having function overloading based on type, which is not something I think Roc should ever have - so multiplication has to be one function!)

Richard Feldman (Oct 16 2023 at 15:10):

another possibility here: maybe we can't figure out a single function that * could desugar to which meets all of these criteria, but encoding the matrix dimensions in its type is so important that this means matrix multiplication shouldn't be done with *.

another thing I think is a potentially interesting possibility is that we have * for scalar multiplication (maybe vectors too? That might work better from a type perspective), and then another operator (e.g. **) for matrix multiplication, which desugars to a different function. However, that wouldn't scale to higher dimensions; as soon as you wanted more dimensions than a matrix, you'd be back to prefix function calling.

Brendan Hansknecht (Oct 16 2023 at 15:27):

mulMat : Matrix a rows, Matrix a cols -> Matrix rows cols

Just pointing this out because I think it is important to realize that matrices are complex and it isn't just *. Also, if you make * mean matrix multiplication, how do you do pervasive multiplication of a matrix (which is also a very common operations. 2 * myMat is not requesting for us to turn 2 into a matrix and then do matrix multiplication. It is requesting pervasive multiplication where each element of myMat is multiplied by 2.

Richard Feldman (Oct 16 2023 at 15:29):

also, is that type correct for what matrix multiplication is supposed to do? :face_with_raised_eyebrow:

Brendan Hansknecht (Oct 16 2023 at 15:31):

It's about memory access. In the original signiture, you are traversing the columns of one matrix and the rows of anther. This means that in one matrix you are jumping around in memory with strided accesses instead of just linearly reading a dense cacheline.

Brendan Hansknecht (Oct 16 2023 at 15:31):

by transposing one matrix, you can make it so that all of the data is in dense linear chunks of memory.

Brendan Hansknecht (Oct 16 2023 at 15:33):

It is not the correct type based on how it would be written out in match, but it is pretty common in high performance code to try to have one matrix transposed where possible. That said, there are lots of techniques to try and make the first version without transposing faster (generally by transposing in chunks and then doing dense math, but it is never as efficient)

Brendan Hansknecht (Oct 16 2023 at 15:40):

Probably the most common is for a list of matrices. It then puts all of the matrices in one contiguous chunk of memory (instead of having a list of pointers to matricies). On top of that, you may want do operations across all elements in a way that can be defined as a single math op instead of writing a loop. (higher dimensions are also super super common in ML)

As a concrete example. In my ray tracer, I have multiple rays for ever pixel. Each ray has 3 data points (x, y, z) So that is a tensor of shape samples x rows x cols x 3. By simply transposing it, I can get 3 x samples x rows x cols. Now that the outer most dimension is 3, I can do any sort of vector math that expects 3 elements on the entire tensor at once. For example, shift them over by 1. rays - [0 0 1]

Brendan Hansknecht (Oct 16 2023 at 15:40):

In the real ray tracer, I solve collision equations for all rays at once and don't need to explicitly write out any looping. This can all be done with fast simd over the dense tensor.

Richard Feldman (Oct 16 2023 at 15:41):

Brendan Hansknecht (Oct 16 2023 at 15:42):

Though if you have to partition something over multiple execution untis or over time, those often become dimensions in the tensor, so that can add a few extra.

Lakin Wecker (Oct 16 2023 at 15:43):

Yeah, and 3 dimensions are common in multidimensional data fitting for the reasons that Brendan mentioned

Richard Feldman (Oct 16 2023 at 15:44):

ok so let's say we have two 5-dimensional values - what would the type look like to multiply them?

Brendan Hansknecht (Oct 16 2023 at 15:47):

That would be pervasive multiplication. It works with anything with pretty minimal agreement. Matrix multiplication is special.

Matrix multiplication is the sum reduction of the pervasive multiplication of all rows with all columns. (or all rows with all rows if you transpose one matrix)
This also applies to higher dimensional tensor. It is just that the higher dimensions are ignored/perserved.

Richard Feldman (Oct 16 2023 at 15:49):

interesting! so the exact signature of mulMat I wrote above would work for all use cases in ML?

Richard Feldman (Oct 16 2023 at 15:49):

Brendan Hansknecht (Oct 16 2023 at 15:51):

mulNd : Tensor a b c ... n row x, Tensor a b c ... n x cols -> Tensor A B C ... N rows cols

Brendan Hansknecht (Oct 16 2023 at 15:52):

Note, this is still touching all elements in both tensors and modifying them, so it isn't really doing pervasive mulitplication for all higher dimensions, it just isn't changing the size of higher dimensions.

Brendan Hansknecht (Oct 16 2023 at 15:53):

But yeah, it could be seen as looping over all higher dimensions until you get to the inner 2 and then doing matrix multiplication.

Richard Feldman (Oct 16 2023 at 15:53):

Brendan Hansknecht (Oct 16 2023 at 15:54):

Just trying to point out the repeating pattern. Obviously would have to be unique to each tensor rank to fit into peano numbers.

Richard Feldman (Oct 16 2023 at 15:54):

Brendan Hansknecht (Oct 16 2023 at 15:55):

Richard Feldman (Oct 16 2023 at 15:56):

which, if we wanted all of those to support infix operations, means we'd need something like ** for 2d, *** for 3d, **** for 4d, etc.

Brendan Hansknecht (Oct 16 2023 at 15:56):

Richard Feldman (Oct 16 2023 at 15:56):

Brendan Hansknecht (Oct 16 2023 at 15:57):

That said, I think in practice, I think that it is fine for higher dimension matrices to just use dynamic shape generally.

Brendan Hansknecht (Oct 16 2023 at 15:57):

Richard Feldman (Oct 16 2023 at 15:57):

that sounds reasonable to me, but then again I have no experience in the domains in question :laughing:

Brendan Hansknecht (Oct 16 2023 at 15:57):

Cause generally, by dim 3 or 4, the tensors are big enough that the dynamic nature has litle cost

Richard Feldman (Oct 16 2023 at 15:57):

Richard Feldman (Oct 16 2023 at 15:58):

sure, although there's also the question of how valuable it is to represent the rows/cols/etc in the type for reasons other than performance (I have no idea how valuable that is or isn't in practice!)

Brendan Hansknecht (Oct 16 2023 at 15:58):

If we want to start simple, I would argue for just going 100% dynamic shape and then later debate adding in some static versions for smaller ranks.

Brendan Hansknecht (Oct 16 2023 at 16:02):

From what I have seen from my many engineer friends. Everything they do is always dynamic (and often totally untyped)

Brendan Hansknecht (Oct 16 2023 at 16:03):

Don't know about games, scientific computing, or other applications. Have only messed with rust/c++ libraries that tend to give both as options by using super complex templates and such.

Lakin Wecker (Oct 16 2023 at 16:03):

Yeah, aside from specific pipelines (3D rendering being one obvious one), most matrices are dynamically sized and operations checked at runtime in my experience in other languages, which is limited.

Richard Feldman (Oct 16 2023 at 16:03):

Lakin Wecker (Oct 16 2023 at 16:03):

Richard Feldman (Oct 16 2023 at 16:04):

Lakin Wecker (Oct 16 2023 at 16:04):

Eigen3 is the library I'm most familiar with and it supports both styles (typed matrix sizes and dynamic sizes). And we only used the typed sizes for rendering pipelines and some specific physical simulations. Everything else ends up being dynamic

Richard Feldman (Oct 16 2023 at 16:05):

Lakin Wecker (Oct 16 2023 at 16:05):

Lakin Wecker (Oct 16 2023 at 16:06):

But I dunno if I have wide enough experience to be certain this is sufficient for all purposes. So consider it more of a confirmation that it's satisfactory for my own purposes.

Brendan Hansknecht (Oct 16 2023 at 16:15):

Also, one advantage of making matrices a builtin is that they can access the refcount.

Richard Feldman (Oct 16 2023 at 16:15):

Brendan Hansknecht (Oct 16 2023 at 16:18):

The main issue being when you add 2 matrices for example, you want to do it in place, but you can't tell which of the two matrices is unique (if I either). If you happen to pick a unique matrix, our list unique work will kick in, but it is nicer to be able to check both matrices rather than guessing. (of course could be track in userland with an extra integer but would need to be manually updated which doesn't sound great).

Richard Feldman (Oct 16 2023 at 16:18):

@Lakin Wecker and @Declan Joseph Maguire I'm curious what your perspective on this - how would you feel about matrix multiplication requiring that you transpose the first argument before calling the function, if it meant better performance?

Brendan Hansknecht (Oct 16 2023 at 16:21):

Brendan Hansknecht (Oct 16 2023 at 16:31):

Also, just to clarify, if we make a matmul infix operator, we probably want it to match standard matmul (not transposed). But in most libraries, they expose both. Cause in many cases, you have a constant or otherwise that you can just created transposed or transpose once and then repeatedly use over and over again. So might as well just take the free perf.

Lakin Wecker (Oct 16 2023 at 16:35):

I'd prefer developer ergonomics over raw performance. If I need raw performance I may end up back in a different language anyways. My understanding of Roc is that we want to be close enough with raw performance, but with better ergonomics. If we're considering starting with dynamic sizes at a small runtime overhead expense, then we could also consider having both column major and row major representations of matrices, which might solve the same purpose without the confusing API. Maybe this complicates things way too much though, so I'll defer to others who are more familiar with these details. I will point out that there is a whole rabbit hole here, such as things like z-order curve element ordering for efficient memory traversal.

Lakin Wecker (Oct 16 2023 at 16:37):

I'm also often dealing with banded sparse matrices with known repeating patterns as well which use a completely different in memory representation, and so it's unclear to me if ensuring that the first matrix is always transposed will help with performance in this case anyways

Anton (Oct 16 2023 at 16:38):

Could the transpose be used automatically behind the scenes by the compiler as an optimization?

Lakin Wecker (Oct 16 2023 at 16:39):

The compiler doesn't know which matrix you're going to multiply with another one, and for large enough matrices transposing is a pretty expensive operation if you change the memory layout.

Lakin Wecker (Oct 16 2023 at 16:39):

We're not trying to go down the road of building a full linear algebra library here with matrix decompositions and the like as a built in, so I think something that loosely translates to a memory layout that is compatible with existing libraries (such as rendering libraries) and has a nice ergonomic API to write simple arithmetic like code is what I'd aim for

Brendan Hansknecht (Oct 16 2023 at 16:42):

For sure, but it would probably be a pretty limited pattern. Anyway, yeah, probably not an optimization to worry about for now.

Brendan Hansknecht (Oct 16 2023 at 16:45):

Most existing libraries that I know of tend to have some sort of tensor wrapper that is just a pointer and a shape (maybe also a flag about traverasl). So should be easy to expose a view in general

Lakin Wecker (Oct 16 2023 at 16:46):

Johan Lövgren (Oct 16 2023 at 19:14):

(Here follows some math which gives a little bit of extra context, sorry if it is too much.) Here is a useful way to think about matrix and tensor multiplication. We can think of a matrix

A

in terms of its components

A_{ij}

, where

i

enumerates the rows and

j

enumerates the columns (so both go from 1 to 3 for 3D square matrices). Then when we multiply with some other matrix

B

. The result can be found summing over "the contracted index", and multiplying the components:

C=A * B

if the components of C are

C_{ik} = \sum_{j} A_{ij}B_{jk}

. Or with a convention that repeated indices are always summed over,

C_{ik} =A_{ij}B_{jk}

Also, given our matrices above, we can perform other contractions, e.g. by contracting the first index of A with the second of B, which is the same as first transposing

B

and then doing a matrix multiplication with

B^T

to the left:

A_{ij}B_{ki} =B^{T}_{ki}A_{ij}=(B^T A)_{kj}

And for higher-ranked tensors we have many possible different products. All that is required is that the contracted indices (i.e. the dimensions summed over) both enumerate the same size. So for example if we have rank-3 tensor

D_{ijk}

and a rank-4 tensor

E_{lmno}

then there are many different ways to multiply them! Contracting one index removes one index from each tensor, so we end up with a tensor of rank 3 + 4 - 2 =5. So e.g.

F_{ijklm}=D_{ija}E_{aklm}

, where

a

is summed over. (i.e. this case requires the third dimension of D and the first dimension of E to be of the same size).

Johan Lövgren (Oct 16 2023 at 19:17):

The reason I bring all of this up is because it kind of explains the type in your expression of matrix multiplicaiton @Richard Feldman:

mulMat : Matrix rows a, Matrix a cols -> Matrix rows cols

Here a is a type variable , which enumerates the index we contract. And then a is not present in the final expression in the end. Same would be true for generic tensor multiplication

Richard Feldman (Oct 16 2023 at 19:19):

right, but given that each infix operator needs to correspond to exactly one type signature function, it's very relevant which one(s) people expect * to do in practice, and which of the others (if any) require infix operators in order for the ergonomics to be nice! :big_smile:

Johan Lövgren (Oct 16 2023 at 19:23):

To be honest I don't know what people would expect. To me it seems like it could mean anything honestly. This might actually be a good argument to restrict it to matrices. Because for matrices you always get a matrix back, i.e. by contracting you go 2 + 2 - 2 = 2.

Brendan Hansknecht (Oct 16 2023 at 20:27):

There is definitely a reason I think that all of our default infix operators should just be the pervasive ops with broadcasting. They are simple, clear, and consistent with what scalars do.

Brendan Hansknecht (Oct 16 2023 at 20:27):

We can then choose to add a special operator for the special case of matrix multiplication, but that would be a side question

Brendan Hansknecht (Oct 16 2023 at 20:28):

I mean lots of people use numpy to great success and it doesn't have an infix matrix multiplication op (I know, this isn't an argument for not improving, but I think it points to what a solid baseline is)

Brendan Hansknecht (Oct 16 2023 at 20:28):

Brendan Hansknecht (Oct 16 2023 at 20:29):

I think pervasive ops as infix gives us something that will always just feel like scalars and will still be helpful in many cases. Then named ops (or maybe a few special infix ops) can be used for other cases.

Richard Feldman (Oct 16 2023 at 20:52):

also just to clarify since this is a different thread: what does broadcasting mean in this context?

Brendan Hansknecht (Oct 16 2023 at 21:00):

If you have a vector of shape [3] and a matrix of shape [10, 3], you can do regular math with them. So vector + matrix is equivalent to "broadcasting" the vector to [10, 3] and then adding it to the matrix. So it is as if you had duplicated the elements multiple times.

Note: this can be done on either the leading or trailing axis and we would have to pick which direction we want. With trailing axis (like numpy), [3] can be added to [10 3]. With leading axis (like most array languages), [3] can be added to [3 10]. There are reasons to pick either, we probably should follow scientific computing by looking at numpy, matlab, and julia (which I think are all trailing axis for this, but should double check).

>>> import numpy as np
>>> v = np.array([1, 2, 3])
>>> m = np.array ([[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4], [5, 5, 5]])
>>> v.shape
(3,)
>>> m.shape
(5, 3)
>>> v + m
array([[2, 3, 4],
       [3, 4, 5],
       [4, 5, 6],
       [5, 6, 7],
       [6, 7, 8]])

Declan Joseph Maguire (Oct 17 2023 at 08:31):

The transposition is very common. However, this touches on a deep nerve, which is that the underlying representations of matrices are often different from the explicit form for performance reasons, often in a variety of ways. For example, there's the distinction between row-major and column-major encodings, which is what @Brendan Hansknecht is getting at. These are respectively those cases where rows or columns are memory contiguous. As such, most implementations of matrices/tensors have a lot of hidden implementation details, describing how the dimensions are laid out in memory, as well as stuff like "has this whole thing been negated/conjugated?" This makes many operations have negligible footprint, like transposition, because you just need to edit the metadata to change how the memory gets interpreted.

Further, you often have cases like diagonal, tridiagonal, and upper/lower triangular matrices, which only store part of their memory with the rest implicitly 0, which are super common cases in many systems (tridiagonals often come up when you have systems of equations, each variable only directly coupled to its 1D neighbours). The most exotic one (the only one whose memory footprint isn't knowable from shape) is a sparse array, which is basically a hashtable, but a matrix (many matrices are huge, but almost everywhere 0). Similarly you have symmetric/anti-symmetric/self-ajoint matrices for whom

a_{ji} = a_{ij}|-a_{ij}|a_{ij}^*

, which both allows memory saving and lets you always use efficient layouts. BTW almost all of these classes are closed under addition and multiplication.

I have no idea how much of this, if any, is compatible with the range of type systems under consideration. In practice, it's common for people to have to keep track of row vs. column major for performance, so I don't think it's a terrible ask for people to manually transpose, though I'd _heavily_ suggest making transposition not fiddle with the underlying memory anyway as it's ubiquitous and otherwise very costly, which would make manual transposition ineffective.

The dream would be a system that handles these considerations for you, choosing the most efficient underlying representation for a given task, choosing the right order to multiply matrices in to minimise memory footprint (collapse the largest dimensions first, tracking multiply referenced matrices), etc.

But yeah, manual transposition wouldn't be too hard, but it's probably a bad idea because it wrecks a basic optimisation, instead the row/column major pattern should ideally be inferred and be hidden or exposed only by a dedicated function. I've got more to say about tensors, btw.

Declan Joseph Maguire (Oct 17 2023 at 08:37):

Nowadays you can use @ as an infix for matrix multiplication. It was explicitly added for this usecase to python, though without being bound to that usecase. Obviously python has good reasons to listen to the linalg library makers, but it's still notable that python thought it necessary enough to add a whole infix to the base language just to enable it.

Anton (Oct 17 2023 at 09:12):

Could we have a type Matrix rows cols that is an alias for RawMatrix rows cols [RowMajor, ColMajor, Diagonal, UpperTriangular, LowerTringular, Identity, ...]? In the beginning we could limit it to RawMatrix rows cols [RowMajor, ColMajor].

This could allow us to provide a simple API to most users, give power users nice options and allow us to optimize matrix multiplications behind the scenes some time in the future.

Declan Joseph Maguire (Oct 17 2023 at 09:12):

@Johan Lövgren Thank you for sparing everyone my rambly explanation of the topic.

As for tensors, I believe in current numpy, the first two dimensions of a tensor get treated like the two dimensions of a matrix. You then interpret matrix multiplication as tensor contraction (as Johan explained), resulting in a final matrix. To make a sort of pseudo-type signature with variables, it would be *: Ten a b y, Ten b c z -> Ten a c y z. Most other things generalise along the line @Brendan Hansknecht mentioned when talking about broadcasting.

As for actually __using__ tensors, the big time I've used them is quantum simulations. In those cases, I used tensors with 4 dimensions, alongside a bunch of others. Here's a small exerpt of the code:

M = self.get_pair_tensor(j)
M_new = np.einsum('astb,stuv -> auvb',
        M, # astb
        operator, # stuv
        optimize = True)
self._set_pair_from_tensor(j, M_new)

If you see that weird string in the second line, that's actually the instructions to einsum telling it what contraction to perform on M and operator. This is actually an example of einstein sum notation (hence einsum), and I think it shows a pretty good upper limit, the point at which trying to continue supporting * for tensors becomes redundant. The letters represent and label the dimensions/indices, and letters may be repeated (no more than twice) to implicitly indicate summation/contraction over that index. You can have any non-zero number of tensors in an expression, and any pair of indices can be chosen to be contracted/summed over/given the same name.

This einstein notation actually generalises dot products, cross products, outer products, matrix multiplication, transposition, and a bunch of other things. If we were talking in OOP land, it'd be a dope class to inherit from. Point being, if you can get the type system to be robust enough for einsum, and general broadcasting, you'd probably be able to solve 90% of your type problems. The last 10% is probably banal stuff like glueing two matrices edge-to-edge to make a bigger one.

Declan Joseph Maguire (Oct 17 2023 at 09:18):

Regarding peano numbers, are you asking for practical implementations in Peano arithmetic, or just "in principle" ones? Because as Godel taught us, peano arithmetic can represent anything if you really really really really try hard enough.

Brendan Hansknecht (Oct 17 2023 at 14:54):

Other way around. Numpy does trailing access agreement. It pretends you have a list of matrices. Then does matrix multiplication on the inner dimensions with broadcasting. So Ten a b y, Ten a y z -> Ten a b z. It will do broadcasting if the shapes don't align so Ten a b y, Ten x y z -> Ten max(a, x) b z

Brendan Hansknecht (Oct 17 2023 at 15:01):

Also, I don't think Peano numbers can represent broadcasting (though I don't fully understand peano numbers). Can Peano numbers represent picking the max of 2 different numbers in the type system? Can they also add a check that either two numbers are the same, or one of them is one? Also, they would need to deal with understand shapes and which dimensions align when dealing with mixing different rank tensors.

Brendan Hansknecht (Oct 17 2023 at 15:04):

I'm pretty sure to support static tensors we would either need, a way more powerful type system, special logic for static tensors, or the really bad form of static tensors that can't do any automatic shape stuff.

I am pretty sure we would just have Tensor or NdMatrix or whatever the name with no type variables. I think it has to be dynamic and there is no need to put any of the rest of that information in a type exposed way. May be part of the underlying type, but it would be hidden from the user (mode exposed on explicit request)

Brendan Hansknecht (Oct 17 2023 at 15:17):

All of this is the type of complicated and ever expanding list of stuff is what makes me think none of this should be builtin. I really do think this is a complex enough space that there is extremely huge merit in figuring out how to enable it in userland. This is especially true given the amount of domain specifics are possible in these fields. I don't think we want to pull the entire complexity of numpy or some other high performance matrix library into the builtins. That said, if we don't pull in that level of complexity and support, we would probably be leaving a lot of people dissatisfied.

Also, I think that discussing type system is really the wrong conversation here. I think we should be discussing overall complexity, ease of use of the end library, and long term support. Fundamentally, almost everyone who write matrix code works in a dynamic language (matlab, octave, python, julia, mathematica, R, array languages, even most c/fortan code tends to use dynamically shape tensor libraries). The amount of people that care about typing for matrix code is very small. The code tends to be dynamic in a way that is very easy to debug when shapes don't align and otherwise not really use types.

Richard Feldman (Oct 17 2023 at 16:06):

Richard Feldman (Oct 17 2023 at 16:13):

I agree that the builtins shouldn't get a numpy-ish level of complexity, but I don't think that's necessary. The point of this discussion is to explore the idea of making tensors builtins for the primary purpose of making infix math operators work with them (without requiring introducing operator overloading to the language), with the possible additional benefit of perhaps making it possible to have things like myMatrix * 2 work

Richard Feldman (Oct 17 2023 at 16:17):

today, people can build large and complex math libraries on top of Num, e.g. for statistics. We don't incorporate that complexity into the builtins, and we shouldn't.

I think of the tensors-as-builtins ideas in the same way: making tensors builtins gives people a baseline to build on (e.g. basic arithmetic operations, which work with the normal math infix operators) but not trying to be a comprehensive solution

Richard Feldman (Oct 17 2023 at 16:18):

another way to say it: I think the most important job of builtins is to provide things that can't be provided any other way - e.g. access to operations that take a single CPU instruction, operations whose implementation requires doing unsafe UTF-8 things, etc.

Richard Feldman (Oct 17 2023 at 16:18):

Brendan Hansknecht (Oct 17 2023 at 16:34):

For sure, I think the big question is: Given all the possible ways to represent a matrix, can we give a base that would satisfy the needs of essentially all users?

As pointed out by @Declan Joseph Maguire, there are a ton of representation optimizations that are possible. In some fields (specifically those with huge sparse matrices), it is a non-starter if you don't have the correct representation.

Even in other simpler physics and math based fields, there are still tons of potential storage and compute based optimizations that would be wanted.

All of these pieces would need to be need be represented in our matrix type as a baseline to enable users. Of course, we could choose to just a have a simple matrix (and that is probably fine for most users), but it kinda is opening the door for one subset of people to get nice infix math while another group has no access to it due to using sparse (or otherwise optimized) tensors that no longer match the builtin type.

Richard Feldman (Oct 17 2023 at 16:42):

Richard Feldman (Oct 17 2023 at 16:44):

Richard Feldman (Oct 17 2023 at 16:45):

although I suppose we could theoretically have separate infix operators for both matrix multiplication and also multiplication by a scalar

Richard Feldman (Oct 17 2023 at 16:48):

Richard Feldman (Oct 17 2023 at 16:49):

accessing individual elements for dynamic matrices would always have to be a function call, e.g. Mat.get myMatrix a b

Richard Feldman (Oct 17 2023 at 16:50):

Richard Feldman (Oct 17 2023 at 16:51):

Brendan Hansknecht (Oct 17 2023 at 16:54):

Jacob (Oct 17 2023 at 17:12):

Brendan Hansknecht (Oct 17 2023 at 17:13):

Elias Mulhall (Oct 17 2023 at 22:26):

Elias Mulhall (Oct 17 2023 at 22:28):

Declan Joseph Maguire (Oct 18 2023 at 07:03):

That sucks big big time but it'd be a price I'd pay if I can have my arithmetic. Plus, I reckon that heavily using indices to access specific values is a sort of antipattern when overused by otherwise higher level scripts. It's so easy even with good notation to get utterly lost.

Also, regarding static vs dynamic, might we have tensors be dynamic by default, but allow the user to override that when certain conditions are met? Then games and graphics people can make their matrices and associated libraries nice and neat and efficient without too much pain.

Brendan Hansknecht (Oct 18 2023 at 14:42):

A user definitely can make a static matrix, but with current proposals, they just won't be as ergonomic.

Brendan Hansknecht (Oct 18 2023 at 14:43):

For example, they won't be able to get infix matmul except with square matrices due to type system limitations

Stream: ideas

Topic: builtin matrix support

Richard Feldman (Oct 16 2023 at 14:44):

Richard Feldman (Oct 16 2023 at 15:00):

Richard Feldman (Oct 16 2023 at 15:01):

Richard Feldman (Oct 16 2023 at 15:02):

Richard Feldman (Oct 16 2023 at 15:03):

Richard Feldman (Oct 16 2023 at 15:03):

Richard Feldman (Oct 16 2023 at 15:04):

Richard Feldman (Oct 16 2023 at 15:05):

Richard Feldman (Oct 16 2023 at 15:06):

Richard Feldman (Oct 16 2023 at 15:07):

Richard Feldman (Oct 16 2023 at 15:10):

Brendan Hansknecht (Oct 16 2023 at 15:27):

Richard Feldman (Oct 16 2023 at 15:29):

Brendan Hansknecht (Oct 16 2023 at 15:31):

Brendan Hansknecht (Oct 16 2023 at 15:31):

Brendan Hansknecht (Oct 16 2023 at 15:33):

Brendan Hansknecht (Oct 16 2023 at 15:40):

Brendan Hansknecht (Oct 16 2023 at 15:40):

Richard Feldman (Oct 16 2023 at 15:41):

Richard Feldman (Oct 16 2023 at 15:41):

Brendan Hansknecht (Oct 16 2023 at 15:42):

Brendan Hansknecht (Oct 16 2023 at 15:42):

Lakin Wecker (Oct 16 2023 at 15:43):

Richard Feldman (Oct 16 2023 at 15:44):

Brendan Hansknecht (Oct 16 2023 at 15:47):

Richard Feldman (Oct 16 2023 at 15:49):

Richard Feldman (Oct 16 2023 at 15:49):

Brendan Hansknecht (Oct 16 2023 at 15:51):

Brendan Hansknecht (Oct 16 2023 at 15:52):

Brendan Hansknecht (Oct 16 2023 at 15:53):

Richard Feldman (Oct 16 2023 at 15:53):

Richard Feldman (Oct 16 2023 at 15:53):

Richard Feldman (Oct 16 2023 at 15:53):

Brendan Hansknecht (Oct 16 2023 at 15:54):

Richard Feldman (Oct 16 2023 at 15:54):

Brendan Hansknecht (Oct 16 2023 at 15:55):

Richard Feldman (Oct 16 2023 at 15:56):

Brendan Hansknecht (Oct 16 2023 at 15:56):

Richard Feldman (Oct 16 2023 at 15:56):

Brendan Hansknecht (Oct 16 2023 at 15:57):

Brendan Hansknecht (Oct 16 2023 at 15:57):

Richard Feldman (Oct 16 2023 at 15:57):

Brendan Hansknecht (Oct 16 2023 at 15:57):

Richard Feldman (Oct 16 2023 at 15:57):

Richard Feldman (Oct 16 2023 at 15:58):

Brendan Hansknecht (Oct 16 2023 at 15:58):

Brendan Hansknecht (Oct 16 2023 at 16:02):

Brendan Hansknecht (Oct 16 2023 at 16:03):

Lakin Wecker (Oct 16 2023 at 16:03):

Richard Feldman (Oct 16 2023 at 16:03):

Lakin Wecker (Oct 16 2023 at 16:03):

Richard Feldman (Oct 16 2023 at 16:04):

Lakin Wecker (Oct 16 2023 at 16:04):

Richard Feldman (Oct 16 2023 at 16:05):

Lakin Wecker (Oct 16 2023 at 16:05):

Lakin Wecker (Oct 16 2023 at 16:06):

Brendan Hansknecht (Oct 16 2023 at 16:15):

Richard Feldman (Oct 16 2023 at 16:15):

Brendan Hansknecht (Oct 16 2023 at 16:18):

Richard Feldman (Oct 16 2023 at 16:18):

Brendan Hansknecht (Oct 16 2023 at 16:21):

Brendan Hansknecht (Oct 16 2023 at 16:31):

Lakin Wecker (Oct 16 2023 at 16:35):

Lakin Wecker (Oct 16 2023 at 16:37):

Anton (Oct 16 2023 at 16:38):

Lakin Wecker (Oct 16 2023 at 16:39):

Lakin Wecker (Oct 16 2023 at 16:39):

Brendan Hansknecht (Oct 16 2023 at 16:42):

Brendan Hansknecht (Oct 16 2023 at 16:45):

Lakin Wecker (Oct 16 2023 at 16:46):

Johan Lövgren (Oct 16 2023 at 19:14):

Johan Lövgren (Oct 16 2023 at 19:17):

Richard Feldman (Oct 16 2023 at 19:19):

Johan Lövgren (Oct 16 2023 at 19:23):

Brendan Hansknecht (Oct 16 2023 at 20:27):

Brendan Hansknecht (Oct 16 2023 at 20:27):

Brendan Hansknecht (Oct 16 2023 at 20:28):

Brendan Hansknecht (Oct 16 2023 at 20:28):