Array2D - advice for index type · API design

Stream: API design

Topic: Array2D - advice for index type

Elias Mulhall (Dec 20 2023 at 15:53):

I've been chatting with @Johan Lindskogen and @Ryan Bates on this github issue about how best to represent matrix indices https://github.com/mulias/roc-array2d/issues/5

Most of the necessary context is in the thread, but in case it's helpful the current implementation uses Index : { x: Nat, y: Nat }, and all of the array functions treat x as the row component and y as the column component of the index. As Johan points out that's not always going to match the user's expectations.

I think we've reached a point where I need to experiment with some options and see which ones make the most readable roc code, but I thought I'd bring the discussion over here to give people with different perspectives/backgrounds a chance to chime in. In particular I'm interested in advice from people with data science or array programming experience, since I have done very little of either!

Brendan Hansknecht (Dec 20 2023 at 16:02):

Generally, I think row and column are more intuitively clear to people

Anton (Dec 20 2023 at 16:13):

I've got data science experience :)
I like Index : { row: Nat, col: Nat } (4) for it's clarity. Algorithms that use multi-dimensional arrays can very easily become hard to follow, so I think (4) mitigates this best.

LoipesMas (Dec 20 2023 at 16:15):

I've done a little bit of data science and computer graphics and I was never sure which one is which. I think generally y is the rows and x is the columns, so you index like this: array[y][x]. But that can differ between use-cases (table with records vs an image) and libraries. I think row and column would clear that up, but then you still need to remember which one is the "outer" one (this is often important for performance)

Elias Mulhall (Dec 20 2023 at 16:17):

Yes that's a good point, I'm currently using row-major ordering https://en.wikipedia.org/wiki/Row-_and_column-major_order
My understanding is that some APIs will let you choose your data order. I've thought about that a little bit but it's fiddly.

LoipesMas (Dec 20 2023 at 16:22):

Maybe Index : { outer: Nat, inner: Nat } could work? That's more explicit and has less assumptions, but it's not really used so can be confusing. Although, IMO, it's better to have to think about it than be confused about what x or row means in a given context

Elias Mulhall (Dec 20 2023 at 16:37):

Interesting. I think the issue with that is outer could be either the row or the column, depending on if the data is stores in row-major or column-major order. I'll keep thinking about it though.

LoipesMas (Dec 20 2023 at 17:53):

I feel like what row and column means depends on the context and maybe how you get your data. If you load a table from a csv file it's pretty unambiguous, but in other cases not so much. If you transpose an image, rows are now columns and columns are now rows, but they're still arrays of pixels. When you render an image to screen, you have a "contract" with the library/API/whatever that inner arrays will go horizontally. But it's not a "property" of the data.
But maybe that's too philosophical/abstract and not productive :sweat_smile:

Elias Mulhall (Dec 20 2023 at 18:47):

Sure. We need to start with something that people can build their mental model off of, and I think it's pretty well understood that columns are the uppy-downies and rows are the side-to-sidies.

Ayaz Hafiz (Dec 21 2023 at 03:37):

If you're only interested in sticking with 2D arrays, I'd suggest {row, col}. If you're planning to go to higher dimensions later on, I'd honestly go with the tuple option, and expose helpers like dim0, dim1, ... to extract the position in a dimension.

Elias Mulhall (Dec 21 2023 at 04:43):

No solid plans, but I'm considering 1d with a integer index, 2d with {row, col}, 3d with {row, col, depth}, and Nd with a tuple. I don't think that the index types being different would be too big of a deal

Jacob (Dec 30 2023 at 13:07):

Would you prefer to use column-major for "scientific" purposes, no? I think every scientific purposed language I've used (R, Matlab, Julia, eigen/blas/lapack etc.) other than numpy (although, order='F') uses it by default or allows you to modify it. For math/physics/science books in general its standard, but computer graphics textbooks said "nah"... Although in the end it doesn't really matter that much, more preference/consistency.

Last updated: Jul 26 2025 at 12:14 UTC