Stream: ideas

Topic: Fusion without Iter


view this post on Zulip Ayaz Hafiz (Sep 19 2024 at 01:14):

Cool proposal. Glad this is being discussed.

One thought I have is I think this will actually increase compile times much more than it may seem, and the optimization is not guaranteed. This optimization problem is exactly the kind that typically requires a lot of compute work. The reason I say this is because at its core, the optimization relies on the user writing a lot of IR with intermediate calls, and having some batch optimization later on aggregate and cancel out the intermediates.

I wonder if there's a way to achieve the goals here without generating a ton of intermediate code that then needs to be thrown away. One idea would be to treat List as the common denominator, and have an analysis that tracks direct calls to List.* functions in a sequence. Then, if there is a region of code that only contains sequenced calls to List.* functions, you know you can inline those directly, and specialize sequenced List.map/etc. without having to traverse big chains of code that you need to eliminate or consider inlining - the directive for what to do is defined explicitly by the analysis.

view this post on Zulip Ayaz Hafiz (Sep 19 2024 at 01:20):

(the reason this is so significant is this is exactly why LLVM or other optimizers can be pathologically slow - if you feed them a ton of code to be optimized away, they will do so but take a very long time. if it's possible to feed optimal code earlier on, you win compile time)

view this post on Zulip Richard Feldman (Sep 19 2024 at 01:28):

Ayaz Hafiz said:

One thought I have is I think this will actually increase compile times much more than it may seem, and the optimization is not guaranteed. This optimization problem is exactly the kind that typically requires a lot of compute work. The reason I say this is because at its core, the optimization relies on the user writing a lot of IR with intermediate calls, and having some batch optimization later on aggregate and cancel out the intermediates.

just to check my understanding - is the extra IR from intermediate calls from people writing |> List.map fn1 |> List.map fn2 where otherwise they might have just combined them into one map call?

view this post on Zulip Richard Feldman (Sep 19 2024 at 01:29):

or the fact that each of those map calls would expand to 3 calls to Iter.fromList, Iter.map, and then Iter.toList? (whereas today they'd just be one call)

view this post on Zulip Richard Feldman (Sep 19 2024 at 01:30):

or I guess both, since either compounds the other

view this post on Zulip Richard Feldman (Sep 19 2024 at 01:31):

Ayaz Hafiz said:

I wonder if there's a way to achieve the goals here without generating a ton of intermediate code that then needs to be thrown away. One idea would be to treat List as the common denominator, and have an analysis that tracks direct calls to List.* functions in a sequence. Then, if there is a region of code that only contains sequenced calls to List.* functions, you know you can inline those directly, and specialize sequenced List.map/etc. without having to traverse big chains of code that you need to eliminate or consider inlining - the directive for what to do is defined explicitly by the analysis.

yeah I definitely think this is a situation where there would be an extreme power law distribution in play

view this post on Zulip Richard Feldman (Sep 19 2024 at 01:31):

like I would not be at all surprised if in practice it turned out to be 95% of the time it's specifically a List higher-order function in a pipeline with 1+ other List higher order functions

view this post on Zulip Richard Feldman (Sep 19 2024 at 01:32):

and "everything else" accounts for 5% of the scenarios where extra work is done compared to today

view this post on Zulip Richard Feldman (Sep 19 2024 at 01:32):

and yeah, those would be straightforward to detect prior to inlining even

view this post on Zulip Ayaz Hafiz (Sep 19 2024 at 02:00):

or the fact that each of those map calls would expand to 3 calls to Iter.fromListIter.map, and then Iter.toList? (whereas today they'd just be one call)

yes, this. and the fact that is can be deeply nested, hence making the inline analysis hard

view this post on Zulip Ayaz Hafiz (Sep 19 2024 at 02:02):

yeah I definitely think this is a situation where there would be an extreme power law distribution in play

yes, this is why it might make sense to do a simpler heuristic based just on list. I think in practice you're going to want to convert to a list-like structure, or represent your data structure as a list under the hood anyway.

view this post on Zulip Richard Feldman (Sep 19 2024 at 02:23):

as in don't do optimization on Iter at all?

view this post on Zulip Richard Feldman (Sep 19 2024 at 02:24):

or as in like don't implement data structure operations in terms of Iter

view this post on Zulip Ayaz Hafiz (Sep 19 2024 at 03:51):

use List as the basic unit for supporting fusion rather than Iter

view this post on Zulip Brendan Hansknecht (Sep 19 2024 at 04:17):

We've discussed fusion a lot in the past, personally, I think it is likely to be much more finicky than iterators. I think this is something that likely needs to be explicit. An iterator over a range of numbers needs to never allocate a full list.

view this post on Zulip Brendan Hansknecht (Sep 19 2024 at 04:17):

Using explicit iterators (even if nothing inlines) avoids the intermediate allocations

view this post on Zulip Brendan Hansknecht (Sep 19 2024 at 04:18):

I do also agree that it will make a crazy chain of lambdas that llvm will have to deal with unwinding and inlining.

view this post on Zulip Sam Mohr (Sep 19 2024 at 05:24):

Maybe we could make iterators easier to start and stop, that might make it just as desirable to use, which seems to be a solution here

view this post on Zulip Sam Mohr (Sep 19 2024 at 05:24):

(more keywords, basically)

view this post on Zulip Sam Mohr (Sep 19 2024 at 05:24):

I'm AFK at the moment, I can brainstorm a bit at home

view this post on Zulip Sam Mohr (Sep 19 2024 at 05:25):

If we were able to make iterators just as easy to use in 90% or more of situations, we'd be happy to use those, right?

view this post on Zulip Sam Mohr (Sep 19 2024 at 05:26):

Basically, if we could come up with some syntax to start and collect iterators, that seems to be a solution here

view this post on Zulip Sam Mohr (Sep 19 2024 at 05:26):

At which point, we could radically remove builtins like List.map and List.filter

view this post on Zulip Sam Mohr (Sep 19 2024 at 05:38):

It feels like a stupid question, but could someone tell me why the reason we want fusion is anything beyond "iterators are more effort to use"?

view this post on Zulip Sam Mohr (Sep 19 2024 at 05:39):

Considering we have size hints meaning we get the same allocations and all

view this post on Zulip Sam Mohr (Sep 19 2024 at 10:15):

I'm messing with the best I can do with some magic iter <list> keyword. It will take some brainpower, but again, if the limit really is ergonomics, that seems like a much more reasonable battle to fight than manual inlining and fusion.

view this post on Zulip Sam Mohr (Sep 19 2024 at 10:52):

Current thought is something like this, using a new iter block:

iter items as List
    Iter.map \x -> x + 2
    |> Iter.keepIf \x -> x % 2 == 0

# desugars to

items
    |> List.toIter
    |> Iter.map \x -> x + 2
    |> Iter.keepIf \x -> x % 2 == 0
    |> List.fromIter

for collecting to a different type

iter items as List to Personal.Dict
    Iter.map \x -> x + 2

# desugars to

items
    |> List.toIter
    |> Iter.map \x -> x + 2
    |> Personal.Dict.fromIter

It's a line longer for the standard case, but it's plainly readable and seems to sidestep the need for alternatives because we would not support List.map, just List.toIter and List.fromIter

view this post on Zulip Notification Bot (Sep 19 2024 at 11:22):

28 messages were moved here from #ideas > Iterators and fusion proposal by Richard Feldman.

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:23):

Ayaz Hafiz said:

use List as the basic unit for supporting fusion rather than Iter

yeah, I was really into that idea for awhile, but it turned out to have a number of problems.

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:24):

one category of problems was around custom data structures

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:50):

for example, if I make a Tree data structure and I give it a toList, I'm not sure how fusion could work on tree |> Tree.toList |> List.map fn1 |> List.map fn2 etc

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:51):

because converting the tree to a List isn't the same as converting it to the Iter data structure (which is set up to be able to do these transformations) - and I think it would be really hard to detect that and make it work

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:52):

one way to address that would be to offer Iter.custom but have it be List.custom intead

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:52):

but then we have the problem that if you have a List in general, you're assuming it has certain properties (e.g. seamless slices) but actually sometimes it might be a "secret iterator" created with List.custom which doesn't have the perf characters of an actual List

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:54):

another design direction is to say "okay custom data structures just don't benefit from fusion"

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:55):

but the problem there is that now it becomes a reasonable style to do things like |> List.map fn1 |> List.map fn2 until you make a custom data structure, at which point it's like "oh yeah don't do the normal style at all or else your performance will be terrible - switch to loops/folds/etc"

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:55):

which doesn't seem like a good outcome :sweat_smile:

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:56):

so I wonder about the design direction of having Iter but sort of inverting the default :thinking:

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:57):

in other words, don't have the builtins implemented in terms of Iter, but give them toIter and fromIter

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:57):

and then do these optimizations:

view this post on Zulip Richard Feldman (Sep 19 2024 at 11:58):

so basically if we see |> List.map fn1 |> List.map fn2 we have a separate compiler optimization that ends up being semantically equivalent to the optimization we'd do on |> List.toIter |> Iter.map fn1 |> Iter.map fn2 |> Iter.toList

view this post on Zulip Richard Feldman (Sep 19 2024 at 12:17):

Sam Mohr said:

It feels like a stupid question, but could someone tell me why the reason we want fusion is anything beyond "iterators are more effort to use"?

the way I think about this is that there are a few reasonable designs possible:

view this post on Zulip Richard Feldman (Sep 19 2024 at 12:19):

there's another possible direction, which is essentially "introduce the concept of ability parameters, have an Iter elem abilitiy, give it to List", and then you can write:

foo
|> Iter.map fn1
|> Iter.keepIf fn2

where foo is a List.

view this post on Zulip Richard Feldman (Sep 19 2024 at 12:22):

some downsides of that design:

view this post on Zulip Ayaz Hafiz (Sep 19 2024 at 14:03):

I think you can make the List approach work if you do the fusion all at once; there is no need for intermediate “list but secretly an iterator”. And the case of custom data structures can be solved by specialized the toList constructor around fusion.

There are other ways here to avoid potentially blowing up the amount of code in user space, which is the larger thing I’m trying to get at, for example by using external iterators. Happy to chat more Friday

view this post on Zulip Ayaz Hafiz (Sep 19 2024 at 14:09):

This would very likely be better! The reason being the typechecking happens before any specialization so you have less code to check up front, the typechecking can direct you to exactly what code to inline and simplify, and you have much less code to delete because there are less trivially discardable fromIter/toIter in sequence.

+1 on it probably not being the right approach for ergonomic reasons

view this post on Zulip Ayaz Hafiz (Oct 07 2024 at 02:27):

https://arxiv.org/pdf/2410.02232

view this post on Zulip Richard Feldman (Oct 07 2024 at 15:48):

so if I understand this correctly, it seems like:

does that all sound right? Did I miss anything?

view this post on Zulip Ayaz Hafiz (Oct 07 2024 at 20:12):

yes i think so

view this post on Zulip Richard Feldman (Oct 07 2024 at 22:08):

very cool! sounds more promising than the previous state of the art

view this post on Zulip Richard Feldman (Jan 09 2025 at 18:53):

I think based on :point_up: it probably makes sense to hold off on doing an Iter API for right now. We can have for loops work on List for now, and if you want to iterate over something other than a List, you have to convert it to a List first

view this post on Zulip Richard Feldman (Jan 09 2025 at 18:53):

obviously that will be bad for performance in the immediate term, but it does mean you have a way to use for loops on anything if you really want to, and if the perf is too bad you can fall back on the pre-for strategies

view this post on Zulip Richard Feldman (Jan 09 2025 at 18:53):

(for list perf it'll be fine, of course)

view this post on Zulip Richard Feldman (Jan 09 2025 at 18:54):

because it seems like - especially in light of that paper - it might actually be possible to get the benefit of Iter without actually introducing a separate Iter type, and instead just using List for everything

view this post on Zulip Richard Feldman (Jan 09 2025 at 18:54):

this would be really nice from an API perspective, but it feels like something that needs a lot more investigation and experimentation, and I don't want to block for on that

view this post on Zulip Richard Feldman (Jan 09 2025 at 18:55):

I also don't want to build and introduce an Iter API that we later end up removing as a big breaking change

view this post on Zulip Sam Mohr (Jan 09 2025 at 18:57):

And if for item in list relies on some static dispatch a.iter() -> Iter elem then there won't be much of a migration step if we move to Iter later

view this post on Zulip Richard Feldman (Jan 09 2025 at 19:01):

hm, I'm not sure that would work in the current type system

view this post on Zulip Sam Mohr (Jan 09 2025 at 19:01):

Because no HKT?

view this post on Zulip Richard Feldman (Jan 09 2025 at 19:01):

yeah

view this post on Zulip Sam Mohr (Jan 09 2025 at 19:02):

I'll have to bug @Ayaz Hafiz about that special set of HKT that we'd be able to support with principle type inference

view this post on Zulip Richard Feldman (Jan 09 2025 at 19:02):

although if we did end up having List be the intermediary, then they'd all be doing the right thing already anyway

view this post on Zulip Richard Feldman (Jan 09 2025 at 19:02):

Sam Mohr said:

I'll have to bug Ayaz Hafiz about that special set of HKT that we'd be able to support with principle type inference

we definitely can do that, I just don't think we should

view this post on Zulip Sam Mohr (Jan 09 2025 at 19:03):

If we move to Iter instead of List, then we'd need something like that. Otherwise we'd be good with what you suggest.

view this post on Zulip Richard Feldman (Jan 09 2025 at 19:04):

well or just call list.to_iter() explicitly, which I think would be totally fine :big_smile:

view this post on Zulip Sam Mohr (Jan 09 2025 at 19:07):

Yep


Last updated: Jun 16 2026 at 16:19 UTC