Cool proposal. Glad this is being discussed.
One thought I have is I think this will actually increase compile times much more than it may seem, and the optimization is not guaranteed. This optimization problem is exactly the kind that typically requires a lot of compute work. The reason I say this is because at its core, the optimization relies on the user writing a lot of IR with intermediate calls, and having some batch optimization later on aggregate and cancel out the intermediates.
I wonder if there's a way to achieve the goals here without generating a ton of intermediate code that then needs to be thrown away. One idea would be to treat List as the common denominator, and have an analysis that tracks direct calls to List.* functions in a sequence. Then, if there is a region of code that only contains sequenced calls to List.* functions, you know you can inline those directly, and specialize sequenced List.map/etc. without having to traverse big chains of code that you need to eliminate or consider inlining - the directive for what to do is defined explicitly by the analysis.
(the reason this is so significant is this is exactly why LLVM or other optimizers can be pathologically slow - if you feed them a ton of code to be optimized away, they will do so but take a very long time. if it's possible to feed optimal code earlier on, you win compile time)
Ayaz Hafiz said:
One thought I have is I think this will actually increase compile times much more than it may seem, and the optimization is not guaranteed. This optimization problem is exactly the kind that typically requires a lot of compute work. The reason I say this is because at its core, the optimization relies on the user writing a lot of IR with intermediate calls, and having some batch optimization later on aggregate and cancel out the intermediates.
just to check my understanding - is the extra IR from intermediate calls from people writing |> List.map fn1 |> List.map fn2 where otherwise they might have just combined them into one map call?
or the fact that each of those map calls would expand to 3 calls to Iter.fromList, Iter.map, and then Iter.toList? (whereas today they'd just be one call)
or I guess both, since either compounds the other
Ayaz Hafiz said:
I wonder if there's a way to achieve the goals here without generating a ton of intermediate code that then needs to be thrown away. One idea would be to treat
Listas the common denominator, and have an analysis that tracks direct calls toList.*functions in a sequence. Then, if there is a region of code that only contains sequenced calls toList.*functions, you know you can inline those directly, and specialize sequencedList.map/etc. without having to traverse big chains of code that you need to eliminate or consider inlining - the directive for what to do is defined explicitly by the analysis.
yeah I definitely think this is a situation where there would be an extreme power law distribution in play
like I would not be at all surprised if in practice it turned out to be 95% of the time it's specifically a List higher-order function in a pipeline with 1+ other List higher order functions
and "everything else" accounts for 5% of the scenarios where extra work is done compared to today
and yeah, those would be straightforward to detect prior to inlining even
or the fact that each of those
mapcalls would expand to 3 calls toIter.fromList,Iter.map, and thenIter.toList? (whereas today they'd just be one call)
yes, this. and the fact that is can be deeply nested, hence making the inline analysis hard
yeah I definitely think this is a situation where there would be an extreme power law distribution in play
yes, this is why it might make sense to do a simpler heuristic based just on list. I think in practice you're going to want to convert to a list-like structure, or represent your data structure as a list under the hood anyway.
as in don't do optimization on Iter at all?
or as in like don't implement data structure operations in terms of Iter
use List as the basic unit for supporting fusion rather than Iter
We've discussed fusion a lot in the past, personally, I think it is likely to be much more finicky than iterators. I think this is something that likely needs to be explicit. An iterator over a range of numbers needs to never allocate a full list.
Using explicit iterators (even if nothing inlines) avoids the intermediate allocations
I do also agree that it will make a crazy chain of lambdas that llvm will have to deal with unwinding and inlining.
Maybe we could make iterators easier to start and stop, that might make it just as desirable to use, which seems to be a solution here
(more keywords, basically)
I'm AFK at the moment, I can brainstorm a bit at home
If we were able to make iterators just as easy to use in 90% or more of situations, we'd be happy to use those, right?
Basically, if we could come up with some syntax to start and collect iterators, that seems to be a solution here
At which point, we could radically remove builtins like List.map and List.filter
It feels like a stupid question, but could someone tell me why the reason we want fusion is anything beyond "iterators are more effort to use"?
Considering we have size hints meaning we get the same allocations and all
I'm messing with the best I can do with some magic iter <list> keyword. It will take some brainpower, but again, if the limit really is ergonomics, that seems like a much more reasonable battle to fight than manual inlining and fusion.
Current thought is something like this, using a new iter block:
iter items as List
Iter.map \x -> x + 2
|> Iter.keepIf \x -> x % 2 == 0
# desugars to
items
|> List.toIter
|> Iter.map \x -> x + 2
|> Iter.keepIf \x -> x % 2 == 0
|> List.fromIter
for collecting to a different type
iter items as List to Personal.Dict
Iter.map \x -> x + 2
# desugars to
items
|> List.toIter
|> Iter.map \x -> x + 2
|> Personal.Dict.fromIter
It's a line longer for the standard case, but it's plainly readable and seems to sidestep the need for alternatives because we would not support List.map, just List.toIter and List.fromIter
28 messages were moved here from #ideas > Iterators and fusion proposal by Richard Feldman.
Ayaz Hafiz said:
use List as the basic unit for supporting fusion rather than Iter
yeah, I was really into that idea for awhile, but it turned out to have a number of problems.
one category of problems was around custom data structures
for example, if I make a Tree data structure and I give it a toList, I'm not sure how fusion could work on tree |> Tree.toList |> List.map fn1 |> List.map fn2 etc
because converting the tree to a List isn't the same as converting it to the Iter data structure (which is set up to be able to do these transformations) - and I think it would be really hard to detect that and make it work
one way to address that would be to offer Iter.custom but have it be List.custom intead
but then we have the problem that if you have a List in general, you're assuming it has certain properties (e.g. seamless slices) but actually sometimes it might be a "secret iterator" created with List.custom which doesn't have the perf characters of an actual List
another design direction is to say "okay custom data structures just don't benefit from fusion"
but the problem there is that now it becomes a reasonable style to do things like |> List.map fn1 |> List.map fn2 until you make a custom data structure, at which point it's like "oh yeah don't do the normal style at all or else your performance will be terrible - switch to loops/folds/etc"
which doesn't seem like a good outcome :sweat_smile:
so I wonder about the design direction of having Iter but sort of inverting the default :thinking:
in other words, don't have the builtins implemented in terms of Iter, but give them toIter and fromIter
and then do these optimizations:
Iter fusionList, Dict, Set)so basically if we see |> List.map fn1 |> List.map fn2 we have a separate compiler optimization that ends up being semantically equivalent to the optimization we'd do on |> List.toIter |> Iter.map fn1 |> Iter.map fn2 |> Iter.toList
Sam Mohr said:
It feels like a stupid question, but could someone tell me why the reason we want fusion is anything beyond "iterators are more effort to use"?
the way I think about this is that there are a few reasonable designs possible:
|> List.map fn1 |> List.keepIf fn2 because it will allocate an intermediate list") - downside is that you sometimes have to choose between code styles that feel natural in a functional language, are easy to read and understand, but don't have good perf bc of intermediate allocationsIterexplicitly - make super common operations like List.map ~3x as verbose, to increase code consistency while avoiding intermediate allocations. This design makes less sense to me than status quo. The pitch sounds to me like "let's facilitate writing code in a nice concise functional style without sacrificing perf...by making it so that it's always really verbose to do basically anything in a functional style." :sweat_smile: there's another possible direction, which is essentially "introduce the concept of ability parameters, have an Iter elem abilitiy, give it to List", and then you can write:
foo
|> Iter.map fn1
|> Iter.keepIf fn2
where foo is a List.
some downsides of that design:
|> Iter.map is no more concise than |> List.map but it's less self-documenting. I had to tell you that foo was a list because you didn't have the extra context that |> List.map provides.Iter value implementation, but we also have a lot of extra ability resolution all over the place|> Result.map if you have a Result, |> Parser.map if you have a Parser, but |> Iter.map if you have a ListI think you can make the List approach work if you do the fusion all at once; there is no need for intermediate “list but secretly an iterator”. And the case of custom data structures can be solved by specialized the toList constructor around fusion.
There are other ways here to avoid potentially blowing up the amount of code in user space, which is the larger thing I’m trying to get at, for example by using external iterators. Happy to chat more Friday
- it might be even worse for compiler perf because I think we still have about the same amount of IR as a naive
Itervalue implementation, but we also have a lot of extra ability resolution all over the place
This would very likely be better! The reason being the typechecking happens before any specialization so you have less code to check up front, the typechecking can direct you to exactly what code to inline and simplify, and you have much less code to delete because there are less trivially discardable fromIter/toIter in sequence.
+1 on it probably not being the right approach for ergonomic reasons
https://arxiv.org/pdf/2410.02232
so if I understand this correctly, it seems like:
List.mapfromIter/toIter pairs" pass, so you also don't need inlining to happen first - which could be beneficial for dev build performance if we wanted to try to get fusion without having to do inlining (although it seems like inlining is still desired after this fusion pass for optimized builds, in order to get rid of some of the thunks this strategy creates)does that all sound right? Did I miss anything?
yes i think so
very cool! sounds more promising than the previous state of the art
I think based on :point_up: it probably makes sense to hold off on doing an Iter API for right now. We can have for loops work on List for now, and if you want to iterate over something other than a List, you have to convert it to a List first
obviously that will be bad for performance in the immediate term, but it does mean you have a way to use for loops on anything if you really want to, and if the perf is too bad you can fall back on the pre-for strategies
(for list perf it'll be fine, of course)
because it seems like - especially in light of that paper - it might actually be possible to get the benefit of Iter without actually introducing a separate Iter type, and instead just using List for everything
this would be really nice from an API perspective, but it feels like something that needs a lot more investigation and experimentation, and I don't want to block for on that
I also don't want to build and introduce an Iter API that we later end up removing as a big breaking change
And if for item in list relies on some static dispatch a.iter() -> Iter elem then there won't be much of a migration step if we move to Iter later
hm, I'm not sure that would work in the current type system
Because no HKT?
yeah
I'll have to bug @Ayaz Hafiz about that special set of HKT that we'd be able to support with principle type inference
although if we did end up having List be the intermediary, then they'd all be doing the right thing already anyway
Sam Mohr said:
I'll have to bug Ayaz Hafiz about that special set of HKT that we'd be able to support with principle type inference
we definitely can do that, I just don't think we should
If we move to Iter instead of List, then we'd need something like that. Otherwise we'd be good with what you suggest.
well or just call list.to_iter() explicitly, which I think would be totally fine :big_smile:
Yep
Last updated: Jun 16 2026 at 16:19 UTC