Early return and correct refcounting · compiler development

Stream: compiler development

Topic: Early return and correct refcounting

Sam Mohr (Oct 24 2024 at 09:25):

So I'm working on the return keyword that allows for early returns from functions, and unblocks the try keyword implementation. Besides adding tests, the PR is ready... but for one issue: I'm not yet handling refcounting correctly on early returns (I'm just returning naively).

Though I'd normally try to fully investigate myself before pulling in help, I'm not finding things obvious enough to figure out how to make the right change without delaying the PR a good bit, so I'd love some help on at least the strategy to do this. AFAICT there are two ways to go about this:

Convert functions with early returns to SSA, and use the existing code. In a GC language, this would be annoying (see nested ifs with deeply nested returns) but not terrible. Refcounting makes that approach less desirable since each instance can't just "joinpoint" to the final return, as each return has incremented the refcounts of different sets of vars. However, this doesn't need much more understanding of the Roc compiler than I already have.
Track which refcounted variables have been created/had their refcount incremented up to each early return, and add dec/decref instructions for each of those values with each return statement. Presumably we do this somewhere already for end-of-function-body returns, but I've not found it yet in the ~30k lines of crates/compiler/mono.

Does anyone with backend experience have thoughts on which of those options would be better/easier, and potentially could help me with the high-level plan? Or maybe there's another option? Any help is welcome.

@Folkert de Vries I've heard that you may have some expertise here? If not, no worries.

Sam Mohr (Oct 24 2024 at 09:55):

Logging off for the day, but my current concept is to implement the equivalent of roc_can::scope::Scope in insert_inc_dec_operations_proc to track which refcounted vars need to be decremented before early returning. I'll try it tomorrow night if I get time

Ayaz Hafiz (Oct 24 2024 at 13:49):

First off, I would check whether you need to handle anything for refcounting. I'm not sure that anything breaks with early returns but maybe it does.

Ayaz Hafiz (Oct 24 2024 at 13:50):

If it does the simplest approach which i would suggest is converting

foo = \x ->
  n = if x then
    return 1
  else
    2

  n + 3

foo = \x ->
  if x then
    1
  else
    n = 2
    n + 3

Richard Feldman (Oct 24 2024 at 13:51):

I'm not sure that scales, e.g. what if you do it in the middle of a when branch? or, even worse, a when guard? (we might want to disallow that :sweat_smile:)

Sam Mohr (Oct 24 2024 at 17:43):

Currently, return is only parsed as a statement, meaning anywhere we parse blocks that allow_defs it'll work, and anywhere else is a parse error. Aka it doesn't work for if conditions, when predicates, or when guards

Richard Feldman (Oct 24 2024 at 18:08):

oh I meant in one of the when branches, not the conditional

Sam Mohr (Oct 24 2024 at 18:09):

If we don't allow early return in when branches, then try won't work in when branches

Richard Feldman (Oct 24 2024 at 18:09):

especially if it's a when inside another when

Richard Feldman (Oct 24 2024 at 18:09):

yeah my point is that I think the "translate return to sugar" idea runs into common situations where it doesn't work pretty quickly

Sam Mohr (Oct 24 2024 at 18:10):

Oh, I agree!

Brendan Hansknecht (Oct 24 2024 at 18:10):

I guess the core issue is that we alway deal with refcounting for an entire block. With return, you have to be able to deal with refcounting for a partial block

Sam Mohr (Oct 24 2024 at 18:11):

Yep

Brendan Hansknecht (Oct 24 2024 at 18:11):

This means that order of variable definition matters

Richard Feldman (Oct 24 2024 at 18:11):

that's already true for other reasons though :big_smile:

Brendan Hansknecht (Oct 24 2024 at 18:11):

I think we still toposort our variables at some point

Brendan Hansknecht (Oct 24 2024 at 18:12):

So I don't think order matters today even though it probably should

Brendan Hansknecht (Oct 24 2024 at 18:13):

I guess purity inference may change this?

With ! desugaring to await, order kinda mattered due to all of the secret lambdas enforcing ordering.

Brendan Hansknecht (Oct 24 2024 at 18:14):

Anyway, we probably just need a statement by statement build up of refcounted lifetimes that can be used anytime a return is created. If no return is hit, we just do full refcount handling at the end of the block. If a return is hit, we do a partial refcount handling before actually running the return

Sam Mohr (Oct 24 2024 at 18:15):

We form an SCC graph for defs and sort here: https://github.com/roc-lang/roc/blob/06996d88f29589e189843171cf74187a45cc777e/crates/compiler/can/src/def.rs#L2003

Sam Mohr (Oct 24 2024 at 18:15):

That's my thought as well

Richard Feldman (Oct 24 2024 at 18:17):

yeah we need to stop reordering them

Richard Feldman (Oct 24 2024 at 18:18):

that's already true in isolation today :big_smile:

Sam Mohr (Oct 24 2024 at 18:18):

That'll happen with the canonicalization rewrite, right? unless we wanna do it earlier

Richard Feldman (Oct 24 2024 at 18:18):

yeah, it could happen then, or earlier - either way

Sam Mohr (Oct 24 2024 at 18:18):

Wouldn't be bad to do that first to keep the changes smaller

Richard Feldman (Oct 24 2024 at 18:18):

I think we still need to reorder function defs actually

Richard Feldman (Oct 24 2024 at 18:18):

but not non-functions

Richard Feldman (Oct 24 2024 at 18:19):

although that gets a bit tricky in the sense that they shouldn't be reordered earlier than when things they close over are defined

Brendan Hansknecht (Oct 24 2024 at 18:28):

I thought we only planned to reorder top level functions.

Brendan Hansknecht (Oct 24 2024 at 18:28):

But maybe mutually recursive lambdas with closures shouldn't work? I'm not actually sure they are really important to make work

Richard Feldman (Oct 24 2024 at 18:39):

Brendan Hansknecht said:

I thought we only planned to reorder top level functions.

oh yeah I think this is right...I forget where we talked about reordering, but I think there was a thread somewhere

Brendan Hansknecht (Oct 24 2024 at 18:42):

This?

Richard Feldman said:

I guess that's the simplest design because the rule is basically "you have to define a value before you use it, except top-level values can be defined in any order and used in any order"

Sam Mohr (Oct 25 2024 at 06:03):

Okay, after much fiddling with test_mono, it seems that @Ayaz Hafiz was correct and the refcounting is working even with nested returns. I added two test cases, one for nested if and the other for nested when, and they both seem to be working correctly.

It might be good to get an extra review, but yeah, this PR is ready for review!

Sam Mohr (Oct 25 2024 at 06:06):

I'll double check that I have all the tests I want, and then move on to implementing try

Sam Mohr (Oct 25 2024 at 06:44):

Yeah, tests look good to me. Moving on to try

Last updated: Aug 17 2025 at 12:14 UTC