Stream: contributing

Topic: Help with `Defs.split_defs_around` impl


view this post on Zulip Luke Boswell (Apr 07 2024 at 11:30):

I've been working on implementing the chaining design proposal and have been having a lot of trouble with an implementation for split_defs_around. I've tried a bunch of different things, but I've been unable to get it right. The combination of Vecs, indexes, phantom data, and memory has been tricky.

The implementation below works... the problem is that there are unecessary defs that remain because I'm just cloning the original and then replace the tags to keep for before and after.

The current implementation behaves correctly (as in programs compile and run fine), but there is a lot of duplication which makes debugging the AST when reading it manually, and it would also grow quite large on real programs.

I would appreciate any assistance.

// crates/compiler/parse/src/ast.rs

#[derive(Debug, Clone, PartialEq, Default)]
pub struct Defs<'a> {
    pub tags: std::vec::Vec<EitherIndex<TypeDef<'a>, ValueDef<'a>>>,
    pub regions: std::vec::Vec<Region>,
    pub space_before: std::vec::Vec<Slice<CommentOrNewline<'a>>>,
    pub space_after: std::vec::Vec<Slice<CommentOrNewline<'a>>>,
    pub spaces: std::vec::Vec<CommentOrNewline<'a>>,
    pub type_defs: std::vec::Vec<TypeDef<'a>>,
    pub value_defs: std::vec::Vec<ValueDef<'a>>,
}

#[derive(Debug, Clone, PartialEq)]
pub struct SplitDefsAround<'a> {
    pub before: Defs<'a>,
    pub after: Defs<'a>,
}

impl<'a> Defs<'a> {

    // For desugaring suffixed Defs we need to split around the current value
    pub fn split_defs_around(&self, target: usize) -> SplitDefsAround {
        let mut before = self.clone();
        let mut after = self.clone();

        before.tags = self.tags[0..target].to_vec();

        if target >= self.tags.len() {
            after.tags = self.tags.clone();
            after.tags.clear();
        } else {
            after.tags = self.tags[(target + 1)..].to_vec();
        }

        SplitDefsAround { before, after }
    }
}

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 17:12):

So this only needs to split tags and the rest stays identical?

view this post on Zulip Luke Boswell (Apr 07 2024 at 22:36):

No, that is the problem with my current solution. The tags are indexes into the other Vecs. So I'm keeping type_defs and value_defs that dont have a tag pointing to them.

view this post on Zulip Brendan Hansknecht (Apr 07 2024 at 23:36):

So just redundant data and the inconvenient copy?

view this post on Zulip Luke Boswell (Apr 08 2024 at 00:33):

If I use Vec::remove on the value_defs it shifts the indexes. Which means the tags now point to the wrong place, and I haven't found a way to update them all correctly

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 01:00):

Honestly, the best way would be to reference the original defs and only update the tags. But that might require larger complier changes

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 01:35):

Another option. Update Defs to instead of containing std::vec::Vec make it have a &'a bumpalo::Vec or something like that.

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 01:35):

Then cause it is a refernce, no need to copy it. just keep referencing the original data.

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 01:38):

But yeah, you may be stuck properly splitting the defs. If so, may be worth just walking the defs, copying them over and assigning to new indices.

view this post on Zulip Brendan Hansknecht (Apr 08 2024 at 01:38):

hard to say what is best, would need to look at specifics and mess around more.

view this post on Zulip Luke Boswell (Apr 09 2024 at 08:03):

Ok, I've made some progress I think.

I've pushed my WIP to this branch.

There are two functions which I have implemented which are almost identical and where I think the issue is, both are in crates/compiler/parse/src/ast.rs, replace_with_value_def and split_defs_around.

Unfortunately I've managed to break things somehow. I know it's an issue with the way I am handling spaces.

I'm hoping someone can help me out with this. @Folkert de Vries

view this post on Zulip Luke Boswell (Apr 09 2024 at 08:05):

Specifically I think this might be wrong somehow...

let space_before: &[CommentOrNewline<'a>] = {
    let start: usize = self.space_before[tag_index].start();
    let len: usize = self.space_before[tag_index].len();

    &self.spaces[start..(start + len)]
};

view this post on Zulip Folkert de Vries (Apr 09 2024 at 08:05):

I'd generally write that as [start..][..len], but that's just cosmetic

view this post on Zulip Luke Boswell (Apr 09 2024 at 08:07):

I've had trouble understanding that pub space_before: Vec<Slice<CommentOrNewline<'a>>, Global> is and how to work with it I think.

view this post on Zulip Folkert de Vries (Apr 09 2024 at 08:08):

right, actually you can do &self.spaces[self.space_before[tag_index].indices()]

view this post on Zulip Luke Boswell (Apr 09 2024 at 08:10):

Thank you, that looks much nicer.

view this post on Zulip Folkert de Vries (Apr 09 2024 at 08:11):

but in general I'd try to create test cases here and see what breaks. hard to spot invalid assumptions just by reading the code

view this post on Zulip Luke Boswell (Apr 09 2024 at 08:12):

Thank you, thats a good idea.


Last updated: Jul 06 2025 at 12:14 UTC