Stream: contributing

Topic: LLM CI check


view this post on Zulip Anton (Feb 28 2025 at 17:36):

One thing an LLM could be well suited for is checking for too few comments and nondescriptive variable names in CI. Thoughts?

view this post on Zulip Richard Feldman (Feb 28 2025 at 18:11):

as in, have the compiler itself invoke an LLM?

view this post on Zulip Anton (Feb 28 2025 at 18:31):

No, just a github CI check

view this post on Zulip Anton (Feb 28 2025 at 18:34):

I've edited the original message for clarity

view this post on Zulip Richard Feldman (Feb 28 2025 at 18:34):

ah gotcha

view this post on Zulip Richard Feldman (Feb 28 2025 at 18:34):

I dunno, I tend to think those sorts of reviews are more annoying than helpful when a human does them :sweat_smile:

view this post on Zulip Richard Feldman (Feb 28 2025 at 18:35):

"more test coverage" might be reasonable though, especially for outside contributors

view this post on Zulip Brendan Hansknecht (Feb 28 2025 at 18:35):

checking for too few comments

I think it would be better to enforce every function has a doc comment then something like this.

view this post on Zulip Richard Feldman (Feb 28 2025 at 18:35):

and especially if it was just a comment a bot left instead of a check

view this post on Zulip Anton (Feb 28 2025 at 18:35):

A human can do a better job for sure, but with a bunch of contributors we may not be reviewing consistently

view this post on Zulip Richard Feldman (Feb 28 2025 at 18:36):

I definitely don't love the idea of a check that can block the PR which is nondeterministic :sweat_smile:

view this post on Zulip Anton (Feb 28 2025 at 18:38):

Yeah that could be a problem but the variance between identical calls could turn out extremely small

view this post on Zulip Anton (Feb 28 2025 at 18:39):

I think it would be better to enforce every function has a doc comment then something like this.

Can you elaborate @Brendan Hansknecht?

view this post on Zulip Isaac Van Doren (Feb 28 2025 at 18:39):

I think it would be better to enforce every function has a doc comment then something like this.

I suspect that would get kind of annoying and would result in a lot of half-hearted comments left to satisfy the requirement

view this post on Zulip Anton (Feb 28 2025 at 18:40):

Yeah, I generally prefer good descriptive names instead of comments when possible

view this post on Zulip Brendan Hansknecht (Feb 28 2025 at 18:41):

I don't think that requiring comments within a function is the right requirement. While they can be good, they tend to lead to lots of terrible comments that just lead to telling what the code is doing (which you can tell from reading the code). The most useful comments tends to explain the why. I don't think llm too few comment messages will help with this.

view this post on Zulip Anton (Feb 28 2025 at 18:42):

How do you feel about just a nondescriptive variable names check?

view this post on Zulip Brendan Hansknecht (Feb 28 2025 at 18:43):

I suspect that would get kind of annoying and would result in a lot of half-hearted comments left to satisfy the requirement

Doc comments on functions (especially those exposed and imported) are rather fundamental in my opinion. From watching many new people interact with and work with the old rust compiler, almost always a fundamental problem is our lack of doc comments. Good naming helps a lot but is not enough.

Clearly high level why and how docs are also huge (but those tend to get stale very quickly and are much more of an investment to write). While I don't think every function needs a doc comment (especially when thinking about internal functions), I do think it is extra valuable in an open source project (even if many are half hearted)

view this post on Zulip Anton (Feb 28 2025 at 18:47):

Yeah, agreed on exposed functions

view this post on Zulip Brendan Hansknecht (Feb 28 2025 at 18:48):

I think a really valuable base check would be doc comments on everything with pub. So types, functions, etc.

view this post on Zulip Brendan Hansknecht (Feb 28 2025 at 18:49):

Also, mixed value (depends on how half hearted or descriptive people are), but a top of file doc comment would also be good (this would be where we write down the rough goal of a module and any important design pieces).

view this post on Zulip Brendan Hansknecht (Feb 28 2025 at 18:50):

I think it would make roc code a lot more explorable. Even if to start it is just the one sentence description of what a pass is meant to do

view this post on Zulip Brendan Hansknecht (Feb 28 2025 at 18:50):

I think those both could be hard requirements and add a lot of value

view this post on Zulip Brendan Hansknecht (Feb 28 2025 at 18:51):

How do you feel about just a nondescriptive variable names check?

Sounds fine overall. I think there is some nuance depending on variable lifetime, but better names are very helpful.

view this post on Zulip Brendan Hansknecht (Feb 28 2025 at 18:54):

That said, I think this would fit best as code review comments that a user can choose to ignore rather than a blocking check

view this post on Zulip Richard Feldman (Feb 28 2025 at 18:54):

agreed!

view this post on Zulip Anton (Mar 06 2025 at 12:20):

I think a really valuable base check would be doc comments on everything with pub. So types, functions, etc.

I'm going to start on this, forking zlint seems like the way to go. I'll contribute the rule if they want it.

view this post on Zulip Anton (Mar 06 2025 at 18:11):

I'm just going with regex, that's way simpler and one less dependency

view this post on Zulip Brendan Hansknecht (Mar 06 2025 at 18:31):

Scan file. Find word pub. Assert line before starts with ///?

view this post on Zulip Brendan Hansknecht (Mar 06 2025 at 18:31):

Separate check to assert each file starts with a doc comment

view this post on Zulip Anton (Mar 06 2025 at 18:42):

Scan file. Find word pub. Assert line before starts with ///?

Yeah, I got way of course with nodes, symbols, the ast... with zlint :p

view this post on Zulip Anton (Mar 06 2025 at 18:44):

Separate check to assert each file starts with a doc comment

Going to leave this for a second PR, so we can quickly address all the missing comments for the first check.

view this post on Zulip Anton (Mar 06 2025 at 18:58):

There are 45 zig files in the src folder (overview) where pub functions and constants do not have a doc comment ///.
Please add doc comments to the files you are familiar with @Sam Mohr @Joshua Warner @Luke Boswell @Brendan Hansknecht @Anthony Bullard @Jared Ramirez @Isaac Van Doren

view this post on Zulip Luke Boswell (Mar 06 2025 at 22:39):

I'll add doc comments in my latest PR

view this post on Zulip Luke Boswell (Mar 06 2025 at 23:38):

I decided to add the doc comments in Anton's PR

view this post on Zulip Luke Boswell (Mar 06 2025 at 23:39):

I've been bulk adding these... I figure it's just not going to happen any other way. Some of my comments are probably not great. But I figure we can get them there and maintain a culture of improving comments as we go

view this post on Zulip Brendan Hansknecht (Mar 07 2025 at 00:03):

That's awesome. Thanks

view this post on Zulip Luke Boswell (Mar 07 2025 at 00:03):

Maybe thank me after you see the damage :sweat_smile:

view this post on Zulip Brendan Hansknecht (Mar 07 2025 at 00:04):

I think folks are more likely to update wrong comments than add non-existent ones

view this post on Zulip Luke Boswell (Mar 07 2025 at 00:05):

So the more wrong the comment... the more inclined someone is to fix it right? :smiley:

view this post on Zulip Brendan Hansknecht (Mar 07 2025 at 00:05):

:grug: :tombstone:

view this post on Zulip Sam Mohr (Mar 07 2025 at 00:05):

Cunningham's Law in full force

view this post on Zulip Luke Boswell (Mar 07 2025 at 00:36):

Ok... everything has a doc comment now.

view this post on Zulip Luke Boswell (Mar 07 2025 at 00:48):

I propose we land this PR -- even though I've added lots of TODO's in doc comments.

My logic is that we aren't going to resolve all the TODO's in this PR, there are just far too many. The scope of this change was to introduce the CI check, which is working well.

So we can follow this up with other PR's and gradually clean up things as we go.

view this post on Zulip Sam Mohr (Mar 07 2025 at 00:48):

I'll try to work on that tonight

view this post on Zulip Sam Mohr (Mar 07 2025 at 03:11):

Something that would be really nice to figure out is how to avoid this requirement on re-exports. Maybe the regex can check if the line contains an @import and ignore those lines?

view this post on Zulip Sam Mohr (Mar 07 2025 at 03:11):

The current behavior requires docs on those lines, which puts what I'd call superfluous docs on top of the real docs in the LSP hover box

view this post on Zulip Luke Boswell (Mar 07 2025 at 03:11):

Also init and deinit feels a bit silly also

view this post on Zulip Sam Mohr (Mar 07 2025 at 03:11):

Yep

view this post on Zulip Sam Mohr (Mar 07 2025 at 03:12):

The regex also doesn't check nested pub items, which seems like something we'll want

view this post on Zulip Luke Boswell (Mar 07 2025 at 03:12):

It's a good start, we can collect ideas to adjust as we go

view this post on Zulip Sam Mohr (Mar 07 2025 at 04:00):

Another round of doc comments (besides the build/ and check/ folders) https://github.com/roc-lang/roc/pull/7671/files

view this post on Zulip Luke Boswell (Mar 07 2025 at 04:26):

Love to see it

view this post on Zulip Anthony Bullard (Mar 07 2025 at 16:00):

I've been pretty good about this, but I will look to get better as I work on my current PR

view this post on Zulip Anton (Mar 10 2025 at 16:21):

I've added a check for a top level comment for every zig file //!. That comment should describe the purpose of the file. It's currently limited to zig files introduced in the current PR.

view this post on Zulip Anton (Mar 10 2025 at 17:40):

Doc comments are now no longer required for init, deinit and re-exports.

view this post on Zulip Sam Mohr (Mar 10 2025 at 18:12):

Thank you Anton!

view this post on Zulip Brendan Hansknecht (Mar 13 2025 at 05:46):

We should expand this check for public functions in structs. Like all of the tokenizer struct functions and such. Basically any line with pub on it that isn't a re-export.


Last updated: Jul 06 2025 at 12:14 UTC