How should we handle using AI for reviewing commits?
From: Theodore Tso
Date: Thu Mar 12 2026 - 12:51:58 EST
We recently used an AI review bot while applying an LTS backport to an
internal kernel tree at $WORK. While doing the review, it flagged a
set of concerns which resulted in my creating a patch[1] to address
the issues that it found in the kernel commit.
[1] https://lore.kernel.org/all/20260310122806.1277631-1-tytso@xxxxxxx/
In this commit there is no LLM generated output in the code, but there
*is* LLM generated output in the commit description, since I quoted
the concerns raised by the LLM. Per the our new coding-assistants
process document[2], "When AI tools contribute to kernel development,
proper attribution helps track the evolving role of AI in the
development process. Contributions should include an Assisted-by tag..."
[2] https://www.kernel.org/doc/html/latest/process/coding-assistants.html
When I was considering whether I should add something like:
Assisted-by: Gemini:Gemini-3.1-Pro [TOOL]
There was a couple of things that came to mind. First, should we make
some kind of distintion between exactly how the AI tool assisted in
the development of the commit? There's a big difference between using
an AI assistant to find a potential issue, to an AI assistant which
created new code out of whole cloth, with a spectrum of changes in
between. Given that the stated code was to "track the evolving role
of AI", it occured to me that perhaps we should add some indication
about exactly what was the nature of assistance that was provided.
The second observation that I had was that example set of tools for
[TOOL] was "specialized analysis tools": coccinelle, sparse, smatch,
clang-tidy. I assume the intent was if an AI bot started using tools
like sparse, coccinelle, as an agent?w
If there is a set of LLM prompts which has a name, would that also be
appropriate for TOOL? Chris Mason's repo has a fairly non-descriptive
name, "review-prompts", but in the future when companies start making
their internal review prompts public, some of them may have more
evocative names that might be more unique and more marketing friendly. :-)
- Ted
--- Begin Message --- Commit 4865c768b563 ("ext4: always allocate blocks only from groups
inode can use") restricts what blocks will be allocated for indirect
block based files to block numbers that fit within 32-bit block
numbers.
However, when using a review bot running on the latest Gemini LLM to
check this commit when backporting into an LTS based kernel, it raised
the following concerns:
If ac->ac_g_ex.fe_group is >= ngroups (for instance, if the goal
group was populated via stream allocation from s_mb_last_groups),
then start will be >= ngroups.
Does this allow allocating blocks beyond the 32-bit limit for
indirect block mapped files? The commit message mentions that
ext4_mb_scan_groups_linear() takes care to not select unsupported
groups. However, its loop uses group = *start, and the very first
iteration will call ext4_mb_scan_group() with this unsupported
group because next_linear_group() is only called at the end of the
iteration.
Also, in the optimized scanning paths like
ext4_mb_scan_groups_p2_aligned(), start is passed into
ext4_mb_scan_groups_xa_range(). If start >= ngroups, will this
trigger the WARN_ON_ONCE(end > ngroups || start >= end) because end
is set to ngroups? Furthermore, after wrapping around, end will be
set to the original start which is > ngroups, triggering the
warning a second time. Should this code clamp start to < ngroups
before scanning?
After reviewing the code paths involved and considering the LLM
review, I believe that while it is unlikely, it is possible that how
commit 4865c768b563 added ext4_get_allocation_groups_count() doesn't
completely address all of the possible situtions that might show up
when using indirect-mapped files and allocating blocks on a file
system with more than 2**32 blocks.
So this commit adds some safety checks and wrap around logic to some
functions if the multiblock allocator: ext4_mb_scan_groups_xa_range(),
ext4_mb_scan_groups_best_avail(), ext4_mb_scan_groups_p2_aligned(),
and ext4_mb_scan_groups().
Fixes: 4865c768b563 ("ext4: always allocate blocks only from groups inode can use")
Signed-off-by: Theodore Ts'o <tytso@xxxxxxx>
Cc: Jan Kara <jack@xxxxxxx>
---
fs/ext4/mballoc.c | 26 ++++++++++++++++++++------
1 file changed, 20 insertions(+), 6 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 20e9fdaf4301..454d5a1b4803 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -915,13 +915,17 @@ static int ext4_mb_scan_groups_xa_range(struct ext4_allocation_context *ac,
struct ext4_sb_info *sbi = EXT4_SB(sb);
enum criteria cr = ac->ac_criteria;
ext4_group_t ngroups = ext4_get_allocation_groups_count(ac);
- unsigned long group = start;
+ unsigned long group, end_range;
struct ext4_group_info *grp;
- if (WARN_ON_ONCE(end > ngroups || start >= end))
- return 0;
-
- xa_for_each_range(xa, group, grp, start, end - 1) {
+ if (start >= ngroups)
+ start = 0;
+ group = start;
+wrap_around:
+ end_range = end;
+ if (end_range > ngroups)
+ end_range = ngroups;
+ xa_for_each_range(xa, group, grp, start, end_range - 1) {
int err;
if (sbi->s_mb_stats)
@@ -933,7 +937,11 @@ static int ext4_mb_scan_groups_xa_range(struct ext4_allocation_context *ac,
cond_resched();
}
-
+ if (start) {
+ end = start;
+ start = 0;
+ goto wrap_around;
+ }
return 0;
}
@@ -967,6 +975,8 @@ static int ext4_mb_scan_groups_p2_aligned(struct ext4_allocation_context *ac,
start = group;
end = ext4_get_allocation_groups_count(ac);
+ if (start >= end)
+ return 0;
wrap_around:
for (i = ac->ac_2order; i < MB_NUM_ORDERS(ac->ac_sb); i++) {
ret = ext4_mb_scan_groups_largest_free_order_range(ac, i,
@@ -1099,6 +1109,8 @@ static int ext4_mb_scan_groups_best_avail(struct ext4_allocation_context *ac,
start = group;
end = ext4_get_allocation_groups_count(ac);
+ if (start >= end)
+ start = 0;
wrap_around:
for (i = order; i >= min_order; i--) {
int frag_order;
@@ -1199,6 +1211,8 @@ static int ext4_mb_scan_groups(struct ext4_allocation_context *ac)
/* searching for the right group start from the goal value specified */
start = ac->ac_g_ex.fe_group;
+ if (start >= ngroups)
+ start = 0;
ac->ac_prefetch_grp = start;
ac->ac_prefetch_nr = 0;
--
2.51.0
--- End Message ---