On Thu, Mar 14, 2024 at 10:09:01PM +0800, Baokun Li wrote:Hi Ojaswin,
--- a/fs/ext4/mballoc.cHey Baokun,
+++ b/fs/ext4/mballoc.c
@@ -831,6 +831,8 @@ static int mb_avg_fragment_size_order(struct super_block *sb, ext4_grpblk_t len)
return 0;
if (order == MB_NUM_ORDERS(sb))
order--;
+ if (WARN_ON_ONCE(order > MB_NUM_ORDERS(sb)))
+ order = MB_NUM_ORDERS(sb) - 1;
Thanks for the review!
Thanks for fixing this. This patch looks good to me, feel free to add:
Reviewed-by: Ojaswin Mujoo <ojaswin@xxxxxxxxxxxxx>
my comments after this are less about the patch and more about someThat's not quite right, in ext4_mb_choose_next_group_goal_fast() even
thoughts on the working of average fragment lists.
So going through the v2 and this patch got me thinking about what really
is going to happen when a user tries to allocate 32768 blocks which is also
the maximum value we could have in say ac->ac_g_ex.fe_len.
When this happens, ext4_mb_regular_allocator() will directly set the
criteria as CR_GOAL_LEN_FAST. Now, we'll follow:
ext4_mb_choose_next_group_goal_fast()
for (i=mb_avg_fragment_size_order(); i < MB_NUM_ORDERS; i++) { .. }
Here, mb_avg_fragment_siz_order() will do something like:
order = fls(32768) - 2 = 14
...
if (order == MB_NUM_ORDERS(sb))
order--;
return order;
And we'll look in the fragment list[13] and since none of the groups
there would have 32768 blocks free (since we dont track it here) we'll
unnecessarily traverse the full list before falling to CR_BEST_AVAIL_LEN
(this will become a noop due to the way order and min_order
are calculated) and eventually to CR_GOAL_LEN_SLOW where we might get
something or end up splitting.
I think something more optimal would be to:
1. Add another entry to average fragment lists for completely empty
groups. (As a sidenote i think we should use something like MB_NUM_FRAG_ORDER
instead of MB_NUM_ORDERS in calculating limits related to average
fragment lists since the NUM_ORDERS seems to be the buddy max order ie
8192 blocks only valid for CR_POWER2 and shouldn't really limit the
fragment size lists)
2. If we don't want to go with 1 (maybe there's some history for that),
then probably should exit early from CR_GOAL_LEN_FAST so that we don't
iterate there.
Would like to hear your thoughts on it Baokun, Jan.
Regards,
ojaswin