Re: [PATCH 3/3] debugobjects: Use hlist_cut_number() to optimize performance and improve readability

From: Leizhen (ThunderTown)
Date: Wed Sep 11 2024 - 05:39:17 EST

Next message: Masahiro Yamada: "Re: linux-next: build failure after merge of the kbuild tree"
Previous message: Jerome Brunet: "Re: [PATCH 3/3] hwmon: (pmbus/tps25990): add initial support"
In reply to: Thomas Gleixner: "Re: [PATCH 3/3] debugobjects: Use hlist_cut_number() to optimize performance and improve readability"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2024/9/11 16:54, Thomas Gleixner wrote:
> On Wed, Sep 11 2024 at 15:44, Leizhen wrote:
>> On 2024/9/10 19:44, Thomas Gleixner wrote:
>>> That minimizes the pool lock contention and the cache foot print. The
>>> global to free pool must have an extra twist to accomodate non-batch
>>> sized drops and to handle the all slots are full case, but that's just a
>>> trivial detail.
>>
>> That's great. I really admire you for completing the refactor in such a
>> short of time.
>
> The trick is to look at it from the data model and not from the
> code. You need to sit down and think about which data model is required
> to achieve what you want. So the goal was batching, right?

Yes, when I found a hole in the road, I thought about how to fill it. But
you think more deeply, why is there a pit, is there a problem with the
foundation? I've benefited a lot from communicating with you these days.

>
> That made it clear that the global pools need to be stacks of batches
> and never handle single objects because that makes it complex. As a
> consequence the per cpu pool is the one which does single object
> alloc/free and then either gets a full batch from the global pool or
> drops one into it. The rest is just mechanical.
>
>> But I have a few minor comments.
>> 1. When kmem_cache_zalloc() is called to allocate objs for filling,
>> if less than one batch of objs are allocated, all of them can be
>> pushed to the local CPU. That's, call pcpu_free() one by one.
>
> If that's the case then we should actually immediately give them back
> because thats a sign of memory pressure.

Yes, that makes sense, and that's a solution too.

>
>> 2. Member tot_cnt of struct global_pool can be deleted. We can get it
>> simply and quickly through (slot_idx * ODEBUG_BATCH_SIZE). Avoid
>> redundant maintenance.
>
> Agreed.
>
>> 3. debug_objects_pool_min_level also needs to be adjusted accordingly,
>> the number of batches of the min level.
>
> Sure. There are certainly more problems with that code. As I said, it's
> untested and way to big to be reviewed. I'll split it up into more
> manageable bits and pieces.

Looking forward to...

>
> Thanks,
>
> tglx
> .
>

--
Regards,
Zhen Lei

Next message: Masahiro Yamada: "Re: linux-next: build failure after merge of the kbuild tree"
Previous message: Jerome Brunet: "Re: [PATCH 3/3] hwmon: (pmbus/tps25990): add initial support"
In reply to: Thomas Gleixner: "Re: [PATCH 3/3] debugobjects: Use hlist_cut_number() to optimize performance and improve readability"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]