Re: [RT LATENCY] 249 microsecond latency caused by slub'sunfreeze_partials() code.

From: JoonSoo Kim
Date: Tue Jun 04 2013 - 18:21:28 EST

2013/6/4 Christoph Lameter <cl@xxxxxxxxx>:
> On Tue, 4 Jun 2013, JoonSoo Kim wrote:
>> And I re-read Steven initial problem report in RT kernel and find that
>> unfreeze_partial() do lock and unlock several times. This means that
>> each page in cpu partial list doesn't come from same node. Why do we
>> add other node's slab to this cpu partial list? This is also not good
>> for general case. How about considering node affinity in __slab_free()?
>> IMHO, if we implement this, Steven's problem can be solved.
> We may need the other nodes pages if we consistently allocate from there.
> __slab_alloc() ensures that only pages from the correct node are used. It
> will drop pages that do not come from the proper nodes.
> Filtering in __slab_free would mean that we cannot improve performance on
> remote frees.

Sorry for poor explanation.
I knew that we need the other nodes pages if we consistently allocate
from there.
In allocation path, it is reasonable.
But, in free path, it is not reasonable that we put other nodes' page
to cpu partial list.
In this case, we don't need other nodes pages, because there is
no allocation request for other nodes. Putting other node's page to
cpu partial list and
using this other node's page for normal allocation request(it doesn't
specify node affinity)
makes system performance degradation. Below is quick implementation of
this idea.

diff --git a/mm/slub.c b/mm/slub.c
index 57707f0..7f614f4 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2495,7 +2495,8 @@ static void __slab_free(struct kmem_cache *s,
struct page *page,
if ((!new.inuse || !prior) && !was_frozen) {

- if (!kmem_cache_debug(s) && !prior)
+ if (!kmem_cache_debug(s) && !prior &&
+ node_match(page,

* Slab was on no list before and will
be partially empty
@@ -2550,8 +2551,9 @@ static void __slab_free(struct kmem_cache *s,
struct page *page,
* Objects left in the slab. If it was not on the partial list before
* then add it.
- if (kmem_cache_debug(s) && unlikely(!prior)) {
- remove_full(s, page);
+ if (!prior) {
+ if (kmem_cache_debug(s))
+ remove_full(s, page);
add_partial(n, page, DEACTIVATE_TO_TAIL);
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at