zone->nr_inactive race?

From: Nikita Danilov (Nikita@Namesys.COM)
Date: Mon Apr 21 2003 - 12:58:43 EST


Hello,

I am observing on 2.5.67 that sometimes shrink_slab() is called with
insanely huge total_scanned. In kgdb I see

(gdb) f
#1 0xc013734c in shrink_caches (classzone=0xc04ad680, priority=12,
    total_scanned=0xddf15b60, gfp_mask=210, nr_pages=32, ps=0xddf15b64)
    at mm/vmscan.c:794
(gdb) p/x max_scan
$20 = 0xfffff # which looks suspiciously similar to ((unsigned)-1) >> 12
(gdb) p priority
$21 = 12
(gdb) p zone->nr_inactive
$22 = 0
(gdb) info threads
....
  16 Thread 17 0xc012e37c in unlock_page (page=0xc14d12c8)
    at include/asm/bitops.h:175
...
(gdb) thread 16 # this is the only other thread doing something page cache related at the moment
(gdb) bt
#0 0xc012e37c in unlock_page (page=0xc14d12c8) at include/asm/bitops.h:175
#1 0xc01369d9 in shrink_list (page_list=0xdfd27e54, gfp_mask=208,
    max_scan=0xdfd27eb8, nr_mapped=0xdfd27f0c, priority=7) at mm/vmscan.c:434
#2 0xc0136c56 in shrink_cache (nr_pages=513, zone=0xc04ad680, gfp_mask=208,
    max_scan=637, nr_mapped=0xdfd27f0c, priority=7) at mm/vmscan.c:517
#3 0xc01372bd in shrink_zone (zone=0xc04ad680, max_scan=1026, gfp_mask=208,
    nr_pages=513, nr_mapped=0xdfd27f0c, ps=0xdfd27f44, priority=7)
    at mm/vmscan.c:746
#4 0xc0137527 in balance_pgdat (pgdat=0xc04ac280, nr_pages=0, ps=0xdfd27f44)
    at mm/vmscan.c:909
#5 0xc01376a7 in kswapd (p=0xc04ac280) at mm/vmscan.c:969

Looks like zone->nr_inactive was negative.

After some more debugging I managed to break into kgbd when
zone->nr_inactive is 0xffffffc0 (again in shrink_caches()). On the
another processor a thread is running inside refill_inactive_zone():

        while (!list_empty(&l_inactive)) {
                page = list_entry(l_inactive.prev, struct page, lru);
                prefetchw_prev_lru_page(page, &l_inactive, flags);
                if (TestSetPageLRU(page))
                        BUG();

(somewhere near to TestSetPageLRU(page)).

(gdb) bt
#0 0xc0137008 in refill_inactive_zone (zone=0xc04ad680, nr_pages_in=128,
    ps=0xdd921a24, priority=9) at include/asm/bitops.h:136
#1 0xc01372a6 in shrink_zone (zone=0xc04ad680, max_scan=64, gfp_mask=210,
    nr_pages=32, nr_mapped=0xdd9219e4, ps=0xdd921a24, priority=9)
    at mm/vmscan.c:744
#2 0xc0137349 in shrink_caches (classzone=0xc04ad680, priority=9,
    total_scanned=0xdd921a20, gfp_mask=210, nr_pages=32, ps=0xdd921a24)
    at mm/vmscan.c:794
#3 0xc0137417 in try_to_free_pages (classzone=0xc04ad680, gfp_mask=210,
    order=0) at mm/vmscan.c:836
#4 0xc0131848 in __alloc_pages (gfp_mask=210, order=0, zonelist=0xc04afea0)
    at mm/page_alloc.c:610
...

This fragment of refill_inactive_zone() looks strange:

                list_move(&page->lru, &zone->inactive_list);
                if (!pagevec_add(&pvec, page)) {
                        spin_unlock_irq(&zone->lru_lock);
                        if (buffer_heads_over_limit)
                                pagevec_strip(&pvec);
                        __pagevec_release(&pvec);
                        spin_lock_irq(&zone->lru_lock);
                }

page is already on the inactive list (but not accounted for in
zone->nr_inactive). Zone spin lock is released. If at this moment other
thread would call del_page_from_lru(page), zone->nr_inactive can go
negative.

Let me know if more info is necessary, I can reproduce this.

Nikita.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Apr 23 2003 - 22:00:30 EST