Re: [2.4] heavy-load under swap space shortage

From: Hugh Dickins
Date: Thu Feb 05 2004 - 13:45:18 EST


On Wed, 4 Feb 2004 j-nomura@xxxxxxxxxxxxx wrote:
>
> With slight modification (please see the patch below), it's really helpful.
> I hope you push it again to the mainline.

Okay, glad to hear it, I'll try pushing to Marcelo in 2.4.26-pre.
Can you describe the benefit you see?

> I had to remove raciness check in swap_out_mm, otherwise swap_out_mm
> returns immediately and contend on mmlist_lock in mmput().

Ah yes, thanks a lot, leaving that swap_mm != mm check behind
made an utter nonsense of my patch.

> I think removal is ok because we now avoid the 'rush to same mm' by trylock.

I agree, swap_out_mm's TASK_SIZE check was for the previously all too
common case where one by one they spin on and visit the same old mm
too late, no need for it now. But if we remove that block, the
mmcounter arg becomes pointless, so removed in version below.

> I added the check for 'mm == swap_mm'. It might be necessary to avoid
> the corner case where mmlist_lock being held too long.

Oh, good point. But I'm uneasy about treating a trip round the mmlist
failing to get a lock as the same thing as finding no pages to free,
your "goto empty": drop lock and come around again instead, as below?

Hugh

--- 2.4.25-rc1/mm/vmscan.c 2004-02-05 17:37:28.755210944 +0000
+++ linux/mm/vmscan.c 2004-02-05 17:48:03.764674832 +0000
@@ -283,7 +283,7 @@
/*
* Returns remaining count of pages to be swapped out by followup call.
*/
-static inline int swap_out_mm(struct mm_struct * mm, int count, int * mmcounter, zone_t * classzone)
+static inline int swap_out_mm(struct mm_struct * mm, int count, zone_t * classzone)
{
unsigned long address;
struct vm_area_struct* vma;
@@ -292,13 +292,7 @@
* Find the proper vm-area after freezing the vma chain
* and ptes.
*/
- spin_lock(&mm->page_table_lock);
address = mm->swap_address;
- if (address == TASK_SIZE || swap_mm != mm) {
- /* We raced: don't count this mm but try again */
- ++*mmcounter;
- goto out_unlock;
- }
vma = find_vma(mm, address);
if (vma) {
if (address < vma->vm_start)
@@ -310,15 +304,13 @@
if (!vma)
break;
if (!count)
- goto out_unlock;
+ goto out;
address = vma->vm_start;
}
}
/* Indicate that we reached the end of address space */
mm->swap_address = TASK_SIZE;
-
-out_unlock:
- spin_unlock(&mm->page_table_lock);
+out:
return count;
}

@@ -330,6 +322,7 @@

counter = mmlist_nr << 1;
do {
+top:
if (unlikely(current->need_resched)) {
__set_current_state(TASK_RUNNING);
schedule();
@@ -345,12 +338,21 @@
swap_mm = mm;
}

+ /* Scan mmlist and lock the first available mm */
+ while (mm == &init_mm || !spin_trylock(&mm->page_table_lock)) {
+ mm = list_entry(mm->mmlist.next, struct mm_struct, mmlist);
+ if (mm == swap_mm) {
+ spin_unlock(&mmlist_lock);
+ goto top;
+ }
+ }
+
/* Make sure the mm doesn't disappear when we drop the lock.. */
atomic_inc(&mm->mm_users);
spin_unlock(&mmlist_lock);

- nr_pages = swap_out_mm(mm, nr_pages, &counter, classzone);
-
+ nr_pages = swap_out_mm(mm, nr_pages, classzone);
+ spin_unlock(&mm->page_table_lock);
mmput(mm);

if (!nr_pages)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/