Bug with shared memory.

From: Martin Schwidefsky (schwidefsky@de.ibm.com)
Date: Tue May 14 2002 - 10:13:06 EST


Hi,
we managed to hang the kernel with a db/2 stress test on s/390. The test
was done on 2.4.7 but the problem is present on all recent 2.4.x and 2.5.x
kernels (all architectures). In short a schedule is done while holding
the shm_lock of a shared memory segment. The system call that caused
this has been sys_ipc with IPC_RMID and from there the call chain is
as follows: sys_shmctl, shm_destroy, fput, dput, iput, truncate_inode_pages,
truncate_list_pages, schedule. The scheduler picked a process that called
sys_shmat. It tries to get the lock and hangs.

One way to fix this is to remove the schedule call from truncate_list_pages:

--- linux-2.5/mm/filemap.c~ Tue May 14 17:04:14 2002
+++ linux-2.5/mm/filemap.c Tue May 14 17:04:33 2002
@@ -237,11 +237,6 @@

                  page_cache_release(page);

- if (need_resched()) {
- __set_current_state(TASK_RUNNING);
- schedule();
- }
-
                  write_lock(&mapping->page_lock);
                  goto restart;
            }

Another way is to free the lock before calling fput in shm_destroy but the
comment says that this functions has to be called with shp and shm_ids.sem
locked. Comments?

blue skies,
   Martin

Linux/390 Design & Development, IBM Deutschland Entwicklung GmbH
Schönaicherstr. 220, D-71032 Böblingen, Telefon: 49 - (0)7031 - 16-2247
E-Mail: schwidefsky@de.ibm.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue May 14 2002 - 12:00:23 EST