Re: SHM stuff - Reason for Oopsen discovered

From: Christoph Rohland (hans-christoph.rohland@sap.com)
Date: Thu May 25 2000 - 12:38:15 EST


Russell King <rmk@arm.linux.org.uk> writes:

> I've found the following "oddities" in the shm code:
>
> 1. shm_alloc seems to assume that "sizeof(*pte) * PTRS_PER_PTE < PAGE_SIZE"
> Is this true of all architectures? It's a little wasteful on ARM because
> sizeof(*pte) * PTRS_PER_PTE == 1024, PAGE_SIZE == 4096.
>
> A cleaner way would be to define and use PTRS_PER_PAGE, especially since
> it doesn't actually reflect a page table, or even better and preferred
> use vmalloc() to allocate a virtually-contiguous area.
>
> I'm assuming that shm_alloc is doing all this crud so that it can allocate
> enough memory.

Yes, we need this crud to not exhaust the vmalloc space with shm page
tables on highend servers. We ran into troubles on stress tests
because of this without highmem support. With highmem support and the
shared anonymous support via shm this gets much worse. Please do _not_
revert that.

> 2. It appears that zshm (and therefore sysvipc) is now required for anonymous
> mmap()s. If you configure a kernel without sysvipc, you can no longer
> do anonymous mmap()s - is this expected?

Sorry, this was an overlook from my side. We should stick a
map_zero_setup into the dummy code of SYVIPC.
<<<
> 3. Is shmctl(..., SHM_LOCK, ...) honoured? It looks like there is code
> present to set a flag (PRV_LOCKED), and return the status of this bit to
> usermode via SHM_STAT/IPC_STAT, but nothing to actually prevent the
> shared memory segment being swapped.

This is honoured in shm_swap.

> Also, I've found my problem, and its ARM specific! It's related to
> point (1) above.
>
> On the ARM, we don't have enough bits in the page tables to store
> the pte bits, so what we do is the following:
>
> +-----+
> | |
> | | cpu ptes
> | |
> pteptr -> | |
> +-----+
> | |
> | | kernel-visible ptes
> | |
> | |
> +-----+
>
> Unfortunately, because ipc/shm.c assumes that it can make
> assumptions about pte pointers and the actions that pte_clear does
> and so forth, it doesn't work well. On the ARM, the memory before
> the pteptr will get corrupted.
>
> Really, the code in shm.c needs to be re-written so its not making
> these braindead assumptions, and if its necessary to use the pte
> stuff, it MUST use the architecture functions to allocate pte tables
> and the like.
>
> Currently, I view the shm code as broken to the extreme.

Or the ARM one? (The basic principle of shm was always the same and
this could have been noticed when the ARM pte handling was designed.)

> I'm not sure what the correct way to proceed is - I've been trying
> to steer away from the Linux memory management stuff recently, so
> I'm reluctant to get more involved with this, since its going to
> require a re-write in the shm allocation/fault/swapping area.

I think we should simply allocate the indirect blocks with pte_alloc
in chunks like we do it now with kmalloc. In 2.5 the shm code should
definitely integrated into the page cache.

I will work out a patch beginning of next week. I just now have no
access to the current kernel sources.

Greetings
                Christoph

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed May 31 2000 - 21:00:14 EST