Re: [URGENT ASSISTANCE REQUESTED] production machines dying

Mike Jagdis (mike@roan.co.uk)
Wed, 26 Nov 1997 11:56:51 +0000 (GMT/BST)


On Tue, 25 Nov 1997, Rik van Riel wrote:

> > You can do it by looking at the vm mappings which hang off the
> > inode which hangs off the page.
>
> Do all pages have the ->inode filled in?
> What about stack pages and the like?

What you need to do there is to add a dummy inode to the struct mm_struct
objects that define memory spaces and have anonymous mappings in
that space reference it. Then you have a nice uniform structure to
work with. Paging in from an anonymous inode obviously just supplies
zero pages. Paging out to it should never happen because shared,
anonymous mappings don't make much sense (IMHO). However, you might
want to add a flag to inodes to indicate that pages sourced from
the inode must be bounced off swap and not back to the inode. Such
would give a fairly elegant way of allowing files from slow devices
to be backed by swap rather than the original file (which is something
that has been requested in the past).

> [snip nice piece of code]

I should probably have pointed out that you need to check the
address against the vm_start and vm_end to see if the page
actually falls in that region otherwise you could end up with
a pointer to god knows what rather than a pte in that region...

> Not really, we just use an ext2-balloc like mechanism to look
> for the xx-sized area with the most free/stale pages. When it
> only does this below 16Megs, we only need to scan 4096 pages.
> Some kind of bitmap for DMA-able memory is easily manageable.

Bear in mind that you might want to be looking for a set of pages
that have a low average age rather than a set of pages which
are all stale. Also that different architectures might have
different ideas of what is DMA-able - it might be scattered
around all over rather than simply the bottom 16MB. Also that
with ISA DMA you need to remember that 8 bit DMA must not span
a 64KB boundary and 16 bit must not span a 128KB boundary.
This is implicit in the buddy allocator but may not be in a
different method.

Incidentally, since you are asking for memory related tasks,
have you thought about large memory support?

Linux use the 0-3GB range for user space and maps physical
memory at 3-4GB. Physical memory must be <1GB because we need
some space left for vmalloc to play in.

Now, consider if we only map the kernel text, data, bss at 3GB.
Pages mapped to a user space are acessible from the kernel anyway
so they don't need a kernel mapping as well. Pages not mapped to
a user space do need a kernel mapping if the kernel wants to use
them. This includes such things as page tables, task structures,
buffer heads etc. Pages which have a user mapping but are going
to be accessed at interrupt time need a kernel mapping (because
no user space is available at interrupt time - or, at least,
you can't predict what user space you are in).

What does all this mean? Well, it comes back to managing page
"flavours". The page management needs to track what pages have
kernel mappings and what don't so we can allocate appropriately,
plus we need routines to test if a page is kernel mapped and
to add/remove kernel mappings for a page. For memory sizes <1GB
the extra cruft collapses to nothing since all pages can, and
should, be kernel mapped. For larger systems pages are switched
as necessary.

I hope someone is archiving this list :-). (Actually it would
probably make a good final year project for some student, wouldn't
it?)

Mike

P.S. You know the saying, "Them as can, do. Them as can't, talk
about it"? Well, my excuse is that I haven't got *time*. Ok? :-)

-- 
.----------------------------------------------------------------------.
|  Mike Jagdis                  |  Internet:  mailto:mike@roan.co.uk   |
|  Roan Technology Ltd.         |                                      |
|  54A Peach Street, Wokingham  |  Telephone:  +44 118 989 0403        |
|  RG40 1XG, ENGLAND            |  Fax:        +44 118 989 1195        |
`----------------------------------------------------------------------'