Re: memory handling in pre5/pre6

From: Rik van Riel (riel@conectiva.com.br)
Date: Tue Apr 18 2000 - 17:28:44 EST

Next message: Jeff V. Merkey: "Re: Are you all aware that there was a slight drop in performancein going from pre3 to pre5?"
Previous message: david parsons: "Re: [RFC] automount based devfs replacement"
In reply to: Andrea Arcangeli: "Re: memory handling in pre5/pre6"
Next in thread: Andrea Arcangeli: "Re: memory handling in pre5/pre6"
Reply: Andrea Arcangeli: "Re: memory handling in pre5/pre6"
Reply: Stephen C. Tweedie: "Re: memory handling in pre5/pre6"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, 19 Apr 2000, Andrea Arcangeli wrote:
> On Tue, 18 Apr 2000, Rik van Riel wrote:
>
> >Think about eg. multiple truncates on the same inode...
>
> Oh don't worry about that, that can't happen by design. That's what the
> VFS should be good for.

Please read Stephen Tweedie's mail to you. There is a clear bug
in truncate_inode_pages() and this patch should solve it. Maybe
this is the wrong fix, I'm open to suggestions on that, but
simply asserting that there is no bug (when there clearly is)
isn't going to do any good.

> >> find / -type f -exec cp {} /dev/null \; >/dev/null
> >
> >I'm running it now (with my latest version of the patch,
> >which is attached below) and performance of the rest of the
> >machine is not impacted by the continuous copy...
>
> Fun, you do another completly different patch and then you say
> that it wasn't true the old patch was going to harm :). Did you
> ever tried your previous patch in first place?

I tried the previous patch mainly for processes with big
working sets and some other things. And indeed, my OLD patch
broke on heavy IO stuff ... the new patch works fine however.
Oh, and don't be fooled by appearances, the new and the old
patch are only very subtly different.

> >I'm interested in real-world cases where anybody manages to
>
> Try your previous patch (not the new one) and then you'll see
> what happens in real world.

I've seen it. I've fixed it and now I'm curious in performance
metrics with the NEW patch. Dismissing my new patch because of
a mistake in the old patch isn't going to get us a better 2.4
kernel...

> >@@ -233,7 +243,18 @@
> > page = list_entry(page_lru, struct page, lru);
> > list_del(page_lru);
> >
> >+ /* What?! A page is in the LRU queue of another zone?! */
> >+ if (!memclass(page->zone, zone))
> >+ BUG();
> >+
> >+ count--;
> >+
>
> The lru should be global. It make no sense to keep in 16mbyte of cache
> very obsolete stuff. The fix is to put the lru into the NUMA structure
> instead of in the zone. That's another thing I want to fix.

You must be the only one who wants this. Linus has already said
that he wants to make memory management a per-zone thing...

> > dispose = &zone->lru_cache;
> >+ /* the page is in use, we can't free it now */
> >+ if (!page->buffers && page_count(page) > 1)
> >+ goto dispose_continue;
>
> you put busy pages into the lru and you'll lockup as soon as
> only busy pages will remains into the lru.

You seem to forget that we only see stuff that's already in the
LRU cache. We don't "put anything in", we simply put these pages
back where they came from. This is a very surprising comment
from you since you're the guy who wrote the original LRU code
in the first place...

> >+
> >+ dispose = &old;
> > if (test_and_clear_bit(PG_referenced, &page->flags))
> > /* Roll the page at the top of the lru list,
> > * we could also be more aggressive putting
>
> You now enterely broke the lru cache aging.

Nope. The "old" list is inserted at the back of the LRU list,
so it'll be scanned *last* on the next call to shrink_mmap().
This piece of code means that referenced pages will end up at
the back of the LRU list, while not accessed pages will end up
being freed or at the front of the queue.

This is exactly how LRU is supposed to work and a number of
people have already seen quite a performance gain on their
system after having applied my new patch.

Please have a look at the code before blindly dismissing it.

regards,

Rik

-- The Internet is not a network of computers. It is a network of people. That is its real strength.

Wanna talk about the kernel? irc.openprojects.net / #kernelnewbies http://www.conectiva.com/ http://www.surriel.com/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/

Next message: Jeff V. Merkey: "Re: Are you all aware that there was a slight drop in performancein going from pre3 to pre5?"
Previous message: david parsons: "Re: [RFC] automount based devfs replacement"
In reply to: Andrea Arcangeli: "Re: memory handling in pre5/pre6"
Next in thread: Andrea Arcangeli: "Re: memory handling in pre5/pre6"
Reply: Andrea Arcangeli: "Re: memory handling in pre5/pre6"
Reply: Stephen C. Tweedie: "Re: memory handling in pre5/pre6"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Sun Apr 23 2000 - 21:00:14 EST