invalidate_inode_pages in 2.3.35

From: Ananth Ananthanarayanan (ananth@sgi.com)
Date: Wed Jan 12 2000 - 17:03:00 EST


I'm trying to measure the performance of ext2 read
when the read has to actually hit the device.
I've set up a little system call, which takes
the fd and invalidates the pages associated with
that fd from the buffer/page cache. Basically,
the system call does:

        filp = fget(fd);
        invalidate_inode_pages(filp->f_dentry->d_inode);

The above seems to do the right thing in that
a read on that fd goes to the disk. But, the
amount of free memory goes down drastically,
as reported by vmstat ... in fact after a few
runs of the test, the system starts swapping.

All this happens in linux 2.3.35. My question is:
is the above use of invalidate_inode_pages() correct?
If so, is this problem known?

On further investigation, it looks like get_page() is
called one too many times, or inversely, put_page() is
not being called enough times. My analysis is as follows:
ext2 read is implemented with generic_file_read(),
in turn using do_generic_file_read(). Assuming
the page cache does not contain the requested page,
a new page is allocated via page_cache_alloc(), which
returns a page with page->count = 1 [as part of rmqueue()
of the buddy allocator]. Subsequently, __add_to_page_cache()
does a get_page() and page->count is 2. Now, the actual
read is initiated with the ext2 readpage() operation,
which is implemented with block_read_full_page(); this
routine allocates buffer headers for the page with
create_empty_buffers(), which again does a get_page().
Now the page->count is 3!

Back in do_generic_file_read(), after the readpage is complete
following the wait_on_page(), the page is released through
page_cache_release(), which drops the count to 2. Now,
if invalidate_inode_pages is called, the count can atmost
goto 1, and rightly, __free_pages won't really free the
page. Hence the page will now not be on any queues
(it's been removed from the page cache & the inode hash
as part of invalidate_inode_pages()), but will be left
"hanging", explaining the memory "leak".

Is this analysis correct?

ananth.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jan 15 2000 - 21:00:21 EST