Re: [RFC] A couple of questions about the paged I/O sub system

From: Hugh Dickins
Date: Wed Oct 21 2015 - 15:56:29 EST


On Wed, 21 Oct 2015, Ian Kent wrote:

> Hi all,
>
> I've been looking through some of the page reclaim code and at
> truncate_inode_pages().
>
> I'm not familiar with the code and I'm struggling to understand it.
>
> One thing that is puzzling me right now is, if a file has pages that
> have been modified and are swapped out when pagevec_lookup_entries() is
> called will they be found?

truncate_inode_pages() is a library function which a filesystem calls
at some stage in its inode truncation processing, to take all the incore
pages out of pagecache (out of its radix_tree), and free them up
(usually: some might be otherwise pinned in memory at the time).

A filesystem will have other work to do, very particular to that
filesystem, to free up the actual disk blocks: that's definitely
not part of truncate_inode_pages()'s job.

It's also called when evicting an inode no longer needed in memory,
to free the associated pagecache, when not deleting the blocks on disk.

I think I don't understand your "swapped out": modifications occur to
a page while it is in pagecache, and those modifications need to be
written back to disk before that page can be reclaimed for other use.

>
> If not then how does truncate_inode_pages(_range)() handle waiting for
> these pages to be swapped back in to perform the writeback and
> truncation?

Pages are never "swapped back in to perform the writeback":
if writeback is needed, it's done before the page can be freed from
pagecache; and if that data is needed again after the page was freed,
it's read back in from disk to fresh page.

You may be worrying about what happens when a page is modified or
under writeback when it is truncated: I think that's something each
filesystem has to be careful of, and may deal with in different ways.

I'm not sure how much to read in to your use of the word "swap".
It's true that shmem/tmpfs uses swap (of the swapon/swapoff variety)
as backing for its pages when under pressure (and uses its own variant
shmem_undo_range() to manage that, instead of truncate_inode_pages()),
but most filesystems don't use "swap" at all.

I just noticed your subject "paged I/O sub system": I hope you realize
that mm/page_io.c is solely concerned with swap (of the swapon/swapoff
variety), and has next to nothing to do with filesystems. (Just as,
conversely, mm/swap.c has next to nothing to do with swap.)

>
> Anyone, please?

I hope something I've said there has helped, but warn you that
I'm a terrible person to engage in an extended conversation with!
Expect long silences, pray for someone else to jump in.

Hugh

> Ian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/