Re: [PATCH v8 10/22] Replace the XIP page fault handler with the DAX page fault handler

From: Matthew Wilcox
Date: Wed Jul 23 2014 - 15:57:54 EST


On Wed, Jul 23, 2014 at 07:57:09PM +0300, Boaz Harrosh wrote:
> > +/*
> > + * The user has performed a load from a hole in the file. Allocating
> > + * a new page in the file would cause excessive storage usage for
> > + * workloads with sparse files. We allocate a page cache page instead.
> > + * We'll kick it out of the page cache if it's ever written to,
> > + * otherwise it will simply fall out of the page cache under memory
> > + * pressure without ever having been dirtied.
> > + */
>
> Do you like this ?? I understand that you cannot use the ZERO page or
> such global page on a page cache since each instance needs its own
> list_head/index/mapping and so on. But why use any page at all.
>
> use a global ZERO page, either the system global, or static local to
> this system. map it to the current application VMA in question, using it's
> pfn (page_to_pfn) just like you do with real DAX-blocks from prd.

I must admit to not understanding the MM particularly well. There would
seem to be problems with rmap when doing this kind of trick. Also, this
is how reading from holes on regular filesystems work (except for the
part about kicking it out of page cache on a write). A third reason is
that there are some forms of PMem which are terribly slow to write to.
I have a longer-term plan to support these memories by transparently
caching them in DRAM and only writing back to the media on flush/sync.

> Say app A reads an hole, then app B reads an hole. Both now point to the same
> zero page pfn, now say app B writes to that hole, mkwrite will convert it to
> a real dax-block pfn and will map the new pfn in the faulting vma. But what about
> app A, will it read the old pfn? who loops on all VMA's that have some mapping
> and invalidates those mapping.

That's the call to unmap_mapping_range().

> Same with truncate. App A mmap-read a block, app B does a read-mmap then a truncate.
> who loops on all VMA mappings of these blocks to invalidate them. With page-cache and
> pages we have a list of all VMA's that currently have mappings on a page, but with
> dax-pfns (dax-blocks) we do *not* have page struct, who keeps the list of current
> active vma-mappings?

Same solution ... there's a list in the address_space of all the VMAs who
have it mapped. See truncate_pagecache() in mm/truncate.c (filesystems
usually call truncate_setsize()).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/