Re: [PATCH v8 4/9] dax: support dirty DAX entries in radix tree

From: Jan Kara
Date: Wed Jan 13 2016 - 04:44:13 EST


On Thu 07-01-16 22:27:54, Ross Zwisler wrote:
> Add support for tracking dirty DAX entries in the struct address_space
> radix tree. This tree is already used for dirty page writeback, and it
> already supports the use of exceptional (non struct page*) entries.
>
> In order to properly track dirty DAX pages we will insert new exceptional
> entries into the radix tree that represent dirty DAX PTE or PMD pages.
> These exceptional entries will also contain the writeback sectors for the
> PTE or PMD faults that we can use at fsync/msync time.
>
> There are currently two types of exceptional entries (shmem and shadow)
> that can be placed into the radix tree, and this adds a third. We rely on
> the fact that only one type of exceptional entry can be found in a given
> radix tree based on its usage. This happens for free with DAX vs shmem but
> we explicitly prevent shadow entries from being added to radix trees for
> DAX mappings.
>
> The only shadow entries that would be generated for DAX radix trees would
> be to track zero page mappings that were created for holes. These pages
> would receive minimal benefit from having shadow entries, and the choice
> to have only one type of exceptional entry in a given radix tree makes the
> logic simpler both in clear_exceptional_entry() and in the rest of DAX.
>
> Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
> Reviewed-by: Jan Kara <jack@xxxxxxx>

I have realized there's one issue with this code. See below:

> @@ -34,31 +35,39 @@ static void clear_exceptional_entry(struct address_space *mapping,
> return;
>
> spin_lock_irq(&mapping->tree_lock);
> - /*
> - * Regular page slots are stabilized by the page lock even
> - * without the tree itself locked. These unlocked entries
> - * need verification under the tree lock.
> - */
> - if (!__radix_tree_lookup(&mapping->page_tree, index, &node, &slot))
> - goto unlock;
> - if (*slot != entry)
> - goto unlock;
> - radix_tree_replace_slot(slot, NULL);
> - mapping->nrshadows--;
> - if (!node)
> - goto unlock;
> - workingset_node_shadows_dec(node);
> - /*
> - * Don't track node without shadow entries.
> - *
> - * Avoid acquiring the list_lru lock if already untracked.
> - * The list_empty() test is safe as node->private_list is
> - * protected by mapping->tree_lock.
> - */
> - if (!workingset_node_shadows(node) &&
> - !list_empty(&node->private_list))
> - list_lru_del(&workingset_shadow_nodes, &node->private_list);
> - __radix_tree_delete_node(&mapping->page_tree, node);
> +
> + if (dax_mapping(mapping)) {
> + if (radix_tree_delete_item(&mapping->page_tree, index, entry))
> + mapping->nrexceptional--;

So when you punch hole in a file, you can delete a PMD entry from a radix
tree which covers part of the file which still stays. So in this case you
have to split the PMD entry into PTE entries (probably that needs to happen
up in truncate_inode_pages_range()) or something similar...

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR