Re: [virtio-dev] [PATCH v6 4/6] mm: Introduce Reported pages

From: Nitesh Narayan Lal
Date: Thu Aug 22 2019 - 12:19:15 EST



On 8/21/19 10:59 AM, Alexander Duyck wrote:
> From: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
>
> In order to pave the way for free page reporting in virtualized
> environments we will need a way to get pages out of the free lists and
> identify those pages after they have been returned. To accomplish this,
> this patch adds the concept of a Reported Buddy, which is essentially
> meant to just be the Uptodate flag used in conjunction with the Buddy
> page type.
>
> It adds a set of pointers we shall call "boundary" which represents the
> upper boundary between the unreported and reported pages. The general idea
> is that in order for a page to cross from one side of the boundary to the
> other it will need to go through the reporting process. Ultimately a
> free_list has been fully processed when the boundary has been moved from
> the tail all they way up to occupying the first entry in the list.
>
> Doing this we should be able to make certain that we keep the reported
> pages as one contiguous block in each free list. This will allow us to
> efficiently manipulate the free lists whenever we need to go in and start
> sending reports to the hypervisor that there are new pages that have been
> freed and are no longer in use.
>
> An added advantage to this approach is that we should be reducing the
> overall memory footprint of the guest as it will be more likely to recycle
> warm pages versus trying to allocate the reported pages that were likely
> evicted from the guest memory.
>
> Since we will only be reporting one zone at a time we keep the boundary
> limited to being defined for just the zone we are currently reporting pages
> from. Doing this we can keep the number of additional pointers needed quite
> small. To flag that the boundaries are in place we use a single bit
> in the zone to indicate that reporting and the boundaries are active.
>
> The determination of when to start reporting is based on the tracking of
> the number of free pages in a given area versus the number of reported
> pages in that area. We keep track of the number of reported pages per
> free_area in a separate zone specific area. We do this to avoid modifying
> the free_area structure as this can lead to false sharing for the highest
> order with the zone lock which leads to a noticeable performance
> degradation.
[...]
> +
> +/* request page reporting on this zone */
> +void __page_reporting_request(struct zone *zone)
> +{
> + struct page_reporting_dev_info *phdev;
> +
> + rcu_read_lock();
> +
> + /*
> + * We use RCU to protect the ph_dev_info pointer. In almost all
> + * cases this should be present, however in the unlikely case of
> + * a shutdown this will be NULL and we should exit.
> + */
> + phdev = rcu_dereference(ph_dev_info);
> + if (unlikely(!phdev))
> + return;
> +

Just a minor comment here.
Although this is unlikely to trigger still I think you should release the
rcu_read_lock before returning.

> + /*
> + * We can use separate test and set operations here as there
> + * is nothing else that can set or clear this bit while we are
> + * holding the zone lock. The advantage to doing it this way is
> + * that we don't have to dirty the cacheline unless we are
> + * changing the value.
> + */
> + __set_bit(ZONE_PAGE_REPORTING_REQUESTED, &zone->flags);
> +
> + /*
> + * Delay the start of work to allow a sizable queue to
> + * build. For now we are limiting this to running no more
> + * than 10 times per second.
> + */
> + if (!atomic_fetch_inc(&phdev->refcnt))
> + schedule_delayed_work(&phdev->work, HZ / 10);
> +
> + rcu_read_unlock();
> +}
> +
[...]
> + }
> +
> + /* enable page reporting notification */
> + static_branch_enable(&page_reporting_notify_enabled);
> +err_out:
> + mutex_unlock(&page_reporting_mutex);
> +
> + return err;
> +}
> +EXPORT_SYMBOL_GPL(page_reporting_startup);
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
> For additional commands, e-mail: virtio-dev-help@xxxxxxxxxxxxxxxxxxxx
>
--
Thanks
Nitesh