On Thu, Dec 12, 2019 at 05:06:58PM +0100, SeongJae Park wrote:
On Thu, 12 Dec 2019 16:27:57 +0100 "Roger Pau MonnÃ" <roger.pau@xxxxxxxxxx> wrote:
diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index fd1e19f1a49f..98823d150905 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -142,6 +142,21 @@ static inline bool persistent_gnt_timeout(struct persistent_gnt *persistent_gnt)
HZ * xen_blkif_pgrant_timeout);
}
+/* Once a memory pressure is detected, squeeze free page pools for a while. */
+static unsigned int buffer_squeeze_duration_ms = 10;
+module_param_named(buffer_squeeze_duration_ms,
+ buffer_squeeze_duration_ms, int, 0644);
+MODULE_PARM_DESC(buffer_squeeze_duration_ms,
+"Duration in ms to squeeze pages buffer when a memory pressure is detected");
+
+static unsigned long buffer_squeeze_end;
+
+void xen_blkbk_reclaim_memory(struct xenbus_device *dev)
+{
+ buffer_squeeze_end = jiffies +
+ msecs_to_jiffies(buffer_squeeze_duration_ms);
I'm not sure this is fully correct. This function will be called for
each blkback instance, but the timeout is stored in a global variable
that's shared between all blkback instances. Shouldn't this timeout be
stored in xen_blkif so each instance has it's own local variable?
Or else in the case you have 1k blkback instances the timeout is
certainly going to be longer than expected, because each call to
xen_blkbk_reclaim_memory will move it forward.
Agreed that. I think the extended timeout would not make a visible
performance, though, because the time that 1k-loop take would be short enough
to be ignored compared to the millisecond-scope duration.
I took this way because I wanted to minimize such structural changes as far as
I can, as this is just a point-fix rather than ultimate solution. That said,
it is not fully correct and very confusing. My another colleague also pointed
out it in internal review. Correct solution would be to adding a variable in
the struct as you suggested or avoiding duplicated update of the variable by
initializing the variable once the squeezing duration passes. I would prefer
the later way, as it is more straightforward and still not introducing
structural change. For example, it might be like below:
diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index f41c698dd854..6856c8ef88de 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -152,8 +152,9 @@ static unsigned long buffer_squeeze_end;
void xen_blkbk_reclaim_memory(struct xenbus_device *dev)
{
- buffer_squeeze_end = jiffies +
- msecs_to_jiffies(buffer_squeeze_duration_ms);
+ if (!buffer_squeeze_end)
+ buffer_squeeze_end = jiffies +
+ msecs_to_jiffies(buffer_squeeze_duration_ms);
}
static inline int get_free_page(struct xen_blkif_ring *ring, struct page **page)
@@ -669,10 +670,13 @@ int xen_blkif_schedule(void *arg)
}
/* Shrink the free pages pool if it is too large. */
- if (time_before(jiffies, buffer_squeeze_end))
+ if (time_before(jiffies, buffer_squeeze_end)) {
shrink_free_pagepool(ring, 0);
- else
+ } else {
+ if (unlikely(buffer_squeeze_end))
+ buffer_squeeze_end = 0;
shrink_free_pagepool(ring, max_buffer_pages);
+ }
if (log_stats && time_after(jiffies, ring->st_print))
print_stats(ring);
May I ask you what way would you prefer?
I'm not particularly found of this approach, as I think it's racy. Ie:
you would have to add some kind of lock to make sure the contents of
buffer_squeeze_end stay unmodified during the read and set cycle, or
else xen_blkif_schedule will race with xen_blkbk_reclaim_memory.
This is likely not a big deal ATM since the code will work as
expected in most cases AFAICT, but I would still prefer to have a
per-instance buffer_squeeze_end added to xen_blkif, given that the
callback is per-instance. I wouldn't call it a structural change, it's
just adding a variable to a struct instead of having a shared one, but
the code is almost the same as the current version.