Re: [PATCH 23/26] netfs: Cut over to using new writeback code

From: David Howells
Date: Mon Apr 08 2024 - 11:55:53 EST


David Howells <dhowells@xxxxxxxxxx> wrote:

> + /* Wait for writeback to complete. The writeback engine owns
> + * the info in folio->private and may change it until it
> + * removes the WB mark.
> + */
> + if (folio_wait_writeback_killable(folio)) {
> + ret = written ? -EINTR : -ERESTARTSYS;
> + goto error_folio_unlock;
> + }
> +

It turns out that this really kills performance with fio with as many jobs as
cpus. It's taking up to around 8x longer to complete a pwrite() on average
and perf shows a 30% of the CPU cycles are being spent in contention on the
i_rwsem.

The reason this was added here is that writeback cannot take the folio lock in
order to clean up folio->private without risking deadlock vs the truncation
routines (IIRC).

I can mitigate this by skipping the wait if folio->private is not set and if
we're not going to attach anything there (see attached). Note that if
writeout is ongoing and there is nothing attached to ->private, then we should
not be engaging write-streaming mode and attaching a new netfs_folio (and if
we did, we'd flush the page and wait for it anyway).

The other possibility is if we have a writeback group to set. This only
applies to ceph for the moment and is something that will need dealing with
if/when ceph is made to use this code.

David
---

diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c
index 1eff9413eb1b..279b296f8014 100644
--- a/fs/netfs/buffered_write.c
+++ b/fs/netfs/buffered_write.c
@@ -255,7 +255,8 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter,
* the info in folio->private and may change it until it
* removes the WB mark.
*/
- if (folio_wait_writeback_killable(folio)) {
+ if (folio_get_private(folio) &&
+ folio_wait_writeback_killable(folio)) {
ret = written ? -EINTR : -ERESTARTSYS;
goto error_folio_unlock;
}