Re: [PATCH 2/2] nfs/blocklayout: Limit repeat device registration on failure

From: Chuck Lever
Date: Thu Nov 21 2024 - 10:32:57 EST


On Thu, Nov 21, 2024 at 10:11:33AM -0500, Benjamin Coddington wrote:
> On 20 Nov 2024, at 10:22, Chuck Lever wrote:
>
> > On Wed, Nov 20, 2024 at 09:09:35AM -0500, Benjamin Coddington wrote:
> >> If we're unable to register a SCSI device, ensure we mark the device as
> >> unavailable so that it will timeout and be re-added via GETDEVINFO. This
> >> avoids repeated doomed attempts to register a device in the IO path.
> >>
> >> Add some clarifying comments as well.
> >>
> >> Fixes: d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration")
> >> Signed-off-by: Benjamin Coddington <bcodding@xxxxxxxxxx>
> >> Reviewed-by: Christoph Hellwig <hch@xxxxxx>
> >> ---
> >> fs/nfs/blocklayout/blocklayout.c | 12 +++++++++++-
> >> 1 file changed, 11 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
> >> index 0becdec12970..b36bc2f4f7e2 100644
> >> --- a/fs/nfs/blocklayout/blocklayout.c
> >> +++ b/fs/nfs/blocklayout/blocklayout.c
> >> @@ -571,19 +571,29 @@ bl_find_get_deviceid(struct nfs_server *server,
> >> if (!node)
> >> return ERR_PTR(-ENODEV);
> >>
> >> + /*
> >> + * Devices that are marked unavailable are left in the cache with a
> >> + * timeout to avoid sending GETDEVINFO after every LAYOUTGET, or
> >> + * constantly attempting to register the device. Once marked as
> >> + * unavailable they must be deleted and never reused.
> >> + */
> >> if (test_bit(NFS_DEVICEID_UNAVAILABLE, &node->flags)) {
> >> unsigned long end = jiffies;
> >> unsigned long start = end - PNFS_DEVICE_RETRY_TIMEOUT;
> >>
> >> if (!time_in_range(node->timestamp_unavailable, start, end)) {
> >> + /* Force a new GETDEVINFO for this LAYOUT */
> >
> > Or perhaps: "Uncork subsequent GETDEVINFO operations for this device"
> > <shrug>
>
> Sure, ok!
>
> >> nfs4_delete_deviceid(node->ld, node->nfs_client, id);
> >> goto retry;
> >> }
> >> goto out_put;
> >> }
> >>
> >> - if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node)))
> >> + /* If we cannot register, treat this device as transient */
> >
> > How about "Make a negative cache entry for this device"
>
> Hmm - that's closer to the dentry language rather than how we refer to
> temporary error cases in device land. For me the "transient" has some
> hopeful meaning as in we expect this might work in the future - but I'm ok
> changing this comment. There will be some NFS clients that might try to do
> pNFS SCSI but will never actually have the devices locally, and so that's
> not a "transient" situation. This can only fixed today with export policy.

No big deal!


> >> + if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node))) {
> >> + nfs4_mark_deviceid_unavailable(node);
> >> goto out_put;
> >> + }
> >>
> >> return node;
> >>
> >> --
> >> 2.47.0
> >>
> >
> > It took me a bit to understand what this patch does. It is like
> > setting up a negative dentry so the local device cache absorbs
> > bursts of checks for the device. OK.
>
> Yes, its like the layout error handling, but for devices.
>
> Its not obvious at this layer, but every IO wants to do LAYOUTGET, then
> figure out which device GETDEVINFO, then here we need to prep the device
> with a reservation. Its a lot of slow work that makes a mess of IO
> latencies if one of the later steps is going to fail for awhile.

Thanks! It would be nice to add this bit of context to the patch
description.


> > Just an observation: Negative caching has some consequences too.
> > For instance, there will now be a period where, if the device
> > happens to become available, the layout is still unusable. I wonder
> > if that's going to have some undesirable operational effects.
>
> It sure does, but I don't think there's a simple way to get notified that a
> SCSI device has re-appeared or has started supporting persistent
> reservations.

Reviewed-by: Chuck Lever <chuck.lever@xxxxxxxxxx>


--
Chuck Lever