RE: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach and loop open

From: Gulam Mohamed
Date: Tue Jul 09 2024 - 05:10:01 EST


Hi Christoph,

> -----Original Message-----
> From: Gulam Mohamed
> Sent: Saturday, July 6, 2024 1:21 AM
> To: hch@xxxxxx
> Cc: linux-block@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> yukuai1@xxxxxxxxxxxxxxx; axboe@xxxxxxxxx
> Subject: RE: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach
> and loop open
>
> Hi Christoph,
>
> > -----Original Message-----
> > From: hch@xxxxxx <hch@xxxxxx>
> > Sent: Tuesday, July 2, 2024 9:20 PM
> > To: Gulam Mohamed <gulam.mohamed@xxxxxxxxxx>
> > Cc: hch@xxxxxx; linux-block@xxxxxxxxxxxxxxx;
> > linux-kernel@xxxxxxxxxxxxxxx; yukuai1@xxxxxxxxxxxxxxx;
> axboe@xxxxxxxxx
> > Subject: Re: [PATCH V6 for-6.11/block] loop: Fix a race between loop
> > detach and loop open
> >
> > Hi Gulam,
> >
> > On Sun, Jun 30, 2024 at 10:11:14PM +0000, Gulam Mohamed wrote:
> > > With our latest version of the patch V6, the "kernel robot test"
> > > failed in the ioctl_loop06 test (LTP tests) as in below mail.
> > > the reason for the failure is, the deferring of the "detach" loop
> > > device to release function. The test opens the loop device, sends
> > > LOOP_SET_BLOCK_SIZE and LOOP_CONFIGURE commands and in
> between
> > that,
> > > it will also detach the loop device. At the end of the test, while
> > > cleanup, it will close the loop device. As we deferred the detach to
> > > last close, the detach will be at the end only but before that we
> > > are setting the lo_state to Lo_rundown. This setting of Lo_rundown
> > > we are doing in the beginning because, there was another LTP test
> > > case failed earlier due to the same reason.
> > >
> > > So, when the LOOP_CONFIGURE was sent, the loop device was still in
> > > Lo_rundown state (Lo_unbound will be set after detach in
> > > __loop_clr_fd()) due to which kernel returned the EBUSY error
> > > causing the test to fail.
> >
> > Before we'd end up in Lo_unbound toward the end of __loop_clr_fd if
> > there was a single opener.
> >
> > > I have noticed that a good number of test cases are having a
> > > behaviour that it will send different loop commands and in between
> > > the detach command also, with only a single open. And close happens at
> the end.
> > > Due to this, I think a couple of test cases needs to be modified.
> > >
> > > Now, as per my understanding, we have two options here:
> > >
> > > 1. Continue with this kernel patch and modify few test cases to
> > > accommodate this new kernel behaviour
> >
> > That would be my preference. Any code that is doing a clear_fd and
> > then tries to configure it again is prone to races vs other openers.
> > It also does not seem very useful outside of test code.
> > But if we end up breaking real code and not test cases we might have
> > to go and bring it back.
>
> Requested the maintainers of the LTP test cases for the modification to
> accomodate the new kernel behavior.

The LTP maintainers agreed to modify the impacted the test cases to accommodate the new kernel behavior. They are asking the kernel version/commit in which this new behavior is included.
Can you please help in integrating the path into the mainline?

Regards,
Gulam Mohamed.