Re: usb storage cleanup

From: Matthew Dharm (mdharm-kernel@one-eyed-alien.net)
Date: Wed Jul 03 2002 - 19:05:21 EST


On Thu, Jul 04, 2002 at 12:19:28AM +0200, Manfred Spraul wrote:
> Matthew Dharm wrote:
> > I don't understand what this patch is trying to do...
> >
> > You're reverting our new state machine changes... why?
> >
>
> Because the state machine doesn't work. I've degraded it into a
> debugging state.
> I've described it in a mail I send to you and linux-usb-devel a few
> weeks ago, without any reply.

I've got that mail, and it's on my todo list.

> E.g. queue_command stored new commands in ->queue_srb. The worker thread
> then moved it from queue_srb to srb and set sm_state to RUNNING.
>
> But what if command_abort() is called before the worker thread is scheduled?

Then we have a serious problem, because the aborts are on the order of
several seconds. If the thread hasn't gotten scheduled by then it _should_
cause a BUG_ON.

> State machines and asynchroneous command aborts are incompatible, that
> why I've moved command abortion out of sm_state.

I disagree here. I think the clear state machine is the -only- way to get
this right. We tried it without the state machine, and all we did was find
more and more corner cases which are not handled.

> > You're reverting the new mechanism to determine device state... why?
>
> Unnesessary duplication. Device disconnected is equivalent to
> ->pusb_dev==NULL. Why do you need a special variable?

Because relying on a pointer has caused problems in the past, especially
when there are concerns that the pointer might be invalid.

> > You're removing the entire bus_reset() logic... why?
> >
> You are right, that change is not correct.
> Do you remember the reasons that lead to the current implementation?
>
> Hmm. Are you sure that the code can't cause data losses with unrelated
> devices?
> Suppose I have an usb hub installed, and behind that hub 2 usb disks. If
> bus_reset is called for the scsi controller that represents one disk,
> won't that affect the data transfer that go to the other disk?

The hub isn't reset, only the target device is.

> > This patch undoes most of the work done in the last few months. I
> > _strongly_ oppose the patch without some better explanations.
>
> I've sent you a mail on 06/02 with details about all changes.
>
> http://www.geocrawler.com/archives/3/2571/2002/6/600/8821396/
>
> You did not reply, thus I assumed that you were too busy and I fixed
> everything myself.

I see.. thus skipping the 4 patches which address most of these issues
which are in my queue.

Look, I might not be that speedy on this, but did it at least occur to you
to contact _any_ of the other usb-storage people? Bjorn? Stern?

> The only new change is removing the call to usb_stor_CBI_irq() and
> replacing it with "up(&us->ip_waitq);" from usb_stor_abort_transport.
> Setting sm_state and then calling usb_stor_CBI_irq() is a
> synchronization nightmare.
> Situation: command is completed by the hardware and aborted by the scsi
> midlayer at the same time. usb_stor_abort_transport() could run on cpu1,
> _CBI_irq() on cpu2. Now imagine you run on Alpha, where both reads and
> writes are reordered. Initially I tried to fix it with memory barriers,
> but the new version is much simpler.

The only requirement in this condition is that the command state be
consistent at the end -- either completed or aborted. I don't see how the
current code fails this requirement...

Matt

-- 
Matthew Dharm                              Home: mdharm-usb@one-eyed-alien.net 
Maintainer, Linux USB Mass Storage Driver

A: The most ironic oxymoron wins ... DP: "Microsoft Works" A: Uh, okay, you win. -- A.J. & Dust Puppy User Friendly, 1/18/1998


- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jul 07 2002 - 22:00:11 EST