On Thu, Jul 04, 2002 at 12:19:28AM +0200, Manfred Spraul wrote:
> Matthew Dharm wrote:
> > I don't understand what this patch is trying to do...
> > 
> > You're reverting our new state machine changes... why?
> > 
> 
> Because the state machine doesn't work. I've degraded it into a 
> debugging state.
> I've described it in a mail I send to you and linux-usb-devel a few 
> weeks ago, without any reply.
I've got that mail, and it's on my todo list.
> E.g. queue_command stored new commands in ->queue_srb. The worker thread 
> then moved it from queue_srb to srb and set sm_state to RUNNING.
> 
> But what if command_abort() is called before the worker thread is scheduled?
Then we have a serious problem, because the aborts are on the order of
several seconds.  If the thread hasn't gotten scheduled by then it _should_
cause a BUG_ON.
> State machines and asynchroneous command aborts are incompatible, that 
> why I've moved command abortion out of sm_state.
I disagree here.  I think the clear state machine is the -only- way to get
this right.  We tried it without the state machine, and all we did was find
more and more corner cases which are not handled.
> > You're reverting the new mechanism to determine device state... why?
> 
> Unnesessary duplication. Device disconnected is equivalent to 
> ->pusb_dev==NULL. Why do you need a special variable?
Because relying on a pointer has caused problems in the past, especially
when there are concerns that the pointer might be invalid.
> > You're removing the entire bus_reset() logic... why?
> >
> You are right, that change is not correct.
> Do you remember the reasons that lead to the current implementation?
> 
> Hmm. Are you sure that the code can't cause data losses with unrelated 
> devices?
> Suppose I have an usb hub installed, and behind that hub 2 usb disks. If 
> bus_reset is called for the scsi controller that represents one disk, 
> won't that affect the data transfer that go to the other disk?
The hub isn't reset, only the target device is.
> > This patch undoes most of the work done in the last few months.  I
> > _strongly_ oppose the patch without some better explanations.
> 
> I've sent you a mail on 06/02 with details about all changes.
> 
> http://www.geocrawler.com/archives/3/2571/2002/6/600/8821396/
> 
> You did not reply, thus I assumed that you were too busy and I fixed 
> everything myself.
I see.. thus skipping the 4 patches which address most of these issues
which are in my queue.
Look, I might not be that speedy on this, but did it at least occur to you
to contact _any_ of the other usb-storage people?  Bjorn?  Stern?
> The only new change is removing the call to usb_stor_CBI_irq() and 
> replacing it with "up(&us->ip_waitq);" from usb_stor_abort_transport. 
> Setting sm_state and then calling usb_stor_CBI_irq() is a 
> synchronization nightmare.
> Situation: command is completed by the hardware and aborted by the scsi 
> midlayer at the same time. usb_stor_abort_transport() could run on cpu1, 
> _CBI_irq() on cpu2. Now imagine you run on Alpha, where both reads and 
> writes are reordered. Initially I tried to fix it with memory barriers, 
> but the new version is much simpler.
The only requirement in this condition is that the command state be
consistent at the end -- either completed or aborted.  I don't see how the
current code fails this requirement...
Matt
-- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage DriverA: The most ironic oxymoron wins ... DP: "Microsoft Works" A: Uh, okay, you win. -- A.J. & Dust Puppy User Friendly, 1/18/1998
This archive was generated by hypermail 2b29 : Sun Jul 07 2002 - 22:00:11 EST