Re: [GIT PULL] SCSI fixes for 4.7-rc2
From: Hannes Reinecke
Date: Mon Jun 13 2016 - 03:04:59 EST
On 06/11/2016 11:03 PM, James Bottomley wrote:
> On Sat, 2016-06-11 at 13:25 -0700, Linus Torvalds wrote:
>> On Sat, Jun 11, 2016 at 12:41 PM, James Bottomley
>> <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
>>>
>>> The QEMU people have accepted it as their bug and are fixing it.
>>
>> Of course they are. Somebody found a bug in their device model, I'd
>> expect nothing else.
>>
>> But I'm not worried about qemu. I'm worried about all the other
>> random devices that have never been tested.
>
> Most of the other devices that are likely to misbehave don't advertise
> high levels of SCSI conformance, so we seem to be mostly covered.
>
And we have been running the very patch in SLES for over a year now,
without a single issue being reported.
>>> There's no other course of action, really because we can't stop
>>> people
>>> sending this command using the BLOCK_PC interface from user space,
>>> so
>>> it's now a known and easy to use way of stopping the device from
>>> responding.
>>
>> Bah. That's not an argument from kernel space. We've had that
>> forever. Broken device that hangs up when you try to read past the
>> end? If you can open the raw device for reading, you can still do a
>> SCSI_IOCTL_SEND_COMMAND to send that read command past the end.
>>
>> The fact that you can craft special commands that can cause problems
>> for specific devices (if you have access to the raw device) does
>> *not* at all argue that the kernel should then do those accesses of
>> its own volition.
>>
>> My worry basically comes down to: we're clearly now doing something
>> that has never ever been tested by anybody before.
>>
Not quite. See above.
The reported issue came from someone who has been running the very
latest linux kernel in a VM which was hosted on an ancient version of
QEMU. Hardly a common scenario.
>> And I think that the assumption that the bug would magically be
>> limited to qemu is a *big* assumption.
>
> How do we ever find out if we don't test it, though? I'm sure some
> obscure minor celebrity trying to get on the chat show circuit once
> said "what is userspace except a test case for the kernel?"
>
> If this is the only problem that turns up, I think we're done. If we
> get any more we can consider either blacklisting all CD type devices or
> raising the conformance bar to SPC-3.
>
I'm fully with James here.
The alternative would be to whitelist _every_ conformant device,
resulting in lots of unhappy customers until we've got the whitelist
settled.
Having to discuss with customers why Linux doesn't follow the specs is
infinitely harder than discussing with customers whose _hardware_
doesn't follow the specs.
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@xxxxxxx +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 NÃrnberg
GF: F. ImendÃrffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG NÃrnberg)