Re: data corruption: revalidating a (removable) hdd/flash on re-insert

From: Michael Tokarev
Date: Fri Oct 31 2008 - 13:39:39 EST


Kay Sievers wrote:
On Fri, Oct 31, 2008 at 16:38, Michael Tokarev <mjt@xxxxxxxxxx> wrote:
To make a long story short: is there a way to force kernel
to re-validate a replaced usb-connected hard drive (or a
flash) *automatically*?
[]
Insert the media, and force a validation:
$ touch /dev/sdb

With a newly inserted flash (removed some irrelevant stuff):

DEVTYPE=disk SUBSYSTEM=block MINOR=16 ACTION=change MAJOR=8
DEVTYPE=partition SUBSYSTEM=block MINOR=17 ACTION=add MAJOR=8
DEVTYPE=scsi_device SUBSYSTEM=scsi DRIVER=sd SDEV_MEDIA_CHANGE=1 ACTION=change
DEVTYPE=disk SUBSYSTEM=block MINOR=16 ACTION=change MAJOR=8

Access the device:
$ touch /dev/sdb

Nothing should happen, as the reader/kernel knows it is still valid.

Yes nothing happens.

Now remove the media and insert it immediately again.

Access the device:
$ touch /dev/sdb
UEVENT[1225468868.803950] change
/devices/pci0000:00/0000:00:1d.7/usb5/5-2/5-2:1.0/host8/target8:0:0/8:0:0:0
(scsi)

and you see the reader told to kernel (scsi unit attention) to
revalidate the device.

Ok. So in my case, nothing happens here just like
if it were not removed/inserted.

I replaced the card with another one, and nothing
happened as well.

Only when touch'ing after REMOVING the flash, I see:

DEVTYPE=scsi_device SUBSYSTEM=scsi DRIVER=sd SDEV_MEDIA_CHANGE=1 ACTION=change DEVTYPE=partition SUBSYSTEM=block MINOR=17 ACTION=remove MAJOR=8
DEVTYPE=disk SUBSYSTEM=block MINOR=16 ACTION=change MAJOR=8

Every access to removable media is guarded by this revalidation check.
If you don't see these events, you should not trust this reader, and
at least never change the media while it is connected.

Ok. So.. 3 questions.

1) how it worked before (i yet to find which kernel worked)?
I can only guess that some older kernel never cached the
"validity".

2) 'doze notices the insertions/removals just fine. Again I
can only guess that it constantly pools for changes.

3), and the most important one. I think there should be a
way to stop "caching" of the media information, i.e. to force
revalidation events on EVERY access, for certain hardware at
least. Because corruption in such cases is much worse than
any positive effects of caching etc... Maybe some unusual_devs.h
way or somesuch?..

Now I see the device is somewhat(?) broken. But as I said before
in another email, it's a great device (as in, two epochs connected
to each other), and it'd be sad to lose it. A nostalgie, sort of.. ;)

Ok, maybe actually polling for devices sometimes makes sense... ;)
And there can be a workaround, using a tiny daemon that constantly
accesses the device, in order to catch removals... 'hwell.

/mjt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/