Re: [PATCH] Use of getblk differs between locations

From: Anton Altaparmakov
Date: Wed Oct 12 2005 - 15:09:26 EST


On Wed, 12 Oct 2005, Jeff Mahoney wrote:
> Anton Altaparmakov wrote:
> > On Tue, 2005-10-11 at 00:49 +0200, Mikulas Patocka wrote:
> >> On Mon, 10 Oct 2005, Anton Altaparmakov wrote:
> >>> On Mon, 10 Oct 2005, Mikulas Patocka wrote:
> >>>> On Mon, 10 Oct 2005, Glauber de Oliveira Costa wrote:
> >>>> As comment in buffer.c says, getblk will deadlock if the machine is out of
> >>>> memory. It is questionable whether to deadlock or return NULL and corrupt
> >>>> filesystem in this case --- deadlock is probably better.
> >>> What do you mean corrupt filesystem? If a filesystem is written so badly
> >>> that it will cause corruption when a NULL is returned somewhere, I
> >>> certainly don't want to have anything to do with it.
> >> What should a filesystem driver do if it can't suddenly read or write any
> >> blocks on media?
> >
> > Two clear choices:
> >
> > 1) Switch to read-only and use the cached data to fulfil requests and
> > fail all others.
> >
> > 2) Ask the user to insert the media/plug the device back in (this is by
> > far the most likely cause of all requests suddenly failing) and then
> > continue where they left off.
> >
> > It is unfortunate that Linux does not allow for 2) so you need to do 1).
>
> I recently looked into 2) a bit and the dm multipath code is almost
> enough to do exactly this.
>
> If you configure your block device as an mpath device that queues on
> path failure, and change the table to the new device location on device
> re-attach, the queued up i/o will be flushed out. Almost. Right now,
> when you change the table and resume the dm mapping, it does a suspend
> which attempts to write out the data to a device which is no longer
> there, causing it to just be dropped on the floor. If this were changed
> not to do that, and perhaps set a timer so that the dirty data wouldn't
> be left around forever if the device wasn't reattached, 2) would
> definitely be possible.
>
> I realize that the userspace intervention required may involve a bit of
> dark magic, but my point is most of the code required on the kernel side
> is already implemented.

Cool. I didn't know that. On Linux as you say the userspace intervention
is the real problem due to non-X vs X and gnome vs KDE vs whatnot... But
I would imagine a simple userspace helper a-la /sbin/modprobe and things
like that would be enough. And that could then be system specific to
display a message to the local user...

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/