Re: [PATCH] Use of getblk differs between locations

From: Anton Altaparmakov
Date: Mon Oct 10 2005 - 17:26:20 EST

On Mon, 10 Oct 2005, Mikulas Patocka wrote:
> On Mon, 10 Oct 2005, Glauber de Oliveira Costa wrote:
> > On Mon, Oct 10, 2005 at 10:20:07PM +0100, Anton Altaparmakov wrote:
> > > On Mon, 10 Oct 2005, Glauber de Oliveira Costa wrote:
> > > > I've just noticed that the use of sb_getblk differs between locations
> > > > inside the kernel. To be precise, in some locations there are tests
> > > > against its return value, and in some places there are not.
> > > >
> > > > According to the comments in __getblk definition, the tests are not
> > > > necessary, as the function always return a buffer_head (maybe a wrong
> > > > one),
> > >
> > > If you had read the source code rather than just the comments you would
> > > have seen that this is not true. It can return NULL (see
> > > fs/buffer.c::__getblk_slow()). Certainly I would prefer to keep the
> > > checks in NTFS, please. They may only be good for catching bugs but I
> > > like catching bugs rather than segfaulting due to a NULL dereference.
> The check should be rather a BUG() than dump_stack() and return NULL --- I
> think it's not right to write code to recover from programming errors.

Why programming errors? It could be faulty memory or other corruption,
perhaps even caused by a different driver altogether (e.g. I found a bug
in ntfs last week which caused it to memset() to zero a random location in
memory of a random size and it caused a lot of strange effects like my
shell suddenly exiting and me being left on the login prompt...). Also it
could be that the function one day changes and it can return NULL. It is
far safer to do checking than to make assumptions about not being able to
return NULL.

> Filesystem drivers are supposed to pass correct blocksize to getblk(). ---
> even for users it's better to crash, because user whose machine has locked up
> on BUG() will report bug more likely than user whose machine has written stack
> dump into log and corrupted filesystem --- by the time he discovers the
> corruption and mesage he might not even remember what triggered it.
> As comment in buffer.c says, getblk will deadlock if the machine is out of
> memory. It is questionable whether to deadlock or return NULL and corrupt
> filesystem in this case --- deadlock is probably better.

What do you mean corrupt filesystem? If a filesystem is written so badly
that it will cause corruption when a NULL is returned somewhere, I
certainly don't want to have anything to do with it.

Going BUG() is generally a bad thing if the error can be recovered from.
Certainly all my code attempts to recover from all error conditions it can
possibly encounter.

I would much rather see NULL and then handle the error gracefully with an
error message than go BUG(). You can then still umount and remove the fs
module and everything works fine (you may need an fsck you may not depends
on how good your error handling is). If you do a BUG() you are guaranteed
to cause corruption... I only use BUG() when something really cannot
happen unless there is a bug in which case I want to know it...

Best regards,

Anton Altaparmakov <aia21 at> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on
WWW: &
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at