I was working on my filesystem project (announcment coming soon, hopefully:-)
when I noticed in ext2_new_inode that the descriptor buffer that ends up being
marked dirty is quite often not the same as the one into which gdp points.
This means that the wrong descriptor buffer is sometimes marked dirty, which
surprisingly hasn't seemed to cause a lot of problems, at least for me. But
the bug is really there and probably manifests itself occasionally in
high load, limited memory situations, especially with large partitions.
Here is a patch against kernel 2.3.47 for ialloc.c that fixes the problem:
----------------------
279c279
< struct buffer_head * bh2;
---
> struct buffer_head * bh2, * tmpbh2;
314c314
< tmp = ext2_get_group_desc (sb, i, &bh2);
---
> tmp = ext2_get_group_desc (sb, i, &tmpbh2);
318a319
> bh2 = tmpbh2;
327c328
< tmp = ext2_get_group_desc (sb, j, &bh2);
---
> tmp = ext2_get_group_desc (sb, j, &tmpbh2);
335a337
> bh2 = tmpbh2;
347,348c349,350
< tmp = ext2_get_group_desc (sb, i, &bh2);
< if (tmp && le16_to_cpu(tmp->bg_free_inodes_count))
---
> tmp = ext2_get_group_desc (sb, i, &tmpbh2);
> if (tmp && le16_to_cpu(tmp->bg_free_inodes_count)) {
350,351c352,353
< else
< {
---
> bh2 = tmpbh2;
> } else {
360c362
< tmp = ext2_get_group_desc (sb, i, &bh2);
---
> tmp = ext2_get_group_desc (sb, i, &tmpbh2);
363a366
> bh2 = tmpbh2;
376c379
< tmp = ext2_get_group_desc (sb, i, &bh2);
---
> tmp = ext2_get_group_desc (sb, i, &tmpbh2);
379a383
> bh2 = tmpbh2;
----------------------
I can't resist the temptation to comment on the main factor that caused this
this bug in the first place. The real source of the problem is the design of
the ext2_get_group_desc (sb, group, &bh) function: it has one too many
parameters. This function returns a pointer to a group descriptor given the
filesystem superblock and the group number. It also returns another result
"out the side", namely buffer into which the pointer points returned as a side
effect.
There are circumstances where you have to do this kind of thing, but this isn't
one of them. It's easy and efficient to find the buffer given a group number:
it's just a shift, and the correct value to shift by (s_desc_per_block_bits) is
sitting right there in the info struct. The design of ext2_get_group_desc is
an accident waiting to happen: whatever *can* get out of synch, *will* get out
of synch, and in a way that results in the most elusive possible bug (to
misquote Murphy).
What should really be done here is to write a new version of
ext2_get_group_desc that does things in a better-factored way, and phase out
the current version over the course of the next few releases. The patch
given here is really just a hack by comparison with fixing the real source of
the problem.
--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Tue Feb 29 2000 - 21:00:16 EST