Re: [PATCH] sector_t overflow in block layer

From: Andreas Dilger
Date: Fri May 19 2006 - 16:55:14 EST

On May 19, 2006 13:11 -0700, Andrew Morton wrote:
> btw, it seems odd to me that we're trying to handle the
> device-too-large-for-sector_t problem at the submit_bh() level. What
> happens if someone uses submit_bio()? Isn't it something we can check at
> mount time, or partition parsing, or...?

Doing so at mount time will be added to ext3 shortly. That said, I'd still
like a check in the block layer to avoid the silent corruption possible if
the filesystem makes a mistake, or for those filesystems that haven't added
such a check yet, or new filesystems that don't in the future.

I saw that xfs and ocfs2 have some checks at fill_super time related
to sector_t and CONFIG_LBD, which I found AFTER I saw that ext3 already
had a problem and was searching for where CONFIG_LBD was used. Looking
at that code, however, it isn't even clear there that this is checking
for sector_t overflow, only what the maximum file size is, so they may
also be susceptible to the same corruption.

As for submit_bio() I don't think it is even POSSIBLE to check for overflow
at that stage because the bi_sector has already been truncated to sector_t.
It would have to be done in the caller of submit_bio(), as in submit_bh().

As for the kernel completely disallowing access to a block device larger
than 2TB when CONFIG_LBD isn't set is a separate question entirely.
Stephen suggested (and I can believe this happening, just did it myself
last week during testing on an 8TB+delta device) that a filesystem may be
formatted to only use the accessible part of the device. Having the block
layer fail IO beyond the limit is good, not being able to use the device
because it is just a bit bigger than the safe limit is not.

One extra suggestion that might be safe and acceptible all around is if
a device is larger than 2TB w/o a 64-bit sector_t that the block device
size itself be truncated in the kernel to 2TB-512. This at least prevents
userspace tools from trying to e.g. format a 3TB filesystem on a device
that will just corrupt the filesystem.

I don't think partitions on a larger device make any difference, since
the kernel will remap them to the final sector offset internally in the
end, and still overflow sector_t later (which could itself be checked
also). I don't think this is sufficient to avoid the submit_bh() check
though, since it doesn't address wrapping when calculating the sector.

Cheers, Andreas
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at