Re: [Patch][RFC] Supress Buffer I/O errors when SCSI REQ_QUIET flagset
From: Andrew Morton
Date: Tue Dec 30 2008 - 14:36:18 EST
On Tue, 25 Nov 2008 10:19:18 +0100
Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:
> On Mon, Nov 24 2008, Keith Mannthey wrote:
> > Allow the scsi request REQ_QUIET flag to be propagated to the buffer
> > file system layer. The basic ideas is to pass the flag from the scsi
> > request to the bio (block IO) and then to the buffer layer. The buffer
> > layer can then suppress needless printks.
> >
> > This patch declutters the kernel log by removed the 40-50 (per lun)
> > buffer io error messages seen during a boot in my multipath setup . It
> > is a good chance any real errors will be missed in the "noise" it the
> > logs without this patch.
> >
> > During boot I see blocks of messages like
> > "
> > __ratelimit: 211 callbacks suppressed
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242847
> > Buffer I/O error on device sdm, logical block 1
> > Buffer I/O error on device sdm, logical block 5242878
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242872
> > "
> > in my logs.
> >
> > My disk environment is multipath fiber channel using the SCSI_DH_RDAC
> > code and multipathd. This topology includes an "active" and "ghost"
> > path for each lun. IO's to the "ghost" path will never complete and the
> > SCSI layer, via the scsi device handler rdac code, quick returns the IOs
> > to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
> > layer messages.
> >
> > I am wanting to extend the QUIET behavior to include the buffer file
> > system layer to deal with these errors as well. I have been running this
> > patch for a while now on several boxes without issue. A few runs of
> > bonnie++ show no noticeable difference in performance in my setup.
> >
> > Thanks for John Stultz for the quiet_error finalization.
>
> Looks good to me. I'll merge it up for 2.6.29.
So a month later this turns up in linux-next. During the merge
window, giving a nice pile of rejects to keep me amused.
Can we do better than this, please? A lot?
> +static int quiet_error(struct buffer_head *bh)
> +{
> + if (!test_bit(BH_Quiet, &bh->b_state) && printk_ratelimit())
> + return 0;
> + return 1;
> +}
>
For better of for worse, we have a convention of using cpp-generated
helper functions for the buffer_head flags. There's no reason why this
new code needs to diverge from that. The above should use buffer_quiet().
The functions in fs/buffer.c have been nicely commented.
This function is poorly named. What does "quiet_error" *mean*?
<tries to work it out>
Every caller of this function does `if (!quiet_error(bh))'. Would it
not make more sense to invert the sense of its return value?
static int permit_bh_errors(struct buffer_head *bh)
{
if (buffer_quiet(bh))
return 0; /* IO layer suppressed error messages */
return printk_ratelimit();
}
Did I translate that right? If so, then the addition of the
printk_ratelimit() to the non-buffer_quiet() buffers is an
unchangelogged and unrelated alteration.
The use of printk_ratelimit() needs some thought. It shares
ratelimiting state with all other printk_ratelimit() callsites. Was
that desirable? Would it have been better to create a private
ratelimit_state for buffer_heads? Per physical device? Per
something-else?
> + if (unlikely (test_bit(BIO_QUIET,&bio->bi_flags)))
> + set_bit(BH_Quiet, &bh->b_state);
And the above (which has coding-style errors and has apparently not been
checkpatched) should use set_buffer_quiet().
> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -35,6 +35,7 @@ enum bh_state_bits {
> BH_Ordered, /* ordered write */
> BH_Eopnotsupp, /* operation not supported (barrier) */
> BH_Unwritten, /* Buffer is allocated on disk but not written */
> + BH_Quiet, /* Buffer Error Prinks to be quiet */
>
> BH_PrivateStart,/* not a state bit, but the first bit available
> * for private allocation by other entities
Add
+ BUFFER_FNS(Quiet, quiet)
around line 123 to generate the helper functions.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/