Re: [PATCH] Give kjournald a IOPRIO_CLASS_RT io priority

From: Andrew Morton
Date: Thu Oct 02 2008 - 17:38:58 EST


On Thu, 2 Oct 2008 21:22:23 +0200
Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:

> > @@ -131,6 +132,17 @@ static int kjournald(void *arg)
> > journal->j_commit_interval / HZ);
> >
> > /*
> > + * kjournald is the process on which most other processes depend on
> > + * for doing the filesystem portion of their IO. As such, there exists
> > + * the equivalent of a priority inversion situation, where kjournald
> > + * would get less priority as it should.
> > + *
> > + * For this reason we set to "medium real time priority", which is higher
> > + * than regular tasks, but not infinitely powerful.
> > + */
> > + set_task_ioprio(current, IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, 0));
> > +
> > + /*
> > * And now, wait forever for commit wakeup events.
> > */
> > spin_lock(&journal->j_state_lock);
> > diff --git a/include/linux/ioprio.h b/include/linux/ioprio.h
> > index f98a656..76dad48 100644
> > --- a/include/linux/ioprio.h
> > +++ b/include/linux/ioprio.h
> > @@ -86,4 +86,6 @@ static inline int task_nice_ioclass(struct task_struct *task)
> > */
> > extern int ioprio_best(unsigned short aprio, unsigned short bprio);
> >
> > +extern int set_task_ioprio(struct task_struct *task, int ioprio);
> > +
> > #endif
> > --
> > 1.5.5.1
>
> Can we agree on this patch?

This change will cause _all_ kjournald writeout to have elevated
priority. The majority of that writeout (in data=ordered mode) is file
data, which we didn't intend to change.

The risk here is that this will *worsen* latency for plain old read(),
because now kjournald writeout will be favoured.

There is in fact a good argument for _reducing_ kjournald's IO
priority, not increasing it!

A better approach might be to mark the relevant buffers/bios as needing
higher priority at submit_bh() time (if that's possible). At least
that way we don't accidentally elevate the priority of the bulk data.


It's a bit of a hack, sorry :(
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/