Re: CONFIG_PREEMPT and server workloads

From: Takashi Iwai
Date: Fri Mar 19 2004 - 06:39:14 EST


At Thu, 18 Mar 2004 14:48:07 -0500,
Chris Mason wrote:
>
> On Thu, 2004-03-18 at 14:29, Andrew Morton wrote:
>
> > > yep, i see a similar problem also in reiserfs's do_journal_end().
> > > it's in lock_kernel().
> >
> > I have a scheduling point in journal_end() in 2.4. But I added bugs to
> > reiserfs a couple of times doing this - it's pretty delicate. Beat up on
> > Chris ;)
>
> ;-) Not sure if Takashi is talking about -suse or -mm, the data=ordered
> patches change things around. He sent me suggestions for the
> data=ordered latencies already, but it shouldn't be against the BKL
> there, since I drop it before calling write_ordered_buffers().

i tested only suse kernels recently. will try mm kernel later, again.

ok, let me explain some nasty points i found through the disk i/o load
tests:

- in the loop in do_journal_end(). this happens periodically in
pdflush.

/* first data block is j_start + 1, so add one to cur_write_start wherever you use it */
cur_write_start = SB_JOURNAL(p_s_sb)->j_start ;
cn = SB_JOURNAL(p_s_sb)->j_first ;
jindex = 1 ; /* start at one so we don't get the desc again */
while(cn) {
clear_bit(BH_JNew, &(cn->bh->b_state)) ;
....
next = cn->next ;
free_cnode(p_s_sb, cn) ;
cn = next ;
}


- in write_ordered_buffers().

i still don't figure out where. we have already cond_resched()
check in the loops. this one is triggered when i write bulk data
in parallel (1GB write with 20 threads at the same time), resulting
in up to 2ms.

a typical stacktracing looks like this:

T=36.569 diff=3.64275
comm=reiserfs/0
rtc_interrupt (+cd/e0)
handle_IRQ_event (+2f/60)
do_IRQ (+76/170)
common_interrupt (+18/20)
kfree (+36/50)
reiserfs_free_jh (+34/60)
write_ordered_buffers (+11f/1d0)
flush_commit_list (+3e6/480)
flush_async_commits (+5d/70)
worker_thread (+164/1d0)
flush_async_commits (+0/70)
default_wake_function (+0/10)
default_wake_function (+0/10)
worker_thread (+0/1d0)
kthread (+77/9f)
kthread (+0/9f)
kernel_thread_helper (+5/10)


Takashi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/