Re: [PATCH] x86: enable preemption in delay

From: Thomas Gleixner
Date: Sun May 25 2008 - 08:45:53 EST


On Sat, 24 May 2008, Steven Rostedt wrote:
> The RT team has been searching for a nasty latency. This latency shows
> up out of the blue and has been seen to be as big as 5ms!
>
> @@ -44,12 +44,33 @@ static void delay_loop(unsigned long loo
> static void delay_tsc(unsigned long loops)
> {
> unsigned long bclock, now;
> + int cpu;
>
> - preempt_disable(); /* TSC's are per-cpu */
> + preempt_disable();
> + cpu = smp_processor_id();
> rdtscl(bclock);
> do {
> rep_nop();
> rdtscl(now);
> + /* Allow RT tasks to run */
> + preempt_enable();
> + preempt_disable();
> + /*
> + * It is possible that we moved to another CPU,
> + * and since TSC's are per-cpu we need to
> + * calculate that. The delay must guarantee that
> + * we wait "at least" the amount of time. Being
> + * moved to another CPU could make the wait longer
> + * but we just need to make sure we waited long
> + * enough. Rebalance the counter for this CPU.
> + */
> + if (unlikely(cpu != smp_processor_id())) {

Eeek, once you migrated you do this all the time. you need to update
cpu here.

> + if ((now-bclock) >= loops)
> + break;

Also this is really dangerous with unsynchronized TSCs. You might get
migrated and return immediately because the TSC on the other CPU is
far ahead.

What you really want is something like the patch below, but we should
reuse the sched_clock_cpu() thingy to make that simpler. Looking into
that right now.

Thanks,
tglx

diff --git a/arch/x86/lib/delay_32.c b/arch/x86/lib/delay_32.c
index 4535e6d..66a3c32 100644
--- a/arch/x86/lib/delay_32.c
+++ b/arch/x86/lib/delay_32.c
@@ -40,17 +40,51 @@ static void delay_loop(unsigned long loops)
:"0" (loops));
}

+/*
+ * 5 usec on a 1GHZ machine. Not necessarily correct, but not too long
+ * either.
+ */
+#define TSC_MIGRATE_COUNT 5000
+
/* TSC based delay: */
static void delay_tsc(unsigned long loops)
{
unsigned long bclock, now;
+ int cpu;

- preempt_disable(); /* TSC's are per-cpu */
+ preempt_disable();
+ cpu = smp_processor_id();
rdtscl(bclock);
do {
rep_nop();
- rdtscl(now);
- } while ((now-bclock) < loops);
+
+ /* Allow RT tasks to run */
+ preempt_enable();
+ preempt_disable();
+
+ /*
+ * It is possible that we moved to another CPU, and
+ * since TSC's are per-cpu we need to calculate
+ * that. The delay must guarantee that we wait "at
+ * least" the amount of time. Being moved to another
+ * CPU could make the wait longer but we just need to
+ * make sure we waited long enough. Rebalance the
+ * counter for this CPU.
+ */
+ if (unlikely(cpu != smp_processor_id())) {
+ if (loops <= TSC_MIGRATE_COUNT)
+ break;
+ cpu = smp_processor_id();
+ rdtscl(bclock);
+ loops -= TSC_MIGRATE_COUNT;
+ } else {
+ rdtscl(now);
+ if ((now - bclock) >= loops)
+ break;
+ loops -= (now - bclock);
+ bclock = now;
+ }
+ } while (loops > 0);
preempt_enable();
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/