Re: CONFIG_X86_PM_TIMER is slow

From: Willy Tarreau
Date: Fri Nov 12 2004 - 02:07:32 EST


On Thu, Nov 11, 2004 at 10:16:27PM -0800, dean gaudet wrote:
> On Fri, 12 Nov 2004, Willy Tarreau wrote:
>
> > On Thu, Nov 11, 2004 at 09:52:27PM -0800, dean gaudet wrote:
> > > when using CONFIG_X86_PM_TIMER i'm finding that gettimeofday() calls take
> > > 2.8us on a p-m 1.4GHz box... which is an order of magnitude slower than
> > > TSC-based solutions.
> > >
> > > on one workload i'm seeing a 7% perf improvement by booting "acpi=off" to
> > > force it to use tsc instead of the PM timer... (the workload calls
> > > gettimeofday too frequently, but i can't change that).
> >
> > I did not test, this might be interesting.
> > In fact, what would be very good would be sort of a new select/poll/epoll
> > syscalls with an additional argument, which would point to a structure
> > that the syscall would fill in return with the time of day. This would
> > greatly reduce the number of calls to gettimeofday() in some programs,
> > and make use of the time that was used by the syscall itself.
> >
> > For example, if I could call it like this, it would be really cool :
> >
> > ret = select_absdate(&in, &out, &excp, &date_timeout, &return_date);
>
> but the overhead isn't the syscall :) it's the pm timer itself... the
> program below reads the pm timer the same way the kernel does and you can
> see it takes 2.5us to read it...

Sorry, what I meant is that if the select() did filled the structure itself,
it would avoid a supplementary call (syscall+hw access). And I'm certain
that select() accesses the same information at some time.

> # cc -O2 -g -Wall readport.c -o readport
> # grep PM-Timer /var/log/dmesg
> ACPI: PM-Timer IO Port: 0xd808
> # time ./readport 0xd808 1000000
> ./readport 0xd808 1000000 2.54s user 0.00s system 100% cpu 2.526 total
>
> the gettimeofday overhead is dominated by this i/o...

Indeed, this is much !

> /* It has been reported that because of various broken
> * chipsets (ICH4, PIIX4 and PIIX4E) where the ACPI PM time
> * source is not latched, so you must read it multiple
> * times to insure a safe value is read.
> */
> do {
> v1 = inl(pmtmr_ioport);
> v2 = inl(pmtmr_ioport);
> v3 = inl(pmtmr_ioport);
> } while ((v1 > v2 && v1 < v3) || (v2 > v3 && v2 < v1)
> || (v3 > v1 && v3 < v2));

Just a thought : have you tried to check whether it's the recovery time
after a read or read itself which takes time ? I mean, perhaps one read
would take, say 50 ns, but two back-to-back reads will take 2 us. If
this is the case, having a separate function with only one read for
non-broken chipsets will be better because there might be no particular
reasons to check the counter that often.

Other thought : is it possible to memory-map this timer to avoid the slow
inl() on x86 ?

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/