Re: [PATCH 0/6][RFC] Rework vsyscall to avoid truncation/roundingissue in timekeeping core

From: John Stultz
Date: Tue Sep 18 2012 - 14:31:39 EST


On 09/18/2012 11:02 AM, Richard Cochran wrote:
On Mon, Sep 17, 2012 at 05:20:41PM -0700, John Stultz wrote:
On 09/17/2012 04:49 PM, Andy Lutomirski wrote:
2. There's nothing vsyscall-specific about the code in
vclock_gettime.c. In fact, the VVAR macro should work just fine in
kernel code. If you moved all this code into a header, then in-kernel
uses could use it, and maybe even other arches could use it. Last
time I checked, it seemed like vclock_gettime was considerably faster
than whatever the in-kernel equivalent did.
I like the idea of unifying the implementations, but I'd want to
know more about why vclock_gettime was faster then the in-kernel
getnstimeofday(), since it might be due to the more limited locking
(we only update vsyscall data under the vsyscall lock, where as the
timekeeper lock is held for the entire execution of
update_wall_time()), or some of the optimizations in the vsyscall
code is focused on providing timespecs to userland, where as
in-kernel we also have to provide ktime_ts.
This there a valid technical reason why each arch has its own vdso
implementation?
I believe its mostly historical, but on some architectures that history has become an established ABI, making it technical.

powerpc, for example exports timekeeping data at a specific address, and the code logic to use that data is in userland libraries, outside of kernel control. ia64 uses a fsyscall method, which is (to my understanding) a mode that allows limited access to kernel data from userland, but restricts what instructions can be used, requiring it to be hand written in asm.

Now, x86_64 too had its own magic vsyscall address that was hard coded, but Andy did some very cool work allowing that to bounce to the normal syscall for compatability, allowing the nicer vdso method to be used. It may be that such a vdso method could be introduced and migrated to on these other arches, but we'd still have to preserve the existing ABI as well (and in cases like ppc, that preservation would be just as complicated as it is now).

If not, I would suggest that the first step would be to refactor these
into one C-language header. If this can be shared with kernel code,
then all the better.

It would make it a lot easier to fix the leap second thing, too.
Indeed, it would be nice. Tweaking the ia64 fsyscall isn't anything I look forward to. :)

But such heavy lifting will likely need to be done by arch maintainers. That's why with this patchset I preserve the existing method, but make it clear its deprecated and allow arches that don't need the old method to avoid the extra overhead caused by the additional rounding fix. Then those arches can migrate when they can, rather then having to block change on everyone conforming to a new standard.

thanks
-john

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/