Re: [PATCH 3/3] X86: Add a thread cpu time implementation to vDSO
From: Andy Lutomirski
Date: Wed Dec 10 2014 - 18:06:57 EST
On Wed, Dec 10, 2014 at 2:56 PM, Shaohua Li <shli@xxxxxx> wrote:
> On Wed, Dec 10, 2014 at 02:13:23PM -0800, Andy Lutomirski wrote:
>> On Wed, Dec 10, 2014 at 1:57 PM, Shaohua Li <shli@xxxxxx> wrote:
>> > On Wed, Dec 10, 2014 at 11:10:52AM -0800, Andy Lutomirski wrote:
>> >> On Sun, Dec 7, 2014 at 7:03 PM, Shaohua Li <shli@xxxxxx> wrote:
>> >> > This primarily speeds up clock_gettime(CLOCK_THREAD_CPUTIME_ID, ..). We
>> >> > use the following method to compute the thread cpu time:
>> >>
>> >> I like the idea, and I like making this type of profiling fast. I
>> >> don't love the implementation because it's an information leak (maybe
>> >> we don't care) and it's ugly.
>> >>
>> >> The info leak could be fixed completely by having a per-process array
>> >> instead of a global array. That's currently tricky without wasting
>> >> memory, but it could be created on demand if we wanted to do that,
>> >> once my vvar .fault patches go in (assuming they do -- I need to ping
>> >> the linux-mm people).
>> >
>> > those info leak really doesn't matter.
>>
>> Why not?
>
> Ofcourse I can't make sure completely, but how could this info be used
> as attack?
It may leak interesting timing info, even from cpus that are outside
your affinity mask / cpuset. I don't know how much anyone actually
cares.
>
>> > But we need the global array
>> > anyway. The context switch detection should be per-cpu data and should
>> > be able to access in remote cpus.
>>
>> Right, but the whole array could be per process instead of global.
>>
>> I'm not saying I'm sure that would be better, but I think it's worth
>> considering.
>
> right, it's possible to be per process. As you said, this will waster a
> lot of memory. and you can't even do on-demand, as the context switch
> path will write the count to the per-process/per-thread vvar. Or you can
> maintain the count in kernel and let the .fault copy the count to vvar
> page (if the vvar page is absent). But this still wastes memory if
> applications use the vdso. I'm wondering how you handle page fault in
> context switch too if you don't pin the vdso pages.
>
You need to pin them, but at least you don't need to create them at
all until they're needed the first time.
The totally per-thread approach has all kinds of nice properties,
including allowing the whole thing to work without a loop, at least on
64-bit machines (if you detect that you had a context switch, just
return the most recent sum_exec_runtime).
Anyway, there's no need to achieve perfection here -- we can always
reimplement this if whatever implementation happens first turns out to
be problematic.
--Andy
> Thanks,
> Shaohua
--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/