RE: [patch 1/5] x86, bts: detect size of DS fields
From: Metzger, Markus T
Date:  Wed Mar 18 2009 - 14:09:25 EST
>-----Original Message-----
>From: Metzger, Markus T
>Sent: Tuesday, March 17, 2009 1:58 PM
>To: Ingo Molnar
Ingo, Stephane,
>>That's not really a good solution - GFP_ATOMIC is not a reliable
>>form of allocation.
>
>In that case, I would need to double the interface. So far, I used
>task == NULL to indicate tracing on the current cpu. I could turn
>this into:
>  ds_request_bts_task(struct task_struct *task, ....) and
>  ds_request_bts_cpu(int cpu, ....)
>
>This way, I could do the allocation using GFP_KERNEL and then
>do the wrmsrl() to enable tracing using smp_call_function().
I added the above _task and _cpu variants to ds_request.
Ds_whatever() calls require interrupts disabled for the existing
functions. I added _noirq variants for most functions that may be
called with interrupts disabled as long as they are called on the
cpu they affect.
I added a few WARN_ON(irqs_disabled()), and I have not seen one, yet.
Ds_request_~() must be called with interrupts enabled, since it
needs to allocate memory.
Stephane, is that OK for perfmon?
I will add a few more selftests to cover the _noirq path, as well.
I should be able to send the patches in a few days.
Meanwhile, I will send the patch to turn GFP_KERNEL into GFP_ATOMIC.
The other patch will undo those changes.
>>the other callsites are buggy too:
>>
>>                smp_call_function_single(cpu, bts_trace_start_cpu, NULL, 1);
>>
>>done under the bts_tracer_lock in addition to an atomic IPI
>>context.
>
>That lock synchronizes the on_each_cpu initialization calls with the
>hotplug handler.
>The for_each_online_cpu iteration in smp_call_function_many() may
>race with the smp_call_function_single() when a new cpu arrives or
>departs.
I'm not sure I understood you correctly.
I interpreted your comment that the lock is unnecessary since we run with
interrupts disabled, anyway.
Please let me know if I understood you wrong.
>>for_each_online_cpu() done under the mutex would be better i
>>guess, plus you can allocate any memory before you do the SMP
>>cross-call, and pass it to the IPI handler via the data
>>parameter. (NULL in the sequence above)
>
>The memory is allocated by ds_request_bts(). It holds the tracer
>struct returned to the caller. The struct is private to ds.c.
That's almost how it now is.
Ds_whatever() calls can now be made from any cpu; ds.c will take care
of running on the correct cpu when writing msr's (I use wrmsr_on_cpu()).
Ds_request() still allocates memory, though.
The hw-branch-tracer will now use for_each_online_cpu() loops for
almost everything.
The ftrace framework unfortunately adds the cpu id automatically, so
I still need to do on_each_cpu() to collect the trace.
regards,
markus.
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/