Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM

From: Luis R. Rodriguez
Date: Mon Sep 22 2014 - 12:37:00 EST


On Mon, Sep 8, 2014 at 7:57 PM, Luis R. Rodriguez
<mcgrof@xxxxxxxxxxxxxxxx> wrote:
>> Why do we care about the priority of probing tasks? Does that
>> actually make any meaningful difference? If so, how?
>
> As I noted before -- I have yet to provide clear metrics but at least
> changing both init paths + probe from finit_module() to kthread
> certainly had a measurable time increase, I suspect using
> queue_work(system_unbound_wq, async_probe_work) will make probe
> slower. I'll get to these metrics this week.

The results are in and I'm glad to report my suspicions were incorrect
about kthread() being slower than queue_work(system_unbound_wq), it
actually works faster. Results will likely vary depending on
subsystems but in this particular case the cxgb4 driver was tested
requiring firmware loading and then without requiring firmware loading
and for these two types of driver loading all mechanisms make probe
take just about the same out of time. What was surprising was that
when firmware loading is required the amount of time it takes to run
probe does vary and quite considerably in terms of microseconds. The
discrepancies are by no means terrible... but should be considered if
one is thinking of large systems and if we do wish to optimize things
further and offer equivalent behavior, specially when probing multiple
devices with the same driver. The method used to collect the amount of
time for probe was to use:

ktime_t calltime, delta, rettime;
calltime = ktime_get();
driver_attach();
rettime = ktime_get();
delta = ktime_sub(rettime, calltime);
duration = (unsigned long long) ktime_to_ns(delta) >> 10;

And then print that time of microsecond out right after it finishes,
whether that be through the default kernel synchronous run or the
async runs.

The collection and testing was then done by Santosh. Details of the
collections are at:

https://bugzilla.novell.com/show_bug.cgi?id=877622

The summary:

The driver actually probed 2 cards in the tests so we don't have
results for 1 card, the kernel serially calls probe for each device so
to get the amount of time for one run lets just divide the results by
2. For each strategy there is the requirement of using firmware and a
run where no firmware loading is required. The results for both cards
are:

=====================================================================|
strategy fw (usec) no-fw (usec) |
---------------------------------------------------------------------|
synchronous 48945138 2615126 |
kthread 50132831 2619737 |
queue_work(system_unbound_wq) 49827323 2615262 |
---------------------------------------------------------------------|

For one device then that comes out to:

=====================================================================|
strategy fw (usec) no-fw (usec) |
---------------------------------------------------------------------|
synchronous 24472569 1307563 |
kthread 25066415.5 1309868.5 |
queue_work(system_unbound_wq) 24913661.5 1307631 |
---------------------------------------------------------------------|

Converting that to seconds:

=====================================================================|
strategy fw (s) no-fw (s) |
---------------------------------------------------------------------|
synchronous 24.47 1.31 |
kthread 25.07 1.31 |
queue_work(system_unbound_wq) 24.91 1.31 |
---------------------------------------------------------------------|

Graph friendly versions of the results for probe of 1 device:

Probe with firmware:

http://drvbp1.linux-foundation.org/~mcgrof/images/probe-measurements/probe-cgxb4-firmware.png

Probe without firmware:

http://drvbp1.linux-foundation.org/~mcgrof/images/probe-measurements/probe-cgxb4-no-firmware.png

Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/