Re: SCHED_DEADLINE cpudeadline.{h,c} fixup

From: luca abeni
Date: Tue May 17 2016 - 07:47:09 EST


Hi all,

On Mon, 16 May 2016 18:00:04 +0200
Tommaso Cucinotta <tommaso.cucinotta@xxxxxxxx> wrote:

> Hi,
>
> looking at the SCHED_DEADLINE code, I spotted an opportunity to
> make cpudeadline.c faster, in that we can skip real swaps during
> re-heapify()ication of items after addition/removal. As such ops
> are done under a domain spinlock, it sounded like an interesting
> try.
[...]

I do not know the cpudeadline code too much, but I think every "dl = 0"
looks like a bug... So, I think this hunk actually fixes a real bug:
[...]
- cp->elements[cp->size - 1].dl = 0;
- cp->elements[cp->size - 1].cpu = cpu;
- cp->elements[cpu].idx = cp->size - 1;
- cpudl_change_key(cp, cp->size - 1, dl);
- cpumask_clear_cpu(cpu, cp->free_cpus);
+ cpumask_set_cpu(cpu, cp->free_cpus);
} else {
- cpudl_change_key(cp, old_idx, dl);
+ if (old_idx == IDX_INVALID) {
+ int sz1 = cp->size++;
+ cp->elements[sz1].dl = dl;
[...]

Maybe the "cp->elements[cp->size - 1].dl = 0" ->
"cp->elements[cp->size - 1].dl = 0" change can be split in a separate
patch, which is a bugfix (and IMHO uncontroversial)?


Thanks,
Luca

>
> Indeed, I've got a speed-up of up to ~6% for the cpudl_set() calls
> on a randomly generated workload of 1K,10K,100K random insertions
> and deletions (75% cpudl_set() calls with is_valid=1 and 25% with
> is_valid=0), and randomly generated cpu IDs with 2, 4, ..., 256 CPUs.
> Details in the attached plot.
>
> The attached patch does this, along with a minimum of rework of
> cpudeadline.c internals, and a final clean-up of the cpudeadline.h
> interface (second patch).
>
> The measurements have been made on an Intel Core2 Duo with the CPU
> frequency fixed at max, by letting cpudeadline.c be initialized with
> various numbers of CPUs, then making many calls sequentially, taking
> the rdtsc among calls, then dumping all numbers through printk(),
> and I'm plotting the average of clock ticks between consecutive calls.
> [ I can share the benchmarking code as well if needed ]
>
> Also, this fixes what seems to me a bug I noticed comparing the whole
> heap contents as handledbut the modified code vs the original one,
> insertion by insertion. The problem is in this code:
>
> cp->elements[cp->size - 1].dl = 0;
> cp->elements[cp->size - 1].cpu = cpu;
> cp->elements[cpu].idx = cp->size - 1;
> mycpudl_change_key(cp, cp->size - 1, dl);
>
> when fed by an absolute deadline that is so large to have a negative
> value as a s64. In such a case, as from dl_time_before(), the kernel
> should handle correctly the abs deadline wrap-around, however the
> current code in cpudeadline.c goes mad, and doesn't re-heapify
> correctly the just inserted element... that said, if these are ns,
> such a bug should be hit after a ~292 years of uptime :-D...
>
> I'd be happy to hear comments from others. I can provide additional
> info / make additional experiments as needed.
>
> Please, reply-all to this e-mail, I'm not subscribed to linux-kernel@.
>
> Thanks,
>
> Tommaso