Re: [PATCH 1/2] sched/deadline: add per rq tracking of admitted bandwidth

From: Juri Lelli
Date: Wed Feb 10 2016 - 11:27:09 EST


On 10/02/16 09:37, Steven Rostedt wrote:
> On Wed, 10 Feb 2016 11:32:58 +0000
> Juri Lelli <juri.lelli@xxxxxxx> wrote:
>
> > Hi,
> >
> > I've updated this patch since, with a bit more testing and talking with
> > Luca in private, I realized that the previous version didn't manage
> > switching back and forth from SCHED_DEADLINE correctly. Thanks a lot
> > Luca for your feedback (even if not visible on the list).
> >
> > I updated the testing branch accordingly and added a test to my tests
> > that stresses switch-in/switch-out.
> >
> > I don't particularly like the fact that we break the scheduling classes
> > abstraction in __dl_overflow(), so I think a little bit of refactoring
> > is still needed. But that can also happen afterwards, if we fix the
> > problem with root domains.
> >
> > Best,
> >
> > - Juri
> >
> > --->8---
> >
> > >From 62f70ca3051672dce209e8355cf5eddc9d825c2a Mon Sep 17 00:00:00 2001
> > From: Juri Lelli <juri.lelli@xxxxxxx>
> > Date: Sat, 6 Feb 2016 12:41:09 +0000
> > Subject: [PATCH 1/2] sched/deadline: add per rq tracking of admitted bandwidth
> >
>
>
> I applied this patch and patch 2 and hit this:
>
> [ 2298.134284] ------------[ cut here ]------------
> [ 2298.138933] WARNING: CPU: 4 PID: 0 at /home/rostedt/work/git/linux-trace.git/kernel/sched/sched.h:735 task_dead_dl+0xc5/0xd0()
> [ 2298.150350] Modules linked in: ip6table_filter ip6_tables ebtable_nat ebtables xt_CHECKSUM iptable_mangle bridge stp llc bluetooth lockd grace snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core x86_pkg_temp_thermal snd_seq vhost_net snd_seq_device tun vhost macvtap macvlan coretemp iTCO_wdt snd_pcm hp_wmi rfkill kvm_intel sparse_keymap iTCO_vendor_support snd_timer snd acpi_cpufreq kvm i2c_i801 mei_me mei soundcore lpc_ich mfd_core irqbypass wmi serio_raw uinput i915 i2c_algo_bit e1000e drm_kms_helper crc32_pclmul ptp crc32c_intel drm pps_core i2c_core video sunrpc
> [ 2298.207495] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.5.0-rc1-test+ #204
> [ 2298.214392] Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
> [ 2298.223371] ffffffff81abc680 ffff880119433d68 ffffffff81411c3f 0000000000000000
> [ 2298.230904] ffffffff81abc680 ffff880119433da0 ffffffff810acf66 ffff88011eb16f40
> [ 2298.238435] ffff88001ee16200 ffffffff81fd4a00 00000000000aaaaa 0000000000000001
> [ 2298.245958] Call Trace:
> [ 2298.248431] [<ffffffff81411c3f>] dump_stack+0x50/0xb1
> [ 2298.253597] [<ffffffff810acf66>] warn_slowpath_common+0x86/0xc0
> [ 2298.259627] [<ffffffff810ad05a>] warn_slowpath_null+0x1a/0x20
> [ 2298.265490] [<ffffffff810f92a5>] task_dead_dl+0xc5/0xd0
> [ 2298.270828] [<ffffffff810d833f>] finish_task_switch+0x16f/0x310
> [ 2298.276871] [<ffffffff810fa7f3>] ? pick_next_task_dl+0xb3/0x250
> [ 2298.282906] [<ffffffff817f07a3>] __schedule+0x3d3/0x9e0
> [ 2298.288252] [<ffffffff817f1001>] schedule+0x41/0xc0
> [ 2298.293242] [<ffffffff817f12c8>] schedule_preempt_disabled+0x18/0x30
> [ 2298.299712] [<ffffffff810fc974>] cpu_startup_entry+0x74/0x4e0
> [ 2298.305573] [<ffffffff8105d16f>] start_secondary+0x2bf/0x330
> [ 2298.311347] ---[ end trace 732d16efabe456f1 ]---
>
> It's the warning you added in __dl_sub_ac().
>

OK. There are still holes where we fail to properly update per-rq bw. It
seems (by running you test) that we fail to move the per-rq bw when we
move the root_domain bw due css_set_move_task(). So, the final
task_dead_dl() tries to remove bw from where there isn't.

I'm trying to see how we can close this hole.

Thanks,

- Juri