Re: [RFC PATCH] arm64: defconfig: Disable fine-grained task level IRQ time accounting
From: Vladimir Oltean
Date: Mon Aug 03 2020 - 12:14:31 EST
On Mon, Aug 03, 2020 at 04:47:03PM +0100, Valentin Schneider wrote:
>
> On 03/08/20 16:13, Thomas Gleixner wrote:
> > Vladimir Oltean <olteanv@xxxxxxxxx> writes:
> >>> 1) When irq accounting is disabled, RT throttling kicks in as
> >>> expected.
> >>>
> >>> 2) With irq accounting the RT throttler does not kick in and the RCU
> >>> stall/lockups happen.
> >> What is this telling us?
> >
> > It seems that the fine grained irq time accounting affects the runtime
> > accounting in some way which I haven't figured out yet.
> >
>
> With IRQ_TIME_ACCOUNTING, rq_clock_task() will always be incremented by a
> lesser-or-equal value than when not having the option; you start with the
> same delta_exec but slice some for the IRQ accounting, and leave the rest
> for the rq_clock_task() (+paravirt).
>
> IIUC this means that if you spend e.g. 10% of the time in IRQ and 90% of
> the time running the stress-ng RT tasks, despite having RT tasks hogging
> the entirety of the "available time" it is still only 90% runtime, which is
> below the 95% default and the throttling doesn't happen.
>
> I don't know if considering IRQ time in some way or another in
> sched_rt_runtime_exceeded() really is a way out here.
>
> > Thanks,
> >
> > tglx
>
Not sure it this is relevant at all to the conversation we're having
(sorry if it isn't), but FWIW, on my armv7 board, enabling
CONFIG_IRQ_TIME_ACCOUNTING makes it completely unable to boot (the log
below is just earlyprintk):
Starting kernel ...
[ 0.000000] Booting Linux on physical CPU 0xf00
[ 0.000000] Linux version 5.8.0-rc7-02154-g1a0b51626a14-dirty (tigrisor@skbuf) (arm-linux-gnueabihf-gcc (Linaro GCC 7.3-2018.05) 7.3.1 20180425 [linaro-7.3-2018.05 revision d29120a424ecfbc167ef90065c0eeb7f91977701], GNU ld (Linaro_Binutils-2018.05) 2.28.2.20170706) #383 SMP Mon Aug 3 19:03:50 EEST 2020
[ 0.000000] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c5387d
[ 0.000000] CPU: div instructions available: patching division code
[ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[ 0.000000] OF: fdt: Machine model: NXP LS1021A-TSN Board
[ 0.000000] earlycon: uart8250 at MMIO 0x021c0500 (options '')
[ 0.000000] printk: bootconsole [uart8250] enabled
[ 0.000000] Memory policy: Data cache writealloc
[ 0.000000] efi: UEFI not found.
[ 0.000000] Ignoring RAM at 0xb0000000-0xc0000000
[ 0.000000] Consider using a HIGHMEM enabled kernel.
[ 0.000000] cma: Reserved 64 MiB at 0xac000000
[ 0.000000] Zone ranges:
[ 0.000000] Normal [mem 0x0000000080000000-0x00000000afffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000080000000-0x00000000afffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x00000000afffffff]
[ 0.000000] percpu: Embedded 20 pages/cpu s51660 r8192 d22068 u81920
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 194880
[ 0.000000] Kernel command line: console=ttyS0,115200 earlycon=uart8250,mmio,0x21c0500 root=/dev/mmcblk0p2 rootwait rw
[ 0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, linear)
[ 0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] Memory: 687944K/786432K available (12288K kernel code, 1026K rwdata, 6980K rodata, 1024K init, 279K bss, 32952K reserved, 65536K cma-reserved)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[ 0.000000] rcu: Hierarchical RCU implementation.
[ 0.000000] rcu: RCU event tracing is enabled.
[ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=16 to nr_cpu_ids=2.
[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
[ 0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
[ 0.000000] random: get_random_bytes called from start_kernel+0x36c/0x54c with crng_init=0
[ 0.000000] timer_probe: no matching timers found
[ 0.000000] Console: colour dummy device 80x30
[ 0.000000] sched_clock: 32 bits at 100 Hz, resolution 10000000ns, wraps every 21474836475000000ns
Thanks,
-Vladimir