Re: linux-next: Tree for Aug 12

From: John Hubbard
Date: Thu Aug 12 2021 - 20:02:27 EST


On 8/12/21 1:39 AM, Stephen Rothwell wrote:
Hi all,

Changes since 20210811:

The net-next still had its build failure for which I applied a patch.

The drm tree gained a conflict against Linus' tree.

The drm-intel tree gained a conflict against the drm tree.

The drm-msm tree gained a conflict against Linus' tree.

The kvm-arm tree gained a conflict against the arm64 tree.

Non-merge commits (relative to Linus' tree): 6805
6737 files changed, 392726 insertions(+), 154960 deletions(-)


Hi,

This one is producing hard lockups on my Intel 64bit system now. I'll
attempt to bisect it in a few hours (I have to step away briefly), if
it's not already known/solved. (20210810 worked fine for me.)

evm: HMAC attrs: 0x1
Freeing unused kernel image (initmem) memory: 1720K
usb 1-8: new full-speed USB device number 3 using xhci_hcd
usb 1-8: New USB device found, idVendor=1532, idProduct=0043, bcdDevice= 2.00
usb 1-8: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 1-8: Product: Razer DeathAdder Chroma
usb 1-8: Manufacturer: Razer
input: Razer Razer DeathAdder Chroma as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.0/0003:1532:0043.0003/input/input5
hid-generic 0003:1532:0043.0003: input,hidraw2: USB HID v1.11 Mouse [Razer Razer DeathAdder Chroma] on usb-0000:00:14.0-8/input0
input: Razer Razer DeathAdder Chroma Keyboard as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.1/0003:1532:0043.0004/input/input6
input: Razer Razer DeathAdder Chroma as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.1/0003:1532:0043.0004/input/input7
hid-generic 0003:1532:0043.0004: input,hidraw3: USB HID v1.11 Keyboard [Razer Razer DeathAdder Chroma] on usb-0000:00:14.0-8/input1
input: Razer Razer DeathAdder Chroma as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.2/0003:1532:0043.0005/input/input8
hid-generic 0003:1532:0043.0005: input,hidraw4: USB HID v1.11 Keyboard [Razer Razer DeathAdder Chroma] on usb-0000:00:14.0-8/input2
NMI watchdog: Watchdog detected hard LOCKUP on cpu 6
Modules linked in:
irq event stamp: 2711080
hardirqs last enabled at (2711079): [<ffffffff811a97e1>] rcu_idle_exit+0x21/0x30
hardirqs last disabled at (2711080): [<ffffffff8114e826>] do_idle+0x86/0xe0
softirqs last enabled at (2476): [<ffffffff81e002d3>] __do_softirq+0x2d3/0x404
softirqs last disabled at (2465): [<ffffffff8110ebf8>] __irq_exit_rcu+0xa8/0xd0
CPU: 6 PID: 0 Comm: swapper/6 Not tainted 5.14.0-rc5-next-20210812-hubbard-github+ #12
Hardware name: ASUS X299-A/PRIME X299-A, BIOS 3201 09/04/2020
RIP: 0010:rcu_read_lock_sched_held+0xd/0x70
Code: f4 01 41 83 e4 01 44 89 e0 41 5c c3 45 31 e4 44 89 e0 41 5c c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 41 bc 01 00 00 00 <e8> 7e e6 98 00 80
RSP: 0000:ffffc900001fbda0 EFLAGS: 00000083
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000002
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8260b1c8
RBP: ffffffff8260b1c8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88887ff80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000005612001 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
lock_acquire+0x172/0x2b0
? _raw_spin_unlock_irqrestore+0x23/0x40
? hrtimer_get_next_event+0x4f/0x60
tick_nohz_next_event+0x53/0x1f0
? tick_nohz_get_sleep_length+0x6b/0xa0
tick_nohz_get_sleep_length+0x6b/0xa0
menu_select+0x4bd/0x600
cpuidle_idle_call+0xf6/0x1d0
do_idle+0x8d/0xe0
cpu_startup_entry+0x19/0x20
secondary_startup_64_no_verify+0xb0/0xbb
rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
rcu: 6-...!: (0 ticks this GP) idle=bc8/0/0x0 softirq=20/20 fqs=0 (false positive?)
(detected by 0, t=65002 jiffies, g=-1107, q=1569)
Sending NMI from CPU 0 to CPUs 6:
NMI backtrace for cpu 6 skipped: idling at poll_idle+0x93/0xb2
rcu: rcu_sched kthread timer wakeup didn't happen for 64999 jiffies! g-1107 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
rcu: Possible timer handling issue on cpu=6 timer-softirq=11
rcu: rcu_sched kthread starved for 65002 jiffies! g-1107 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=6
rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_sched state:I stack:14744 pid: 13 ppid: 2 flags:0x00004000
Call Trace:
__schedule+0x26d/0x790
schedule+0x59/0xc0
schedule_timeout+0xc4/0x1f0
? _raw_spin_unlock_irqrestore+0x2d/0x40
? __bpf_trace_tick_stop+0x10/0x10
rcu_gp_fqs_loop+0xfa/0x700
rcu_gp_kthread+0x1d3/0x300
? rcu_gp_cleanup+0x610/0x610
kthread+0x12b/0x150
? set_kthread_struct+0x40/0x40
ret_from_fork+0x22/0x30
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 0 to CPUs 6:
NMI backtrace for cpu 6 skipped: idling at poll_idle+0x93/0xb2
rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
rcu: 3-...!: (0 ticks this GP) idle=c7c/0/0x0 softirq=43/43 fqs=0 (false positive?)
rcu: 6-...!: (0 ticks this GP) idle=b88/0/0x0 softirq=20/20 fqs=0 (false positive?)
(detected by 0, t=65002 jiffies, g=-1103, q=1560)
Sending NMI from CPU 0 to CPUs 3:
NMI backtrace for cpu 3
CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.14.0-rc5-next-20210812-hubbard-github+ #12
Hardware name: ASUS X299-A/PRIME X299-A, BIOS 3201 09/04/2020
RIP: 0010:mwait_idle_with_hints.constprop.0+0x4f/0xa0
Code: 48 89 d1 65 48 8b 04 25 40 70 01 00 0f 01 c8 48 8b 00 a8 08 75 14 eb 07 0f 00 2d 3c 53 a4 00 b9 01 00 00 00 48 89 f8 0f 01 c9 <65> 48 8b 04 25 4b
RSP: 0000:ffffc900001e3e80 EFLAGS: 00000046
RAX: 0000000000000020 RBX: 0000000000000003 RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffffffff828d6040 RDI: 0000000000000020
RBP: ffffe8ffff2f0b80 R08: 0000000000000000 R09: 0000000000000018
R10: 0000000000000374 R11: 000000000002a055 R12: 0000000000000003
R13: ffffffff828d6190 R14: 0000000000000003 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88887fec0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000005612001 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
intel_idle+0x1f/0x30
cpuidle_enter_state+0xa5/0x450
cpuidle_enter+0x29/0x40
cpuidle_idle_call+0x12c/0x1d0
do_idle+0x8d/0xe0
cpu_startup_entry+0x19/0x20
secondary_startup_64_no_verify+0xb0/0xbb
Sending NMI from CPU 0 to CPUs 6:
NMI backtrace for cpu 6 skipped: idling at poll_idle+0x93/0xb2
rcu: rcu_sched kthread timer wakeup didn't happen for 65031 jiffies! g-1103 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
rcu: Possible timer handling issue on cpu=6 timer-softirq=11
rcu: rcu_sched kthread starved for 65034 jiffies! g-1103 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=6
rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_sched state:I stack:14744 pid: 13 ppid: 2 flags:0x00004000
Call Trace:
__schedule+0x26d/0x790
schedule+0x59/0xc0
schedule_timeout+0xc4/0x1f0
? _raw_spin_unlock_irqrestore+0x2d/0x40
? __bpf_trace_tick_stop+0x10/0x10
rcu_gp_fqs_loop+0xfa/0x700
rcu_gp_kthread+0x1d3/0x300
? rcu_gp_cleanup+0x610/0x610
kthread+0x12b/0x150
? set_kthread_struct+0x40/0x40
ret_from_fork+0x22/0x30
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 0 to CPUs 6:
NMI backtrace for cpu 6 skipped: idling at poll_idle+0x98/0xb2


thanks,
--
John Hubbard
NVIDIA
----------------------------------------------------------------------------

I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one. You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source. There are also quilt-import.log and merge.log
files in the Next directory. Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig and htmldocs. And finally, a simple boot test
of the powerpc pseries_le_defconfig kernel in qemu (with and without
kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 333 trees (counting Linus' and 90 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next . If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds. And to Paul
Gortmaker for triage and bug fixes.