Re: RCU explosion on ARM Integrator

From: Paul E. McKenney
Date: Fri Sep 11 2015 - 13:29:21 EST


On Fri, Sep 11, 2015 at 01:24:56PM +0200, Linus Walleij wrote:
> Hi RCU folks,
>
> this happened to me when running the iozone throughput benchmark
> on the ARM Integrator, I wonder if I should take this platform for a ride on
> the RCU torture test or similar? Looks a bit instable :/

You got a pagefault in rcu_check_callbacks(). Congratulations, that -is-
an accomplishment! ;-)

I haven't seen anything like this recently.

Is this reproducible? If so, and if it was stable on some previous
release, a bisection would be helpful.

Otherwise, it looks like this blew up just after returning from a
function call. If you could map back to the source code, let me know what
version you are running, send me a disassembly of rcu_check_callbacks(),
and supply a .config, I can take a look and see if I can provide any
additional information. Or, for that matter, a fix.

On the other hand, if this is a new port, things to be suspicious of
include correct masking of interrupts, consistent reporting of the number
of CPUs, and of course memory mapping.

Thanx, Paul

> Yours,
> Linus Walleij
>
> root@integrator:/ iozone -az -i0 -i1 -i2 -s 20m -I -f /mnt/foo.test
> Iozone: Performance Test of File I/O
> Version $Revision: 3.430 $
> Compiled for 32 bit mode.
> Build: linux-arm
>
> Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
> Al Slater, Scott Rhine, Mike Wisner, Ken Goss
> Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
> Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
> Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
> Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
> Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
> Vangel Bojaxhi, Ben England, Vikentsi Lapa.
>
> Run began: Thu Jan 1 01:16:39 1970
>
> Auto Mode
> Cross over of record size disabled.
> File size set to 20480 kB
> O_DIRECT feature enabled
> Command line used: iozone -az -i0 -i1 -i2 -s 20m -I -f /mnt/foo.test
> Output is in kBytes/sec
> Time Resolution = 0.000016 seconds.
> Processor cache size set to 1024 kBytes.
> Processor cache line size set to 32 bytes.
> File stride size set to 17 * record size.
> random
> random bkwd record stride
> kB reclen write rewrite read reread read
> write read rewrite read fwrite frewrite fread
> freread
> 20480 4 54 56 57 57 57
> 24
> 20480 8 56 57 58 58 58
> 34
> 20480 16 56 58 59 59Unable to
> handle kernel paging request at virtual address 807b7cac
> pgd = c6404000
> [807b7cac] *pgd=00000000
> Internal error: Oops: 5 [#1] PREEMPT ARM
> Modules linked in:
> CPU: 0 PID: 110 Comm: iozone Not tainted 4.2.0-11142-gb0a1ea51bda4-dirty #3
> Hardware name: ARM Integrator/AP (Device Tree)
> task: c6b45540 ti: c6420000 task.ti: c6420000
> PC is at rcu_check_callbacks+0x318/0x850
> LR is at rcu_check_callbacks+0x310/0x850
> pc : [<c00546a8>] lr : [<c00546a0>] psr: 60000093
> sp : c64218c0 ip : c07b8038 fp : c07b8920
> r10: 807b7ca8 r9 : 00000001 r8 : c07b79f8
> r7 : c07b2110 r6 : c07b2118 r5 : c07b7ca8 r4 : c07b8004
> r3 : c07b8038 r2 : c07b8038 r1 : c07b8004 r0 : 00000000
> Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
> Control: 0005317f Table: 06404000 DAC: 00000051
> Process iozone (pid: 110, stack limit = 0xc6420190)
> Stack: (0xc64218c0 to 0xc6422000)
> 18c0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 18e0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1900: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1920: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1940: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1960: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1980: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 19a0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 19c0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 19e0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1a00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1a20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1a40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1a60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1a80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1aa0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1ac0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1ae0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1b00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1b20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1b40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1b60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1b80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1ba0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1bc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1be0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1c00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1c20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1c40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1c60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1c80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1ca0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1cc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1ce0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1d00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1d20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1d40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1d60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1d80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1da0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1dc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1de0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1e00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1e20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1e40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1e60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1e80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1ea0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1ec0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1ee0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1f00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1f20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1f40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1f60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1f80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1fa0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1fe0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> [<c00546a8>] (rcu_check_callbacks) from [<c00574d4>]
> (update_process_times+0x38/0x60)
> [<c00574d4>] (update_process_times) from [<c0065e74>]
> (tick_sched_timer+0x4c/0x98)
> [<c0065e74>] (tick_sched_timer) from [<c0057e60>]
> (__hrtimer_run_queues.constprop.36+0x124/0x1d4)
> [<c0057e60>] (__hrtimer_run_queues.constprop.36) from [<c0058364>]
> (hrtimer_interrupt+0x9c/0x270)
> [<c0058364>] (hrtimer_interrupt) from [<c0286204>]
> (integrator_timer_interrupt+0x20/0x2c)
> [<c0286204>] (integrator_timer_interrupt) from [<c004abc4>]
> (handle_irq_event_percpu+0x78/0x144)
> [<c004abc4>] (handle_irq_event_percpu) from [<c004ace4>]
> (handle_irq_event+0x54/0x8c)
> [<c004ace4>] (handle_irq_event) from [<c004d600>] (handle_level_irq+0xdc/0x168)
> [<c004d600>] (handle_level_irq) from [<c004a348>] (generic_handle_irq+0x2c/0x40)
> [<c004a348>] (generic_handle_irq) from [<c004a580>]
> (__handle_domain_irq+0x5c/0xd4)
> [<c004a580>] (__handle_domain_irq) from [<c0009548>] (fpga_handle_irq+0x84/0xc4)
> [<c0009548>] (fpga_handle_irq) from [<c000de04>] (__irq_svc+0x44/0x78)
> Exception stack(0xc6421a68 to 0xc6421ab0)
> 1a60: ???????? ???????? ???????? ???????? ???????? ????????
> 1a80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
> 1aa0: ???????? ???????? ???????? ????????
> [<c000de04>] (__irq_svc) from [<c00b182c>]
> (__slab_alloc.isra.90.constprop.92+0x124/0x2ec)
> [<c00b182c>] (__slab_alloc.isra.90.constprop.92) from [<c00b1c48>]
> (kmem_cache_alloc+0xf0/0x130)
> [<c00b1c48>] (kmem_cache_alloc) from [<c0077600>] (mempool_alloc+0x44/0x1b8)
> [<c0077600>] (mempool_alloc) from [<c0186290>] (bio_alloc_bioset+0x128/0x214)
> [<c0186290>] (bio_alloc_bioset) from [<c018673c>] (bio_clone_bioset+0xf4/0x2cc)
> [<c018673c>] (bio_clone_bioset) from [<c0190b50>] (blk_queue_split+0x1a4/0x438)
> [<c0190b50>] (blk_queue_split) from [<c018ca10>] (blk_queue_bio+0x28/0x284)
> [<c018ca10>] (blk_queue_bio) from [<c018adf8>] (generic_make_request+0xb8/0xdc)
> [<c018adf8>] (generic_make_request) from [<c018ae9c>] (submit_bio+0x80/0x16c)
> [<c018ae9c>] (submit_bio) from [<c00f04a4>] (__blockdev_direct_IO+0x11e4/0x1a4c)
> [<c00f04a4>] (__blockdev_direct_IO) from [<c011ce0c>] (ext2_direct_IO+0x54/0x94)
> [<c011ce0c>] (ext2_direct_IO) from [<c007642c>]
> (generic_file_direct_write+0x94/0x1cc)
> [<c007642c>] (generic_file_direct_write) from [<c0076618>]
> (__generic_file_write_iter+0xb4/0x21c)
> [<c0076618>] (__generic_file_write_iter) from [<c0076888>]
> (generic_file_write_iter+0x108/0x2a0)
> [<c0076888>] (generic_file_write_iter) from [<c00b7934>] (__vfs_write+0xb0/0xe4)
> [<c00b7934>] (__vfs_write) from [<c00b8160>] (vfs_write+0x90/0x164)
> [<c00b8160>] (vfs_write) from [<c00b89bc>] (SyS_write+0x44/0x9c)
> [<c00b89bc>] (SyS_write) from [<c000a540>] (ret_fast_syscall+0x0/0x38)
> Code: bad PC value
> ---[ end trace 59d7580d1dfe574e ]---
> Kernel panic - not syncing: Fatal exception in interrupt
> ---[ end Kernel panic - not syncing: Fatal exception in interrupt
>
>
> Yours,
> Linus Walleij
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/