[PATCH v1 0/1] fix for spin lock bad magic on SiFive UART

From: Sagar Shrikant Kadam
Date: Sat May 09 2020 - 06:24:48 EST


A bad magic spin lock error was observed on HiFive Unleashed
running 5.7-rc1+ linux kernel. The discussion is available here [1].
Thanks to Atish Patra <Atish.Patra@xxxxxxx> for reporting it.

Although spin locks are used within the driver to guard the critical
sections, we missed to initialise it this resulted in race condition
and raised a spin lock error as shown in log below.

This happens at the time of hand-off between sifive serial and earlycon
driver (one is enabled and another is disabled) on any random CPU.

[1] https://lore.kernel.org/linux-riscv/b9fe49483a903f404e7acc15a6efbef756db28ae.camel@xxxxxxx

Error Log:
=============================================================
OpenSBI v0.7-31-gd626037
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|____/_____|
| |
|_|

Platform Name : SiFive Freedom U540
Platform HART Features : RV64ACDFIMSU
Platform HART Count : 4
Current HART ID : 2
Firmware Base : 0x80000000
Firmware Size : 100 KB
Runtime SBI Version : 0.2

MIDELEG : 0x0000000000000222
MEDELEG : 0x000000000000b109
PMP0 : 0x0000000080000000-0x000000008001ffff (A)
PMP1 : 0x0000000000000000-0x0000007fffffffff (A,R,W,X)


U-Boot 2020.07-rc1-00201-g24e3f96-dirty (May 05 2020 - 05:44:09 -0700)

CPU: rv64imafdc
Model: SiFive HiFive Unleashed A00
DRAM: 8 GiB
MMC: spi@10050000:mmc@0: 0
In: serial@10010000
Out: serial@10010000
Err: serial@10010000
Net: eth0: ethernet@10090000
Hit any key to stop autoboot: 0
=>
=> setenv bootargs "root=/dev/ram rw console=ttySIF0 earlycon=sbi"
=> mmc rescan ; fatload mmc 0:2 0x80200000 uImage
=> fatload mmc 0:2 0x82200000 hifive-unleashed-a00.dtb
=> bootm 0x80200000 - 0x82200000
21413948 bytes read in 11427 ms (1.8 MiB/s)
6987 bytes read in 12 ms (568.4 KiB/s)
## Booting kernel from Legacy Image at 80200000 ...
Image Name: Linux
Image Type: RISC-V Linux Kernel Image (uncompressed)
Data Size: 21413884 Bytes = 20.4 MiB
Load Address: 80200000
Entry Point: 80200000
Verifying Checksum ... OK
## Flattened Device Tree blob at 82200000
Booting using the fdt blob at 0x82200000
Loading Kernel Image
Using Device Tree in place at 0000000082200000, end 0000000082204b4a

Starting kernel ...

[ 0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x80200000
[ 0.000000] Linux version 5.7.0-rc4-00001-g3742e19-dirty (sagark2@xxxxxxxxxx) (gcc version 8.2.0 (Buildroot 2018.11-rc2-00003-ga0787e9), GNU ld (GNU Binutils) 2.31.1) #4 SMP Sat May 9 01:40:31 PDT 2020
[ 0.000000] earlycon: sbi0 at I/O port 0x0 (options '')
[ 0.000000] printk: bootconsole [sbi0] enabled
[ 0.000000] initrd not found or empty - disabling initrd
[ 0.000000] Zone ranges:
[ 0.000000] DMA32 [mem 0x0000000080200000-0x00000000ffffffff]
[ 0.000000] Normal [mem 0x0000000100000000-0x000000027fffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000080200000-0x000000027fffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000080200000-0x000000027fffffff]
[ 0.000000] software IO TLB: mapped [mem 0xfbfff000-0xfffff000] (64MB)
[ 0.000000] SBI specification v0.2 detected
[ 0.000000] SBI implementation ID=0x1 Version=0x7
[ 0.000000] SBI v0.2 TIME extension detected
[ 0.000000] SBI v0.2 IPI extension detected
[ 0.000000] SBI v0.2 RFENCE extension detected
[ 0.000000] SBI v0.2 HSM extension detected
[ 0.000000] CPU with hartid=0 is not available
[ 0.000000] CPU with hartid=0 is not available
[ 0.000000] elf_hwcap is 0x112d
[ 0.000000] percpu: Embedded 17 pages/cpu s31976 r8192 d29464 u69632
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 2067975
[ 0.000000] Kernel command line: root=/dev/ram rw console=ttySIF0 earlycon=sbi
[ 0.000000] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes, linear)
[ 0.000000] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes, linear)
[ 0.000000] Sorting __ex_table...
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] Memory: 8172080K/8386560K available (6403K kernel code, 4268K rwdata, 4096K rodata, 4599K init, 317K bss, 214480K reserved, 0K cma-reserved)
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] fixmap : 0xffffffcefee00000 - 0xffffffceff000000 (2048 kB)
[ 0.000000] pci io : 0xffffffceff000000 - 0xffffffcf00000000 ( 16 MB)
[ 0.000000] vmemmap : 0xffffffcf00000000 - 0xffffffcfffffffff (4095 MB)
[ 0.000000] vmalloc : 0xffffffd000000000 - 0xffffffdfffffffff (65535 MB)
[ 0.000000] lowmem : 0xffffffe000000000 - 0xffffffe1ffe00000 (8190 MB)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[ 0.000000] rcu: Hierarchical RCU implementation.
[ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=4.
[ 0.000000] rcu: RCU debug extended QS entry/exit.
[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[ 0.000000] NR_IRQS: 0, nr_irqs: 0, preallocated irqs: 0
[ 0.000000] plic: mapped 53 interrupts with 4 handlers for 9 contexts.
[ 0.000000] riscv_timer_init_dt: Registering clocksource cpuid [0] hartid [2]
[ 0.000000] clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycles: 0x1d854df40, max_idle_ns: 3526361616960 ns
[ 0.000006] sched_clock: 64 bits at 1000kHz, resolution 1000ns, wraps every 2199023255500ns
[ 0.008556] Console: colour dummy device 80x25
[ 0.012994] Calibrating delay loop (skipped), value calculated using timer frequency.. 2.00 BogoMIPS (lpj=4000)
[ 0.023105] pid_max: default: 32768 minimum: 301
[ 0.028305] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[ 0.035768] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[ 0.045464] rcu: Hierarchical SRCU implementation.
[ 0.050227] smp: Bringing up secondary CPUs ...
[ 0.056194] smp: Brought up 1 node, 4 CPUs
[ 0.060957] devtmpfs: initialized
[ 0.065489] random: get_random_u32 called from bucket_table_alloc.isra.25+0x4e/0x160 with crng_init=0
[ 0.065876] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 0.084229] futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
[ 0.091920] NET: Registered protocol family 16
[ 0.112744] vgaarb: loaded
[ 0.115132] SCSI subsystem initialized
[ 0.119044] usbcore: registered new interface driver usbfs
[ 0.124199] usbcore: registered new interface driver hub
[ 0.129612] usbcore: registered new device driver usb
[ 0.135587] clocksource: Switched to clocksource riscv_clocksource
[ 0.149008] NET: Registered protocol family 2
[ 0.153406] tcp_listen_portaddr_hash hash table entries: 4096 (order: 5, 163840 bytes, linear)
[ 0.161944] TCP established hash table entries: 65536 (order: 7, 524288 bytes, linear)
[ 0.171377] TCP bind hash table entries: 65536 (order: 9, 2097152 bytes, linear)
[ 0.185365] TCP: Hash tables configured (established 65536 bind 65536)
[ 0.191736] UDP hash table entries: 4096 (order: 6, 393216 bytes, linear)
[ 0.199410] UDP-Lite hash table entries: 4096 (order: 6, 393216 bytes, linear)
[ 0.207498] NET: Registered protocol family 1
[ 0.211846] RPC: Registered named UNIX socket transport module.
[ 0.217172] RPC: Registered udp transport module.
[ 0.221938] RPC: Registered tcp transport module.
[ 0.226712] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ 0.233231] PCI: CLS 0 bytes, default 64
[ 0.585600] workingset: timestamp_bits=62 max_order=21 bucket_order=0
[ 0.600796] NFS: Registering the id_resolver key type
[ 0.605212] Key type id_resolver registered
[ 0.609418] Key type id_legacy registered
[ 0.613510] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
[ 0.620434] 9p: Installing v9fs 9p2000 file system support
[ 0.626244] NET: Registered protocol family 38
[ 0.630401] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
[ 0.637816] io scheduler mq-deadline registered
[ 0.642413] io scheduler kyber registered
[ 0.701522] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 0.708407] 10010000.serial: ttySIF0 at MMIO 0x10010000 (irq = 1, base_baud = 0) is a SiFive UART v0
[ 0.716866] BUG: spinlock bad magic on CPU#3, swapper/0/1
[ 0.722317] lock: 0xffffffe1f63ae018, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
[ 0.730564] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc4-00001-g3742e19-dirty #4
[ 0.738632] Call Trace:
[ 0.741160] [<ffffffe0006024c6>] walk_stackframe+0x0/0xa4
[ 0.746624] [<ffffffe0006026ac>] show_stack+0x2a/0x34
[ 0.751753] [<ffffffe000869f68>] dump_stack+0x6a/0x84
[ 0.756877] [<ffffffe000643060>] spin_dump+0x68/0x74
[ 0.761906] [<ffffffe000642c80>] do_raw_spin_lock+0xb0/0xcc
[ 0.767555] [<ffffffe000c3f528>] _raw_spin_lock_irqsave+0x20/0x2c
[ 0.773714] [<ffffffe0008d9aec>] uart_add_one_port+0x2f6/0x406
[ 0.779617] [<ffffffe0008e3974>] sifive_serial_probe+0x184/0x270
[ 0.785703] [<ffffffe000a1323c>] platform_drv_probe+0x32/0x5e
[ 0.791512] [<ffffffe000a11a2e>] really_probe+0x9a/0x21e
[ 0.796895] [<ffffffe000a11cb4>] driver_probe_device+0x2e/0x88
[ 0.802799] [<ffffffe000a11e86>] device_driver_attach+0x4c/0x50
[ 0.808789] [<ffffffe000a11ec4>] __driver_attach+0x3a/0xac
[ 0.814346] [<ffffffe000a1002a>] bus_for_each_dev+0x4a/0x72
[ 0.819989] [<ffffffe000a1153a>] driver_attach+0x1a/0x22
[ 0.825372] [<ffffffe000a1106c>] bus_add_driver+0x192/0x198
[ 0.831016] [<ffffffe000a123b2>] driver_register+0x3a/0xd0
[ 0.836573] [<ffffffe000a13202>] __platform_driver_register+0x3a/0x42
[ 0.843087] [<ffffffe00001749c>] sifive_serial_init+0x30/0x50
[ 0.848906] [<ffffffe0006000d0>] do_one_initcall+0x50/0x15a
[ 0.854545] [<ffffffe000001c80>] kernel_init_freeable+0x16a/0x1de
[ 0.860709] [<ffffffe000c3ad3c>] kernel_init+0x12/0x118
[ 0.866007] [<ffffffe00060120a>] ret_from_exception+0x0/0xc
[ 0.871690] printk: console [ttySIF0] enabled
[ 0.871690] printk: console [ttySIF0] enabled
[ 0.880442] printk: bootconsole [sbi0] disabled
[ 0.880442] printk: bootconsole [sbi0] disabled
[ 0.889866] 10011000.serial: ttySIF1 at MMIO 0x10011000 (irq = 10, base_baud = 0) is a SiFive UART v0
[ 0.899328] [drm] radeon kernel modesetting enabled.
[ 0.918039] loop: module loaded
[ 0.921025] sifive_spi 10040000.spi: mapped; irq=12, cs=1

The kernel doesn't hang and boots to prompt.

===============================================================
Adding the missing spin_lock_init fixes the spinlock error. With the patch
the serial port is tested with basic sanity checks of baudrate change,
receive and transmit using:

Change baud rate : stty -F /dev/ttySIF0 9600
Recive mode : cat /dev/ttySIF0
Transmit mode : echo "Hello" > /dev/ttySIF0

Some of the relevant working boot log:
===============================================================
[ 0.697617] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 0.704441] 10010000.serial: ttySIF0 at MMIO 0x10010000 (irq = 1, base_baud = 0) is a SiFive UART v0
[ 0.712916] printk: console [ttySIF0] enabled
[ 0.712916] printk: console [ttySIF0] enabled
[ 0.721663] printk: bootconsole [sbi0] disabled
[ 0.721663] printk: bootconsole [sbi0] disabled
[ 0.731077] 10011000.serial: ttySIF1 at MMIO 0x10011000 (irq = 10, base_baud = 0) is a SiFive UART v0

Change History:

V1 : Incorporated suggestions
- Initialised spin lock for sifive console as suggested.
- Updated reference link of reported issue from lore.kernel.org both in
cover-letter and in patch.

V0 : Base patch.

Sagar Shrikant Kadam (1):
tty: serial: add missing spin_lock_init for SiFive serial console

drivers/tty/serial/sifive.c | 1 +
1 file changed, 1 insertion(+)

--
2.7.4