Re: [GIT PULL] Throtl bug (was Re: [origin tree boot failure] Re:[GIT PULL] core block bits for 2.6.37-rc1)
From: Maxim Levitsky
Date: Sat Oct 23 2010 - 16:33:25 EST
On Sat, 2010-10-23 at 20:43 +0200, Jens Axboe wrote:
> On 2010-10-23 20:21, Ingo Molnar wrote:
> >
> > * Jens Axboe <jaxboe@xxxxxxxxxxxx> wrote:
> >
> >>> Looks like a fairly straight forward case of uninitialized memory and
> >>> blk_sync_queue() -> throtl_shutdown_timer() -> cancel_delayed_work_sync().
> >>>
> >>> Will get that fixed up.
> >>
> >> It frees q->td in blk_cleanup_queue(), but doesn't clear q->td. When the final put
> >> happens, blk_sync_queue() is called and then ends up doing the
> >> cancel_delayed_work_sync() on freed memory.
> >>
> >> Two possible fixes:
> >>
> >> - Clear ->td when the queue is goin dead. May require other ->td == NULL
> >> checks in the code, so I opted for:
> >>
> >> - Move the free to when the queue is really going away, post doing the
> >> blk_sync_queue() call.
> >>
> >> The below should fix it.
> >>
> >> Signed-off-by: Jens Axboe <jaxboe@xxxxxxxxxxxx>
> >
> > This did the trick, thanks Jens!
>
> Great, thanks for testing/reporting! I added your reported/tested-by.
>
> Linus, please pull this single fix, better get this out the door since
> I'll be travelling very shortly.
>
>
> git://git.kernel.dk/linux-2.6-block.git for-2.6.37/core
>
> Jens Axboe (1):
> block: fix use-after-free bug in blk throttle code
>
> block/blk-core.c | 2 --
> block/blk-sysfs.c | 2 ++
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
I have here very similar bug.
Must have been caused by this patch series.
I pulled that tree, but that didn't affect anything.
System oopses/panics on removal of any hotplugable device.
(reproduced with xD, MemoryStick, and USB mass storage).
Here is backtrace for MemoryStick card:
<6>[ 24.138665] r592: IRQ: card removed
<1>[ 24.228293] BUG: unable to handle kernel NULL pointer dereference at 00000000000001f8
<1>[ 24.228966] IP: [<00000000000001f8>] 0x1f8
<4>[ 24.230739] PGD 0
<0>[ 24.231182] Oops: 0010 [#1] PREEMPT SMP
<0>[ 24.231182] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda3/alignment_offset
<4>[ 24.231182] CPU 1
<4>[ 24.231182] Modules linked in: dm_crypt firewire_net usb_storage usb_libusual cpufreq_powersave cpufreq_conservative cpufreq_userspace uvcvideo videodev v4l2_compat_ioctl32 acpi_cpufreq iwl3945 iwlcore snd_hda_codec_realtek mac80211 mperf r852 iTCO_wdt coretemp uhci_hcd sm_common ir_lirc_codec mspro_block snd_hda_intel ms_block ehci_hcd sdhci_pci lirc_dev joydev sbp2 nand snd_hda_codec cfg80211 firewire_ohci sdhci ir_sony_decoder ieee1394 nand_ids usbcore r592 ir_jvc_decoder snd_hwdep mmc_core nand_ecc ir_rc6_decoder ene_ir snd_pcm tg3 ir_rc5_decoder firewire_core mtd battery memstick ac ir_nec_decoder psmouse snd_page_alloc libphy sunrpc ir_core sg evdev serio_raw dm_mirror dm_region_hash dm_log dm_mod nouveau ttm drm_kms_helper drm i2c_algo_bit thermal video
<4>[ 32.881606]
<4>[ 32.881606] Pid: 543, comm: kworker/u:4 Not tainted 2.6.36+ #191 Nettiling/Aspire 5720
<4>[ 32.881606] RIP: 0010:[<00000000000001f8>] [<00000000000001f8>] 0x1f8
<4>[ 32.881606] RSP: 0018:ffff880037a03ab8 EFLAGS: 00010086
<4>[ 32.881606] RAX: ffff88007c0ebc00 RBX: ffff880037af9470 RCX: 0000000000000000
<4>[ 32.881606] RDX: 0000000000000019 RSI: 0000000000000001 RDI: ffff880037af9470
<4>[ 32.881606] RBP: ffff880037a03ad0 R08: 0000000000000000 R09: 0000000000000001
<4>[ 32.881606] R10: 00000000000002f0 R11: 0000000000000000 R12: ffff880037af9470
<4>[ 32.881606] R13: ffff880075d6a870 R14: ffff880075bfb560 R15: 0000000000000282
<4>[ 32.881606] FS: 0000000000000000(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
<4>[ 32.881606] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>[ 32.881606] CR2: 00000000000001f8 CR3: 000000007a046000 CR4: 00000000000006e0
<4>[ 32.881606] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[ 32.881606] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>[ 32.881606] Process kworker/u:4 (pid: 543, threadinfo ffff880037a02000, task ffff88007c5b0000)
<0>[ 32.881606] Stack:
<4>[ 32.881606] ffffffff811c42a2 ffff880037a03af0 ffff880037af9470 ffff880037a03af0
<4>[ 32.881606] <0> ffffffff811c525a ffff880077250040 ffff880077250040 ffff880037a03b10
<4>[ 32.881606] <0> ffffffff811cebb2 ffff880075d6a800 ffff880075d6a8a8 ffff880037a03b30
<0>[ 32.881606] Call Trace:
<4>[ 32.881606] [<ffffffff811c42a2>] ? elv_drain_elevator+0x22/0x70
<4>[ 32.881606] [<ffffffff811c525a>] elv_quiesce_start+0x3a/0xc0
<4>[ 32.881606] [<ffffffff811cebb2>] disk_replace_part_tbl+0x42/0x70
<4>[ 32.881606] [<ffffffff811cec63>] disk_release+0x23/0x50
<4>[ 32.881606] [<ffffffff81273c42>] device_release+0x22/0x90
<4>[ 32.881606] [<ffffffff811daced>] kobject_release+0x8d/0x1a0
<4>[ 32.881606] [<ffffffff811dac60>] ? kobject_release+0x0/0x1a0
<4>[ 32.881606] [<ffffffff811dc257>] kref_put+0x37/0x70
<4>[ 32.881606] [<ffffffff811dab67>] kobject_put+0x27/0x60
<4>[ 32.881606] [<ffffffff811cef42>] put_disk+0x12/0x20
<4>[ 32.881606] [<ffffffffa0627663>] mspro_block_disk_release+0xa3/0xb0 [mspro_block]
<4>[ 32.881606] [<ffffffffa062773d>] mspro_block_remove+0xcd/0x140 [mspro_block]
<4>[ 32.881606] [<ffffffffa01d42b5>] memstick_device_remove+0x35/0x60 [memstick]
<4>[ 32.881606] [<ffffffff81277630>] __device_release_driver+0x70/0xe0
<4>[ 32.881606] [<ffffffff8127779a>] device_release_driver+0x2a/0x40
<4>[ 32.881606] [<ffffffff812769b5>] bus_remove_device+0xb5/0x120
<4>[ 32.881606] [<ffffffff81274817>] device_del+0x127/0x1d0
<4>[ 32.881606] [<ffffffff812748dd>] device_unregister+0x1d/0x60
<4>[ 32.881606] [<ffffffffa01d5071>] memstick_check+0x241/0x360 [memstick]
<4>[ 32.881606] [<ffffffff8105a740>] process_one_work+0x1c0/0x4d0
<4>[ 32.881606] [<ffffffff8105a6e2>] ? process_one_work+0x162/0x4d0
<4>[ 32.881606] [<ffffffffa01d4e30>] ? memstick_check+0x0/0x360 [memstick]
<4>[ 32.881606] [<ffffffff8105ae36>] worker_thread+0x156/0x410
<4>[ 32.881606] [<ffffffff8105ace0>] ? worker_thread+0x0/0x410
<4>[ 32.881606] [<ffffffff8105ed66>] kthread+0xb6/0xc0
<4>[ 32.881606] [<ffffffff81037fa6>] ? finish_task_switch+0x46/0xe0
<4>[ 32.881606] [<ffffffff81003c14>] kernel_thread_helper+0x4/0x10
<4>[ 32.881606] [<ffffffff8105ecb0>] ? kthread+0x0/0xc0
<4>[ 32.881606] [<ffffffff81003c10>] ? kernel_thread_helper+0x0/0x10
<0>[ 32.881606] Code: Bad RIP value.
<1>[ 32.881606] RIP [<00000000000001f8>] 0x1f8
<4>[ 32.881606] RSP <ffff880037a03ab8>
<0>[ 32.881606] CR2: 00000000000001f8
<4>[ 32.881606] ---[ end trace ca0206dec4457aff ]---
Best regards,
Maxim Levitsky
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/