Re: [PATCH] mm/backing-dev.c: fix crash when USB/SCSI device is detached

From: Rabin Vincent
Date: Sun Jan 15 2012 - 05:29:53 EST


On Thu, Jan 5, 2012 at 14:19, Chanho Min <chanho0207@xxxxxxxxx> wrote:
>>On Tue, Jan 03, 2012 at 12:23:44PM +0900, Chanho Min wrote:
>>> >On Mon, Jan 02, 2012 at 06:38:21PM +0900,wrote:
>>> >> from Chanho Min <chanho.min@xxxxxxx>
>>> >>
>>> >> System may crash in backing-dev.c when removal SCSI device is detached.
>>> >> bdi task is killed by bdi_unregister()/'khubd', but task's point
>>remains.
>>> >> Shortly afterward, If 'wb->wakeup_timer' is expired before
>>> >> del_timer()/bdi_forker_thread,
>>> >> wakeup_timer_fn() may wake up the dead thread which cause the crash.
>>> >> 'bdi->wb.task' should be NULL as this patch.
>>
>>I noticed a related fix is merged recently, does your test kernel
>>contain this commit?
>>
> No, I will try to reproduce with this patch.
> But, bdi_destroy is not called during write-access. Same result is expected.

I agree. 7a401a972df8e184b3d1a3fc958c0a4ddee8d312 only addressed the
problem of the bdi being destroyed with an active timer, but there are
other races that could happen before that.

>>This patch makes no guarantee wakeup_timer_fn() will see NULL
>>bdi->wb.task before the task is stopped, so there is still race
>>conditions. And still, the complete fix would be to prevent
>>wakeup_timer_fn() from being called at all.
>
> If wakeup_timer_fn() see NULL bdi->wb.task, wakeup_timer_fn regards
> task as killed
> and wake up forker thread instead of the defined thread.
> Is this intended behavior of the bdi?

This appears to be the intended behaviour before, but certainly not
after the bdi is unregistered, since anyway the forker thread will not
find the bdi on the list. In fact, if tracing is enabled the kernel
crashes because dev_name() is called on a NULL bdi->dev from the
wake_forker_thread tracepoint.

The following patch should address these issues:

8<---------------------------