Re: [PATCHv7 6/8] printk: use printk_safe buffers in printk

From: Jan Kara
Date: Wed Feb 01 2017 - 04:06:37 EST


On Tue 31-01-17 10:27:53, Ross Zwisler wrote:
> On Tue, Dec 27, 2016 at 7:16 AM, Sergey Senozhatsky
> <sergey.senozhatsky@xxxxxxxxx> wrote:
> > Use printk_safe per-CPU buffers in printk recursion-prone blocks:
> > -- around logbuf_lock protected sections in vprintk_emit() and
> > console_unlock()
> > -- around down_trylock_console_sem() and up_console_sem()
> >
> > Note that this solution addresses deadlocks caused by printk()
> > recursive calls only. That is vprintk_emit() and console_unlock().
> > The rest will be converted in a followup patch.
> >
> > Another thing to note is that we now keep lockdep enabled in printk,
> > because we are protected against the printk recursion caused by
> > lockdep in vprintk_emit() by the printk-safe mechanism - we first
> > switch to per-CPU buffers and only then access the deadlock-prone
> > locks.
>
> When booting v4.10-rc5-mmots-2017-01-26-15-49 from the mmots tree, I
> sometimes see the following lockdep splat which I think may be related
> to this commit?

I don't think it is really related. Look at the backtrace:

> [ 13.090684] printk+0x52/0x6e
> [ 13.090685] ? update_load_avg+0x85b/0xb80
> [ 13.090685] __warn+0x39/0xf0
> [ 13.090685] warn_slowpath_fmt+0x5f/0x80
> [ 13.090686] update_load_avg+0x85b/0xb80
> [ 13.090686] ? debug_smp_processor_id+0x17/0x20
> [ 13.090686] detach_task_cfs_rq+0x3f/0x210
> [ 13.090687] task_change_group_fair+0x24/0x100
> [ 13.090687] sched_change_group+0x5f/0x110
> [ 13.090687] sched_move_task+0x53/0x160
> [ 13.090687] cpu_cgroup_attach+0x36/0x70
> [ 13.090688] cgroup_migrate_execute+0x230/0x3f0
> [ 13.090688] cgroup_migrate+0xce/0x140
> [ 13.090688] ? cgroup_migrate+0x5/0x140
> [ 13.090689] cgroup_attach_task+0x27f/0x3e0
> [ 13.090689] ? cgroup_attach_task+0x9b/0x3e0
> [ 13.090689] __cgroup_procs_write+0x30e/0x510
> [ 13.090690] ? __cgroup_procs_write+0x70/0x510
> [ 13.090690] cgroup_procs_write+0x14/0x20
> [ 13.090690] cgroup_file_write+0x44/0x1e0
> [ 13.090690] kernfs_fop_write+0x13c/0x1c0
> [ 13.090691] __vfs_write+0x37/0x160
> [ 13.090691] ? rcu_read_lock_sched_held+0x4a/0x80
> [ 13.090691] ? rcu_sync_lockdep_assert+0x2f/0x60
> [ 13.090692] ? __sb_start_write+0x10d/0x220
> [ 13.090692] ? vfs_write+0x19b/0x1f0
> [ 13.090692] ? security_file_permission+0x3b/0xc0
> [ 13.090693] vfs_write+0xcb/0x1f0
> [ 13.090693] SyS_write+0x58/0xc0

Clearly scheduler code (update_load_avg) calls WARN_ON from scheduler while
holding rq_lock which has been always forbidden. Sergey and Petr were doing
some work to prevent similar deadlocks but I'm not sure how far they
went...

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR