Re: [PATCH printk 4/5] printk: remove logbuf_lock, add syslog_lock

From: Petr Mladek
Date: Wed Sep 23 2020 - 12:30:22 EST


On Tue 2020-09-22 17:44:15, John Ogness wrote:
> Since the ringbuffer is lockless, there is no need for it to be
> protected by @logbuf_lock. Remove @logbuf_lock.
>
> This means that printk_nmi_direct and printk_safe_flush_on_panic()
> no longer need to acquire any lock to run.
>
> The global variables @syslog_seq, @syslog_partial, @syslog_time,
> @clear_seq were also protected by @logbuf_lock. Introduce
> @syslog_lock to protect these.
>
> @console_seq, @exclusive_console_stop_seq, @console_dropped are
> protected by @console_lock.
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 763494d1d6b3..65e3cdbddeff 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> #ifdef CONFIG_PRINTK
> DECLARE_WAIT_QUEUE_HEAD(log_wait);
> +
> +/* All 3 protected by @syslog_lock. */
> /* the next printk record to read by syslog(READ) or /proc/kmsg */
> static u64 syslog_seq;
> static size_t syslog_partial;
> static bool syslog_time;

I agree that it makes sense to synchronize these three variables
on 3 locations, see below.

> +/* All 3 protected by @console_lock. */
> /* the next printk record to write to the console */
> static u64 console_seq;
> static u64 exclusive_console_stop_seq;
> static unsigned long console_dropped;
>
> +/* Protected by @syslog_lock. */
> /* the next printk record to read after the last 'clear' command */
> static u64 clear_seq;

The synchronization of this variable is strange. It is not clear
against which changes it is synchronized.


> @@ -823,7 +793,6 @@ static loff_t devkmsg_llseek(struct file *file, loff_t offset, int whence)
> if (offset)
> return -ESPIPE;
>
> - logbuf_lock_irq();
> switch (whence) {
> case SEEK_SET:
> /* the first record */

SEEK_SET does:

user->seq = prb_first_valid_seq(prb);

I wonder if we actually need to protect user->seq by user->lock mutex
as it is done in devkmsg_read(). The logbuf_lock somehow prevented
a possible race. Well, any race is not much realistic.

> @@ -858,7 +828,6 @@ static __poll_t devkmsg_poll(struct file *file, poll_table *wait)
>
> poll_wait(file, &log_wait, wait);
>
> - logbuf_lock_irq();

Also this should probably get replaced by user->lock mutex.

> if (prb_read_valid(prb, user->seq, NULL)) {
> /* return error when data has vanished underneath us */
> if (user->seq < prb_first_valid_seq(prb))


> @@ -1593,8 +1576,11 @@ int do_syslog(int type, char __user *buf, int len, int source)
> return 0;
> if (!access_ok(buf, len))
> return -EFAULT;
> + syslog_lock_irq();
> + seq = syslog_seq;
> + syslog_unlock_irq();

It is not clear why a lock is suddenly needed here.

All the locks around a single variable read/write are suspicious. They
help only against inconsistent value (compile optimization or 64-bit value
manipulation on 32-bit system).

It might make sense but it has been clearly ignored before.

> error = wait_event_interruptible(log_wait,
> - prb_read_valid(prb, syslog_seq, NULL));
> + prb_read_valid(prb, seq, NULL));
> if (error)
> return error;
> error = syslog_print(buf, len);
> @@ -1642,7 +1628,7 @@ int do_syslog(int type, char __user *buf, int len, int source)
> break;
> /* Number of chars in the log buffer */
> case SYSLOG_ACTION_SIZE_UNREAD:
> - logbuf_lock_irq();
> + syslog_lock_irq();

I agree that some locking is needed here to keep @syslog_seq,
@syslog_partial, and @syslog_time consistent.


> if (syslog_seq < prb_first_valid_seq(prb)) {
> /* messages are gone, move to first one */
> syslog_seq = prb_first_valid_seq(prb);
> @@ -1669,7 +1655,7 @@ int do_syslog(int type, char __user *buf, int len, int source)
> }
> error -= syslog_partial;
> }
> - logbuf_unlock_irq();
> + syslog_unlock_irq();
> break;
> /* Size of the log buffer */
> case SYSLOG_ACTION_SIZE_BUFFER:
> @@ -2106,10 +2092,9 @@ asmlinkage int vprintk_emit(int facility, int level,
> boot_delay_msec(level);
> printk_delay();
>
> - /* This stops the holder of console_sem just where we want him */
> - logbuf_lock_irqsave(flags);
> + printk_safe_enter_irqsave(flags);
> printed_len = vprintk_store(facility, level, dev_info, fmt, args);
> - logbuf_unlock_irqrestore(flags);

Why exactly need this be called in printk_safe context, please?

Infinite recursion might be prevented by per-CPU counter.
Lack of line buffers could hopefully be prevented by vscnprintf(NULL,
...) or extending the pool in 2nd patch.

Is there any other reason, please?

> + printk_safe_exit_irqrestore(flags);
>
> /* If called from the scheduler, we can not call up(). */
> if (!in_sched) {


> @@ -2691,9 +2670,9 @@ void console_flush_on_panic(enum con_flush_mode mode)
> if (mode == CONSOLE_REPLAY_ALL) {
> unsigned long flags;
>
> - logbuf_lock_irqsave(flags);
> + local_irq_save(flags);
> console_seq = prb_first_valid_seq(prb);
> - logbuf_unlock_irqrestore(flags);
> + local_irq_restore(flags);

What is the reason for disabled irq here, please?

> }
> console_unlock();
> }
> @@ -3476,17 +3449,14 @@ bool kmsg_dump_get_buffer(struct kmsg_dumper *dumper, bool syslog,
> if (!dumper->active || !buf || !size)
> goto out;
>
> - logbuf_lock_irqsave(flags);

I wonder if the logbuf_lock actually synchronized also some variables
in struct kmsg_dumper (cur_seq and next_seq).

We might need to add a lock into struct kmsg_dumper.


> if (dumper->cur_seq < prb_first_valid_seq(prb)) {
> /* messages are gone, move to first available one */
> dumper->cur_seq = prb_first_valid_seq(prb);
> }
>
> /* last entry */
> - if (dumper->cur_seq >= dumper->next_seq) {
> - logbuf_unlock_irqrestore(flags);
> + if (dumper->cur_seq >= dumper->next_seq)
> goto out;
> - }
>
> /* calculate length of entire buffer */
> seq = dumper->cur_seq;

Sigh, I wish the locking world was easier.

Best Regards,
Petr