Re: [RFC PATCH v5 1/3] printk-rb: new printk ringbuffer implementation (writer)

From: Petr Mladek
Date: Mon Dec 02 2019 - 11:00:01 EST


On Mon 2019-12-02 16:48:41, Petr Mladek wrote:
> > +/* Reserve a new descriptor, invalidating the oldest if necessary. */
> > +static bool desc_reserve(struct printk_ringbuffer *rb, u32 *id_out)
> > +{
> > + struct prb_desc_ring *desc_ring = &rb->desc_ring;
> > + struct prb_desc *desc;
> > + u32 id_prev_wrap;
> > + u32 head_id;
> > + u32 id;
> > +
> > + head_id = atomic_read(&desc_ring->head_id);
> > +
> > + do {
> > + desc = to_desc(desc_ring, head_id);
> > +
> > + id = DESC_ID(head_id + 1);
> > + id_prev_wrap = DESC_ID_PREV_WRAP(desc_ring, id);
> > +
> > + if (id_prev_wrap == atomic_read(&desc_ring->tail_id)) {
> > + if (!desc_push_tail(rb, id_prev_wrap))
> > + return false;
> > + }
> > + } while (!atomic_try_cmpxchg(&desc_ring->head_id, &head_id, id));
>
> Hmm, in theory, ABA problem might cause that we successfully
> move desc_ring->head_id when tail has not been pushed yet.
>
> As a result we would never call desc_push_tail() until
> it overflows again.
>
> I am not sure if we need to take care of it. The code is called with
> interrupts disabled. IMHO, only NMI could cause ABA problem
> in reality. But the game (debugging) is lost anyway when NMI ovewrites
> the buffer several times.

BTW: If I am counting correctly. The ABA problem would happen when
exactly 2^30 (1G) messages is written in the mean time.

> Also it should not be a complete failure. We still could bail out when
> the previous state was not reusable. We are on the safe side
> when it was reusable.
>
> Also we could probably detect when desc_ring->tail_id is not
> updated any longer and do something about it.
>
> > +
> > + desc = to_desc(desc_ring, id);
> > +
> > + /* Set the descriptor's ID and also set its state to reserved. */
> > + atomic_set(&desc->state_var, id | 0);
>
> We should check here that the original state id from the previous
> wrap in reusable state (or 0 for bootstrap). On error, we know that
> there was the ABA problem and would need to do something about it.

I believe that it should be enough to add this check and
bail out in case of error.

Best Regards,
Petr