Re: [PATCH printk v3 04/14] printk: ringbuffer: Do not skip non-finalized records with prb_next_seq()

From: John Ogness
Date: Mon Feb 05 2024 - 06:34:02 EST


On 2024-01-15, Petr Mladek <pmladek@xxxxxxxx> wrote:
>> The acquire is with @last_finalized_seq. So the release must also be
>> with @last_finalized_seq. The important thing is that the CPU that
>> updates @last_finalized_seq has actually read the corresponding
>> record beforehand. That is exactly what desc_update_last_finalized()
>> does.
>
> I probably did not describe it well. The CPU updating
> @last_finalized_seq does the right thing. I was not sure about the CPU
> which reads @last_finalized_seq via prb_next_seq().
>
> To make it more clear:
>
> u64 prb_next_seq(struct printk_ringbuffer *rb)
> {
> u64 seq;
>
> seq = desc_last_finalized_seq(rb);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> |
> `-> This includes atomic_long_read_acquire(last_finalized_seq)
>
>
> if (seq != 0)
> seq++;
>
> while (_prb_read_valid(rb, &seq, NULL, NULL))
> seq++;
>
> return seq;
> }
>
> But where is the atomic_long_read_release(last_finalized_seq) in
> this code path?

read_release? The counterpart of this load_acquire is a
store_release. For example:

CPU0 CPU1
==== ====
load(varA)
store_release(varB) load_acquire(varB)
load(varA)

If CPU1 reads the value in varB that CPU0 stored, then it is guaranteed
that CPU1 will read the value (or a later value) in varA that CPU0 read.

Translating the above example to this particular patch, we have:

CPU0: desc_update_last_finalized() CPU1: prb_next_seq()
==== ====
_prb_read_valid(seq)
cmpxchg_release(last_finalized_seq,seq) seq=read_acquire(last_finalized_seq)
_prb_read_valid(seq)

> IMHO, the barrier provided by the acquire() is _important_ to make
> sure that _prb_read_valid() would see the valid descriptor.

Correct.

> Now, I think that the related read_release(seq) is hidden in:
>
> static int prb_read(struct printk_ringbuffer *rb, u64 seq,
> struct printk_record *r, unsigned int *line_count)
> {
> /* Get a local copy of the correct descriptor (if available). */
> err = desc_read_finalized_seq(desc_ring, id, seq, &desc);
>
> /* If requested, copy meta data. */
> if (r->info)
> memcpy(r->info, info, sizeof(*(r->info)));
>
> /* Copy text data. If it fails, this is a data-less record. */
> if (!copy_data(&rb->text_data_ring, &desc.text_blk_lpos, info->text_len,
> r->text_buf, r->text_buf_size, line_count)) {
> return -ENOENT;
> }
>
> /* Ensure the record is still finalized and has the same @seq. */
> return desc_read_finalized_seq(desc_ring, id, seq, &desc);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> |
> `-> This includes a memory barrier /* LMM(desc_read:A) */
> which makes sure that the data are read before
> the desc/data could be reused.
> }
>
> I consider this /* LMM(desc_read:A) */ as a counter part for that
> acquire() in prb_next_seq().

desc_read:A is not a memory barrier. It only marks the load of the
descriptor state. This is a significant load because prb_next_seq() must
see at least the descriptor state that desc_update_last_finalized() saw.

The memory barrier comments in desc_update_last_finalized() state:

* If desc_last_finalized_seq:A reads from
* desc_update_last_finalized:A, then desc_read:A reads from
* _prb_commit:B.

This is referring to a slightly different situation than the example I
used above because it is referencing where the descriptor state was
stored (_prb_commit:B). The same general picture is valid:

CPU0 CPU1
==== ====
_prb_commit:B
desc_update_last_finalized:A desc_last_finalized_seq:A
desc_read:A

desc_read:A is loding the descriptor state that _prb_commit:B stored.

The extra note in the comment clarifies that _prb_commit:B could also be
denoted as desc_read:A because desc_update_last_finalized() performs a
read (i.e. must have seen) _prb_commit:B.

* Note: _prb_commit:B and desc_update_last_finalized:A can be
* different CPUs. However, the desc_update_last_finalized:A
* CPU (which performs the release) must have previously seen
* _prb_commit:B.

Normally the CPU committing the record will also update
last_finalized_seq. But it is possible that another CPU updates
last_finalized_seq before the committing CPU because it already sees the
finalized record. In that case the complete (maximally complex) picture
looks like this.

CPU0 CPU1 CPU2
==== ==== ====
_prb_commit:B desc_read:A
desc_update_last_finalized:A desc_last_finalized_seq:A
desc_read:A

Any memory barriers in _prb_commit() or desc_read() are irrelevant for
guaranteeing that a CPU reading a sequence value from
desc_last_finalized_seq() will always be able to read that record.

> Summary:
>
> I saw atomic_long_read_acquire(last_finalized_seq) called from
> prb_next_seq() code path. The barrier looked important to me.
> But I saw neither the counter-part nor any comment. I wanted
> to understand it because it might be important for reviewing
> following patches which depend on prb_next_seq().

desc_update_last_finalized:A is the counterpart to
desc_last_finalized_seq:A. IMHO there are plenty of comments that are
formally documenting these memory barriers. Including the new entry in
the summary of all memory barriers:

* desc_update_last_finalized:A / desc_last_finalized_seq:A
* store finalized record, then set new highest finalized sequence number

John