Re: [PATCH 02/11] cxl/mem: Implement Get Event Records command
From: Ira Weiny
Date: Wed Nov 30 2022 - 00:10:14 EST
On Tue, Nov 29, 2022 at 12:26:20PM +0000, Jonathan Cameron wrote:
> On Mon, 28 Nov 2022 15:30:12 -0800
> Ira Weiny <ira.weiny@xxxxxxxxx> wrote:
>
[snip]
> > > A valid reading of that temporal order comment is actually the other way around
> > > that the device must not reset it's idea of temporal order until all records
> > > have been read (reading 3 twice is not in temporal order - imagine we had
> > > read 5 each time and it becomes more obvious as the read order becomes
> > > 0,1,2,3,4,3,4,5,6,7 etc which is clearly not in temporal order by any normal
> > > reading of the term.
> >
> > Well I guess. My reading was that it must return the first element temporally
> > within the list at the time of the Get operation.
> >
> > So in this example since 3 is still in the list it must return it first. Each
> > read is considered atomic from the others. Yes as long as 0 is in the queue it
> > will be returned.
> >
> > But I can see it your way too...
>
> That pesky text under More Event Records flag doesn't mention clearing when it
> says "The host should continue to retrieve
> records using this command, until this indicator is no longer set by the
> device."
>
> I wish it did :(
>
As I have reviewed these in my head again I have come to the conclusion that
the More Event Records flags is useless. Let me explain:
The Clear all Records flag is useless because if an event which occurs between the
Get and Clear all operation will be dropped without the host having seen it.
However, while clearing records based on the handles read, additional events
could come in. Because of the way the interrupts are specified the host can't
be sure that those new events will cause a zero to non-zero transition. This
is because there is no way to guarantee all the events were cleared at the
moment the events came in.
I believe this is what you mentioned in another email about needing an 'extra
read' at the end to ensure there was nothing more to be read. But based on
that logic the only thing that matters is the Get Event.Record
Count. If it is not 0 keep on reading because while the host is clearing the
records another event could come in.
In other words, the only way to be sure that all records are seen is to do a
Get and see the number of records equal to 0. Thus any further events will
trigger an interrupt and we can safely exit the loop.
Ira
Basically the loop looks like:
int nr_rec;
do {
... <Get Events> ...
nr_rec = le16_to_cpu(payload->record_count);
... <for each record trace> ...
... <for each record clear> ...
} while (nr_rec);