Re: [PATCH 02/11] cxl/mem: Implement Get Event Records command
From: Jonathan Cameron
Date: Wed Nov 30 2022 - 09:05:42 EST
On Tue, 29 Nov 2022 21:09:58 -0800
Ira Weiny <ira.weiny@xxxxxxxxx> wrote:
> On Tue, Nov 29, 2022 at 12:26:20PM +0000, Jonathan Cameron wrote:
> > On Mon, 28 Nov 2022 15:30:12 -0800
> > Ira Weiny <ira.weiny@xxxxxxxxx> wrote:
> >
>
> [snip]
>
> > > > A valid reading of that temporal order comment is actually the other way around
> > > > that the device must not reset it's idea of temporal order until all records
> > > > have been read (reading 3 twice is not in temporal order - imagine we had
> > > > read 5 each time and it becomes more obvious as the read order becomes
> > > > 0,1,2,3,4,3,4,5,6,7 etc which is clearly not in temporal order by any normal
> > > > reading of the term.
> > >
> > > Well I guess. My reading was that it must return the first element temporally
> > > within the list at the time of the Get operation.
> > >
> > > So in this example since 3 is still in the list it must return it first. Each
> > > read is considered atomic from the others. Yes as long as 0 is in the queue it
> > > will be returned.
> > >
> > > But I can see it your way too...
> >
> > That pesky text under More Event Records flag doesn't mention clearing when it
> > says "The host should continue to retrieve
> > records using this command, until this indicator is no longer set by the
> > device."
> >
> > I wish it did :(
> >
>
> As I have reviewed these in my head again I have come to the conclusion that
> the More Event Records flags is useless. Let me explain:
>
> The Clear all Records flag is useless because if an event which occurs between the
> Get and Clear all operation will be dropped without the host having seen it.
Can still be used to get a known clean sheet if you don't care about a bunch
of records on initial boot because no data in flight yet etc.
Agreed it is no use if you care about content of the records.
Make sure interrupts are enabled before re-checking if there are new records
to close that race.
>
> However, while clearing records based on the handles read, additional events
> could come in. Because of the way the interrupts are specified the host can't
> be sure that those new events will cause a zero to non-zero transition. This
> is because there is no way to guarantee all the events were cleared at the
> moment the events came in.
>
> I believe this is what you mentioned in another email about needing an 'extra
> read' at the end to ensure there was nothing more to be read. But based on
> that logic the only thing that matters is the Get Event.Record
> Count. If it is not 0 keep on reading because while the host is clearing the
> records another event could come in.
>
> In other words, the only way to be sure that all records are seen is to do a
> Get and see the number of records equal to 0. Thus any further events will
> trigger an interrupt and we can safely exit the loop.
Agreed - standard race to close when ever we have a FIFO with edge interrupts
on how full it is.
More records is useful for a different potential pattern of non destructive
read and later clear. Or for a debug non destructive read.
int nr_rec;
<list>
round_we_go:
do {
... <for each record trace and add to list...> ...
...
} while (!MORE);
for_each_list_entry() {
clear records one at a time.
}
nr_rec = le16_to_cpu(payload->record_count);
if (nr_rec)
goto round_we_go;
...
>
> Ira
>
> Basically the loop looks like:
>
> int nr_rec;
>
> do {
> ... <Get Events> ...
>
> nr_rec = le16_to_cpu(payload->record_count);
>
> ... <for each record trace> ...
> ... <for each record clear> ...
>
> } while (nr_rec);
>