Re: [PATCH 2/2] edac: add support for Amazon's Annapurna Labs EDAC

From: Borislav Petkov
Date: Tue Jun 11 2019 - 23:53:19 EST


On Wed, Jun 12, 2019 at 08:25:52AM +1000, Benjamin Herrenschmidt wrote:
> Yes, we would be in a world of pain already if tracepoints couldn't
> handle concurrency :-)

Right, lockless buffer and the whole shebang :)

> Sort-of... I still don't see a race in what we propose but I might be
> missing something subtle. We are talking about two drivers for two
> different IP blocks updating different counters etc...

If you do only *that* you should be fine. That should technically be ok.

I still think, though, that the sensible thing to do is have one
platform driver which concentrates all RAS functionality. It is the
more sensible design and takes care of potential EDAC shortcomings and
the need to communicate between the different logging functionality,
as in, for example, "I had so many errors, lemme go and increase DRAM
scrubber frequency." For example. And all the other advantages of having
everything in a single driver.

And x86 already does that - we even have a single driver for all AMD
platforms - amd64_edac. Intel has a couple but there's still a lot of
sharing.

But apparently ARM folks want to have one driver per IP block. And we
have this discussion each time a new vendor decides to upstream its
driver. And there's no shortage of vendors in ARM-land trying to do
that.

James and I have tried to come up with a nice scheme to make that work
on ARM and he has an example prototype here:

http://www.linux-arm.org/git?p=linux-jm.git;a=shortlog;h=refs/heads/edac_dummy/v1

to show how it could look like.

But I'm slowly growing a serious aversion against having this very same
discussion each time an ARM vendor sends a driver. And that happens
pretty often nowadays.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.