Re: [RFC 1/1] edac: Add a counter parameter for edac_device_handle_ue/ce()

From: Robert Richter
Date: Thu Aug 01 2019 - 10:17:10 EST


On 01.08.19 15:29:03, Hawa, Hanna wrote:
> On 8/1/2019 2:35 PM, Robert Richter wrote:
> > On 15.07.19 13:53:07, Hanna Hawa wrote:
> > > Add a counter parameter in order to avoid losing errors count for edac
> > > device, the error count reports the number of errors reported by an edac
> > > device similar to the way MC_EDAC do.
> > >
> > > Signed-off-by: Hanna Hawa <hhhawa@xxxxxxxxxx>
> > > ---
> > > drivers/edac/altera_edac.c | 20 ++++++++++++--------
> > > drivers/edac/amd8111_edac.c | 6 +++---
> > > drivers/edac/cpc925_edac.c | 4 ++--
> > > drivers/edac/edac_device.c | 18 ++++++++++--------
> > > drivers/edac/edac_device.h | 8 ++++++--
> > > drivers/edac/highbank_l2_edac.c | 4 ++--
> > > drivers/edac/mpc85xx_edac.c | 4 ++--
> > > drivers/edac/mv64x60_edac.c | 4 ++--
> > > drivers/edac/octeon_edac-l2c.c | 20 ++++++++++----------
> > > drivers/edac/octeon_edac-pc.c | 6 +++---
> > > drivers/edac/qcom_edac.c | 8 ++++----
> > > drivers/edac/thunderx_edac.c | 10 +++++-----
> > > drivers/edac/xgene_edac.c | 26 +++++++++++++-------------
> > > 13 files changed, 74 insertions(+), 64 deletions(-)
> >
> > > diff --git a/drivers/edac/edac_device.h b/drivers/edac/edac_device.h
> > > index 1aaba74..cf1a1da 100644
> > > --- a/drivers/edac/edac_device.h
> > > +++ b/drivers/edac/edac_device.h
> > > @@ -290,23 +290,27 @@ extern struct edac_device_ctl_info *edac_device_del_device(struct device *dev);
> > > * perform a common output and handling of an 'edac_dev' UE event
> > > *
> > > * @edac_dev: pointer to struct &edac_device_ctl_info
> > > + * @error_count: number of errors of the same type
> > > * @inst_nr: number of the instance where the UE error happened
> > > * @block_nr: number of the block where the UE error happened
> > > * @msg: message to be printed
> > > */
> > > extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
> > > - int inst_nr, int block_nr, const char *msg);
> > > + u16 error_count, int inst_nr, int block_nr,
> > > + const char *msg);
> > > /**
> > > * edac_device_handle_ce():
> > > * perform a common output and handling of an 'edac_dev' CE event
> > > *
> > > * @edac_dev: pointer to struct &edac_device_ctl_info
> > > + * @error_count: number of errors of the same type
> > > * @inst_nr: number of the instance where the CE error happened
> > > * @block_nr: number of the block where the CE error happened
> > > * @msg: message to be printed
> > > */
> > > extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
> >
> > How about renaming this to __edac_device_handle_ce() and then have 2
> > macros for:
> >
> > * edac_device_handle_ce() to keep old i/f.
> >
> > * edac_device_handle_ce_count(), with count parameter added.
> >
> > Same for uncorrectable errors.
> >
> > Code of other driver can be kept as it is then.
>
> Don't you think it'll be confused to have different APIs between EDAC_MC and
> EDAC_DEVICE?
> (in MC the count passed as part of edac_mc_handle_error())

I don't think edac_mc_handle_error() with 11 function arguments is a
good reference for somethin we want to adopt. For the majority of
drivers you just introduce another useless argument with the following
pattern:

edac_device_handle_ce(edac_dev, 1, 0, 0, edac_dev_name);

IMO, the api should be improved when touching it.

-Robert