Re: [PATCH] EDAC/device: Add sysfs notification for UE,CE count change

From: Deepti Jaggi
Date: Tue Aug 01 2023 - 18:37:41 EST


On 7/31/2023 10:48 PM, Trilok Soni wrote:
On 7/31/2023 3:40 PM, Trilok Soni wrote:
On 7/31/2023 3:00 PM, Deepti Jaggi wrote:
A daemon running in user space collects information on correctable
and uncorrectable errors from EDAC driver by reading corresponding
sysfs entries and takes appropriate action.

Which daemon we are referring here? Can you please provide the link to the project?

Are you using this daemon?

https://mcelog.org/ - It is for x86, but is your daemon project different?


No this daemon is not used. Daemon is under development and it is more specific to Qualcomm use cases.
Based on my limited understanding of mcelog, this daemon is handling errors in an architecture specific way.
By adding support for sysfs notification in EDAC framework, drivers which are not using any custom sysfs attributes can take advantage of this modification to notify the user space daemon polling on ue_count and/or ce_count attributes.

This patch adds support for user space daemon to wait on poll() until
the sysfs entries for UE count and CE count change and then read updated
counts instead of continuously monitoring the sysfs entries for
any changes.

The modifications below are architecture agnostic so I really want to know what exactly we are fixing and if there is a problem.


In the change set, adding support for user space to poll on the ue_count and/or ce_count sysfs attributes.
On changes in ue_count,ce_count attributes, unblock user space poll from EDAC driver framework and user space can read the changed ce_count, ue_count.

As an example from user space perform the following steps:
1. Open the sysfs attribute file for UE count and CE count
2. Read the initial CE count and UE count
3. Poll on any changes on CE count, UE count fds.
4. Once poll unblocks, Read the updated count.
5.Take appropriate action on the changed counts.

#####################################################################
Example Simple User space code Snippet:

#define MAX_POLL_FDS 2
char ue_count_file[] = "/sys/devices/system/edac/qcom-llcc/qcom-llcc0/ue_count";
char ce_count_file[] = "/sys/devices/system/edac/qcom-llcc/qcom-llcc0/ce_count";

struct pollfd *p_poll_fds = NULL;
struct pollfd poll_fds[MAX_POLL_FDS] = {0};
char data[100];

poll_fds[0].fd = open(ue_count_file, O_RDONLY);
poll_fds[1].fd = open(ce_count_file, O_RDONLY);

/*Read Initial value before poll and set poll events*/
for (int i = 0; i < MAX_POLL_FDS; i++)
{
ret = read(poll_fds[i].fd, data, 100);
poll_fds[i].events = POLLPRI ;
}
p_poll_fds = &poll_fds[0];
while(1)
{
/*Block on poll until ue_count or ce_count change
ret = poll(p_poll_fds, sizeof(poll_fds)/sizeof(struct pollfd) , -1);
/*
* Read the changed UE/CE count. lseek()
* or close/re-open the changed fd
*/
for(int i = 0; i < MAX_POLL_FDS; i++) {
if( poll_fds[i].revents & POLLPRI) {

ret = read(poll_fds[i].fd, data, 100);

/*Take an appropriate action*/

}
}
}
######################################################################

+ CC linux-arm-msm

Please keep linux-arm-msm in CC if there is a next revision.


Noted.


--Deepti