Re: [PATCH 5/9] HWPoison: add memory_failure_queue()
From: Huang Ying
Date: Mon May 23 2011 - 23:08:07 EST
On 05/24/2011 10:48 AM, Ingo Molnar wrote:
> * Huang Ying <ying.huang@xxxxxxxxx> wrote:
>>>> - How to deal with ring-buffer overflow? For example, there is full of
>>>> corrected memory error in ring-buffer, and now a recoverable memory error
>>>> occurs but it can not be put into perf ring buffer because of ring-buffer
>>>> overflow, how to deal with the recoverable memory error?
>>> The solution is to make it large enough. With *every* queueing solution there
>>> will be some sort of queue size limit.
>> Another solution could be:
>> Create two ring-buffer. One is for logging and will be read by RAS
>> daemon; the other is for recovering, the event record will be removed
>> from the ring-buffer after all 'active filters' have been run on it.
>> Even RAS daemon being restarted or hang, recoverable error can be taken
>> cared of.
> Well, filters will always be executed since they execute when the event is
> inserted - not when it's extracted.
For filters executed in NMI context, they can be executed when the event
is inserted, no need for buffering. But for filters executed in
deferred IRQ context, they need to be executed when event's extracted.
> So if you worry about losing *filter* executions (and dependent policy action)
> - there should be no loss there, ever.
> But yes, the scheme you outline would work as well: a counting-only event with
> a filter specified - this will do no buffering at all.
> So ... to get the ball rolling in this area one of you guys active in RAS
> should really try a first approximation for the active filter approach: add a
> test-TRACE_EVENT() for the errors you are interested in and define a convenient
> way to register policy action with post-filter events. This should work even
> without having the 'active' portion defined at the ABI and filter-string level.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/