Re: [PATCH] x86/mce: Need to let kill_proc() send signal to doomedprocess

From: Borislav Petkov
Date: Sat Jul 07 2012 - 00:39:45 EST


On Fri, Jul 06, 2012 at 02:33:15PM -0700, Tony Luck wrote:
> In commit dad1743e5993f19b3d7e7bd0fb35dc45b5326626
> x86/mce: Only restart instruction after machine check recovery if it is safe
> we fixed mce_notify_process() to force a signal to the current process
> if it was not restartable (RIPV bit not set in MCG_STATUS). But doing
> it here means that the process doesn't get told the virtual address of
> the fault via siginfo_t->si_addr. This would prevent application level
> recovery from the fault.

Ok, this makes sense, we want to kill all the processes mapping that
page.

> Make a new MF_MUST_KILL flag bit for memory_failure() et. al. to use
> so that we will provide the right information with the signal.
>
> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
> ---
> arch/x86/kernel/cpu/mcheck/mce.c | 4 ++--
> include/linux/mm.h | 1 +
> mm/memory-failure.c | 8 +++++---
> 3 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index da27c5d..43f918d 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -1200,8 +1200,8 @@ void mce_notify_process(void)
> * doomed. We still need to mark the page as poisoned and alert any
> * other users of the page.
> */
> - if (memory_failure(pfn, MCE_VECTOR, MF_ACTION_REQUIRED) < 0 ||
> - mi->restartable == 0) {
> + if (memory_failure(pfn, MCE_VECTOR,
> + MF_ACTION_REQUIRED|MF_MUST_KILL) < 0) {

This makes mi->restartable unused?

And more specifically, we're not looking at RIPV anymore. I'm guessing
when we've reached this point, we always MUST_KILL?

> pr_err("Memory error not recovered");
> force_sig(SIGBUS, current);
> }
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index b36d08c..f9f279c 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1591,6 +1591,7 @@ void vmemmap_populate_print_last(void);
> enum mf_flags {
> MF_COUNT_INCREASED = 1 << 0,
> MF_ACTION_REQUIRED = 1 << 1,
> + MF_MUST_KILL = 1 << 2,
> };
> extern int memory_failure(unsigned long pfn, int trapno, int flags);
> extern void memory_failure_queue(unsigned long pfn, int trapno, int flags);
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index ab1e714..e3e0045 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -858,7 +858,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
> struct address_space *mapping;
> LIST_HEAD(tokill);
> int ret;
> - int kill = 1;
> + int kill = 1, doit;
> struct page *hpage = compound_head(p);
> struct page *ppage;
>
> @@ -965,12 +965,14 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
> * Now that the dirty bit has been propagated to the
> * struct page and all unmaps done we can decide if
> * killing is needed or not. Only kill when the page
> - * was dirty, otherwise the tokill list is merely
> + * was dirty or the process is not restartable,
> + * otherwise the tokill list is merely
> * freed. When there was a problem unmapping earlier
> * use a more force-full uncatchable kill to prevent
> * any accesses to the poisoned memory.
> */
> - kill_procs(&tokill, !!PageDirty(ppage), trapno,
> + doit = !!PageDirty(ppage) || (flags & MF_MUST_KILL) != 0;

Maybe

!!(flags & MF_MUST_KILL)
?

> + kill_procs(&tokill, doit, trapno,
> ret != SWAP_SUCCESS, p, pfn, flags);
>
> return ret;
> --

Thanks.

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/