Re: [PATCH v1 0/2] mm/kdump: exclude reserved pages in dumps
From: Michal Hocko
Date: Tue Jul 24 2018 - 09:35:36 EST
On Tue 24-07-18 15:27:51, David Hildenbrand wrote:
> On 24.07.2018 15:13, Michal Hocko wrote:
> > On Tue 24-07-18 14:17:12, David Hildenbrand wrote:
> >> On 24.07.2018 09:25, Michal Hocko wrote:
> >>> On Mon 23-07-18 19:20:43, David Hildenbrand wrote:
> >>>> On 23.07.2018 14:30, Michal Hocko wrote:
> >>>>> On Mon 23-07-18 13:45:18, Vlastimil Babka wrote:
> >>>>>> On 07/20/2018 02:34 PM, David Hildenbrand wrote:
> >>>>>>> Dumping tools (like makedumpfile) right now don't exclude reserved pages.
> >>>>>>> So reserved pages might be access by dump tools although nobody except
> >>>>>>> the owner should touch them.
> >>>>>>
> >>>>>> Are you sure about that? Or maybe I understand wrong. Maybe it changed
> >>>>>> recently, but IIRC pages that are backing memmap (struct pages) are also
> >>>>>> PG_reserved. And you definitely do want those in the dump.
> >>>>>
> >>>>> You are right. reserve_bootmem_region will make all early bootmem
> >>>>> allocations (including those backing memmaps) PageReserved. I have asked
> >>>>> several times but I haven't seen a satisfactory answer yet. Why do we
> >>>>> even care for kdump about those. If they are reserved the nobody should
> >>>>> really look at those specific struct pages and manipulate them. Kdump
> >>>>> tools are using a kernel interface to read the content. If the specific
> >>>>> content is backed by a non-existing memory then they should simply not
> >>>>> return anything.
> >>>>>
> >>>>
> >>>> "new kernel" provides an interface to read memory from "old kernel".
> >>>>
> >>>> The new kernel has no idea about
> >>>> - which memory was added/online in the old kernel
> >>>> - where struct pages of the old kernel are and what their content is
> >>>> - which memory is save to touch and which not
> >>>>
> >>>> Dump tools figure all that out by interpreting the VMCORE. They e.g.
> >>>> identify "struct pages" and see if they should be dumped. The "new
> >>>> kernel" only allows to read that memory. It cannot hinder to crash the
> >>>> system (e.g. if a dump tool would try to read a hwpoison page).
> >>>>
> >>>> So how should the "new kernel" know if a page can be touched or not?
> >>>
> >>> I am sorry I am not familiar with kdump much. But from what I remember
> >>> it reads from /proc/vmcore and implementation of this interface should
> >>> simply return EINVAL or alike when you try to dump inaccessible memory
> >>> range.
> >>
> >> Oh, and BTW, while something like -EINVAL could work, we usually don't
> >> want to try to read certain pages at all (e.g. ballooned pages -
> >> accessing the page might work but involves quite some overhead in the
> >> hypervisor).
> >>
> >> So we should either handle this in dump tools (reserved + ...?) or while
> >> doing the read similar to XEN (is_ram_page()).
> >
> > Yes, I think this is the proper way. Just test for PageOnline
> > in read_from_oldmem/copy_oldmem_page. Btw. we already page
> > pfn_to_online_page which performs the per-section online/offline
> > status. This should be extendable to consider your new PageOffline
> > state.
>
> That is the important bit:
>
> What the new kernel sees is not what the old kernel saw.
>
> Checking for pfn_to_online_page() from
> read_from_oldmem/copy_oldmem_page() is plain wrong.
>
> E.g. ACPI hotplug memory is not even added in the new kernel - see
> "acpi_no_memhotplug" which is used in kdump environments.
>
> The only thing we can do is
> - query the hypervisor
> - try to access and get an exception
But we do preserve struct page's (aka memmap) from the crash kernel,
don't we? So you have the whole state there. Or am I missing something?
--
Michal Hocko
SUSE Labs