Re: vvar, gup && coredump
From: Andy Lutomirski
Date: Thu Mar 12 2015 - 12:29:58 EST
On Thu, Mar 12, 2015 at 7:34 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> Add cc's, change subject.
>
> On 03/11, Oleg Nesterov wrote:
>>
>> On 03/05, Jan Kratochvil wrote:
>> >
>> > On Thu, 05 Mar 2015 21:52:56 +0100, Sergio Durigan Junior wrote:
>> > > On Thursday, March 05 2015, Jan Kratochvil wrote:
>> > > > On Thu, 05 Mar 2015 04:48:09 +0100, Sergio Durigan Junior wrote:
>> > > > Currently it also tries to dump [vvar] (by default rules) but that is
>> > > > unreadable for some reason, causing:
>> > > > warning: Memory read failed for corefile section, 8192 bytes at 0x7ffff6ceb000.
>> > > > ^^^^^^^^^^^^^^
>>
>> > It would be good to get a reply from a kernel aware person what does it mean
>> > before such patch gets accepted. It can be also just a Linux kernel bug.
>>
>> _So far_ this doesn't look like a kernel bug to me.
>>
>> But! I need to recheck. In fact, it seems to me that I should discuss
>> this on lkml. I have some concerns, but most probably this is only my
>> misunderstanding, I need to read this (new to me) code more carefully.
>
> Hi Andy, we need your help ;)
>
> So, the problem is that gdb can't access the "vvar" mapping which looks
> like the "normal" vma from user-space pov.
>
> Technically this is clear. vvar_mapping->pages is the "dummy" no_pages[]
> array, get_user_pages() can't succeed. In fact even follow_page() can't
> work because of VM_PFNMAP/_PAGE_SPECIAL set by remap_pfn_range().
>
> What is not clear: do we really want gup() to fail? Or it is not trivial
> to turn __vvar_page into the "normal" page? (to simplify the discussion,
> lets ignore hpet mapping for now).
We could presumably fiddle with the vma to allow get_user_pages to
work on at least the first vvar page. There are some decently large
caveats, though:
- We don't want to COW it. If someone pokes at that page with
ptrace, for example, and it gets COWed, everything will stop working
because the offending process will no longer see updates. That way
lies infinite loops.
- The implementation could be odd. The vma is either VM_MIXEDMAP or
VM_PFNMAP, and I don't see any practical way to change that.
- The HPET and perhaps pvclock stuff. The HPET probably doesn't have
a struct page at all, so you can't possibly get_user_pages it.
>
> Because this doesn't look consistent. gdb tries to "coredump" the live
> process like the kernel does, but fails to dump the "r--p ... [vvar]"
> region.
>
>
> OK, gdb can look at VM_DONTDUMP bit in "VmFlags:" field in /proc/pid/smaps
> and skip this vma. But, why (afaics) the kernel dumps this vma then? Lets
> look at vma_dump_size(),
>
> /* always dump the vdso and vsyscall sections */
> if (always_dump_vma(vma))
> goto whole;
>
> if (vma->vm_flags & VM_DONTDUMP)
> return 0;
>
> so the kernel ignores VM_DONTDUMP in this case, always_dump_vma() returns
> true because of special_mapping_name(). Perhaps we should check VM_DONTDUMP
> before always_dump_vma() ?
>
That sounds reasonable to me. I'll write the patch later today. gdb
will still need changes, though, right?
--Andy
>
> Or. We can teach gdb to read and dump its own "vvar" mapping to mimic the
> kernel behaviour, this is the same read-only memory. But this hack doesn't
> look nice, gdb should not know "too much" about the kernel internals.
>
> Oleg.
>
--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/