Re: S4 resume broken since 2.6.39 (3.1, too)

From: Rafael J. Wysocki
Date: Tue Sep 27 2011 - 13:01:05 EST


On Tuesday, September 27, 2011, Yinghai Lu wrote:
> On 09/26/2011 03:24 PM, Rafael J. Wysocki wrote:
>
> > On Thursday, September 22, 2011, Yinghai Lu wrote:
> >> On Wed, Sep 21, 2011 at 11:48 AM, Rafael J. Wysocki <rjw@xxxxxxx> wrote:
> >>> It looks like init_memory_mapping() is sometimes called with "end"
> >>> beyond the last mapped PFN and it explodes when we try to write stuff to
> >>> that address during image restoration.
> >>>
> >>> IOW, the Yinghai's assumption that init_memory_mapping() would always be
> >>> called with a "good end" on x86_64 was overomptimistic.
> >>
> >> for 64bit x86, kernel_physical_mapping_init() will use
> >> map_low_page()/call early_memmap() to access ram for page_table that is above
> >> rather last mapped PFN.
> >>
> >> the point is:
> >> on system with 64g, usable ram will be [0,2048m), [4g, 64g)
> >> init_memory_mapping will be called two times for them.
> >> before putting page_table high,
> >> page table will be two parts: one is just below 512M, and one below 2048m.
> >> after putting page_table high,
> >> page table will be two parts: one is just below 2048M, and one below 64G.
> >>
> >> one of the purposes is finding biggest continuous big range under
> >> 1024m for kdump.
> >
> > This is all fine so long as we can ensure that the "end" value we're
> > passing to init_memory_mapping() will always be a valid address, which
> > evidently is not the case sometimes.
>
>
> I don't understand why end is not valid could happen.
>
> end should be always valid address. one is max_low_pfn under 4g, and another one is max_pfn...
>
>
> >
> > So, in my opinion we should simply apply the Takashi's patch at this
> > point and revisit the kdump issue later, when we actually know how to do
> > the right thing.
>
>
> Takashi said: 2.6.37 with that commit is ok, only 2.6.39 somehow has the 1/20 chance has the reset problem.
>
> so that commit should not the cause. could be some hidden assumption from
> restore code ?

Quite frankly, I doubt it. The only remotely related change between 2.6.37
and 2.6.37 seems to be commit d1ee433 (x86, trampoline: Use the unified
trampoline setup for ACPI wakeup).

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/