Re: [PATCH -V6 00/21] swap: Swapout/swapin THP in one piece

From: Daniel Jordan
Date: Wed Oct 24 2018 - 13:24:34 EST


On Wed, Oct 24, 2018 at 11:31:42AM +0800, Huang, Ying wrote:
> Hi, Daniel,
>
> Daniel Jordan <daniel.m.jordan@xxxxxxxxxx> writes:
>
> > On Wed, Oct 10, 2018 at 03:19:03PM +0800, Huang Ying wrote:
> >> And for all, Any comment is welcome!
> >>
> >> This patchset is based on the 2018-10-3 head of mmotm/master.
> >
> > There seems to be some infrequent memory corruption with THPs that have been
> > swapped out: page contents differ after swapin.
>
> Thanks a lot for testing this! I know there were big effort behind this
> and it definitely will improve the quality of the patchset greatly!

You're welcome! Hopefully I'll have more results and tests to share in the
next two weeks.

>
> > Reproducer at the bottom. Part of some tests I'm writing, had to separate it a
> > little hack-ily. Basically it writes the word offset _at_ each word offset in
> > a memory blob, tries to push it to swap, and verifies the offset is the same
> > after swapin.
> >
> > I ran with THP enabled=always. THP swapin_enabled could be always or never, it
> > happened with both. Every time swapping occurred, a single THP-sized chunk in
> > the middle of the blob had different offsets. Example:
> >
> > ** > word corruption gap
> > ** corruption detected 14929920 bytes in (got 15179776, expected 14929920) **
> > ** corruption detected 14929928 bytes in (got 15179784, expected 14929928) **
> > ** corruption detected 14929936 bytes in (got 15179792, expected 14929936) **
> > ...pattern continues...
> > ** corruption detected 17027048 bytes in (got 15179752, expected 17027048) **
> > ** corruption detected 17027056 bytes in (got 15179760, expected 17027056) **
> > ** corruption detected 17027064 bytes in (got 15179768, expected 17027064) **
>
> 15179776 < 15179xxx <= 17027064
>
> 15179776 % 4096 = 0
>
> And 15179776 = 15179768 + 8
>
> So I guess we have some alignment bug. Could you try the patches
> attached? It deal with some alignment issue.

That fixed it. And removed three lines of code. Nice :)