Re: [PATCH v10 12/25] mm: cache some VMA fields in the vm_fault structure

From: Minchan Kim
Date: Tue May 08 2018 - 06:56:31 EST


On Fri, May 04, 2018 at 11:10:54AM +0200, Laurent Dufour wrote:
> On 03/05/2018 17:42, Minchan Kim wrote:
> > On Thu, May 03, 2018 at 02:25:18PM +0200, Laurent Dufour wrote:
> >> On 23/04/2018 09:42, Minchan Kim wrote:
> >>> On Tue, Apr 17, 2018 at 04:33:18PM +0200, Laurent Dufour wrote:
> >>>> When handling speculative page fault, the vma->vm_flags and
> >>>> vma->vm_page_prot fields are read once the page table lock is released. So
> >>>> there is no more guarantee that these fields would not change in our back
> >>>> They will be saved in the vm_fault structure before the VMA is checked for
> >>>> changes.
> >>>
> >>> Sorry. I cannot understand.
> >>> If it is changed under us, what happens? If it's critical, why cannot we
> >>> check with seqcounter?
> >>> Clearly, I'm not understanding the logic here. However, it's a global
> >>> change without CONFIG_SPF so I want to be more careful.
> >>> It would be better to describe why we need to sanpshot those values
> >>> into vm_fault rather than preventing the race.
> >>
> >> The idea is to go forward processing the page fault using the VMA's fields
> >> values saved in the vm_fault structure. Then once the pte are locked, the
> >> vma->sequence_counter is checked again and if something has changed in our back
> >> the speculative page fault processing is aborted.
> >
> > Sorry, still I don't understand why we should capture some fields to vm_fault.
> > If we found vma->seq_cnt is changed under pte lock, can't we just bail out and
> > fallback to classic fault handling?
> >
> > Maybe, I'm missing something clear now. It would be really helpful to understand
> > if you give some exmaple.
>
> I'd rather say that I was not clear enough ;)
>
> Here is the point, when we deal with a speculative page fault, the mmap_sem is
> not taken, so parallel VMA's changes can occurred. When a VMA change is done
> which will impact the page fault processing, we assumed that the VMA sequence
> counter will be changed.
>
> In the page fault processing, at the time the PTE is locked, we checked the VMA
> sequence counter to detect changes done in our back. If no change is detected
> we can continue further. But this doesn't prevent the VMA to not be changed in
> our back while the PTE is locked. So VMA's fields which are used while the PTE
> is locked must be saved to ensure that we are using *static* values.
> This is important since the PTE changes will be made on regards to these VMA
> fields and they need to be consistent. This concerns the vma->vm_flags and
> vma->vm_page_prot VMA fields.
>
> I hope I make this clear enough this time.

It's more clear at this point. Please include such nice explanation in description.
Now, I am wondering how you synchronize those static value and vma's seqcount.
It must be in next patchset. I hope to grab a time to read it, asap.

Thanks.