Re: pagecache locking

From: Boaz Harrosh
Date: Sun Jul 07 2019 - 11:05:23 EST


On 06/07/2019 02:31, Dave Chinner wrote:

>
> As long as the IO ranges to the same file *don't overlap*, it should
> be perfectly safe to take separate range locks (in read or write
> mode) on either side of the mmap_sem as non-overlapping range locks
> can be nested and will not self-deadlock.
>
> The "recursive lock problem" still arises with DIO and page faults
> inside gup, but it only occurs when the user buffer range overlaps
> the DIO range to the same file. IOWs, the application is trying to
> do something that has an undefined result and is likely to result in
> data corruption. So, in that case I plan to have the gup page faults
> fail and the DIO return -EDEADLOCK to userspace....
>

This sounds very cool. I now understand. I hope you put all the tools
for this in generic places so it will be easier to salvage.

One thing I will be very curious to see is how you teach lockdep
about the "range locks can be nested" thing. I know its possible,
other places do it, but its something I never understood.

> Cheers,
> Dave.

[ Ha one more question if you have time:

In one of the mails, and you also mentioned it before, you said about
the rw_read_lock not being able to scale well on mammoth machines
over 10ns of cores (maybe you said over 20).
I wonder why that happens. Is it because of the atomic operations,
or something in the lock algorithm. In my theoretical understanding,
as long as there are no write-lock-grabbers, why would the readers
interfere with each other?
]

Thanks
Boaz