On Mon, May 23, 2016 at 04:30:43PM -0500, Larry Finger wrote:
The mainline kernels past 4.6.0 fail hang when logging in. There are no
error messages, and the machine seems to be waiting for some event that
never happens.
The problem has been bisected to commit dd254f5a382c ("fold checks into
iterate_and_advance()"). The bisection has been verified.
The problem is the call from iov_iter_advance(). When I reinstated the old
macro with a new name and used it in that routine, the system works.
Obviously, the call that seems to be incorrect has some benefits. My
quich-and-dirty patch is attached.
I will be willing to test any patch you prepare.
Hangs where and how? A reproducer, please... This is really weird - the
only change there is in the cases when
* iov_iter_advance(i, n) is called with n greater than the remaining
amount. It's a bug, plain and simple - old variant would've been left in
seriously buggered state and at the very least we want to catch any such
places for the sake of backports
* iov_iter_advance(i, 0) - both old and new code leave *i unchanged,
but the old one dereferences i->iov[0], which be pointing beyond the end of
array by that point. The value read from there was not used by the old code,
at that.
Could you slap WARN_ON(size > i->count) in the very beginning of
iov_iter_advance() (the mainline variant) and see what triggers on your
reproducer?