Re: WARNING in get_pi_state

From: Peter Zijlstra
Date: Tue Oct 31 2017 - 06:09:07 EST


On Tue, Oct 31, 2017 at 12:29:50PM +0300, Dmitry Vyukov wrote:
> I understand your sentiment, but it's definitely not _at all_. The
> system compiled this exact code, run it and triggered the bug on it.
> Do you have suggestions on how to make this code more portable? How
> does this setup would look on your system?

So I don't see the point of that tun stuff; what was is supposed to do?

All it ever did after creation was flush_tun(), which reads until empty.
But given nobody would ever write into it, that's an 'expensive' NO-OP.

> We do try hard to get rid of unnecessary stuff in reproducers. I think
> what happened in this case is the following. This is a hard to
> reproduce race. The bot was able to reproduce the crash on initial
> program that uses tun, then tried to get rid of tun code and
> re-reproduce it, but it did not reproduce this time, so it concluded
> that tun code is somehow necessary here. That's unfortunate
> consequence of testing complex concurrent code. May become somewhat
> better once we have KTSAN, the race detector.

I ripped out the tun bits and it reproduced in ~100 seconds. I've now
got it running for well over 30m on the fixed kernel while I'm trying to
come up with a comprehensible Changelog ;-)