Re: [PATCH] fix for zeroed user-space tids in multi-threaded core dumps

From: Andreas Schwab
Date: Thu Aug 20 2015 - 10:48:59 EST


Roland McGrath <roland@xxxxxxxxxx> writes:

> [PATCH] Disable CLONE_CHILD_CLEARTID for abnormal exit.
>
> The CLONE_CHILD_CLEARTID flag is used by NPTL to have its threads
> communicate via memory/futex when they exit, so pthread_join can
> synchronize using a simple futex wait. The word of user memory where NPTL
> stores a thread's own TID is what it passes; this gets reset to zero at
> thread exit.
>
> It is not desireable to touch this user memory when threads are dying due
> to a fatal signal. A core dump is more usefully representative of the
> dying program state if the threads live at the time of the crash have their
> NPTL data structures unperturbed. The userland expectation of
> CLONE_CHILD_CLEARTID has only ever been that it works for a thread making
> an _exit system call.

This breaks nscd. It uses CLONE_CHILD_CLEARTID to clear the
nscd_certainly_running flag in the shared databases, so that the clients
are notified when nscd is restarted. Now, when nscd uses a
non-persistent database, the clients that have it mapped keep thinking
the database is being updated by nscd, when in fact nscd has created a
new (anonymous) one (for non-persistent databases it uses an unlinked
file as backend).

Andreas.

--
Andreas Schwab, SUSE Labs, schwab@xxxxxxx
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/