Re: [PATCH] exit: clear TIF_MEMDIE after exit_task_work
From: Michal Hocko
Date: Mon Jun 13 2016 - 07:50:48 EST
On Tue 07-06-16 14:50:14, Michal Hocko wrote:
> On Mon 14-03-16 17:39:43, Michal Hocko wrote:
> > On Tue 01-03-16 19:20:24, Michael S. Tsirkin wrote:
> > > On Tue, Mar 01, 2016 at 06:17:58PM +0100, Michal Hocko wrote:
> > [...]
> > > > Sorry, I could have been more verbose... The code would have to make sure
> > > > that the mm is still alive before calling g-u-p by
> > > > atomic_inc_not_zero(&mm->mm_users) and fail if the user count dropped to
> > > > 0 in the mean time. See how fs/proc/task_mmu.c does that (proc_mem_open
> > > > + m_start + m_stop.
> > > >
> > > > The biggest advanatage would be that the mm address space pin would be
> > > > only for the particular operation. Not sure whether that is possible in
> > > > the driver though. Anyway pinning the mm for a potentially unbounded
> > > > amount of time doesn't sound too nice.
> > >
> > > Hmm that would be another atomic on data path ...
> > > I'd have to explore that.
> >
> > Did you have any chance to look into this?
>
> So this is my take to get rid of mm_users pinning for an unbounded
> amount of time. This is even not compile tested. I am not sure how to
> handle when the mm goes away while there are still work items pending.
> It seems this is not handled current anyway and only shouts with a
> warning so this shouldn't cause a new regression AFAICS. I am not
> familiar with the vnet code at all so I might be missing many things,
> though. Does the below sound even remotely reasonable to you Michael?
I have checked the vnet code and it doesn't seem to rely on
copy_from_user/get_user AFAICS. Other users of use_mm() need to copy to
the userspace only as well. So we should be perfectly safe to OOM reap
address space even when it is shared by the kthread [1] so this is
not really needed for the OOM correctness purpose. It would be much
nicer if the kthread didn't pin the mm for two long outside of the OOM
handling as well of course but that lowers the priority of the change.
[1] http://lkml.kernel.org/r/20160613112348.GC6518@xxxxxxxxxxxxxx
--
Michal Hocko
SUSE Labs