Re: [PATCH]: Clean up of __alloc_pages

From: Paul Jackson
Date: Mon Nov 07 2005 - 04:47:25 EST


Nick wrote:
> Yeah, take a look at rmap.c as well, and some of the comments in
> changelogs if you need a better feel for it.

Ok - thanks.


> So your cpusets may be reused, but only as new cpusets. This should
> be no problem at all for you.

Correct - should be no problem.


> > And is the pair of operators:
> > task_lock(current), task_unlock(current)
> > really that much worse than the pair of operators
> > ...
> > preempt_disable, preempt_enable

That part still surprises me a little. Is there enough difference in
the performance between:

1) task_lock, which is a spinlock on current->alloc_lock and
2) rcu_read_lock, which is .preempt_count++; barrier()

to justify a separate slab cache for cpusets and a little more code?

For all I know (not much) the task_lock might actually be cheaper ;).


> You may also have to be careful about memory ordering when setting
> a pointer which may be concurrently dereferenced by another CPU so
> that stale data doesn't get picked up.
>
> The set side needs an rcu_assign_pointer, and the dereference side
> needs rcu_dereference. Unless you either don't care about races,

I don't think I care ... I'm just sampling task->cpuset->mems_generation,
looking for it to change. Sooner or later, after it changes, I will get
an accurate read of it, realized it changed, and immediately down a
cpuset semaphore and reread all values of interest.

The semaphore down means doing an atomic_dec_return(), which imposes
a memory barrier, right?


> My RCU suggestion was mainly an idea to get around your immediate
> problem with a lockless fastpath, rather than advocating it over
> any of the alternatives.

Understood. Thanks for your comments on the alternatives - they
seem reasonable.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxx> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/