Re: [PATCH] low-latency zap_page_range

From: Robert Love (rml@tech9.net)
Date: Mon Jul 22 2002 - 13:50:06 EST


On Mon, 2002-07-22 at 11:40, Andrew Morton wrote:

> Disagree, really. It's not a thing of beauty, but it is completely
> obvious what the code is doing and why it is doing it. There are
> no subtle side-effects and the whole lot can be understood from a
> single screenful. Unmaintainable code is code which requires you
> to spend days crawling all over the tree working out what it's doing
> any why it's doing it. It's awkward, but it's simple, and I wouldn't
> get very worked up over it personally.

I agree with your points although I do not find the previous version any
less of this.

> Hard call. In general I suspect it's best to hold onto a lock
> for as long as possible, get a lot of work done rather than
> reacquiring it all the time. But there are some locks which are
> occasionally held for a very long time and are often held for very
> short periods. Such as this one (mm->page_table_lock) and pagemap_lru_lock.
> In those cases, dropping the lock to let someone else get in, out and
> away may help. But not as much as a little bit of locking redesign...

Agreed.

> zap_page_range is sometimes called under another lock, (eg, vmtruncate_list).
> So there's nothing useful to be done there. Perhaps you should test
> ->preempt_count as well - if it's greater than one then don't bother with
> the lock dropping.

Hrm, this means cond_resched_lock() is out of the question here, then.

We could use break_spin_locks() but that would mean we drop the lock w/o
checking for need_resched (or wrap it in need_resched() and then we
check twice).

Finally, we could take your approach, change cond_resched_lock() to be:

        if (need_resched() && preempt_count() == 1) {
                spin_unlock_no_resched(lock);
                __cond_resched();
                spin_lock(lock);
        }

but then we need to break the function up into a preempt and a
non-preempt version as preempt_count() unconditionally returns 0 with
!CONFIG_PREEMPT. Right now the functions I posted do the right thing on
any combination of UP, SMP, and preempt.

Thoughts?

> This, btw, probably means that your code won't be very effective yet: large
> truncates will still exhibit poor latency. However, truncate _is_ something
> which we can improve algorithmically. One possibility is to implement a
> radix tree split function: split the radix tree into two trees along the
> truncation point, clean up the end and then drop the bulk of the pages
> outside locks.

I would _much_ prefer to tackle these issues via better algorithms...
your suggestion for truncate is good.

Note that I make an exception here (part of my argument for a preemptive
kernel was no more need to do "hackish" conditional scheduling and lock
breaking) because there really is not much you can do to this
algorithm. It does a lot of work on potentially a lot of data and the
cleanest solution we have is to just break it up into chunks.

        Robert Love

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Jul 23 2002 - 22:00:40 EST