[PATCH 0/4] struct file RCU optimizations V2

From: Christoph Lameter
Date: Thu Jun 22 2006 - 17:30:24 EST


Optimize RCU handling for struct file by relying on slab RCU handling

(Thanks for all the feedback on the RFC for this....)

Currently we schedule RCU frees for each file we free separately. That has
several drawbacks against the earlier file handling (in 2.6.5 f.e.), which
did not require RCU callbacks:

1. Excessive number of RCU callbacks can be generated causing long RCU
queues that in turn cause long latencies.

2. The cache hot object is not preserved between free and realloc. A close
followed by another open is very fast with the RCUless approach because
the last freed object is returned by the slab allocator that is
still cache hot. RCU free means that the object is not immediately
available again. The new object is cache cold and therefore open/close
performance tests show a significant degradation with the RCU
implementation.

One solution to this problem is to move the RCU freeing into the Slab
allocator by specifying SLAB_DESTROY_BY_RCU as an option at slab creation
time. The slab allocator will do RCU frees only when it is necessary
to dispose of slabs of objects (rare). So with that approach we can cut
out the RCU overhead significantly.

However, the slab allocator may return the object for another use even
before the RCU period has expired under SLAB_DESTROY_BY_RCU. This means
there is the (unlikely) possibility that the object is going to be
switched under us in sections protected by rcu_read_lock() and
rcu_read_unlock(). So we need to verify that we have acquired the correct
object after establishing a stable object reference (incrementing the
refcounter does that).

I have tested this on IA64 NUMA with aim7.

Patch was first discussed at:
http://marc.theaimsgroup.com/?t=115048747200002&r=1&w=2

V1->V2:
- Consolidate common code after finding fget in fget_light.
- Rename fu_list to its prior name f_list.
- Do not clear variables in get_empty_filp() that are cleared
later by other subsystems.
- Move the eventpoll initialization into the constructor.
- Split up the patch into patchset of 4 patches.

The patchset consists of 4 patches against 2.6.17 that are following
this message.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/