Re: [RFC, PATCH] kernel/rcu: add kfree_rcu

From: Lai Jiangshan
Date: Sun Jan 04 2009 - 00:56:22 EST


Paul E. McKenney wrote:
> On Fri, Jan 02, 2009 at 12:25:58PM +0100, Manfred Spraul wrote:
>> Based on the idea from Lai Jiangshan:
>>
>> There are 23 instances where the rcu callback just does
>>
>> kfree(containerof(rcu_member,struct whatever_struct,head));
>>
>> The 23 instances exist because there are 23 'struct whatever_struct' with
>> their individual rcu_member.
>>
>> The attached patch creates a generic kfree_rcu function that removes the need
>> for these 23 helpers: The offset from the container struct to the rcu member
>> is calculated at compile time and stored in head->func instead of the
>> function pointer.
>
> OK, these seem to me to avoid Nick Piggin's vfree() question in response
> to a roughly similar posting from Lai Jiangshan on November 18th
> (http://lkml.org/lkml/2008/11/18/83).

Why?

I think these two are different questions.
vfree() still can not be called from softirq context now.
And I proposed vfree_atomic() for RCU, but it can not be accepted.

Lai

>
>> Effect on .text: (x86_64)
>> +23 bytes in rcutree.c
>> - 9 bytes in ipc/sem.c
>>
>> Depending on the number of call_rcu instances in the .config, adding
>> kfree_rcu should save between 100 and 200 bytes in .text.
>>
>> What do you think?
>
> I am in favor, given changes noted below. This is a much nicer way to
> implement some function in the original RCU patch for Linux.
>
>> Is the interface ok? Incorrect use should generate compile time errors.
>> Is it ok to use IS_ERR(p)?
>
> I would suggest instead using the bottom bit to differentiate between
> these two cases, especially given that your approach makes it impossible
> for callback processing to notice a NULL function pointer. In addition,
> this approach would allow different types of allocators to be specified
> should this later prove to be helpful. You should not have to shift the
> offset because the rcu_head offset should always be a multiple of four
> (or eight on 64-bit architectures).
>
> And we really are running into bugs that are detected by RCU's seeing a
> null function pointer in the rcu_head structure at callback-invocation
> time. So, whatever encoding you choose, please leave a function-pointer
> value of zero as an invalid value!
>
>> And: How many seperate patches should I create?
>> I would propose 24 patches: one with the interface, 23 that
>> replace the individual callbacks.
>
> I defer to the various maintainers on this one. ;-)
>
> Thanx, Paul
>
>> Signed-off-by: Manfred Spraul <manfred@xxxxxxxxxxxxxxxx>
>> ---
>> include/linux/rcupdate.h | 33 +++++++++++++++++++++++++++++++++
>> ipc/sem.c | 10 ++--------
>> kernel/rcuclassic.c | 2 +-
>> kernel/rcupreempt.c | 2 +-
>> kernel/rcutree.c | 2 +-
>> 5 files changed, 38 insertions(+), 11 deletions(-)
>>
>> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
>> index 1168fbc..d120014 100644
>> --- a/include/linux/rcupdate.h
>> +++ b/include/linux/rcupdate.h
>> @@ -269,6 +269,29 @@ extern void call_rcu(struct rcu_head *head,
>> extern void call_rcu_bh(struct rcu_head *head,
>> void (*func)(struct rcu_head *head));
>>
>> +/**
>> + * kfree_rcu - kfree after a grace period
>> + * @ptr: pointer to kfree
>> + * @rcu_member: name of the struct rcu_head member.
>> + *
>> + * Many rcu callbacks just call kfree() on the base structure. This helper
>> + * function calls kfree internally. The rcu_head structure must be embedded
>> + * in the to be freed structure.
>> + */
>> +
>> +static inline void __kfree_rcu(void *prcu, unsigned long offset)
>> +{
>> + void (*pfnc)(struct rcu_head *head) = (void (*)(struct rcu_head *head))offset;
>> +
>> + BUILD_BUG_ON(!__builtin_constant_p(-offset));
>> + BUILD_BUG_ON(!IS_ERR(pfnc));
>
> This allows us to drop the above check in favor of something like the
> following above (though a symbolic value in place of 0x1 would be nice):
>
> void (*pfnc)(struct rcu_head *head) = (void (*)(struct rcu_head *head))(offset | 0x1);
>
>> + call_rcu(prcu, pfnc);
>> +}
>> +
>> +#define kfree_rcu(pobj, entry) \
>> + __kfree_rcu(&((pobj)->entry), -offsetof(typeof(*(pobj)), entry))
>> +
>> /* Exported common interfaces */
>> extern void synchronize_rcu(void);
>> extern void rcu_barrier(void);
>> @@ -279,4 +302,14 @@ extern void rcu_barrier_sched(void);
>> extern void rcu_init(void);
>> extern int rcu_needs_cpu(int cpu);
>>
>> +/* helper functions */
>> +static inline void rcu_docallback(struct rcu_head *head)
>> +{
>
> Experience indicates that we should have some sort of WARN_ON() in
> the case of a NULL rcu_head, either here or where this is called. :-/
>
>> + if (IS_ERR(head->func)) {
>
> And in this "if" statement, just check the bottom bit.
>
>> + kfree(((char *)head) + (unsigned long)head->func);
>> + } else {
>> + head->func(head);
>> + }
>> +}
>> +
>> #endif /* __LINUX_RCUPDATE_H */
>> diff --git a/ipc/sem.c b/ipc/sem.c
>> index 0821224..4a68cf2 100644
>> --- a/ipc/sem.c
>> +++ b/ipc/sem.c
>> @@ -504,12 +504,6 @@ static int count_semzcnt (struct sem_array * sma, ushort semnum)
>> return semzcnt;
>> }
>>
>> -static void free_un(struct rcu_head *head)
>> -{
>> - struct sem_undo *un = container_of(head, struct sem_undo, rcu);
>> - kfree(un);
>> -}
>> -
>> /* Free a semaphore set. freeary() is called with sem_ids.rw_mutex locked
>> * as a writer and the spinlock for this semaphore set hold. sem_ids.rw_mutex
>> * remains locked on exit.
>> @@ -528,7 +522,7 @@ static void freeary(struct ipc_namespace *ns, struct kern_ipc_perm *ipcp)
>> un->semid = -1;
>> list_del_rcu(&un->list_proc);
>> spin_unlock(&un->ulp->lock);
>> - call_rcu(&un->rcu, free_un);
>> + kfree_rcu(un, rcu);
>
> This -is- nice! Short and sweet!!!
>
>> }
>>
>> /* Wake up all pending processes and let them fail with EIDRM. */
>> @@ -1347,7 +1341,7 @@ void exit_sem(struct task_struct *tsk)
>> update_queue(sma);
>> sem_unlock(sma);
>>
>> - call_rcu(&un->rcu, free_un);
>> + kfree_rcu(un, rcu);
>> }
>> kfree(ulp);
>> }
>> diff --git a/kernel/rcuclassic.c b/kernel/rcuclassic.c
>> index e503a00..b2f97d2 100644
>> --- a/kernel/rcuclassic.c
>> +++ b/kernel/rcuclassic.c
>> @@ -336,7 +336,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
>> while (list) {
>> next = list->next;
>> prefetch(next);
>> - list->func(list);
>> + rcu_docallback(list);
>> list = next;
>> if (++count >= rdp->blimit)
>> break;
>> diff --git a/kernel/rcupreempt.c b/kernel/rcupreempt.c
>> index 0498265..8440dac 100644
>> --- a/kernel/rcupreempt.c
>> +++ b/kernel/rcupreempt.c
>> @@ -1110,7 +1110,7 @@ static void rcu_process_callbacks(struct softirq_action *unused)
>> spin_unlock_irqrestore(&rdp->lock, flags);
>> while (list) {
>> next = list->next;
>> - list->func(list);
>> + rcu_docallback(list);
>> list = next;
>> RCU_TRACE_ME(rcupreempt_trace_invoke);
>> }
>> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
>> index a342b03..29c88ce 100644
>> --- a/kernel/rcutree.c
>> +++ b/kernel/rcutree.c
>> @@ -901,7 +901,7 @@ static void rcu_do_batch(struct rcu_data *rdp)
>> while (list) {
>> next = list->next;
>> prefetch(next);
>> - list->func(list);
>> + rcu_docallback(list);
>
> Good, you got all three of them! ;-)
>
>> list = next;
>> if (++count >= rdp->blimit)
>> break;
>> --
>> 1.6.0.6
>>
>
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/