Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of theprocess id namespace

From: Eric W. Biederman
Date: Tue Feb 14 2006 - 00:55:17 EST


Kirill Korotaev <dev@xxxxx> writes:

>>>1.
>>>flags are neither atomic nor protected with any lock.
>> flags are atomic as they are a machine word. So they do not
>> require a read/modify write so they will either be written
>> or not written. Plus this allows write-sharing of the appropriate
>> cache line which is very polite (assuming the line is not shared with
>> something else)
> Eric I'm familiar with SMP, thanks :)
> Why do you write all this if you agreed below that have problems with it?

To establish a baseline of understanding and because you made an assertion
that is counter to my understanding.

>>>2. due to 1) you code is buggy. in this respect do_exit() is not serialized
> with
>>>copy_process().
>> Yes. I may need a memory barrier in there. I need to think
>> about that a little more.
> memory barrier doesn't help. you really need to think about.

Except for instances where you need an atomic read/modify/write the
only primitives you have are reads, writes and barriers.

I have all of the correct reads and writes therefore the only thing
that could help are barriers if the logic is otherwise sane.

A write barrier to ensure the write of flags is visible before I write
the kill signal will ensure the write of flags is globally visible
first. Although I am having a hard time convincing myself even that
matters.

>>>3. due to the same 1) reason
>>> > + kill_pspace_info(SIGKILL, (void *)1, tsk->pspace);
>>>can miss a task being forked. Bang!!!
>>
>> Well the only bad thing that can happen is that I get a process that
>> can run and observe pid == 1 has exited. So Bang!! is not too
>> painful.
> And what about references to pspace->child_reaper which was freed already?

The assumes that release_task() is called synchronously with do_exit
which is not the case. Looking at the code I do think release_task()
for the pspace leader can be called too soon. But that is really
has nothing to do with whether or not all of it's children got sent
SIGKILL.

That is a significant issue, that needs to be fixed before I submit
this piece of code for inclusion into the kernel.

The issue is depending on the context is that a process actively running
in kernel space could proceed for a long time before it returns to user
space and receives a signal. In that span of time it could execute just
about any code in the kernel.

Kirill thank you for spotting this.

This exchange seems to have a hostile and not a cooperative tone so
I will finish the investigation and bug fixing elsewhere.


I expect that there might be a few more issues like this. My only
expectation was that the code was complete enough to discuss semantics
and kernel mechanisms for creating a namespaces, and the code has
successfully served that purpose.

Eric








-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/