Re: Efficient IPC mechanism on Linux

From: Luca Veraldi
Date: Wed Sep 10 2003 - 05:06:31 EST


> For fun do the measurement on a pIV cpu. You'll be surprised.
> The microcode "mark dirty" (which is NOT a btsl, it gets done when you do
a write
> memory access to the page content) result will be in the 2000 to 4000
range I
> predict.

I'm not responsible for microarchitecture designer stupidity.
If a simple STORE assembler instruction will eat up 4000 clock cycles,
as you say here, well, I think all we Computer Scientists can go home and
give it up now.

> There are things like SMP synchronisation to do, but also
> if the cpu marks a page dirty in the page table, that means the page table
> changes which means the pagetable needs to be marked in the
> PMD. Which means the PMD changes, which means the PGD needs the PMD marked
> dirty. Etc Etc. It's microcode. It'll take several 1000 cycles.

Please, return to the fact.
Modifying the page contents is done by that part of benchmark application we
are not misuring.
That is, the code ***BEFORE*** the zc_send() (write() on pipe or whatever
you choose)
and ***AFTER*** a zc_receive() (or read() from pipe ot whatever else).
This is nice, thanks, but out of our interests.

We are only reading the relocation tables or inserting new entries in it.
Not modifying page contents.

> if you change a page table, you need to flush the TLB on all other cpus
> that have that same page table mapped, like a thread app running
> on all cpu's at once with the same pagetables.

Ok. Simply, this is not the case in my experiment.
This does not apply.
We have no threads. But only detached process address spaces.
Threads are a bit different from processes.

> why would you need a global lock for copying memory ?

System call sys_write() calls
locks_verify_area() which calls
locks_mandatory_area() which calls
lock_kernel()

oops...
global spin_lock locking...

SMP is crying...
But not so much, if the lock is not lasting much, is it right?

THIS IS NOT WHAT WE CALL A SHORT LOCK SECTION:

732 repeat:
733 /* Search the lock list for this inode for locks that conflict
with
734 * the proposed read/write.
735 */
736 for (fl = inode->i_flock; fl != NULL; fl = fl->fl_next) {
737 if (!(fl->fl_flags & FL_POSIX))
738 continue;
739 if (fl->fl_start > new_fl->fl_end)
740 break;
741 if (posix_locks_conflict(new_fl, fl)) {
742 error = -EAGAIN;
743 if (filp && (filp->f_flags & O_NONBLOCK))
744 break;
745 error = -EDEADLK;
746 if (posix_locks_deadlock(new_fl, fl))
747 break;
748
749 error = locks_block_on(fl, new_fl);
750 if (error != 0)
751 break;
752
753 /*
754 * If we've been sleeping someone might have
755 * changed the permissions behind our back.
756 */
757 if ((inode->i_mode & (S_ISGID | S_IXGRP)) !=
S_ISGID)
758 break;
759 goto repeat;
760 }
761 }

>From the source code of your Operating System.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/