Re: NUMA API observations

From: Andi Kleen
Date: Mon Jun 14 2004 - 18:51:48 EST


On Tue, Jun 15, 2004 at 07:40:04AM +1000, Anton Blanchard wrote:
>
> > interleave should always fall back to other nodes. Very weird.
> > Needs to be investigated. What were the actual arguments passed
> > to the syscalls?
>
> This one looks like a bug in my code. I wasnt setting numnodes high
> enough, so the node fallback lists werent being initialised for some
> nodes.

Ok. Good to know.

That's a bad generic bug, right?

interleaving isn't really doing much different from an ordinary allocation,
except that the numa_node_id() index to the zone table is replaced with a
different number.

> > > My kernel is compiled with NR_CPUS=128, the setaffinity syscall must be
> > > called with a bitmap at least as big as the kernels cpumask_t. I will
> > > submit a patch for this shortly.
> >
> > Umm, what a misfeature. We size the buffer up to the biggest
> > running CPU. That should be enough.
> >
> > IMHO that's just a kernel bug. How should a user space
> > application sanely discover the cpumask_t size needed by the kernel?
> > Whoever designed that was on crack.
>
> glibc now uses a select style interface. Unfortunately the interface has
> changed about three times by now.

I have no plans to track the glibc interface of the week for this
and numactl must run with older glibc anyways, that is why I always
used an own stub to this. I am not sure they even solved the problem
completely. With the upcomming numactl version it should work.

What I wonder is why IA64 worked though. We tested on it previously,
but somehow didn't run into this. The regression test suite
needs to check this better.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/