Re: [PATCH] x86: Run checksumming in parallel accross multiple alu's

From: Eric Dumazet
Date: Mon Oct 21 2013 - 13:31:44 EST


On Sun, 2013-10-20 at 17:29 -0400, Neil Horman wrote:
> On Fri, Oct 18, 2013 at 02:15:52PM -0700, Eric Dumazet wrote:
> > On Fri, 2013-10-18 at 16:11 -0400, Neil Horman wrote:
> >
> > > #define BUFSIZ_ORDER 4
> > > #define BUFSIZ ((2 << BUFSIZ_ORDER) * (1024*1024*2))
> > > static int __init csum_init_module(void)
> > > {
> > > int i;
> > > __wsum sum = 0;
> > > struct timespec start, end;
> > > u64 time;
> > > struct page *page;
> > > u32 offset = 0;
> > >
> > > page = alloc_pages((GFP_TRANSHUGE & ~__GFP_MOVABLE), BUFSIZ_ORDER);
> >
> > Not sure what you are doing here, but its not correct.
> >
> Why not? You asked for a test with 32 hugepages, so I allocated 32 hugepages.

Not really. We cannot allocate 64 Mbytes in a single alloc_pages() call
on x86. (MAX_ORDER = 11)

You noticed nothing because you did not
write anything on the 64Mbytes area (and corrupt memory) or
use CONFIG_DEBUG_PAGEALLOC=y.

Your code read data out of bounds and was lucky, thats all...

You in fact allocated a page of (4096<<4) bytes



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/