On Mon 15-05-17 14:12:10, Pasha Tatashin wrote:
Hi Michal,
After looking at your suggested memblock_virt_alloc_core() change again, I
decided to keep what I have. I do not want to inline
memblock_virt_alloc_internal(), because it is not a performance critical
path, and by inlining it we will unnecessarily increase the text size on all
platforms.
I do not insist but I would really _prefer_ if the bool zero argument
didn't proliferate all over the memblock API.
Also, because it will be very hard to make sure that no platform regresses
by making memset() default in _memblock_virt_alloc_core() (as I already
showed last week at least sun4v SPARC64 will require special changes in
order for this to work), I decided to make it available only for "deferred
struct page init" case. As, what is already in the patch.
I do not think this is the right approach. Your measurements just show
that sparc could have a more optimized memset for small sizes. If you
keep the same memset only for the parallel initialization then you
just hide this fact. I wouldn't worry about other architectures. All
sane architectures should simply work reasonably well when touching a
single or only few cache lines at the same time. If some arches really
suffer from small memsets then the initialization should be driven by a
specific ARCH_WANT_LARGE_PAGEBLOCK_INIT rather than making this depend
on DEFERRED_INIT. Or if you are too worried then make it opt-in and make
it depend on ARCH_WANT_PER_PAGE_INIT and make it enabled for x86 and
sparc after memset optimization.