Re: [PATCH v4 1/3] cacheinfo: Add arch specific early level initializer
From: Ricardo Neri
Date: Mon Aug 07 2023 - 19:21:18 EST
On Wed, May 31, 2023 at 10:03:36AM -0700, Ricardo Neri wrote:
> On Wed, May 31, 2023 at 01:22:01PM +0100, Sudeep Holla wrote:
> > On Thu, May 18, 2023 at 10:34:14AM +0100, Sudeep Holla wrote:
> > > On Wed, May 17, 2023 at 06:27:03PM -0700, Ricardo Neri wrote:
> > > > On Mon, May 15, 2023 at 10:36:08AM +0100, Sudeep Holla wrote:
> > > > > On Wed, May 10, 2023 at 12:12:07PM -0700, Ricardo Neri wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I had posted a patchset[1] for x86 that initializes
> > > > > > ci_cacheinfo(cpu)->num_leaves during SMP boot.
> > > > > >
> > > > >
> > > > > It is entirely clear to me if this is just a clean up or a fix to some
> > > > > issue you faced ? Just wanted to let you know Prateek from AMD has couple
> > > > > of fixes [2]
> > > >
> > > > My first patch is a bug fix. The second patch is clean up that results
> > > > from fixing the bug in patch 1.
> > > >
> > > > >
> > > > > > This means that early_leaves and a late cache_leaves() are equal but
> > > > > > per_cpu_cacheinfo(cpu) is never allocated. Currently, x86 does not use
> > > > > > fetch_cache_info().
> > > > > >
> > > > > > I think that we should check here that per_cpu_cacheinfo() has been allocated to
> > > > > > take care of the case in which early and late cache leaves remain the same:
> > > > > >
> > > > > > - if (cache_leaves(cpu) <= early_leaves)
> > > > > > + if (cache_leaves(cpu) <= early_leaves && per_cpu_cacheinfo(cpu))
> > > > > >
> > > > > > Otherwise, in v6.4-rc1 + [1] I observe a NULL pointer dereference from
> > > > > > last_level_cache_is_valid().
> > > > > >
> > > > >
> > > > > I think this is different issue as Prateek was just observing wrong info
> > > > > after cpuhotplug operations. But the patches manage the cpumap_populated
> > > > > state better with the patches. Can you please look at that as weel ?
> > > >
> > > > I verified that the patches from Prateek fix a different issue. I was able
> > > > to reproduce his issue. His patches fixes it.
> > > >
> > > > I still see my issue after applying Prateek's patches.
> > >
> > > Thanks, I thought it is different issue and good that you were able to test
> > > them as well. Please post a proper patch for the NULL ptr dereference you
> > > are hitting on x86.
> >
> > Gentle ping! Are you still observing NULL ptr dereference with v6.4-rcx ?
>
> Yes, I still observe it on v6.4-rc4.
>
> > If so, can you please post the fix as a proper patch ? Some of the patches
> > in v6.4-rc1 are being backported, so I prefer to have all the known issues
> > fixed before that happens. Sorry for the nag, but backport is the reason
> > I am pushing for this.
>
> Sure. Sorry for the delay. I have the patch ready and post this week. I
> will post it as part my previous patches in [1].
I at last posted the patchet, Sudeep. You can take a look here:
https://lore.kernel.org/all/20230805012421.7002-1-ricardo.neri-calderon@xxxxxxxxxxxxxxx/
Sorry for the delay. I had to jump through various hoops before posting.
Thanks and BR,
Ricardo