Re: [PATCH v1 net 2/2] net: mscc: ocelot: fix possible memory conflict for vcap_props

From: Vladimir Oltean
Date: Sat Apr 30 2022 - 17:58:26 EST


On Sat, Apr 30, 2022 at 10:24:00AM -0700, Colin Foster wrote:
> Hi Vladimir,
>
> On Sat, Apr 30, 2022 at 02:24:57PM +0000, Vladimir Oltean wrote:
> > Hi Colin,
> >
> > On Fri, Apr 29, 2022 at 04:30:49PM -0700, Colin Foster wrote:
> > > Each instance of an ocelot struct has the ocelot_vcap_props structure being
> > > referenced. During initialization (ocelot_init), these vcap_props are
> > > detected and the structure contents are modified.
> > >
> > > In the case of the standard ocelot driver, there will probably only be one
> > > instance of struct ocelot, since it is part of the chip.
> > >
> > > For the Felix driver, there could be multiple instances of struct ocelot.
> > > In that scenario, the second time ocelot_init would get called, it would
> > > corrupt what had been done in the first call because they both reference
> > > *ocelot->vcap. Both of these instances were assigned the same memory
> > > location.
> > >
> > > Move this vcap_props memory to within struct ocelot, so that each instance
> > > can modify the structure to their heart's content without corrupting other
> > > instances.
> > >
> > > Fixes: 2096805497e2b ("net: mscc: ocelot: automatically detect VCAP
> > > constants")
> > >
> > > Signed-off-by: Colin Foster <colin.foster@xxxxxxxxxxxxxxxx>
> > > ---
> >
> > To prove an issue, you must come with an example of two switches which
> > share the same struct vcap_props, but contain different VCAP constants
> > in the hardware registers. Otherwise, what you call "corruption" is just
> > "overwriting with the same values".
> >
> > I would say that by definition, if two such switches have different VCAP
> > constants, they have different vcap_props structures, and if they have
> > the same vcap_props structure, they have the same VCAP constants.
> >
> > Therefore, even in a multi-switch environment, a second call to
> > ocelot_vcap_detect_constants() would overwrite the vcap->entry_width,
> > vcap->tg_width, vcap->sw_count, vcap->entry_count, vcap->action_count,
> > vcap->action_width, vcap->counter_words, vcap->counter_width with the
> > exact same values.
> >
> > I do not see the point in duplicating struct vcap_props per ocelot
> > instance.
> >
> > I assume you are noticing some problems with VSC7512? What are they?
>
> I'm not seeing issues, no. I was looking to implement the shared
> ocelot_vcap struct between the 7514 and (in-development 7512. In doing
> so I came across this realization that these per-file structures could
> be referenced multiple times, which was the point of this patch. If the
> structure were simply a const configuration there would be no issue, but
> since it is half const and half runtime populated it got more complicated.
>
> (that is likely why I didn't make it shared initially... which feels
> like ages ago at this point)
>
> Whether or not hardware exists that could be affected by this corner
> case I don't know.

VSC7512 documentation at the following link, VCAP constants are laid out
in tables 72-74 starting with page 112:
https://ww1.microchip.com/downloads/en/DeviceDoc/VMDS-10489.pdf

VSC7514 documentation at the following link, VCAP constants are laid out
in tables 71-73 starting with page 111:
https://ww1.microchip.com/downloads/en/DeviceDoc/VMDS-10491.pdf

As you can see, they are identical. Coincidence? I think not. After all,
they are from the same generation and have the same port count.
So even if the new vsc7512 driver reuses the vsc7514 structure for VCAP
properties, and is instantiated in a system where a vsc7514 switch is
also instantiated, I claim that nothing bad will happen. Are you
claiming otherwise? What is that bad thing, exactly?

>
> > Note that since VSC7512 isn't currently supported by the kernel, even a
> > theoretical corruption issue doesn't qualify as a bug, since there is no
> > way to reproduce it. All the Microchip switches supported by the kernel
> > are internal to an SoC, are single switches, and they have different
> > vcap_props structures.
>
> I see. So I do have a misunderstanding in the process.
>
> I shouldn't have submitted this to net, because it isn't an actual "bug"
> I observed. Instead it was a potential issue with existing code, and
> could have affected certain hardware configurations. How should I have
> sent this out? (RFC? net-next? separate conversation discussing the
> validity?)

I can't answer how you should have sent out this patch, since I don't
yet understand what is gained by making the change.

> Back to this patch in particular:
>
> You're saying there's no need to duplicate the vcap_props structure
> array per ocelot instance. Understood. Would it be an improvement to
> split up vcap into a const configuration section (one per hardware
> layout) and a detected set? Or would you have any other suggestion?

Maybe, although I assume the only reason why you're proposing that is
that you want to then proceed and make the detected properties unique
per switch, which again would increase the memory footprint of the
driver for a reason I am not following.

I suppose there's also the option of leaving code that isn't broken
alone?

> And, of course, I can drag this along with my 7512 patch set for now,

Why?

> or try to get this in now. This one feels like it is worth keeping
> separate...
>
> And thanks as always for your feedback!