Re: [PATCH 1/7] mips: dmi: Fix early remap on MIPS32

From: Serge Semin
Date: Tue Nov 28 2023 - 08:56:39 EST


Hi Arnd

On Tue, Nov 28, 2023 at 01:41:51PM +0100, Arnd Bergmann wrote:
> On Mon, Nov 27, 2023, at 17:23, Serge Semin wrote:
> > On Fri, Nov 24, 2023 at 10:03:49PM +0000, Jiaxun Yang wrote:
> >> 在2023年11月24日十一月 下午6:52,Serge Semin写道:
> >> > On Thu, Nov 23, 2023 at 05:33:31PM +0000, Jiaxun Yang wrote:
> >> >>
> >> [...]
> >> >> Actually dmi_setup() is called before cpu_cache_init().
> >> >
> >> > To preliminary sum the discussion, indeed there can be issues on the
> >> > platforms which have DMI initialized on the cached region. Here are
> >> > several solutions and additional difficulties I think may be caused by
> >> > implementing them:
> >>
> >> Thanks for such detailed conclusion!
> >> I'd prefer go solution 1, with comments below.
> >> >
> >> > 1. Use unmapped cached region utilization in the MIPS32 ioremap_prot()
> >> > method.
> >> > This solution a bit clumsy than it looks on the first glance.
> >> > ioremap_prot() can be used for various types of the cachability
> >> > mapping. Currently it's a default-cacheable CA preserved in the
> >> > _page_cachable_default variable and Write-combined CA saved in
> >> > boot_cpu_data.writecombine. Based on that we would have needed to use
> >> > the unmapped cached region utilized for the IO-remaps called with the
> >> > "_page_cachable_default" mapping flags passed only. The rest of the IO
> >> > range mappings, including the write-combined ones, would have been
> >> > handled by VM means. This would have made the ioremap_prot() a bit
> >> > less maintainable, but still won't be that hard to implement (unless I
> >> > miss something):
> >> > --- a/arch/mips/mm/ioremap.c
> >> > +++ b/arch/mips/mm/ioremap.c
> >> > /*
> >> > - * Map uncached objects in the low 512mb of address space using KSEG1,
> >> > - * otherwise map using page tables.
> >> > + * Map uncached/default-cached objects in the low 512mb of address
> >> > + * space using KSEG1/KSEG0, otherwise map using page tables.
> >> > */
> >> > - if (IS_LOW512(phys_addr) && IS_LOW512(last_addr) &&
> >> > - flags == _CACHE_UNCACHED)
> >> > - return (void __iomem *) CKSEG1ADDR(phys_addr);
> >> > + if (IS_LOW512(phys_addr) && IS_LOW512(last_addr)) {
> >> > + if (flags == _CACHE_UNCACHED)
> >> > + return (void __iomem *) CKSEG1ADDR(phys_addr);
> >> > + else if (flags == _page_cachable_default)
> >> > + return (void __iomem *) CKSEG0ADDR(phys_addr);
> >> > + }
> >> >
> >> > Currently I can't figure out what obvious problems it may cause. But
> >> > It seems suspicious that the cacheable IO-mapping hasn't been
> >> > implemented by the unmapped cacheable region in the first place. In
> >> > anyway this solution looks more dangerous than solution 2. because it
> >> > affects all the MIPS32 platforms at once.
> >>
> >> I just made a quick grep in tree, and it seems like we don't have much
> >> user of ioremap_cache (as well as ioremap_uc/wc) here so I think it is
> >> a safe assumption.
> >
> > I wouldn't say there aren't much users. ioremap_wc() and it's
> > devm-version is widely utilized in the GPU and network and some other
> > subsystems. ioremap_cache() isn't widespread indeed. In anyway even a
> > single user must be supported in safely calling the method if it's
> > provided by the arch-code, otherwise the method could be considered as
> > just a bogus stub to have the kernel successfully built. I bet you'll
> > agree with that. But that's not the point in this case.
>

> ioremap_wc() is useful for mapping PCI attached memory such as frame
> buffers,

Thanks for clarification. That's actually the reason why I originally
added the ioremap_wc() support to the MIPS32 arch. In one of the
projects we had SM750/SM768-based graphic cards attached to the
MIPS32-based SoC. Using ioremap_wc() for the framebuffer significantly
improved the graphic subsystem performance indeed. It was mostly
required for the SM750 chips though, which provided a narrow and slow
PCIe Gen.1 x1 interface.

> but ioremap_cache() is generally underspecified because the
> resulting pointer is neither safe to dereference nor to pass into
> readl()/writel()/memcpy_fromio() on all architectures.

I don't know about ARM64 (which for instance has it utilized to access
the DMI region), but at least in case of MIPS32 (a fortiori MIPS64
seeing the ioremap_cache() method actually returns a pointer to the
uncached region) I don't see a reason why it wouldn't be safe in both
cases described by you. All IO and memory regions are accessed by the
generic load and store instructions. The only difference is that the
MMIO-space accessors normally implies additional barriers, which just
slow down the execution, but shouldn't cause any other problem. Could
you clarify why do you think otherwise?

>
> There was an effort to convert the remaining ioremap_cache() calls
> into memremap() a few years ago, not sure if that's still being worked
> on but it would be the right thing to do.

I see. Thanks for the pointing out to that. I guess it could be done
for MIPS too (at least on our MIPS32 platform DMI is just a memory
region pre-initialized by the bootloader), but the conversion would
require much efforts. Alas currently I can't afford to get it
implemented in the framework of this patchset. (I saved your note in
my MIPS TODO list though. Let's hope eventually I'll be able to get
back to this topic.)

-Serge(y)

>
> Arnd