Re: How to fix WARN from drivers/base/dd.c in next-20200401 if CONFIG_MODULES=y?

From: Geert Uytterhoeven
Date: Fri Apr 03 2020 - 09:06:08 EST


Hi John,

On Fri, Apr 3, 2020 at 1:47 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> On Thu, Apr 2, 2020 at 7:27 PM John Stultz <john.stultz@xxxxxxxxxx> wrote:
> > On Thu, Apr 2, 2020 at 3:17 AM Yoshihiro Shimoda
> > <yoshihiro.shimoda.uh@xxxxxxxxxxx> wrote:
> > >
> > > I found an issue after applied the following patches:
> > > ---
> > > 64c775f driver core: Rename deferred_probe_timeout and make it global
> > > 0e9f8d0 driver core: Remove driver_deferred_probe_check_state_continue()
> > > bec6c0e pinctrl: Remove use of driver_deferred_probe_check_state_continue()
> > > e2cec7d driver core: Set deferred_probe_timeout to a longer default if CONFIG_MODULES is set
>
> Note that just setting deferred_probe_timeout = -1 like for the
> CONFIG_MODULES=n case doesn't help.
>
> > > c8c43ce driver core: Fix driver_deferred_probe_check_state() logic
> > > ---
> > >
> > > Before these patches, on my environment [1], some device drivers
> > > which has iommus property output the following message when probing:
> > >
> > > [ 3.222205] ravb e6800000.ethernet: ignoring dependency for device, assuming no driver
> > > [ 3.257174] ravb e6800000.ethernet eth0: Base address at 0xe6800000, 2e:09:0a:02:eb:2d, IRQ 117.
> > >
> > > So, since ravb driver is probed within 4 seconds, we can use NFS rootfs correctly.
> > >
> > > However, after these patches are applied, since the patches are always waiting for 30 seconds
> > > for of_iommu_configure() when IOMMU hardware is disabled, drivers/base/dd.c output WARN.
> > > Also, since ravb cannot be probed for 30 seconds, we cannot use NFS rootfs anymore.
> > > JFYI, I copied the kernel log to the end of this email.
> >
> > Hey,
> > Terribly sorry for the trouble. So as Robin mentioned I have a patch
> > to remove the WARN messages, but I'm a bit more concerned about why
> > after the 30 second delay, the ethernet driver loads:
> > [ 36.218666] ravb e6800000.ethernet eth0: Base address at
> > 0xe6800000, 2e:09:0a:02:eb:2d, IRQ 117.
> > but NFS fails.
> >
> > Is it just that the 30 second delay is too long and NFS gives up?
>
> I added some debug code to mount_nfs_root(), which shows that the first
> 3 tries happen before ravb is instantiated, and the last 3 tries happen
> after. So NFS root should work, if the network works.
>
> However, it seems the Ethernet PHY is never initialized, hence the link
> never becomes ready.

So the issue is not nfsroot in-se, but the ip-config that needs to
happen before that.

The call to wait_for_devices() in ip_auto_config() (which is a
late_initcall()) returns -ENODEV, as the network device hasn't probed
successfully yet, so ip-config is aborted.

The (whitespace-damaged) patch below fixes that, but may have unintended
side-effects.

--- a/net/ipv4/ipconfig.c
+++ b/net/ipv4/ipconfig.c
@@ -1469,7 +1469,11 @@ static int __init ip_auto_config(void)
/* Wait for devices to appear */
err = wait_for_devices();
if (err)
+#ifdef IPCONFIG_DYNAMIC
+ goto try_try_again;
+#else
return err;
+#endif

/* Setup all network devices */
err = ic_open_devs();

Probably we want at least some CONFIG_ROOT_NFS || CONFIG_CIFS_ROOT,
and ROOT_DEV == Root_NFS || ROOT_DEV == Root_CIFS checks.

Thanks for your comments!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds