Re: New warning in nvme_setup_discard
From: Oleksandr Natalenko
Date: Tue Jul 20 2021 - 05:09:38 EST
Hello, Ming.
On pondělí 19. července 2021 8:27:29 CEST Oleksandr Natalenko wrote:
> On pondělí 19. července 2021 3:40:40 CEST Ming Lei wrote:
> > On Sat, Jul 17, 2021 at 02:35:14PM +0200, Oleksandr Natalenko wrote:
> > > On sobota 17. července 2021 14:19:59 CEST Oleksandr Natalenko wrote:
> > > > On sobota 17. července 2021 14:11:05 CEST Oleksandr Natalenko wrote:
> > > > > On sobota 17. července 2021 11:35:32 CEST Ming Lei wrote:
> > > > > > Maybe you need to check if the build is OK, I can't reproduce it
> > > > > > in
> > > > > > my
> > > > > > VM, and BFQ is still builtin:
> > > > > >
> > > > > > [root@ktest-01 ~]# uname -a
> > > > > > Linux ktest-01 5.14.0-rc1+ #52 SMP Fri Jul 16 18:56:36 CST 2021
> > > > > > x86_64
> > > > > > x86_64 x86_64 GNU/Linux [root@ktest-01 ~]# cat
> > > > > > /sys/block/nvme0n1/queue/scheduler
> > > > > > [none] mq-deadline kyber bfq
> > > > >
> > > > > I don't think this is an issue with the build… BTW, with
> > > > > `initcall_debug`:
> > > > >
> > > > > ```
> > > > > [ 0.902555] calling bfq_init+0x0/0x8b @ 1
> > > > > [ 0.903448] initcall bfq_init+0x0/0x8b returned -28 after 507
> > > > > usecs
> > > > > ```
> > > > >
> > > > > -ENOSPC? Why? Also re-tested with the latest git tip, same result
> > > > > :(.
> > > >
> > > > OK, one extra pr_info, and I see this:
> > > >
> > > > ```
> > > > [ 0.871180] blkcg_policy_register: BLKCG_MAX_POLS too small
> > > > [ 0.871612] blkcg_policy_register: -28
> > > > ```
> > > >
> > > > What does it mean please :)? The value seems to be hard-coded:
> > > >
> > > > ```
> > > > include/linux/blkdev.h
> > > > 60:#define BLKCG_MAX_POLS 5
> > > > ```
> > >
> > > OK, after increasing this to 6 I've got my BFQ back. Please see [1].
> > >
> > > [1]
> > > https://lore.kernel.org/linux-block/20210717123328.945810-1-oleksandr@na
> > > t
> > > alenko.name/
> >
> > OK, after you fixed the issue in blkcg_policy_register(), can you
> > reproduce the discard issue on v5.14-rc1 with BFQ applied? If yes,
> > can you test the patch I posted previously?
>
> Yes, the issue is reproducible with both v5.13.2 and v5.14-rc1. I haven't
> managed to reproduce it with v5.13.2+your patch. Now I will build v5.14-
> rc2+your patch and test further.
I'm still hammering v5.14-rc2 + your patch, and I cannot reproduce the issue.
Given I do not have a reliable reproducer (I'm just firing up the kernel build,
and the issue pops up eventually, sooner or later, but usually within a couple
of first tries), for how long I should hammer it for your fix to be considered
proven?
Thanks.
--
Oleksandr Natalenko (post-factum)