Re: [PATCH] lib/dynamic_queue_limits.c: relax BUG_ON to WARN_ON in dql_complete()

From: Eric Dumazet
Date: Wed Oct 18 2017 - 12:29:54 EST


On Wed, 2017-10-18 at 16:45 +0100, Ard Biesheuvel wrote:
> Even though calling dql_completed() with a count that exceeds the
> queued count is a serious error, it still does not justify bringing
> down the entire kernel with a BUG_ON(). So relax it to a WARN_ON()
> instead.
>
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx>
> ---
> lib/dynamic_queue_limits.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/dynamic_queue_limits.c b/lib/dynamic_queue_limits.c
> index f346715e2255..24ce495d78f3 100644
> --- a/lib/dynamic_queue_limits.c
> +++ b/lib/dynamic_queue_limits.c
> @@ -23,7 +23,7 @@ void dql_completed(struct dql *dql, unsigned int count)
> num_queued = ACCESS_ONCE(dql->num_queued);
>
> /* Can't complete more than what's in queue */
> - BUG_ON(count > num_queued - dql->num_completed);
> + WARN_ON(count > num_queued - dql->num_completed);
>
> completed = dql->num_completed + count;
> limit = dql->limit;

So instead fixing the faulty driver, you'll have strange lockups, and
force your users to reboot anyway, after annoying periods where
"Internet does not work"

These kinds of errors should be found when testing a new device driver
or new kernel.

Have you found the root cause ?