Re: [PATCH] coccinelle: semantic patch to check for inappropriate do_div() calls

From: Wen Yang
Date: Sat Jan 04 2020 - 08:51:21 EST




On 2020/1/4 4:55 äå, Julia Lawall wrote:
On Sat, 4 Jan 2020, Wen Yang wrote:



On 2020/1/4 3:00 äå, Julia Lawall wrote:
On Sat, 4 Jan 2020, Wen Yang wrote:

do_div() does a 64-by-32 division.
When the divisor is unsigned long, u64, or s64,
do_div() truncates it to 32 bits, this means it
can test non-zero and be truncated to zero for division.
This semantic patch is inspired by Mateusz Guzik's patch:
commit b0ab99e7736a ("sched: Fix possible divide by zero in avg_atom()
calculation")

Signed-off-by: Wen Yang <wenyang@xxxxxxxxxxxxxxxxx>
Cc: Julia Lawall <Julia.Lawall@xxxxxxx>
Cc: Gilles Muller <Gilles.Muller@xxxxxxx>
Cc: Nicolas Palix <nicolas.palix@xxxxxxx>
Cc: Michal Marek <michal.lkml@xxxxxxxxxxx>
Cc: Matthias Maennich <maennich@xxxxxxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Masahiro Yamada <yamada.masahiro@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: cocci@xxxxxxxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
---
scripts/coccinelle/misc/do_div.cocci | 66
++++++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
create mode 100644 scripts/coccinelle/misc/do_div.cocci

diff --git a/scripts/coccinelle/misc/do_div.cocci
b/scripts/coccinelle/misc/do_div.cocci
new file mode 100644
index 0000000..f1b72d1
--- /dev/null
+++ b/scripts/coccinelle/misc/do_div.cocci
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/// do_div() does a 64-by-32 division.
+/// When the divisor is unsigned long, u64, or s64,
+/// do_div() truncates it to 32 bits, this means it
+/// can test non-zero and be truncated to zero for division.
+///
+//# This makes an effort to find those inappropriate do_div () calls.
+//
+// Confidence: Moderate
+// Copyright: (C) 2020 Wen Yang, Alibaba.
+// Comments:
+// Options: --no-includes --include-headers
+
+virtual context
+virtual org
+virtual report
+
+@depends on context@
+expression f;
+long l;
+unsigned long ul;
+u64 ul64;
+s64 sl64;
+
+@@
+(
+* do_div(f, l);
+|
+* do_div(f, ul);
+|
+* do_div(f, ul64);
+|
+* do_div(f, sl64);
+)
+
+@r depends on (org || report)@
+expression f;
+long l;
+unsigned long ul;
+position p;
+u64 ul64;
+s64 sl64;
+@@
+(
+do_div@p(f, l);
+|
+do_div@p(f, ul);
+|
+do_div@p(f, ul64);
+|
+do_div@p(f, sl64);
+)
+
+@script:python depends on org@
+p << r.p;
+@@
+
+msg="WARNING: WARNING: do_div() does a 64-by-32 division, which may
truncation the divisor to 32-bit"
+coccilib.org.print_todo(p[0], msg)
+
+@script:python depends on report@
+p << r.p;
+@@
+
+msg="WARNING: WARNING: do_div() does a 64-by-32 division, which may
truncation the divisor to 32-bit"
+coccilib.report.print_report(p[0], msg)

A few small issues: You have WARNING: twice in each case, and truncation
should be truncate.


Thanks for your comments, we will fix it soon.

Is there any generic strategy for fixing these issues?


We have done some experiments, such as:
https://lkml.org/lkml/2020/1/2/1354

Thanks. Actually, I would appreciate knowing about such experiments when
the semantic patch is submitted, since eg in this case I am not really an
expert in this issue.


- avg = rec->time;
- do_div(avg, rec->counter);
+ avg = div64_ul(rec->time, rec->counter);

--> Function replacement was performed here,
and simple code cleanup was also performed.


- do_div(stddev, rec->counter * (rec->counter - 1) * 1000);
+ stddev = div64_ul(stddev,
+ rec->counter * (rec->counter - 1) * 1000);

--> Only the function replacement is performed here (because the variable
âstddevâ corresponds to a more complicated equation, cleaning it will reduce
readability).

Would it be reasonable to extend the warning to say "consider using
div64_ul instead"? Or do you think it is obvious to everyone?


Thank you for your comments.
We plan to modify it as follows:
msg="WARNING: do_div() does a 64-by-32 division, please consider using div64_ul, div64_long, div64_u64 or div64_s64 instead."

In addition, there are some codes that do not need to be modified:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/net/can/dev.c#n263

Would it be worth having a special case for constants and checking whether
the value is obviously safe and no warning is needed?

Thanks.
This is very valuable in reducing false positives, and we'll try to implement it.

--
Best Wishes,
Wen

So we just print a warning.
As for how to fix it, we need to analyze the code carefully.

--
Best Wishes,
Wen