Re: [RFC PATCH] coccinelle: check for integer overflow in binary search
From: Denis Efremov
Date: Thu Sep 05 2019 - 03:36:37 EST
On 05.09.2019 09:20, Julia Lawall wrote:
>
>
> On Thu, 5 Sep 2019, Denis Efremov wrote:
>
>> This is an RFC. I will resend the patch after feedback. Currently
>> I'm preparing big patchset with bsearch warnings fixed. The rule will
>> be a part of this patchset if it will be considered good enough for
>> checking.
>>
>> There is a known integer overflow error [1] in the binary search
>> algorithm. Google faced it in 2006 [2]. This rule checks midpoint
>> calculation in binary search for overflow, i.e., (l + h) / 2.
>> Not every match is an actual error since the array could be small
>> enough. However, a custom implementation of binary search is
>> error-prone and it's better to use the library function (lib/bsearch.c)
>> or to apply defensive programming for midpoint calculation.
>>
>> [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues
>> [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html
>>
>> Signed-off-by: Denis Efremov <efremov@xxxxxxxxx>
>> ---
>> scripts/coccinelle/misc/bsearch.cocci | 80 +++++++++++++++++++++++++++
>> 1 file changed, 80 insertions(+)
>> create mode 100644 scripts/coccinelle/misc/bsearch.cocci
>>
>> diff --git a/scripts/coccinelle/misc/bsearch.cocci b/scripts/coccinelle/misc/bsearch.cocci
>> new file mode 100644
>> index 000000000000..a99d9a8d3ee5
>> --- /dev/null
>> +++ b/scripts/coccinelle/misc/bsearch.cocci
>> @@ -0,0 +1,80 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/// Check midpoint calculation in binary search algorithm for integer overflow
>> +/// error [1]. Google faced it in 2006 [2]. Not every match is an actual error
>> +/// since the array can be small enough. However, a custom implementation of
>> +/// binary search is error-prone and it's better to use the library function
>> +/// (lib/bsearch.c) or to apply defensive programming for midpoint calculation.
>> +///
>> +/// [1] https://en.wikipedia.org/wiki/Binary_search_algorithm#Implementation_issues
>> +/// [2] https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html
>> +//
>> +// Confidence: Medium
>> +// Copyright: (C) 2019 Denis Efremov, ISPRAS
>> +// Comments:
>> +// Options: --no-includes --include-headers
>> +
>> +virtual report
>> +virtual org
>> +
>> +@r depends on org || report@
>> +identifier l, h, m;
>> +statement S;
>> +position p;
>> +// to match 1 in <<
>> +// to match 2 in /
>> +// Can't use exact values, e.g. 2, because it fails to match 2L.
>> +// TODO: Is there an isomorphism for 2, 2L, 2U, 2UL, 2ULL, etc?
>> +constant c;
>
> As far as I can see, you aren't checking for 2 at all at the moment?
Yes, there are no false positives even without pinning constants to 1, 2.
However, it's better to express this in the rule.
> You
> should be able to say constant c = {2, 2L, etc};. Actually, we do
> consider several variants of 0, so it could be reasonable to allow eg 2 to
> match other variants as well.
It looks like integer literals aren't fully supported. When I'm trying to write
'constant c = {2L}; ' it fails with int_of_string error.
Denis