Re: [PATCH 12/24] getgrent_r.3: Use sizeof() to get buffer size (instead of hardcoding macro name)

From: Alejandro Colomar
Date: Fri Sep 11 2020 - 13:22:22 EST

On 2020-09-11 17:28, Alejandro Colomar wrote:
Hi Stefan,

On 2020-09-11 16:35, Stefan Puiu wrote:
> Hi,
> On Fri, Sep 11, 2020 at 12:15 AM Alejandro Colomar
> <colomar.6.4.3@xxxxxxxxx> wrote:
>> Signed-off-by: Alejandro Colomar <colomar.6.4.3@xxxxxxxxx>
>> ---
>>   man3/getgrent_r.3 | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>> diff --git a/man3/getgrent_r.3 b/man3/getgrent_r.3
>> index 81d81a851..76deec370 100644
>> --- a/man3/getgrent_r.3
>> +++ b/man3/getgrent_r.3
>> @@ -186,7 +186,7 @@ main(void)
>>       setgrent();
>>       while (1) {
>> -        i = getgrent_r(&grp, buf, BUFLEN, &grpp);
>> +        i = getgrent_r(&grp, buf, sizeof(buf), &grpp);
> I'm worried that less attentive people might copy/paste parts of this
> in their code, where maybe buf is just a pointer, and expect it to
> work. Maybe leaving BUFLEN here is useful as a reminder that they need
> to change something to adapt the code?
> Just my 2 cents,
> Stefan.
That's a very good point.

So we have 3 options and I will propose now a 4th one.  Let's see all
of them and see which one is better for the man pages.

1.-    Use the macro everywhere.

- It is still valid when the buffer is a pointer and not an array.
- Hardcodes the initializer.  If the array is later initialized with a
  different value, it may produce a silent bug, or a compilation break.

2.-    Use sizeof() everywhere, and the macro for the initializer.

- It is valid as long as the buffer is an array.
- If the code gets into a function, and the buffer is then a pointer,
  it will definitively produce a silent bug.

3.-    Use sizeof() everywhere, and a magic number for the initializer.

The same as 2.

4.-    Use ARRAY_BYTES() macro

- It is always safe and when code changes, it may break compilation, but
  never a silent bug.
- Add a few lines of code.  Maybe too much complexity for an example.
  But I'd say that it is the only safe option, and in real code it
  should probably be used more, so maybe it's good to show a good practice.

Here's my definition for ARRAY_BYTES(), which is makes use of
must_be_array() similar to the kernel ARRAY_SIZE():


#define is_same_type(a, b)                    \
    __builtin_types_compatible_p(__typeof__(a), __typeof__(b))
#define is_array(a)            (!is_same_type((a), &(a)[0]))
#define must_be__(e, ...)    (                \
    0 * (int)sizeof(                    \
        struct {                    \
            _Static_assert((e)  __VA_OPT__(,)  __VA_ARGS__); \
            char ISO_C_forbids_a_struct_with_no_members__; \
        }                        \
    )                            \
#define must_be_array__(a)    must_be__(is_array(a), "Not an array!")
#define ARRAY_BYTES(arr)    (sizeof(arr) + must_be_array__(arr))

The macro makes use of quite a few GNU extensions, though, which might
be too much to ask.

Actually, I was also going to propose this macro for the kernel itself,
to make it a bit safer.

There's a much simpler version of ARRAY_BYTES(), which requires the
macro to be defined in a header that is not a system header (to avoid
silencing warnings), and also requires a recent version of the compiler
to show a warning:


#define ARRAY_SIZE(arr)        (sizeof(arr) / sizeof((arr)[0])
#define ARRAY_BYTES(arr)    (sizeof((arr)[0]) * ARRAY_SIZE(arr))

What do you all think about the 5 different options?  I don't know which
one is better.

I'd say 4.2 is the best one for the man pages. Just 2 one-line macro definitions, very good safety, and pretty clear code.

Your thoughts?