Re: [PATCH] modpost: remove use of non-standard strsep() in HOSTCC code
From: Masahiro Yamada
Date: Sun Jun 28 2020 - 10:21:09 EST
On Sun, Jun 28, 2020 at 5:20 PM H. Nikolaus Schaller <hns@xxxxxxxxxxxxx> wrote:
>
>
> > Am 28.06.2020 um 09:52 schrieb Masahiro Yamada <masahiroy@xxxxxxxxxx>:
> >
> > On Sun, Jun 28, 2020 at 3:17 PM H. Nikolaus Schaller <hns@xxxxxxxxxxxxx> wrote:
> >>
> >> Hi,
> >>
> >>> Am 28.06.2020 um 07:51 schrieb Masahiro Yamada <masahiroy@xxxxxxxxxx>:
> >>>
> >>> On Thu, Jun 25, 2020 at 5:47 PM H. Nikolaus Schaller <hns@xxxxxxxxxxxxx> wrote:
> >>>>
> >>>> strsep() is neither standard C nor POSIX and used outside
> >>>> the kernel code here. Using it here requires that the
> >>>> build host supports it out of the box which is e.g.
> >>>> not true for a Darwin build host and using a cross-compiler.
> >>>> This leads to:
> >>>>
> >>>> scripts/mod/modpost.c:145:2: warning: implicit declaration of function 'strsep' [-Wimplicit-function-declaration]
> >>>> return strsep(stringp, "\n");
> >>>> ^
> >>>>
> >>>> and a segfault when running MODPOST.
> >>>>
> >>>> See also: https://stackoverflow.com/a/7219504
> >>>>
> >>>> So let's add some lines of code separating the string at the
> >>>> next newline character instead of using strsep(). It does not
> >>>> hurt kernel size or speed since this code is run on the build host.
> >>>>
> >>>> Fixes: ac5100f5432967 ("modpost: add read_text_file() and get_line() helpers")
> >>>> Signed-off-by: H. Nikolaus Schaller <hns@xxxxxxxxxxxxx>
> >>>> ---
> >>>> scripts/mod/modpost.c | 7 ++++++-
> >>>> 1 file changed, 6 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
> >>>> index 6aea65c65745..8fe63989c6e1 100644
> >>>> --- a/scripts/mod/modpost.c
> >>>> +++ b/scripts/mod/modpost.c
> >>>> @@ -138,11 +138,16 @@ char *read_text_file(const char *filename)
> >>>>
> >>>> char *get_line(char **stringp)
> >>>> {
> >>>> + char *p;
> >>>> /* do not return the unwanted extra line at EOF */
> >>>> if (*stringp && **stringp == '\0')
> >>>
> >>> This check does not make sense anymore.
> >>>
> >>> Previously, get_line(NULL) returns NULL.
> >>>
> >>> With your patch, get_line(NULL) crashes
> >>> due to NULL-pointer dereference.
> >>
> >> Well, that is original code.
> >
> >
> > Sorry for confusion.
> >
> > I meant this:
> >
> > char *s = NULL;
> > get_line(&s);
> >
> >
> > In the current code, get_line(&s) returns NULL.
> > As 'man strsep' says this:
> > "If *stringp is NULL, the strsep() function returns NULL
> > and does nothing else."
> >
> > With your patch, **stringp will cause
> > NULL-pointer dereference.
>
> Ah, now I see. strsep() has a special case that is not covered
> by my patch.
>
> On the other hand, get_line() is only called as get_line(&pos) and
> pos = buf can not be NULL because that is checked before in read_dump().
> This is why I did not observe a segfault.
>
> But it is wise to make get_line() it more robust than needed. We do
> never know who will copy this code fragment... And I am tempted to
> handle the get_line(NULL) case as well.
No.
(stringp == NULL) is a bug.
Better to not handle.
(*stringp == NULL) has a good reason to be handled.
*stringp is updated to point to the next.
But, there is no more delimiter, *stringp reaches
the end of a string (**stringp == '\0').
In this case, we cannot advance *stringp any more.
(We cannot point the next character of '\0';
it would cause buffer overrun).
So,*stringp is set to NULL to represent no more string.
In the next iteration, (*stringp == NULL) is input,
then NULL is returned.
>
> >> I have only replaced the strsep() function.
> >> But yes, it looks to be better in addition to
> >> my patch.
> >>
> >>>
> >>>
> >>>
> >>>> return NULL;
> >>>>
> >>>> - return strsep(stringp, "\n");
> >>>> + p = *stringp;
> >>>> + while (**stringp != '\n')
> >>>> + (*stringp)++;
> >>>
> >>>
> >>> Is this a safe conversion?
> >>>
> >>> If the input file does not contain '\n' at all,
> >>> this while-loop continues running,
> >>> and results in the segmentation fault
> >>> due to buffer over-run.
> >>
> >> Ah, yes, you are right.
> >>
> >> We should use
> >>
> >> + while (**stringp && **stringp != '\n')
> >>
> >>>
> >>>
> >>>
> >>>> + *(*stringp)++ = '\0';
> >>>> + return p;
> >>>> }
> >>>
> >>>
> >>>
> >>> How about this?
> >>>
> >>> char *get_line(char **stringp)
> >>> {
> >>> char *orig = *stringp;
> >>
> >> ^^^ this still segfaults with get_line(NULL)
> >
> >
> > This is OK.
> >
> > get_line(NULL) should crash because we never expect
> > the case ' stringp == NULL'.
> >
> > We need to care about the case ' *stringp == NULL'.
> > In this case, get_line() should return NULL.
> >
> >
> >
> >
> >>> char *next;
> >>>
> >>> /* do not return the unwanted extra line at EOF */
> >>> if (!orig || *orig == '\0')
> >>> return NULL;
> >>>
> >>> next = strchr(orig, '\n');
> >>> if (next)
> >>> *next++ = '\0';
> >>>
> >>> *stringp = next;
> >>
> >> Yes, this code is easier to understand than my while loop.
> >> And strchr() is POSIX.
> >>
> >> So should I submit an updated patch or do you want to submit
> >> it (with a suggested-by: H. Nikolaus Schaller <hns@xxxxxxxxxxxxx>)
> >
> > Please send a patch.
> > (Co-developed-by if you want to give some credit to me)
>
> Yes, I will do in the next days.
>
> BR and thanks,
> Nikolaus Schaller
>
--
Best Regards
Masahiro Yamada