Re: [PATCH 05/11] perf parse-event: Fix memory leak in evsel->unit

From: David Malcolm
Date: Tue Sep 15 2020 - 15:57:19 EST


On Tue, 2020-09-15 at 11:59 -0700, Ian Rogers wrote:
> On Tue, Sep 15, 2020 at 5:19 AM Arnaldo Carvalho de Melo
> <acme@xxxxxxxxxx> wrote:
> > Em Tue, Sep 15, 2020 at 12:18:13PM +0900, Namhyung Kim escreveu:
> > > The evsel->unit borrows a pointer of pmu event or alias instead
> > > of
> > > owns a string. But tool event (duration_time) passes a result of
> > > strdup() caused a leak.
> > >
> > > It was found by ASAN during metric test:
> >
> > Thanks, applied.
>
> Thanks Namhyung and Arnaldo, just to raise a meta point. A lot of the
> parse-events asan failures were caused by a lack of strdup causing
> frees of string literals. It seems we're now adding strdup
> defensively
> but introducing memory leaks. Could we be doing this in a smarter
> way?
> For C++ I'd likely use std::string and walk away. For perf code the
> best source of "ownership" I've found is to look at the "delete"
> functions and figure out ownership from what gets freed there - this
> can be burdensome. For strings, the code is also using strbuf and
> asprintf. One possible improvement could be to document ownership
> next
> to the struct member variable declarations. Another idea would be to
> declare a macro whose usage would look like:
>
> struct evsel {
> ...
> OWNER(char *name, "this");
> ...
> UNOWNED(const char *unit);
> ...
>
> Maybe then we could get a static analyzer to complain if a literal
> were assigned to an owned struct variable. Perhaps if a strdup were
> assigned to an UNOWNED struct variable perhaps it could warn too, as
> presumably the memory allocation is a request to own the memory.
>
> There was a talk about GCC's -fanalyzer option doing malloc/free
> checking at Linux plumbers 2 weeks ago:
> https://linuxplumbersconf.org/event/7/contributions/721/attachments/542/961/2020-LPC-analyzer-talk.pdf
> I added David Malcolm, the LPC presenter, as he may have ideas on how
> we could do this in a better way.

Hi Ian.

Some ideas (with the caveat that I'm a GCC developer, and not a regular
on LKML): can you capture the ownership status in the type system?
I'm brainstorming here but how about:
typedef char *owned_string_t;
typedef const char *borrowed_string_t;
This would at least capture the intent in human-readable form, and
*might* make things more amenable to checking by a machine. It's also
less macro cruft.
I take it that capturing the ownership status with a runtime flag next
to the pointer in a struct is too expensive for your code?


Some notes on -fanalyzer:

Caveat: The implementation of -fanalyzer in gcc 10 is an early
prototype and although it has found its first CVE I don't recommend it
for use "in anger" yet; I'm working on getting it more suitable for
general usage for C in gcc 11. (mostly scaling issues and other
bugfixing)

-fanalyzer associates state machines with APIs; one of these state
machines implements leak detection for malloc, along with e.g. double-
free detection. I'm generalizing this checker to other acquire/release
APIs: I have a semi-working patch under development (targeting GCC 11)
that exposes this via a fndecl attribute, currently named
"deallocated_by", so that fn decls can be labeled e.g.:

extern void foo_release (foo *);
extern foo *foo_acquire (void)
__attribute__((deallocated_by(foo_release));

and have -fanalyzer detect leaks, double-releases, use-after-release,
failure to check for NULL (alloc failure) etc.

Ultimately this attribute might land in the libc header for strdup (and
friends), but I can also special-case strdup so that the analyzer
"knows" that the result needs to be freed if non-NULL (and that it can
fail and return NULL).

Hope this is constructive
Dave

[...]