In fact one user pointed out another potentially vulnerable callback:
On 2024-09-09 13:11, Alex Deucher wrote:
On Sun, Sep 8, 2024 at 7:23 AM Tobias JakobiAs Tobias said, without extensive analysis and trace of the code in all
<tjakobi@xxxxxxxxxxxxxxxxxxxxx> wrote:
On 9/8/24 09:35, Christopher Snowhill wrote:@Wentland, Harry can you confirm this?
On Mon Sep 2, 2024 at 2:40 AM PDT, tjakobi wrote:Maybe. But the big difference I see here, is that in this code there
From: Tobias Jakobi <tjakobi@xxxxxxxxxxxxxxxxxxxxx>Does the problem not exist in the following references to funcs->set_drr?
Hello,
this fixes a nasty race condition in the set_drr() callbacks for DCN10
and DCN35 that has existed now since quite some time, see this GitLab
issue for reference.
https://gitlab.freedesktop.org/drm/amd/-/issues/3142
The report just focuses von DCN10, but the same problem also exists in
the DCN35 code.
drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c: if (pipe_ctx->stream_res.tg->funcs->set_drr)
drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c: pipe_ctx->stream_res.tg->funcs->set_drr(
drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c: pipe_ctx[i]->stream_res.tg->funcs->set_drr(
drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.c: if (pipe_ctx->stream_res.tg->funcs->set_drr)
drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.c: pipe_ctx->stream_res.tg->funcs->set_drr(
drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.c: if (pipe_ctx->stream_res.tg->funcs->set_drr)
drivers/gpu/drm/amd/display/dc/hwss/dcn20/dcn20_hwseq.c: pipe_ctx->stream_res.tg->funcs->set_drr(
drivers/gpu/drm/amd/display/dc/hwss/dcn31/dcn31_hwseq.c: if (pipe_ctx->stream_res.tg->funcs->set_drr)
drivers/gpu/drm/amd/display/dc/hwss/dcn31/dcn31_hwseq.c: pipe_ctx->stream_res.tg->funcs->set_drr(
drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c: if (pipe_ctx->stream_res.tg->funcs->set_drr)
drivers/gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c: pipe_ctx->stream_res.tg->funcs->set_drr(
isn't even any kind of NULL check applied to tg. Or most of the members
of *pipe_ctx. If there really is the same kind of problem here, then one
would need to rewrite a bit more code to fix stuff.
E.g. in the case of dcn31_hwseq.c, the questionable code is in
dcn31_reset_back_end_for_pipe(), which is static and only called from
dcn31_reset_hw_ctx_wrap(). Which is assigned to the .reset_hw_ctx_wrap
callback. And this specific callback, from what I can see, is only
called from dce110_reset_hw_ctx_wrap(). Which is then assigned to the
.apply_ctx_to_hw callback. The callback is only called from
dc_commit_state_no_check(). That one is static again, and called from
dc_commit_streams().
I could trace this even further. My point is: I don't think this is
called from any IRQ handler code. And given the depth and complexity of
the callgraph, I have to admit, that, at least at this point, this is a
bit over my head.
Sure, I could now sprinkle a bunch of x != NULL in the code, but that
would be merely voodoo. And I usually try to have a theoretical basis
when I apply changes to code.
Maybe if someone from the AMD display team could give some insight if
there still is potentially vulnerable code in some of the instances that
Christopher has posted, then I would gladly take a look.
possible use-case it's hard to say there's no possible way the other
set_drr calls could potentially have a similar issue.
I think Tobias' analysis is sound and this fixes a number of issues, hence
my RB.
Harry
Alex
With best wishes,
Tobias
With best wishes,
Tobias
Tobias Jakobi (2):
drm/amd/display: Avoid race between dcn10_set_drr() and
dc_state_destruct()
drm/amd/display: Avoid race between dcn35_set_drr() and
dc_state_destruct()
.../amd/display/dc/hwss/dcn10/dcn10_hwseq.c | 20 +++++++++++--------
.../amd/display/dc/hwss/dcn35/dcn35_hwseq.c | 20 +++++++++++--------
2 files changed, 24 insertions(+), 16 deletions(-)