Re: [PATCH v2 1/3] soc: qcom: rpmh-rsc: Remove tcs_is_free() and find_free_tcs() APIs
From: Doug Anderson
Date: Tue May 05 2020 - 16:05:14 EST
Hi,
On Mon, May 4, 2020 at 11:36 PM Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote:
>
> > > -static int check_for_req_inflight(struct rsc_drv *drv, struct tcs_group *tcs,
> > > - const struct tcs_request *msg)
> > > +static int check_for_req_inflight_and_find_free(struct rsc_drv *drv,
> > > + const struct tcs_group *tcs, const struct tcs_request *msg)
> > > {
> > > unsigned long curr_enabled;
> > > u32 addr;
> > > - int i, j, k;
> > > - int tcs_id = tcs->offset;
> > > -
> > > - for (i = 0; i < tcs->num_tcs; i++, tcs_id++) {
> > > - if (tcs_is_free(drv, tcs_id))
> > > - continue;
> > > + int j, k;
> > > + int i = tcs->offset;
> > > + unsigned long max = tcs->offset + tcs->num_tcs;
> > > + int first_free = i;
> >
> > The way "first_free" is calculated definitely adds complexity to this
> > function. Are we sure it's justified compared to just calling
> > find_next_zero_bit() if the function doesn't return -EBUSY? If you
> > really like it this way I won't object too strongly, but I'm not
> > convinced that it makes the code size smaller (vs. jumping to a common
> > implementation in the kernel) and it seems unlikely to have any
> > real-world speed impact.
>
> I was trying to coalesce the double loop over the free bits here by
> adding a couple more lines to keep track of the first free bit and to
> set the bit when it's found. It almost feels like it would be better to
> inline this whole function into the one call site too.
Definitely a bike shed color issue. I know it was double-looping before, but:
* Neither loop was very long, a few bits at most.
* The 2nd loop was in common code. That means the "number of
instructions" to implement this loop is small--just a function call.
For code that isn't a hot spot it can be better to optimize for code
size rather than speed since it means you're taking up fewer cache
lines and thus less likely to kick out other code. ;-) ...but we're
getting into micro optimization.
In any case, I probably haven't convinced you. I'm fine with your
code and I'll shut up now.
> > > - curr_enabled = read_tcs_reg(drv, RSC_DRV_CMD_ENABLE, tcs_id);
> > > + for_each_set_bit_from(i, drv->tcs_in_use, max) {
> > > + /* Find a free tcs to use in this group */
> > > + if (first_free == i)
> > > + first_free = i + 1; /* Maybe the next one is free? */
> > >
> > > + curr_enabled = read_tcs_reg(drv, RSC_DRV_CMD_ENABLE, i);
> > > for_each_set_bit(j, &curr_enabled, MAX_CMDS_PER_TCS) {
> > > - addr = read_tcs_cmd(drv, RSC_DRV_CMD_ADDR, tcs_id, j);
> > > + addr = read_tcs_cmd(drv, RSC_DRV_CMD_ADDR, i, j);
> > > for (k = 0; k < msg->num_cmds; k++) {
> > > if (addr == msg->cmds[k].addr)
> > > return -EBUSY;
> > > @@ -526,28 +514,11 @@ static int check_for_req_inflight(struct rsc_drv *drv, struct tcs_group *tcs,
> > > }
> > > }
> > >
> > > - return 0;
> > > -}
> > > + if (first_free >= max)
> > > + return -EBUSY;
> > >
> > > -/**
> > > - * find_free_tcs() - Find free tcs in the given tcs_group; only for active.
> > > - * @tcs: A pointer to the active-only tcs_group (or the wake tcs_group if
> > > - * we borrowed it because there are zero active-only ones).
> > > - *
> > > - * Must be called with the drv->lock held since that protects tcs_in_use.
> > > - *
> > > - * Return: The first tcs that's free.
> > > - */
> > > -static int find_free_tcs(struct tcs_group *tcs)
> > > -{
> > > - int i;
> > > -
> > > - for (i = 0; i < tcs->num_tcs; i++) {
> > > - if (tcs_is_free(tcs->drv, tcs->offset + i))
> > > - return tcs->offset + i;
> > > - }
> > > -
> > > - return -EBUSY;
> > > + set_bit(first_free, drv->tcs_in_use);
> >
> > Function is not documented to also set the bit. Do we really gain
> > anything by setting it in this function, or can it just stay with the
> > caller? I'd hate to call this function
> > check_for_req_inflight_and_find_free_and_claim_it().
>
> Maybe the function can be named claim_tcs_for_req() or something like
> that. Anything to make it shorter would be good!
Sure. ...though moving the set_bit() here is just pure churn, right?
It can be here or it can be in the calling function and there's really
no advantage either way. If you really like it here then fine. I
just see no benefit and it's just an extra line to change.
-Doug