Hi,correct, we have unique requests in non-batch caches.
On Wed, Apr 1, 2020 at 4:30 AM Maulik Shah <mkshah@xxxxxxxxxxxxxx> wrote:
Sounds good. Overall I'll probably wait to spin until your series+ * Return: 0 if no problem, or -EAGAIN if the caller should try again in a bit.with [2] it will not return -EAGAIN, can you please remove this part.
+ * Caller should make sure to enable interrupts between tries.
+ * Only ever returns -EAGAIN for ACTIVE_ONLY transfers.
lands because trying to keep spinning this one in conjunction with
that one is going to be a nightmare. Hopefully that means:
a) Your series can land soon. I think it's in pretty good shape now.
I just sent a bunch of reviews. Might need one more spin for nits and
then we'll see if Bjorn thinks it's good to go.
b) Once I spin this it can get a quicker review so other things don't
pop up and change things.
...or, if you want, you can hijack my series and send the next version
out yourself. I won't object to that but please give me a heads up if
you're planning to do that so we don't duplicate work.
Right. The race comment still is somewhat interesting because the@@ -246,12 +288,35 @@ static struct tcs_group *get_tcs_for_msg(struct rsc_drv *drv,You can remove above comment, i already included change to enable IRQs
ret = rpmh_rsc_invalidate(drv);
if (ret)
return ERR_PTR(ret);
+
+ /*
+ * TODO:
+ * - Temporarily enable IRQs on the wake tcs.
+ * - Make sure nobody else race with us and re-write
+ * to this TCS; document how this works.
on wake TCS in my series.
only way we're race free is that we know that the sleep/wake values
are only programmed at a time when no more active transactions can be
started. I'll document that assumption. If we ever change that
assumption we'll have to think about this more since (at the moment)
programming sleep/wake doesn't set the "tcs_in_use" bit.
As per below I'm trying to understand the motivation for using+/**This comment will need to modify/removed after we use this in place of
+ * check_for_req_inflight() - Look to see if conflicting cmds are in flight.
+ * @drv: The controller.
+ * @tcs: A pointer to the tcs_group used for ACTIVE_ONLY transfers.
+ * @msg: The message we want to send, which will contain several addr/data
+ * pairs to program (but few enough that they all fit in one TCS).
+ *
+ * Only for use for ACTIVE_ONLY tcs_group, since those are the only ones
+ * that might be actively sending.
find_match().
see below for more details.
check_for_req_inflight() when writing sleep/wake sets, so changing
this is pending on that discussion.
OK.@@ -411,6 +533,20 @@ static int find_free_tcs(struct tcs_group *tcs)with [2] it will not return -EAGAIN, can you please remove this comment.
return -EBUSY;
}
+/**
+ * tcs_write() - Store messages into a TCS right now, or return -EBUSY.
+ * @drv: The controller.
+ * @msg: The data to be sent.
+ *
+ * Grabs a TCS for ACTIVE_ONLY transfers and writes the messages to it.
+ *
+ * If there are no free ACTIVE_ONLY TCSs or if a command for the same address
+ * is already transferring returns -EBUSY which means the client should retry
+ * shortly.
+ *
+ * Return: 0 on success, -EBUSY if client should retry, or an error.
+ * Client should have interrupts enabled for a bit before retrying.
+ */
static int tcs_write(struct rsc_drv *drv, const struct tcs_request *msg)
{
struct tcs_group *tcs;
@@ -418,16 +554,14 @@ static int tcs_write(struct rsc_drv *drv, const struct tcs_request *msg)
unsigned long flags;
int ret;
+ /* TODO: get_tcs_for_msg() can return -EAGAIN and nobody handles */
OK. I guess I felt like now that check_for_req_inflight() wastcs = get_tcs_for_msg(drv, msg);please keep above comment as it is.
if (IS_ERR(tcs))
return PTR_ERR(tcs);
spin_lock_irqsave(&tcs->lock, flags);
+
spin_lock(&drv->lock);
- /*
- * The h/w does not like if we send a request to the same address,
- * when one is already in-flight or being processed.
- */
documented it was just getting in the way, but I'll keep it if you
want.
OK.@@ -485,6 +635,63 @@ int rpmh_rsc_send_data(struct rsc_drv *drv, const struct tcs_request *msg)we are doing this from [1] onwards.
return ret;
}
+/**
+ * find_match() - Find if the cmd sequence is in the tcs_group
+ * @tcs: The tcs_group to search. Either sleep or wake.
+ * @cmd: The command sequence to search for; only addr is looked at.
+ * @len: The number of commands in the sequence.
+ *
+ * Searches through the given tcs_group to see if a given command sequence
+ * is in there.
+ *
+ * Two sequences are matches if they modify the same set of addresses in
+ * the same order. The value of the data is not considered when deciding if
+ * two things are matches.
+ *
+ * How this function works is best understood by example. For our example,
+ * we'll imagine our tcs group contains these (cmd, data) tuples:
+ * [(a, A), (b, B), (c, C), (d, D), (e, E), (f, F), (g, G), (h, H)]
+ * ...in other words it has an element where (addr=a, data=A), etc.
+ * ...we'll assume that there is one TCS in the group that can store 8 commands.
+ *
+ * - find_match([(a, X)]) => 0
+ * - find_match([(c, X), (d, X)]) => 2
+ * - find_match([(c, X), (d, X), (e, X)]) => 2
+ * - find_match([(z, X)]) => -ENODATA
+ * - find_match([(a, X), (y, X)]) => -EINVAL (and warning printed)
+ * - find_match([(g, X), (h, X), (i, X)]) => -EINVAL (and warning printed)
+ * - find_match([(y, X), (a, X)]) => -ENODATA
+ *
+ * NOTE: This function overall seems like it has questionable value.
+ * - It can be used to update a message in the TCS with new data, but I
+ * don't believe we actually do that--we always fully invalidate and
+ * re-write everything.
Sure, we could use find_match() to add this check. It definitelySpecifically it would be too limiting to forceWe still need to check there is no duplicate request in a TCS for SLEEP
+ * someone not to change the set of addresses written to each time.
+ * - This function could be attempting to avoid writing different data to
+ * the same address twice in a tcs_group. If that's the goal, it doesn't
+ * do a great job since find_match([(y, X), (a, X)]) return -ENODATA in my
+ * above example.
+ * - If you originally wrote [(a, A), (b, B), (c, C)] and later tried to
+ * write [(a, A), (b, B)] it'd look like a match and we wouldn't consider
+ * it an error that the size got shorter.
+ * - If two clients wrote sequences that happened to be placed in slots next
+ * to each other then a later check could match a sequence that was the
+ * size of both together.
+ *
+ * TODO: in light of the above, prehaps we can just remove this function?
and WAKE as well.
find_match() checks if the request already exist for a resource then
update with new value, in a way preventing new request to go in
when one already exists. I am ok to remove this function since after [1]
we always fully invalidate and then re-write and little point in
finding a match now. However we need to use check_for_req_inflight()
from tcs_ctrl_write() with little modification to ignore tcs_is_free()
check if is called from tcs_ctrlr_write().
feels a lot better than the current solution which seems to miss a
whole lot of corner cases.
Before I do that, maybe you can help me understand the motivation,
though? Where are you expecting to find the conflict? Certainly
there can't be any conflict in the normal (non-batch) sleep/wake sets,
right? We only have one entry in the RPMH cache per address so the
non-batch entries can't conflict with themselves. There also can't be
any previous async command still pending because we cache
async/non-async alike.
correct there is only one client interconnect using batch API.
For batch requests I believe that there's exactly one caller using the
batch API (otherwise rpmh_invalidate() would be a nightmare). That
one caller is the interconnect code, right? It feels like we could
assume that the one caller of the batch API won't write to the same
address more than one time?
On upstream interconnect i see they are using batch only requests till now.
So I guess you're expecting that an rpmh_write() would write to the
same address that someone wrote to with rpmh_write_batch() and it
should override it?
Does that actually happen?On upstream interconnect i see they are using batch only requests till now.
Isn't the batch API
used just for interconnect stuff and nobody should be using
rpmh_write() to mess with the interconnect stuff?
After this change on 9th change in your series, please move it beforeSure. I was trying to do all the documentation in the earlier patches
current patch in series.
please also keep dependency on [1] for 9th change.
to provide motivation for and help understand the later patches, but I
can change the order if need be. It didn't seem a big deal to add the
comments and delete them, but I can understand it being weird.
Right. I think this was trying to say that the request wasn't/**you can remove above line since responce is returned from this function.
- * rpmh_rsc_write_ctrl_data: Write request to the controller
- *
- * @drv: the controller
- * @msg: the data to be written to the controller
+ * rpmh_rsc_write_ctrl_data() - Write request to controller but don't trigger.
+ * @drv: The controller.
+ * @msg: The data to be written to the controller.
*
* There is no response returned for writing the request to the controller.
triggered and thus there was no response. I think it's clearer with
my addition of "but don't trigger." to the comment.
-Doug