Re: [PATCH] Revert "mailbox/pcc: support mailbox management of the shared buffer"
From: Adam Young
Date: Tue Sep 30 2025 - 18:12:52 EST
ACK to reverting. I will submit the changes again, and in separate patches.
I will try to address our comments in line, so we have continuity of discussion. We can point to these messages from future reviews.
On 9/30/25 05:37, Sudeep Holla wrote:
On Mon, Sep 29, 2025 at 01:11:23PM -0400, Adam Young wrote:
I posted a patch that addresses a few of these issues. Here is a top levelPlease, let us keep the avoiding duplication as a separate topic atleast for
description of the isse
The correct way to use the mailbox API would be to allocate a buffer for the
message,write the message to that buffer, and pass it in to
mbox_send_message. The abstraction is designed to then provide sequential
access to the shared resource in order to send the messages in order. The
existing PCC Mailbox implementation violated this abstraction. It requires
each individual driver re-implement all of the sequential ordering to access
the shared buffer.
the discussion. We can take care of it even before merging if you prefer that
way but we need to explore what other drivers can use it. Otherwise it is
not yet duplication right ?
Why? Because they are all type 2 drivers, and the shared buffer is 64bits inSure, if there is only and first driver needing large buffers, it is still
length: 32bits for signature, 16 bits for command, 16 bits for status. It
would be execessive to kmalloc a buffer of this size.
not duplication yet. I agree it can be moved to PCC, but lets start with
you client driver code first and then take it from there.
This shows the shortcoming of the mailbox API. The mailbox API assumes thatNo, it is left to the client driver interpretation as it clearly varies even
there is a large enough buffer passed in to only provide a void * pointer to
the message. Since the value is small enough to fit into a single register,
it the mailbox abstraction could provide an implementation that stored a
union of a void * and word. With that change, all of the type 2
implementations could have their logic streamlined and moved into the PCC
mailbox.
within PCC type 1-5. Again, let us start with client driver code and see how
to standardise later. I agree with PCC being standard, there is scope for
avoiding duplication, but we will get to know that only if you first present
it with the client driver code and we can then see how and what to make
generic.
However, I am providing an implementation for a type3/type4 based driver,Don't even go there. It is much bigger beast with all sorts of things to
and I do need the whole managmenet of the message buffer. IN addition, I
know of at least one other subsystem (MPAM) that will benefit from a type3
implementation.
consider. Now that you have mentioned that, I am interested more to look
at MPAM driver usage as well before merging anything as generic as I know
MPAM is not so trivial. You pulled that topic into this, sorry 😉.
This actually got me to laugh. Nervously. I wonder what else is coming.
On 9/26/25 11:33, Sudeep Holla wrote:Sorry if I am slow in understanding but I still struggle why tx_prepare won't
This reverts commit 5378bdf6a611a32500fccf13d14156f219bb0c85.tx prepare is insufficient, as it does not provide access to the type3
Commit 5378bdf6a611 ("mailbox/pcc: support mailbox management of the shared buffer")
attempted to introduce generic helpers for managing the PCC shared memory,
but it largely duplicates functionality already provided by the mailbox
core and leaves gaps:
1. TX preparation: The mailbox framework already supports this via
->tx_prepare callback for mailbox clients. The patch adds
pcc_write_to_buffer() and expects clients to toggle pchan->chan.manage_writes,
but no drivers set manage_writes, so pcc_write_to_buffer() has no users.
flags. IN addition, it forces the user to manage the buffer memory
directly. WHile this is a necessary workaround for type 2 non extended
memory regions, it does not make sense for a managed resource like the
mailbox.
work for you. Please don't jump to solve 2 problems at the same time as it
just adds more confusion. Let us see if and how to make tx_prepare work for
your driver. And then we can look at standardising it as a helper function
that can be use in all the PCC mailbox client drivers if we can do that.
You are just adding parallel and optional APIs just to get your driver
working here. I am not against standardising to avoid duplication which
is your concern(very valid) but doen't need to be solved by adding another
API when the existing APIs already provides mechanism to do that.
If you need information about the PCC type3/4, we can explore that as well.
I will submit a more detailed explaination when I resubmit that functionality.
The short of it is that the Type3 Register information is in the PCCT, and that is not available outside mailbox/pcc.c
For Type 2, there is an accessor function that exposes if the buffer is safe to write. For Type 3, you need the command completion fields and registers,...and there are two of them, one for setting and one for reading.
Without this field, you cannot safely write to the shared buffer in tx_prepare. I am attempting to write it in such a way that it works for any extended memory driver. What I followed here is the PCC spec, not anything specific to the network driver I submitted to call it.
So, yes, the alternative is to create a new accessor function that returns the cmd completion bit, but that would need to be called from inside of a spin lock.
You are correct that the manage_writes flag can be removed, but if (and onlyOK, but I would like to start fresh reverting this patch.
if) we limit the logic to type 3 or type 4 drivers. I have made that change
in a follow on patch:
I can see the value in that, and support the decision.
Not always necessary. Yes if it is agreed to get the user merged. But I am2. RX handling: Data reception is already delivered throughThe change needs to go in before there are users. The patch series that
mbox_chan_received_data() and client ->rx_callback. The patch adds an
optional pchan->chan.rx_alloc, which again has no users and duplicates
the existing path.
introduced this change requires this or a comparable callback mechanism.
now questioning why you need it when you do have rx_callback.
RX callback is optional. For large buffers, we want to let the driver specify how to allocate the buffers. RX Callback will tell the driver that there is data, but would extend the pattern of requiring direct IO-memory access instead of using the message parameter.
However, the reviewers have shown that there is a race condition if theSorry if I have missed it. Can you please point me to the race condition in
callback is provided to the PCC mailbox Channel, and thus I have provided a
patch which moves this callback up to the Mailbox API.
question. I am interested to know more details.
The review in question was on Re: [PATCH net-next v28 1/1] mctp pcc: Implement MCTP over PCC Transport
Jeremy's comment was:
"Also: you're setting the client rx_callback *after*having set up the PCC channel. Won't this race with RX on the inbox?"
And he is right. If you bring up a driver when the platform has messages ready to send, the alloc function needs to be available as soon as the mailbox is active. If not, there will be a race between message delivery and the assignment of the alloc function.
That is why I am proposing a change to the mailbox API. I realize that this greatly increases the scope of the discussion. However, without providing the driver some way to specify how to allocate large buffers, message deliver becomes much more complicated. Essentially, the mailbox needs to hard code a message allocation scheme, and that means that a mechanism like PCC cannot handle different allocation schemes. Since the driver I am writing is a network driver, the right type of buffer is of type struct sk_buff. I would not want to make all PCC type 3 drivers use struct sk_buff, obviously. I wanted to limit the change to the PCC mailbox, but it does not appear to be possible without the race condition.
The change to the mailbox api was submitted in a change titled
[PATCH net-next v29 1/3] mailbox: add callback function for rx buffer allocation
This change, which is obviosuly not required when returning a single byte,I assume it can't be beyond the shmem area anyways. That can be read from the
is essential when dealing with larger buffers, such as those used by network
drivers.
rx_callback. Again I haven't understood your reasoning as why the allocation
and copy can't be part of rx_callback.
That is correct. It is only the shared memory region. Yes, it can be read from rx_callback. But the protocol is more complex than just reading the buffer, and I was trying to write it in a reusable fashion and inaccordance with the intention of the mailbox API.
I did basic testing: CPPC still works correctly on the systems that this code runs on.
True but you have changed the generic code which could break Type1/2 PCC.3. Completion handling: While adding last_tx_done is directionally useful,Applying it to type 2 and earlier would require a huge life of rewriting
the implementation only covers Type 3/4 and fails to handle the absence
of a command_complete register, so it is incomplete for other types.
code that is both multi architecture (CPPC) and on esoteric hardware
(XGene) and thus very hard to test.
I am not sure if it is tested yet.
While those drivers should make better use of the mailbox mechanism,No you need to take care to apply your changes only for Type3/4 so that
stopping the type 3 drivers from using this approach stops an effort to
provide a common implementation base. That should happen in future patches,
as part of reqorking the type 2 drivers.
Type1/2 is unaffected. You can expect to break and someone else to fix
the breakage later.
Agreed. The code was written to only affect the path for Type3/4 interfaces.
The managed_writes flag was originally for just that reason: make the code explicitly opt in. However, the write (outgoing) message flow is only changed for type 3, and thus managed_writes is not needed. I would suggest a standard that the mssg is NULL for cases where the driver is not going to actually send the data via mbox_send_message and instead is going to write the buffer directly.
The write_response path was only taken if the rx_alloc callback is set, and cannot happen for a pre-existing driver that does not set that callback.
Command Complete is part of the PCC specification for type 3 drivers.Agreed, that's not the argument. The check is done unconditionally. I will
send the patch once we agree to revert this change and start fresh. And each
feature handled separately instead of mixing 3 different things in one patch.
The check happens in pcc_write_to_buffer and only
if (!pchan->chan.manage_writes)
However, we can and should make that check be that the channel is a type3/4 instead. I did that in another patch, which I will replicate post revert.
Sure. Lets revert this patch and start discussing your individual requirementsGiven the duplication and incomplete coverage, revert this change. Any newI am willing to break up the previous work into multiple steps, provided the
requirements should be addressed in focused follow-ups rather than bundling
multiple behavioral changes together.
above arguments you provided are not going to prevent them from getting
merged. Type 3/4 drivers can and should make use of the Mailbox
abstraction. Doing so can lay the ground work for making the type 2 drivers
share a common implementation of the shared buffer management.
in individual patches and check why tx_prepare and rx_callback can't work for
you. Please share the client driver code changes you tried when checking
tx_prepare and rx_callback as well so that we can see why it can't work.
Thanks for your attention and feedback. Much appreciated.