On 2018-01-30 21:49, Corey Minyard wrote:
On 01/29/2018 07:57 AM, Wang, Haiyue wrote:Sure, the read/write have protected the critical data area with IRQ, and also, these
On 2018-01-26 22:48, Corey Minyard wrote:
On 01/26/2018 12:08 AM, Wang, Haiyue wrote:Since KCS is not a multi-reader protocol from BMC's view, you makes things complex. :-)
On 2018-01-25 01:48, Corey Minyard wrote:
On 01/24/2018 10:06 AM, Haiyue Wang wrote:
The KCS (Keyboard Controller Style) interface is used to perform in-band
IPMI communication between a server host and its BMC (BaseBoard Management
Controllers).
This driver exposes the KCS interface on ASpeed SOCs (AST2400 and AST2500)
as a character device. Such SOCs are commonly used as BMCs and this driver
implements the BMC side of the KCS interface.
Signed-off-by: Haiyue Wang <haiyue.wang@xxxxxxxxxxxxxxx>
---
v1->v2
- Divide the driver into two parts, one handles the BMC KCS IPMI 2.0 state;
ÂÂ the other handles the BMC KCS controller such as AST2500 IO accessing.
- Use the spin lock APIs to handle the device file operations and BMC chip
ÂÂ IRQ inferface for accessing the same KCS BMC data structure.
- Enhanced the phases handling of the KCS BMC.
- Unified the IOCTL definition for IPMI BMC, it will be used by KCS and BT.
---
+
+static void kcs_bmc_handle_data(struct kcs_bmc *kcs_bmc)
+{
+ÂÂÂ u8 data;
+
+ÂÂÂ switch (kcs_bmc->phase) {
+ÂÂÂ case KCS_PHASE_WRITE:
+ÂÂÂÂÂÂÂ set_state(kcs_bmc, WRITE_STATE);
+
+ÂÂÂÂÂÂÂ /* set OBF before reading data */
+ÂÂÂÂÂÂÂ write_data(kcs_bmc, KCS_ZERO_DATA);
+
+ÂÂÂÂÂÂÂ if (kcs_bmc->data_in_idx < KCS_MSG_BUFSIZ)
+ kcs_bmc->data_in[kcs_bmc->data_in_idx++] =
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ read_data(kcs_bmc);
I missed this earlier, you need to issue a length error if the data is too large.
Understood.+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ case KCS_PHASE_WRITE_END:
+ÂÂÂÂÂÂÂ set_state(kcs_bmc, READ_STATE);
+
+ÂÂÂÂÂÂÂ if (kcs_bmc->data_in_idx < KCS_MSG_BUFSIZ)
+ kcs_bmc->data_in[kcs_bmc->data_in_idx++] =
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ read_data(kcs_bmc);
+
+ÂÂÂÂÂÂÂ kcs_bmc->phase = KCS_PHASE_WAIT_READ;
+ÂÂÂÂÂÂÂ if (kcs_bmc->running) {
Why do you only do this when running is set? It won't hurt anything if it's not
set. As it is, you have a race if something opens the device while this code
runs.
Also, don't set the state to wait read until the "write" has finished (userland has
read the data out of the buffer. More on that later.
If host software sends the data twice such as a retry before the BMC's IPMI service starts,+ kcs_bmc->data_in_avail = true;
+ wake_up_interruptible(&kcs_bmc->queue);
+ÂÂÂÂÂÂÂ }
+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ case KCS_PHASE_READ:
+ÂÂÂÂÂÂÂ if (kcs_bmc->data_out_idx == kcs_bmc->data_out_len)
+ÂÂÂÂÂÂÂÂÂÂÂ set_state(kcs_bmc, IDLE_STATE);
+
+ÂÂÂÂÂÂÂ data = read_data(kcs_bmc);
+ÂÂÂÂÂÂÂ if (data != KCS_CMD_READ_BYTE) {
+ÂÂÂÂÂÂÂÂÂÂÂ set_state(kcs_bmc, ERROR_STATE);
+ÂÂÂÂÂÂÂÂÂÂÂ write_data(kcs_bmc, KCS_ZERO_DATA);
+ÂÂÂÂÂÂÂÂÂÂÂ break;
+ÂÂÂÂÂÂÂ }
+
+ÂÂÂÂÂÂÂ if (kcs_bmc->data_out_idx == kcs_bmc->data_out_len) {
+ÂÂÂÂÂÂÂÂÂÂÂ write_data(kcs_bmc, KCS_ZERO_DATA);
+ÂÂÂÂÂÂÂÂÂÂÂ kcs_bmc->phase = KCS_PHASE_IDLE;
+ÂÂÂÂÂÂÂÂÂÂÂ break;
+ÂÂÂÂÂÂÂ }
+
+ÂÂÂÂÂÂÂ write_data(kcs_bmc,
+ kcs_bmc->data_out[kcs_bmc->data_out_idx++]);
+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ case KCS_PHASE_ABORT_ERROR1:
+ÂÂÂÂÂÂÂ set_state(kcs_bmc, READ_STATE);
+
+ÂÂÂÂÂÂÂ /* Read the Dummy byte */
+ÂÂÂÂÂÂÂ read_data(kcs_bmc);
+
+ÂÂÂÂÂÂÂ write_data(kcs_bmc, kcs_bmc->error);
+ÂÂÂÂÂÂÂ kcs_bmc->phase = KCS_PHASE_ABORT_ERROR2;
+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ case KCS_PHASE_ABORT_ERROR2:
+ÂÂÂÂÂÂÂ set_state(kcs_bmc, IDLE_STATE);
+
+ÂÂÂÂÂÂÂ /* Read the Dummy byte */
+ÂÂÂÂÂÂÂ read_data(kcs_bmc);
+
+ÂÂÂÂÂÂÂ write_data(kcs_bmc, KCS_ZERO_DATA);
+ÂÂÂÂÂÂÂ kcs_bmc->phase = KCS_PHASE_IDLE;
+
+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ default:
+ÂÂÂÂÂÂÂ set_state(kcs_bmc, ERROR_STATE);
+
+ÂÂÂÂÂÂÂ /* Read the Dummy byte */
+ÂÂÂÂÂÂÂ read_data(kcs_bmc);
+
+ÂÂÂÂÂÂÂ write_data(kcs_bmc, KCS_ZERO_DATA);
+ÂÂÂÂÂÂÂ break;
+ÂÂÂ }
+}
+
+static void kcs_bmc_handle_command(struct kcs_bmc *kcs_bmc)
+{
+ÂÂÂ u8 cmd;
+
+ÂÂÂ set_state(kcs_bmc, WRITE_STATE);
+
+ÂÂÂ /* Dummy data to generate OBF */
+ÂÂÂ write_data(kcs_bmc, KCS_ZERO_DATA);
+
+ÂÂÂ cmd = read_data(kcs_bmc);
Shouldn't you check the phase in all the cases below and do error
handling if the phase isn't correct?
Similar thing if the device here isn't open. You need to handle
that gracefully.
Also, you should remove data_in_avail and data_in_idx setting from
here, for reasons I will explain later.
then the two IPMI requests will be merged into one, if not clear data_in_idx after receving
KCS_CMD_WRITE_START. Most of the states are driven by host software (SMS). :(
True, but what if the host issues WRITE_START or a WRITE_END while this driver is in read
state? The spec is unclear on this, but it really only makes sense for the host to issue
WRITE_START in idle stat and WRITE_END in write state. IMHO it should go to error
state. You might make the case that a WRITE_START anywhere restarts the transaction,
but the feel of the error state machine kind of goes against that. WRITE_END is definitely
wrong anywhere but write state.
I just found the following in the spec (section 9.12):
ÂÂ Thus, since the interface will allow a command transfer to be
ÂÂ started or restarted
ÂÂ at any time when the input buffer is empty, software could elect to
ÂÂ simply retry
ÂÂ the command upon detecting an error condition, or issue a âknown goodâ
ÂÂ command in order to clear ERROR_STATE
So a WRITE_START anywhere is ok. A WRITE_END in the wrong state should probably
still go to error state. This means the user needs to be able to handle a write error at
any time. It also means it's very important to make sure the user does a read before
doing a write. If the host re-issues a WRITE_START and writes a new command
between the time the use reads the data and writes the response, the response would
be for the wrong command.
+ÂÂÂ switch (cmd) {
+ÂÂÂ case KCS_CMD_WRITE_START:
+ÂÂÂÂÂÂÂ kcs_bmc->data_in_avail = false;
+ÂÂÂÂÂÂÂ kcs_bmc->data_in_idxÂÂ = 0;
+ÂÂÂÂÂÂÂ kcs_bmc->phaseÂÂÂÂÂÂÂÂ = KCS_PHASE_WRITE;
+ÂÂÂÂÂÂÂ kcs_bmc->errorÂÂÂÂÂÂÂÂ = KCS_NO_ERROR;
+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ case KCS_CMD_WRITE_END:
+ÂÂÂÂÂÂÂ kcs_bmc->phase = KCS_PHASE_WRITE_END;
+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ case KCS_CMD_ABORT:
+ÂÂÂÂÂÂÂ if (kcs_bmc->error == KCS_NO_ERROR)
+ÂÂÂÂÂÂÂÂÂÂÂ kcs_bmc->error = KCS_ABORTED_BY_COMMAND;
+
+ÂÂÂÂÂÂÂ kcs_bmc->phase = KCS_PHASE_ABORT_ERROR1;
+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ default:
+ÂÂÂÂÂÂÂ kcs_bmc->error = KCS_ILLEGAL_CONTROL_CODE;
+ÂÂÂÂÂÂÂ set_state(kcs_bmc, ERROR_STATE);
+ÂÂÂÂÂÂÂ write_data(kcs_bmc, kcs_bmc->error);
+ÂÂÂÂÂÂÂ kcs_bmc->phase = KCS_PHASE_ERROR;
+ÂÂÂÂÂÂÂ break;
+ÂÂÂ }
+}
+
+int kcs_bmc_handle_event(struct kcs_bmc *kcs_bmc)
+{
+ÂÂÂ unsigned long flags;
+ÂÂÂ int ret = 0;
+ÂÂÂ u8 status;
+
+ÂÂÂ spin_lock_irqsave(&kcs_bmc->lock, flags);
+
+ÂÂÂ status = read_status(kcs_bmc) & (KCS_STATUS_IBF | KCS_STATUS_CMD_DAT);
+
+ÂÂÂ switch (status) {
+ÂÂÂ case KCS_STATUS_IBF | KCS_STATUS_CMD_DAT:
+ÂÂÂÂÂÂÂ kcs_bmc_handle_command(kcs_bmc);
+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ case KCS_STATUS_IBF:
+ÂÂÂÂÂÂÂ kcs_bmc_handle_data(kcs_bmc);
+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ default:
+ÂÂÂÂÂÂÂ ret = -1;
+ÂÂÂÂÂÂÂ break;
+ÂÂÂ }
+
+ÂÂÂ spin_unlock_irqrestore(&kcs_bmc->lock, flags);
+
+ÂÂÂ return ret;
+}
+EXPORT_SYMBOL(kcs_bmc_handle_event);
+
+static inline struct kcs_bmc *file_kcs_bmc(struct file *filp)
+{
+ÂÂÂ return container_of(filp->private_data, struct kcs_bmc, miscdev);
+}
+
+static int kcs_bmc_open(struct inode *inode, struct file *filp)
+{
+ÂÂÂ struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
+ÂÂÂ int ret = 0;
+
+ÂÂÂ spin_lock_irq(&kcs_bmc->lock);
+
+ÂÂÂ if (!kcs_bmc->running) {
+ÂÂÂÂÂÂÂ kcs_bmc->runningÂÂÂÂÂÂ = 1;
+ÂÂÂÂÂÂÂ kcs_bmc->phaseÂÂÂÂÂÂÂÂ = KCS_PHASE_IDLE;
+ÂÂÂÂÂÂÂ kcs_bmc->data_in_avail = false;
If you do everything right, setting the phase and data_in_avail should not
be necessary here.
+ÂÂÂ } else {
+ÂÂÂÂÂÂÂ ret = -EBUSY;
+ÂÂÂ }
+
+ÂÂÂ spin_unlock_irq(&kcs_bmc->lock);
+
+ÂÂÂ return ret;
+}
+
+static unsigned int kcs_bmc_poll(struct file *filp, poll_table *wait)
+{
+ÂÂÂ struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
+ÂÂÂ unsigned int mask = 0;
+
+ÂÂÂ poll_wait(filp, &kcs_bmc->queue, wait);
+
+ÂÂÂ spin_lock_irq(&kcs_bmc->lock);
+
+ÂÂÂ if (kcs_bmc->data_in_avail)
+ÂÂÂÂÂÂÂ mask |= POLLIN;
+
+ÂÂÂ spin_unlock_irq(&kcs_bmc->lock);
+
+ÂÂÂ return mask;
+}
+
+static ssize_t kcs_bmc_read(struct file *filp, char *buf,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ size_t count, loff_t *offset)
+{
+ÂÂÂ struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
+ÂÂÂ ssize_t ret = -EAGAIN;
+
This function still has some issues.
You can't call copy_to_user() with a spinlock held or interrupts disabled.
To handle readers, you probably need a separate mutex.
Also, this function can return -EAGAIN even if O_NONBLOCK is not set if
kcs_bmc->data_in_avail changes between when you wait on the event
and when you check it under the lock.
You also clear data_in_avail even if the copy_to_user() fails, which is
wrong.
I believe the best way to handle this would be to have the spinlock
protect the inner workings of the state machine and a mutex handle
copying data out, setting/clearing the running flag (thus a mutex
instead of spinlock in open and release) and the ioctl settings (except
for abort where you will need to grab the spinlock).
After the wait event below, grab the mutex. If data is not available
and O_NONBLOCK is not set, drop the mutex and retry. Otherwise
this is the only place (besides release) that sets data_in_avail to false.
Do the copy_to_user(), grab the spinlock, clear data_in_avail and
data_in_idx, then release the lock and mutex. If you are really
adventurous you can do this without grabbing the lock using
barriers, but it's probably not necessary here.
With the state machine being able to be restarted at any time, you need
something a little different here. You still need the mutex to handle
multiple readers and the copy. I think the function should be something
like:
No, I don't think you understand. The primary purpose of the complexity
here is to protect the driver from the host system (on the other side of
the KCS interface). Without this protection, it is possible for the host
system to start a new write while the user on the BMC side is reading
data out, resulting in corrupt data being read.
I haven't thought too much about this. There may be a simpler way,
but the protection needs to be there.
And you may not think you need to protect the driver against a
malicious BMC side user code, but you would be wrong. You can
only have one opener, but with threads or a fork you can have
multiple readers. And you don't know if a malicious piece of
code has taken over userland. You always need to protect the
kernel.
functions should be thread local safe I believe.
spin_lock_irq(&kcs_bmc->lock);
...
spin_unlock_irq(&kcs_bmc->lock);
ÂÂ static ssize_t kcs_bmc_read(struct file *filp, char *buf,
ÂÂÂ ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ size_t count, loff_t *offset)
ÂÂ {
ÂÂÂ ÂÂÂ struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
ÂÂÂ ÂÂÂ ssize_t ret;
ÂÂÂ ÂÂÂ bool avail;
ÂÂÂ ÂÂÂ size_t data_size;
ÂÂÂ ÂÂÂ u8 *data;
ÂÂÂ ÂÂÂ data = kmalloc(KCS_MSG_BUFSIZ, GFP_KERNEL);
ÂÂÂ ÂÂÂ if (!data)
ÂÂÂ ÂÂÂÂÂÂÂ return -ENOMEM;
ÂÂ retry:
ÂÂÂ ÂÂÂ ret = -EAGAIN;
ÂÂÂ ÂÂÂ if (!(filp->f_flags & O_NONBLOCK))
ÂÂÂ ÂÂÂÂÂÂÂ wait_event_interruptible(kcs_bmc->queue,
ÂÂÂ ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kcs_bmc->data_in_avail);
ÂÂÂ ÂÂÂ mutex_lock(&kcs_bmc->read_mutex);
ÂÂÂ ÂÂÂ spin_lock_irq(&kcs_bmc->lock);
ÂÂÂ ÂÂÂ avail = kcs_bmc->data_in_avail;
ÂÂÂ ÂÂÂ if (avail) {
ÂÂÂ ÂÂÂÂÂÂÂ memcpy(data, kcs_bmc->data_in, kcs_bmc->data_in_idx);
ÂÂÂ ÂÂÂÂÂÂÂ data_size = kcs_bmc->data_in_idx;
ÂÂÂ ÂÂÂ }
ÂÂÂ ÂÂÂ spin_unlock_irq(&kcs_bmc->lock);
ÂÂÂ ÂÂÂ if (!avail) {
ÂÂÂ ÂÂÂÂÂÂÂ if (filp->f_flags & O_NONBLOCK)
ÂÂÂ ÂÂÂÂÂÂÂÂÂÂÂ goto out_mutex_unlock;
ÂÂÂ ÂÂÂÂÂÂÂ mutex_unlock(&kcs_bmc->read_mutex);
ÂÂÂ ÂÂÂÂÂÂÂ goto retry;
ÂÂÂ ÂÂÂ }
ÂÂÂ ÂÂÂ if (count < data_size) {
ÂÂÂ ÂÂÂÂÂÂÂ ret = -EOVERFLOW;
ÂÂÂ ÂÂÂ ÂÂ Â ? I'm not sure about the error, but userspace needs to know.
ÂÂÂ Â Â ÂÂÂ goto out_mutex_unlock;
Maybe a length error to the host side here?
Every time ipmids (or kcsd) crashed or killed, it needs start to call FORCE_ARBORT firstly, to sync with
ÂÂÂ ÂÂÂ }In practice, we do this as spec said in ipmid, NOT in driver, driver can't handle anything, let's
ÂÂÂ ÂÂÂ if (!copy_to_user(buf, data, data_size)) {
ÂÂÂ ÂÂÂÂÂÂÂ ret = -EFAULT;
ÂÂÂ Â Â ÂÂÂ goto out_mutex_unlock;
ÂÂÂ Â Â }
ÂÂÂ ÂÂÂ ret = data_size;
ÂÂÂ ÂÂÂ spin_lock_irq(&kcs_bmc->lock);
ÂÂÂ ÂÂÂ if (kcs_bmc->phase != KCS_PHASE_WRITE_END_DONE)
ÂÂÂ ÂÂÂÂÂÂÂ /* Something aborted or restarted the state machine. */
ÂÂÂ ÂÂÂÂÂÂÂ ? Maybe restart if O_NONBLOCK is not set and -EAGAIN if it is?
ÂÂÂ ÂÂÂÂÂÂÂ ret = -EIO;
ÂÂÂ ÂÂÂ } else {
ÂÂÂ Â ÂÂ ÂÂ kcs_bmc->phase = KCS_PHASE_WAIT_READ;
ÂÂÂ ÂÂÂÂÂÂÂ kcs_bmc->data_in_avail = false;
ÂÂÂ Â Â Â Â kcs_bmc->data_in_idx = 0;
ÂÂÂ ÂÂÂ }
ÂÂÂ ÂÂÂ spin_unlock_irq(&kcs_bmc->lock);
ÂÂ out_mutex_unlock:
ÂÂÂ ÂÂÂ mutex_unlock(&kcs_bmc->read_mutex);
ÂÂÂ ÂÂÂ kfree(data);
ÂÂÂ ÂÂÂ return ret;
ÂÂ }
Note that I added a state, KCS_PHASE_WRITE_END_DONE, which would be
set after the final byte from the host is received. You want the read here
done before you can do the write below to avoid the race I talked about.
There is a local copy made of the data. What you *never* want to happen
here is for the state machine to start processing a new write command
while the data is being copied. It could result in corrupt data being read
and some random operation being done by the BMC.
If you want to avoid the local copy, it could be done, but it's more complex.
The device just provides the read & write data, the transaction is handled in the KCS+ÂÂÂ if (!(filp->f_flags & O_NONBLOCK))
+ÂÂÂÂÂÂÂ wait_event_interruptible(kcs_bmc->queue,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kcs_bmc->data_in_avail);
+
+ÂÂÂ spin_lock_irq(&kcs_bmc->lock);
+
+ÂÂÂ if (kcs_bmc->data_in_avail) {
+ÂÂÂÂÂÂÂ kcs_bmc->data_in_avail = false;
+
+ÂÂÂÂÂÂÂ if (count > kcs_bmc->data_in_idx)
+ÂÂÂÂÂÂÂÂÂÂÂ count = kcs_bmc->data_in_idx;
+
+ÂÂÂÂÂÂÂ if (!copy_to_user(buf, kcs_bmc->data_in, count))
+ÂÂÂÂÂÂÂÂÂÂÂ ret = count;
+ÂÂÂÂÂÂÂ else
+ÂÂÂÂÂÂÂÂÂÂÂ ret = -EFAULT;
+ÂÂÂ }
+
+ÂÂÂ spin_unlock_irq(&kcs_bmc->lock);
+
+ÂÂÂ return ret;
+}
+
+static ssize_t kcs_bmc_write(struct file *filp, const char *buf,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ size_t count, loff_t *offset)
+{
+ÂÂÂ struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
+ÂÂÂ ssize_t ret = count;
+
+ÂÂÂ if (count < 1 || count > KCS_MSG_BUFSIZ)
+ÂÂÂÂÂÂÂ return -EINVAL;
+
+ÂÂÂ spin_lock_irq(&kcs_bmc->lock);
+
+ÂÂÂ if (kcs_bmc->phase == KCS_PHASE_WAIT_READ) {
+ÂÂÂÂÂÂÂ if (copy_from_user(kcs_bmc->data_out, buf, count)) {
+ÂÂÂÂÂÂÂÂÂÂÂ spin_unlock_irq(&kcs_bmc->lock);
+ÂÂÂÂÂÂÂÂÂÂÂ return -EFAULT;
+ÂÂÂÂÂÂÂ }
+
+ÂÂÂÂÂÂÂ kcs_bmc->phase = KCS_PHASE_READ;
+ÂÂÂÂÂÂÂ kcs_bmc->data_out_idx = 1;
+ÂÂÂÂÂÂÂ kcs_bmc->data_out_len = count;
+ÂÂÂÂÂÂÂ write_data(kcs_bmc, kcs_bmc->data_out[0]);
+ÂÂÂ } else if (kcs_bmc->phase == KCS_PHASE_READ) {
+ÂÂÂÂÂÂÂ ret = -EBUSY;
+ÂÂÂ } else {
+ÂÂÂÂÂÂÂ ret = -EINVAL;
Is there a reason you return -EINVAL here? Why not just -EBUSY in all
cases? Is there something that userland will need to do differently?
+ÂÂÂ }
+
+ÂÂÂ spin_unlock_irq(&kcs_bmc->lock);
+
+ÂÂÂ return ret;
+}
+
+static long kcs_bmc_ioctl(struct file *filp, unsigned int cmd,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂ unsigned long arg)
+{
+ÂÂÂ struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
+ÂÂÂ long ret = 0;
+
+ÂÂÂ spin_lock_irq(&kcs_bmc->lock);
+
+ÂÂÂ switch (cmd) {
+ÂÂÂ case IPMI_BMC_IOCTL_SET_SMS_ATN:
+ÂÂÂÂÂÂÂ update_status_bits(kcs_bmc, KCS_STATUS_SMS_ATN,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ KCS_STATUS_SMS_ATN);
+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ case IPMI_BMC_IOCTL_CLEAR_SMS_ATN:
+ÂÂÂÂÂÂÂ update_status_bits(kcs_bmc, KCS_STATUS_SMS_ATN,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 0);
+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ case IPMI_BMC_IOCTL_FORCE_ABORT:
+ÂÂÂÂÂÂÂ set_state(kcs_bmc, ERROR_STATE);
+ÂÂÂÂÂÂÂ read_data(kcs_bmc);
+ÂÂÂÂÂÂÂ write_data(kcs_bmc, KCS_ZERO_DATA);
+
+ÂÂÂÂÂÂÂ kcs_bmc->phase = KCS_PHASE_ERROR;
+ÂÂÂÂÂÂÂ kcs_bmc->data_in_avail = false;
+ÂÂÂÂÂÂÂ break;
+
+ÂÂÂ default:
+ÂÂÂÂÂÂÂ ret = -EINVAL;
+ÂÂÂÂÂÂÂ break;
+ÂÂÂ }
+
+ÂÂÂ spin_unlock_irq(&kcs_bmc->lock);
+
+ÂÂÂ return ret;
+}
+
+static int kcs_bmc_release(struct inode *inode, struct file *filp)
+{
+ÂÂÂ struct kcs_bmc *kcs_bmc = file_kcs_bmc(filp);
+
What happens if the device gets closed in the middle of a transaction? That's
an important case to handle. If something is in process, you need to abort it.
controller's IRQ handler.
From the spec, section 9.14:
ÂÂ The BMC must change the status to ERROR_STATE on any condition where it
ÂÂ aborts a command transfer in progress.
So you need to do something here.
make it simple, thanks!
If ipmid crashes or is killed, how does it accomplish this?
host side software.
I think the final error handling solution is that kcsd (user land) runs, otherwise, the host software side still got stuck. We meet
Whenever the BMC is reset (from power-on or a hard reset), the State Bits are initialized to â11 - Error Stateâ. Doing so
allows SMS to detect that the BMC has been reset and that any message in process has been terminated by the BMC.
Right, that's fine, like it should be. But we are not talking about a reset.
this kind of issue, so in general, we just doesn't handle some mirror errors in driver, then in kcsd, when it can provide the real
IPMI service, it will reset the channel firstly to sync with host side software.
-corey
+ spin_lock_irq(&kcs_bmc->lock);
+
+ÂÂÂ kcs_bmc->running = 0;
+
+ÂÂÂ spin_unlock_irq(&kcs_bmc->lock);
+
+ÂÂÂ return 0;
+}
+