Re: [PATCH 3/3] s390/vfio-ap: Add write support to sysfs attr ap_config

From: Anthony Krowiak
Date: Mon Jan 29 2024 - 16:43:27 EST



On 1/26/24 9:35 AM, Jason J. Herne wrote:
Allow writing a complete set of masks to ap_config. Doing so will
cause the vfio-ap driver to replace the vfio-ap mediated device's entire
state with the given set of masks. If the given state cannot be set, then
no changes are made to the vfio-ap mediated device.

The format of the data written to ap_config is as follows:
{amask},{dmask},{cmask}\n

\n is a newline character.

amask, dmask, and cmask are masks identifying which adapters, domains,
and control domains should be assigned to the mediated device.

The format of a mask is as follows:
0xNN..NN

Where NN..NN is 64 hexadecimal characters representing a 256-bit value.
The leftmost (highest order) bit represents adapter/domain 0.

For an example set of masks that represent your mdev's current
configuration, simply cat ap_config.

This attribute is intended to be used by an mdevctl callout script
supporting the mdev type vfio_ap-passthrough to atomically update a
vfio-ap mediated device's state.

Signed-off-by: Jason J. Herne <jjherne@xxxxxxxxxxxxx>
---
Documentation/arch/s390/vfio-ap.rst | 27 ++++
drivers/s390/crypto/vfio_ap_ops.c | 188 +++++++++++++++++++++++---
drivers/s390/crypto/vfio_ap_private.h | 6 +-
3 files changed, 197 insertions(+), 24 deletions(-)


Maybe the following should be in its own patch since it is a doc change?


diff --git a/Documentation/arch/s390/vfio-ap.rst b/Documentation/arch/s390/vfio-ap.rst
index 929ee1c1c940..a94f13916dea 100644
--- a/Documentation/arch/s390/vfio-ap.rst
+++ b/Documentation/arch/s390/vfio-ap.rst
@@ -380,6 +380,33 @@ matrix device.
control_domains:
A read-only file for displaying the control domain numbers assigned to the
vfio_ap mediated device.
+ ap_config:
+ A read/write file that, when written to, allows the entire vfio_ap mediated
+ device's ap configuration to be replaced in one shot. Three masks are given,
+ one for adapters, one for domains, and one for control domains. If the
+ given state cannot be set, then no changes are made to the vfio-ap
+ mediated device.
+
+ The format of the data written to ap_config is as follows:
+ {amask},{dmask},{cmask}\n
+
+ \n is a newline character.
+
+ amask, dmask, and cmask are masks identifying which adapters, domains,
+ and control domains should be assigned to the mediated device.
+
+ The format of a mask is as follows:
+ 0xNN..NN
+
+ Where NN..NN is 64 hexadecimal characters representing a 256-bit value.
+ The leftmost (highest order) bit represents adapter/domain 0.
+
+ For an example set of masks that represent your mdev's current
+ configuration, simply cat ap_config.
+
+ This attribute is intended to be used by automation. End users would be
+ better served using the respective assign/unassign attributes for adapters
+ and domains.


s/automation/tooling/ ?

s/for adapters and domains/for adapters, domains and control domains/


* functions:
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 96293683b939..5ad134c1d5e6 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -764,10 +764,11 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev)
static void vfio_ap_mdev_link_queue(struct ap_matrix_mdev *matrix_mdev,
struct vfio_ap_queue *q)
{
- if (q) {
- q->matrix_mdev = matrix_mdev;
- hash_add(matrix_mdev->qtable.queues, &q->mdev_qnode, q->apqn);
- }
+ if (!q || vfio_ap_mdev_get_queue(matrix_mdev, q->apqn))
+ return;
+
+ q->matrix_mdev = matrix_mdev;
+ hash_add(matrix_mdev->qtable.queues, &q->mdev_qnode, q->apqn);
}


The above change has nothing to do with write support for the sysfs ap_config attribute. I suppose it could be considered an enhancement to ensure a queue that is already linked does not get re-linked, but the callers of this function all pass in a queue that has not yet been linked; either because it has just been probed or just assigned to the mdev. In any case, I don't think this change belongs in this patch.


static void vfio_ap_mdev_link_apqn(struct ap_matrix_mdev *matrix_mdev, int apqn)
@@ -1048,21 +1049,25 @@ static void vfio_ap_mdev_unlink_adapter(struct ap_matrix_mdev *matrix_mdev,
}
}
-static void vfio_ap_mdev_hot_unplug_adapter(struct ap_matrix_mdev *matrix_mdev,
- unsigned long apid)
+static void vfio_ap_mdev_hot_unplug_adapters(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long *apids)


The commits for the patch series created to reset queues removed from guest's AP configuration are now merged into our master branch and will pre-req this series. I've made note of the changes you will need to make herein to conform with the changes to the hot unplug functions you modified. You may want to take a look in any case because you may be able to take advantage - or not - of some other functions without having to modify these; for example;

staticintreset_queues_for_apids(structap_matrix_mdev*matrix_mdev,

unsignedlong*apm_reset)


{
- int loop_cursor;
- struct vfio_ap_queue *q;
struct ap_queue_table *qtable = kzalloc(sizeof(*qtable), GFP_KERNEL);


The vfio_ap_queue struct now has a field for adding it to a list of queues that need to be reset, so this qtable is now defined:

structlist_headqlist;


+ struct vfio_ap_queue *q;
+ unsigned long apid;
+ int loop_cursor;
hash_init(qtable->queues);
- vfio_ap_mdev_unlink_adapter(matrix_mdev, apid, qtable);
- if (test_bit_inv(apid, matrix_mdev->shadow_apcb.apm)) {
- clear_bit_inv(apid, matrix_mdev->shadow_apcb.apm);
- vfio_ap_mdev_update_guest_apcb(matrix_mdev);
+ for_each_set_bit_inv(apid, apids, AP_DEVICES) {
+ vfio_ap_mdev_unlink_adapter(matrix_mdev, apid, qtable);
+
+ if (test_bit_inv(apid, matrix_mdev->shadow_apcb.apm))
+ clear_bit_inv(apid, matrix_mdev->shadow_apcb.apm);
}
+ vfio_ap_mdev_update_guest_apcb(matrix_mdev);
+
vfio_ap_mdev_reset_queues(qtable);


The call here is now:

vfio_ap_mdev_reset_qlist(&qlist)


hash_for_each(qtable->queues, loop_cursor, q, mdev_qnode) {
@@ -1073,6 +1078,15 @@ static void vfio_ap_mdev_hot_unplug_adapter(struct ap_matrix_mdev *matrix_mdev,
kfree(qtable);
}
+static void vfio_ap_mdev_hot_unplug_adapter(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apid)
+{
+ DECLARE_BITMAP(apids, AP_DEVICES);
+
+ set_bit_inv(apid, apids);
+ vfio_ap_mdev_hot_unplug_adapters(matrix_mdev, apids);
+}
+
/**
* unassign_adapter_store - parses the APID from @buf and clears the
* corresponding bit in the mediated matrix device's APM
@@ -1235,21 +1249,24 @@ static void vfio_ap_mdev_unlink_domain(struct ap_matrix_mdev *matrix_mdev,
}
}
-static void vfio_ap_mdev_hot_unplug_domain(struct ap_matrix_mdev *matrix_mdev,
- unsigned long apqi)
+static void vfio_ap_mdev_hot_unplug_domains(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long *apqis)
{
- int loop_cursor;
- struct vfio_ap_queue *q;
struct ap_queue_table *qtable = kzalloc(sizeof(*qtable), GFP_KERNEL);


The vfio_ap_queue struct now has a field for adding it to a list of queues that need to be reset, so this qtable is now defined:

structlist_headqlist;


+ struct vfio_ap_queue *q;
+ unsigned long apqi;
+ int loop_cursor;
hash_init(qtable->queues);
- vfio_ap_mdev_unlink_domain(matrix_mdev, apqi, qtable);
- if (test_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm)) {
- clear_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm);
- vfio_ap_mdev_update_guest_apcb(matrix_mdev);
+ for_each_set_bit_inv(apqi, apqis, AP_DOMAINS) {
+ vfio_ap_mdev_unlink_domain(matrix_mdev, apqi, qtable);
+
+ if (test_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm))
+ clear_bit_inv(apqi, matrix_mdev->shadow_apcb.aqm);
}
+ vfio_ap_mdev_update_guest_apcb(matrix_mdev);
vfio_ap_mdev_reset_queues(qtable);


The call here is now:

vfio_ap_mdev_reset_qlist(&qlist)


hash_for_each(qtable->queues, loop_cursor, q, mdev_qnode) {
@@ -1260,6 +1277,15 @@ static void vfio_ap_mdev_hot_unplug_domain(struct ap_matrix_mdev *matrix_mdev,
kfree(qtable);
}
+static void vfio_ap_mdev_hot_unplug_domain(struct ap_matrix_mdev *matrix_mdev,
+ unsigned long apqi)
+{
+ DECLARE_BITMAP(apqis, AP_DOMAINS);
+
+ set_bit_inv(apqi, apqis);
+ vfio_ap_mdev_hot_unplug_domains(matrix_mdev, apqis);
+}
+
/**
* unassign_domain_store - parses the APQI from @buf and clears the
* corresponding bit in the mediated matrix device's AQM
@@ -1527,10 +1553,130 @@ static ssize_t ap_config_show(struct device *dev, struct device_attribute *attr,
return idx;
}
+/* Number of characters needed for a complete hex mask representing the bits in .. */
+#define AP_DEVICES_STRLEN (AP_DEVICES/4 + 3)
+#define AP_DOMAINS_STRLEN (AP_DOMAINS/4 + 3)
+#define AP_CONFIG_STRLEN (AP_DEVICES_STRLEN + 2 * AP_DOMAINS_STRLEN)
+
+static int parse_bitmap(char **strbufptr, unsigned long *bitmap, int nbits)
+{
+ char *curmask;
+
+ curmask = strsep(strbufptr, ",\n");
+ if (!curmask)
+ return -EINVAL;
+
+ bitmap_clear(bitmap, 0, nbits);
+ return ap_hex2bitmap(curmask, bitmap, nbits);
+}
+
+static int ap_matrix_length_check(struct ap_matrix_mdev *matrix_mdev)
+{
+ unsigned long bit;
+
+ for_each_set_bit_inv(bit, matrix_mdev->matrix.apm, AP_DEVICES) {
+ if (bit > matrix_mdev->matrix.apm_max)
+ return -ENODEV;
+ }
+
+ for_each_set_bit_inv(bit, matrix_mdev->matrix.aqm, AP_DOMAINS) {
+ if (bit > matrix_mdev->matrix.aqm_max)
+ return -ENODEV;
+ }
+
+ for_each_set_bit_inv(bit, matrix_mdev->matrix.adm, AP_DOMAINS) {
+ if (bit > matrix_mdev->matrix.adm_max)
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+static void ap_matrix_copy(struct ap_matrix *dst, struct ap_matrix *src)
+{
+ bitmap_copy(dst->apm, src->apm, AP_DEVICES);
+ bitmap_copy(dst->aqm, src->aqm, AP_DOMAINS);
+ bitmap_copy(dst->adm, src->adm, AP_DOMAINS);
+}
+
static ssize_t ap_config_store(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
- return count;
+ struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev);
+ struct ap_matrix m_new, m_old, m_added, m_removed;
+ unsigned long newbit;
+ char *newbuf, *rest;
+ int rc = count;
+ bool do_update;
+
+ newbuf = rest = kstrndup(buf, AP_CONFIG_STRLEN, GFP_KERNEL);
+ if (!newbuf)
+ return -ENOMEM;
+
+ mutex_lock(&ap_perms_mutex);
+ get_update_locks_for_mdev(matrix_mdev);
+
+ /* Save old state */
+ ap_matrix_copy(&m_old, &matrix_mdev->matrix);
+
+ if (parse_bitmap(&rest, m_new.apm, AP_DEVICES) ||
+ parse_bitmap(&rest, m_new.aqm, AP_DOMAINS) ||
+ parse_bitmap(&rest, m_new.adm, AP_DOMAINS)) {
+ rc = -EINVAL;
+ goto out;
+ }
+
+ bitmap_andnot(m_removed.apm, m_old.apm, m_new.apm, AP_DEVICES);
+ bitmap_andnot(m_removed.aqm, m_old.aqm, m_new.aqm, AP_DOMAINS);
+ bitmap_andnot(m_added.apm, m_new.apm, m_old.apm, AP_DEVICES);
+ bitmap_andnot(m_added.aqm, m_new.aqm, m_old.aqm, AP_DOMAINS);


Rather than copying m_new to matrix-mdev->matrix and then possibly having to restore it later, why not modify ap_matrix_length_check to take a struct ap_matrix and pass in the m_new. If that fails, you can bail out without going through the copying steps. Note that you can use the vfio_ap_matrix_init(matrix_dev->info, m_new) function to populate the apm_max, aqm_max and adm_max fields of the matrix, or you can pass in matrix_mdev->matrix and retrieve those values from there.


+
+ /* Need new bitmaps in matrix_mdev for validation */
+ ap_matrix_copy(&matrix_mdev->matrix, &m_new);
+
+ /* Ensure new state is valid, else undo new state */
+ rc = vfio_ap_mdev_validate_masks(matrix_mdev);


If you do what I suggested above, you can skip all of the copying below.


+ if (rc) {
+ ap_matrix_copy(&matrix_mdev->matrix, &m_old);
+ goto out;
+ }
+ rc = ap_matrix_length_check(matrix_mdev);
+ if (rc) {
+ ap_matrix_copy(&matrix_mdev->matrix, &m_old);
+ goto out;
+ }
+ rc = count;
+
+ /* Need old bitmaps in matrix_mdev for unplug/unlink */
+ ap_matrix_copy(&matrix_mdev->matrix, &m_old);
+
+ /* Unlink removed adapters/domains */
+ vfio_ap_mdev_hot_unplug_adapters(matrix_mdev, m_removed.apm);
+ vfio_ap_mdev_hot_unplug_domains(matrix_mdev, m_removed.aqm);
+


I'm wondering if it might eliminate a lot of code duplication if you made changes similar to what you did for the vfio_ap_mdev_hot_unplug_adapter/domain functions. Maybe you could refactor the code in vfio_ap_mdev_assign_adapter/domain_store functions into other functions and call them; for example, something like this?:

staticssize_tassign_adapters_store(struct ap_matrix_mdev *matrix_mdev, unsigned long *apids)

{

unsigned long apid;

ssize_t apids_set = 0;

for_each_set_bit_inv(apid, apids, AP_DEVICES) {

if(apid > matrix_mdev->matrix.apm_max)

?????? We may have successfully set one or more prior to this, so what do we do?

if(test_bit_inv(apid, matrix_mdev->matrix.apm)) {

apids_set += 1

continue;

}

set_bit_inv(apid, matrix_mdev->matrix.apm);

ret = vfio_ap_mdev_validate_masks(matrix_mdev);

if(ret) {

clear_bit_inv(apid, matrix_mdev->matrix.apm);

return 0;

}

vfio_ap_mdev_link_adapter(matrix_mdev, apid);

if(vfio_ap_mdev_filter_matrix(matrix_mdev, apm_filtered)) {

vfio_ap_mdev_update_guest_apcb(matrix_mdev);

reset_queues_for_apids(matrix_mdev, apm_filtered);

}

apids_set += 1;

}

ret = apids_set;

}


+ /* Need new bitmaps in matrix_mdev for linking new adapters/domains */
+ ap_matrix_copy(&matrix_mdev->matrix, &m_new);
+
+ /* Link newly added adapters */
+ for_each_set_bit_inv(newbit, m_added.apm, AP_DEVICES)
+ vfio_ap_mdev_link_adapter(matrix_mdev, newbit);
+
+ for_each_set_bit_inv(newbit, m_added.aqm, AP_DOMAINS)
+ vfio_ap_mdev_link_domain(matrix_mdev, newbit);
+
+ /* filter resources not bound to vfio-ap */
+ do_update = vfio_ap_mdev_filter_matrix(matrix_mdev->matrix.apm,
+ matrix_mdev->matrix.aqm, matrix_mdev);


If you decide to reject my suggestion above, you will need to make a change to the call to filter above. The vfio_ap_mdev_filter_matrix function has changed, so the call would look like this:

    DECLARE_BITMAP(apm_filtered, AP_DEVICES);

    vfio_ap_mdev_filter_matrix(matrix_mdev, apm_filtered);

The apm_filtered bitmap may contain APIDs filterred from matrix_mdev->shadow_apcb, so the associated queues will need to be reset (see comment below).
+ do_update |= vfio_ap_mdev_filter_cdoms(matrix_mdev);
+
+ /* Apply changes to shadow apbc if things changed */
+ if (do_update)
+ vfio_ap_mdev_update_guest_apcb(matrix_mdev);


After the call above, you will need to call the function below to reset the queues associated with any APIDs filtered from matrix_mdev->shadow_apcb:

    reset_queues_for_apids(matrix_mdev, apm_filtered);



+out:
+ release_update_locks_for_mdev(matrix_mdev);
+ mutex_unlock(&ap_perms_mutex);
+ kfree(newbuf);
+ return rc;
}
static DEVICE_ATTR_RW(ap_config);
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index 88aff8b81f2f..e2cf07c184bf 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -75,11 +75,11 @@ extern struct ap_matrix_dev *matrix_dev;
*/
struct ap_matrix {
unsigned long apm_max;
- DECLARE_BITMAP(apm, 256);
+ DECLARE_BITMAP(apm, AP_DEVICES);
unsigned long aqm_max;
- DECLARE_BITMAP(aqm, 256);
+ DECLARE_BITMAP(aqm, AP_DOMAINS);
unsigned long adm_max;
- DECLARE_BITMAP(adm, 256);
+ DECLARE_BITMAP(adm, AP_DOMAINS);
};
/**