Re: [PATCH v3 1/1] powerpc/pseries: Interface to represent PAPR firmware attributes

From: Pratik Sampat
Date: Fri Jul 16 2021 - 03:15:07 EST

Next message: Geert Uytterhoeven: "Re: [PATCH] soc: rockchip: ROCKCHIP_GRF should not default to y, unconditionally"
Previous message: Zeng Guang: "[PATCH 5/6] KVM: x86: Support interrupt dispatch in x2APIC mode with APIC-write VM exit"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 16/07/21 1:16 am, Fabiano Rosas wrote:

Pratik Sampat <psampat@xxxxxxxxxxxxx> writes:

Hello,

On 12/07/21 9:13 pm, Fabiano Rosas wrote:

"Pratik R. Sampat" <psampat@xxxxxxxxxxxxx> writes:

Hi, have you seen Documentation/core-api/kobject.rst, particularly the
part that says:

"When you see a sysfs directory full of other directories, generally each
of those directories corresponds to a kobject in the same kset."

Taking a look at samples/kobject/kset-example.c, it seems to provide an
overall structure that is closer to what other modules do when creating
sysfs entries. It uses less dynamic allocations and deals a bit better
with cleaning up the state afterwards.

Thank you for pointing me towards this example, the kset approach is
interesting and the example indeed does handle cleanups better.

Currently, we use "machine_device_initcall()" to register this
functionality, do you suggest I convert this into a tristate module
instead where I can include a "module_exit" for cleanups?

Ugh.. I was hoping we could get away with having all cleanups done at
kobject release time. But now I see that it is not called unless we
decrement the reference count. Nevermind then.

Sure I can keep the current approach as-is, while incorporating the rest of your
comments.

+ ret = plpar_hcall_norets(H_GET_ENERGY_SCALE_INFO, ESI_FLAGS_ALL, 0,
+ virt_to_phys(esi_buf), MAX_BUF_SZ);
+ esi_hdr = (struct h_energy_scale_info_hdr *) esi_buf;
+ if (ret != H_SUCCESS || esi_hdr->data_header_version != ESI_VERSION) {

I really dislike this. If you want to bail due to version change, then
at least include in the ABI document that we might not give the
userspace any data at all.

My only concern for having a version check is that, the attribute list
can change as well as the attributes itself may change.
If that is the case, then in a newer version if we do not bail out we
may parse data into our structs incorrectly.

Sure, that is a valid concern. But the documentation for the header
version field says:

"Version of the Header. The header will be always backward compatible,
and changes will not impact the Array of attributes. Current version =
0x01"

I guess this is a bit vague still, but I understood that:

1- header elements continue to exist at the same position;
2- the format of the array of attributes will not change.

Are you saying that my interpretation above is not correct or that you
don't trust the HV to enforce it?

Thanks for the clarification.
I understand now that my interpretation was incorrect. The version change
should not break anything as our code in kernel acts just as a pass through.

My argument only hinges on that we should likely give no data at all
instead of junk or incorrect data.

I agree. I just don't think it would be possible to end up with
incorrect data, unless the HV has a bug.

Maybe I could make this check after the return check and give out a
version mismatch message like the following?
pr_warn("hcall failed: H_GET_ENERGY_SCALE_INFO VER MISMATCH - EXP: 0x%x, REC: 0x%x",
ESI_VERSION, esi_hdr->data_header_version);

Yes, this will help with debug if we ever end up in this situation.

Understood, In case of a version mismatch and IDs are introduced, it can help
the userspace know that something has changed.

+ pr_warn("hcall failed: H_GET_ENERGY_SCALE_INFO");
+ goto out;
+ }
+
+ num_attrs = be64_to_cpu(esi_hdr->num_attrs);
+ /*
+ * Typecast the energy buffer to the attribute structure at the offset
+ * specified in the buffer
+ */

I think the code is now simple enough that this comment could be
removed.

ack

+ esi_attrs = (struct energy_scale_attribute *)
+ (esi_buf + be64_to_cpu(esi_hdr->array_offset));
+
+ pgs = kcalloc(num_attrs, sizeof(*pgs), GFP_KERNEL);

This is never freed.

+ if (!pgs)
+ goto out_pgs;
+
+ papr_kobj = kobject_create_and_add("papr", firmware_kobj);
+ if (!papr_kobj) {
+ pr_warn("kobject_create_and_add papr failed\n");
+ goto out_kobj;
+ }
+
+ esi_kobj = kobject_create_and_add("energy_scale_info", papr_kobj);
+ if (!esi_kobj) {
+ pr_warn("kobject_create_and_add energy_scale_info failed\n");
+ goto out_ekobj;
+ }
+
+ for (idx = 0; idx < num_attrs; idx++) {
+ char buf[4];
+ bool show_val_desc = true;
+
+ pgs[idx].pgattrs = kcalloc(MAX_ATTRS,
+ sizeof(*pgs[idx].pgattrs),
+ GFP_KERNEL);
+ if (!pgs[idx].pgattrs)
+ goto out_kobj;
+
+ pgs[idx].pg.attrs = kcalloc(MAX_ATTRS + 1,
+ sizeof(*pgs[idx].pg.attrs),
+ GFP_KERNEL);

I think the kobject code expects this to be statically allocated, so
you'd need to override the release function in some way to be able to
free this.

Right this and pgs both are never free'd because my understanding was
that as this functionality is invoked from machine_init, I'd expect it
to stay until shutdown.

Yep, I thought the kset code would improve this, but I misread it. So
I'm fine with keeping it like this.

Sure thing. Thanks!

However, if you believe that a module approach is cleaner, I can change
my implementation to accommodate for that and also include a
module_exit for cleanup of the above allocations

+ if (!pgs[idx].pg.attrs) {
+ kfree(pgs[idx].pgattrs);
+ goto out_kobj;
+ }
+
+ sprintf(buf, "%lld", be64_to_cpu(esi_attrs[idx].id));

Do you mean pgs[idx].name instead of buf? Otherwise you're passing this
stack allocated 'buf' to another function.

Yes you're right I should have either passed the pg struct or I should
have used strcpy, here the stack allocated buffer is being taken out of
scope which is incorrect.
Thanks for pointing this out!

+ pgs[idx].pg.name = buf;
+
+ /* Do not add the value description if it does not exist */
+ if (strlen(esi_attrs[idx].value_desc) == 0)
+ show_val_desc = false;
+
+ if (add_attr_group(be64_to_cpu(esi_attrs[idx].id),
+ MAX_ATTRS, &pgs[idx], show_val_desc)) {
+ pr_warn("Failed to create papr attribute group %s\n",
+ pgs[idx].pg.name);
+ goto out_pgattrs;
+ }
+ }
+
+ return 0;
+
+out_pgattrs:
+ for (i = 0; i < MAX_ATTRS; i++) {
+ kfree(pgs[i].pgattrs);
+ kfree(pgs[i].pg.attrs);
+ }
+out_ekobj:
+ kobject_put(esi_kobj);
+out_kobj:
+ kobject_put(papr_kobj);
+out_pgs:
+ kfree(pgs);
+out:
+ kfree(esi_buf);
+
+ return -ENOMEM;
+}
+
+machine_device_initcall(pseries, papr_init);

Next message: Geert Uytterhoeven: "Re: [PATCH] soc: rockchip: ROCKCHIP_GRF should not default to y, unconditionally"
Previous message: Zeng Guang: "[PATCH 5/6] KVM: x86: Support interrupt dispatch in x2APIC mode with APIC-write VM exit"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]