Re: [PATCH 2/7] x86/intel_rdt: Adds support for Class of service management

From: Vikas Shivappa
Date: Mon May 18 2015 - 15:46:08 EST




On Mon, 18 May 2015, Thomas Gleixner wrote:

On Mon, 18 May 2015, Vikas Shivappa wrote:
+static void __clos_get(unsigned int closid)

What's the purpose of these double underscores?


Similar to __get_rmid.

Errm. We use double underscore for such cases:

__bla()
{
do_stuff();
}

bla()
{
lock();
__bla();
unlock();
}

So you have two sorts of callers. Locked and unlocked. But I don't see
something like this. It's just random underscores for no value.

Ok - I saw this .. will remove underscores then since cpuset also dont use these anyways.


+{
+ struct clos_cbm_map *ccm = &ccmap[closid];
+
+ lockdep_assert_held(&rdt_group_mutex);

Do we really need all these lockdep asserts for a handfull of call
sites?

Well, we still have some calls some may be frequent depending on the usage..
should the assert decided based on number of times its called ?

We add these things for complex locking scenarios, but for single
callsites where locking is obvious its not really valuable. But that's
the least of my worries.

Ok. was just wanting to make sure help debug just in case as its not used by too many people as yet.


+static struct cgroup_subsys_state *
+intel_rdt_css_alloc(struct cgroup_subsys_state *parent_css)
+{
+ struct intel_rdt *parent = css_rdt(parent_css);
+ struct intel_rdt *ir;
+
+ /*
+ * Cannot return failure on systems with no Cache Allocation
+ * as the cgroup_init does not handle failures gracefully.

This comment is blatantly wrong. 5 lines further down you return
-ENOMEM. Now what?


Well , this is for cgroup_init called with parent=null which is when its
initializing the cgroup subsystem. I cant return error in this case but I can
otherwise.

And then you run into the same BUG_ON ...

May be I misled you - I pasted the code from cgroup.c which is not part of cache alloc below to show cgroup_init doesnt handle failure.. (cgroup_init_subsys ). The cache allocation code doesnt return BUG_ON. So I never run into BUG_ON.. (right?)


static void __init cgroup_init_subsys(struct cgroup_subsys *ss, bool early)
{

/* Create the root cgroup state for this subsystem */
ss->root = &cgrp_dfl_root;
css = ss->css_alloc(cgroup_css(&cgrp_dfl_root.cgrp, ss));
/* We don't handle early failures gracefully */
BUG_ON(IS_ERR(css));

I know. But still the comment is misleading.

/*
* cgroup_init() expects valid css pointer. Return
* the rdt_root_group.css instead of a failure code.
*/

Can you see the difference?

ok can fix the comment. Reminds me of my middle school days when English was my hardest subject.


+ */
+ if (!parent)
+ return &rdt_root_group.css;
+
+ ir = kzalloc(sizeof(struct intel_rdt), GFP_KERNEL);
+ if (!ir)
+ return ERR_PTR(-ENOMEM);

And why can't you return something useful here instead of letting the
thing run into a BUG?

This is a return for no memory - this is not for the call from cgroup_init_subsys where failure turns into a BUG.. this is when user is trying to create new cgroups..


maxid = c->x86_rdt_max_closid;
max_cbm_len = c->x86_rdt_max_cbm_len;

+ sizeb = BITS_TO_LONGS(maxid) * sizeof(long);
+ rdtss_info.closmap = kzalloc(sizeb, GFP_KERNEL);

What's the point of this bitmap? In later patches you have a
restriction to 64bits and the SDM says that CBM_LEN is limited to 32.

So where i the point for allocating a bitmap?

The cache bitmask is really a bitmask where every bit represents a cache way ,
so its way based mapping . Although currently the max is 32 bits we still need
to treat it as a bitmask.

In the first patch version you are the one who suggested to use all the bitmap
functions to check the presence of contiguous bits (although I was already
using the bitmaps, I had not used for some cases). So i made the change and
other similar code was changed to using bitmap/bit manipulation APIs. There
are scenarios where in we need to check for subset of bitmasks and other cases
but agree they can all be done without the bitmask APIs as well. But now your
comment contradicts the old comment ?

Oh well. You can still use bitmap functions on an u64 where
appropriate.

I see - I did not notice this bitmap is not for the CBM though.. this is the closmap which manages the closids. the cbm is just kept as a unsigned long - i dont do bitmap allocation for that. So we are on same page i guess.



+ if (!rdtss_info.closmap) {
+ err = -ENOMEM;
+ goto out_err;
+ }
+
+ sizeb = maxid * sizeof(struct clos_cbm_map);
+ ccmap = kzalloc(sizeb, GFP_KERNEL);
+ if (!ccmap) {
+ kfree(rdtss_info.closmap);
+ err = -ENOMEM;
+ goto out_err;

What's wrong with return -ENOMEM? We only use goto if we have to
cleanup stuff, certainly not for just returning err.

This was due to PeterZ's feedback that the return paradigm needs to not have
too many exit points or return a lot of times from the middle of code..
Both contradict now ?

There are different opinions about that. But again, that's the least
of my worries.

Great.



+ }
+
+ set_bit(0, rdtss_info.closmap);
+ rdt_root_group.clos = 0;
+ ccm = &ccmap[0];
+ bitmap_set(&ccm->cache_mask, 0, max_cbm_len);
+ ccm->clos_refcnt = 1;
+
pr_info("Max bitmask length:%u,Max ClosIds: %u\n", max_cbm_len,
maxid);

We surely do not want to sprinkle these all over dmesg.

This is just printed once! how is that sprinke all over? - we have a dmsg
print for Cache monitoring as well when cqm is enabled.

Sorry, mapped that to the wrong function. Though the message itself is
horrible.

"Max bitmask length:32,Max ClosIds: 16"

With some taste and formatting applied this would read:

"Max. bitmask length: 32, max. closids: 16"

Can you spot the difference?

Will fix comment.

Thanks,
Vikas


Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/