Re: [PATCH v11 00/29] x86: Support Intel Advanced Matrix Extensions

From: Thomas Gleixner
Date: Sat Oct 02 2021 - 21:06:27 EST


Len,

On Fri, Oct 01 2021 at 22:50, Chang Seok Bae wrote:
> Sending this version as it follows up the discussion [1] with some code
> changes from v10. This is not intended to ignore your comment on v10 at all.
> Appreciate your points on my oversights that I will address in v12 soon.

why on earth did you make Chang send these patches out when there are
fundamental review comments on the fly vs. the previous version?

The changes vs. V10 just try to address the recently discussed updates
to the permission interface, which we agreed on a couple of days ago,
but all of that still is built on top of something which has serious
flaws at all ends.

Rushing stuff out just to get an internal checkbox ticked off might be
interesting for Intel internal managerial reasons about which I don't
care at all.

But I very much care about the noise in my inbox and the time I have to
spend^W waste on this. Aside of that I care about how this shifts the
blame to someone else. See below.

Looking at the delta between v10 and v11, it's entirely clear that this
is just a hastily cobbled together update which hasn't even seen any
reasonable scrunity. Just looking at this gem:

> +long do_arch_prctl_state(struct task_struct *tsk, int option, unsigned long arg2,

...

> + case ARCH_GET_FEATURES_WITH_KERNEL_ASSISTANCE:
> + return put_user(features_mask, (unsigned long __user *)arg2);
> + case ARCH_SET_STATE_ENABLE: {

> + if (arg3 != ARCH_SET_STATE_ENABLE_ALLOC) {

A suboption in the same namespace? Why? What's wrong with having
explicit options to make the user space interface sensible?

> + tsk->group_leader->thread.fpu.state_perm |= state_perm;

Accessing tsk->group_leaader is safe here because?

> + return 0;
> + }
> +
> + for_each_thread(tsk, t) {

Walking the threads is safe here because?

> + spin_lock_irq(&t->thread.fpu.prealloc.lock);

Nesting the lock acquisitions for potentially hundreds of threads is
correct in which way?

Also what is disabling interrupts protecting against?

If nesting these locks would be sensible in any way, then how is _irq()
the correct mechanism to use?

> + if (!err)
> + err = prealloc_xstate_buffer(t, state_perm);

Surely continuing the list walk when an error happened is a brilliant
idea.

Aside of that this still calls into vzalloc() with interrupts
disabled which is wrong to begin with.

> + }
> +
> + for_each_thread(tsk, t) {
> + if (err)
> + free_xstate_prealloc_buffer(&t->thread.fpu);
> + spin_unlock_irq(&t->thread.fpu.prealloc.lock);

If this ever gets invoked from a process which has threads then the
unconditional enabling of interrupts here is protecting the rest of the
threads against interrupts in which way?

Not that the interrupt disable above is protecting anything, but that's
just beyond hillarious, really.

> + }
> +
> + if (err)
> + return -ENOMEM;
> +
> + tsk->group_leader->thread.fpu.state_perm |= state_perm;
> + set_tsk_thread_flag(tsk->group_leader, TIF_FPU_PREALLOC);

Again, accessing tsk->group_leaader is still safe here because?

> + return 0;

IOW a total of at least 5 obvious bugs in 60 lines of code (including
new lines and comment). That's an achievement.

Len, it's sad that an experienced kernel developer like you is sticking
a Reviewed-by tag on something like that. Either you forgot anything
about kernel development since you became a manager or you simply do not
care anymore and just want to tick your checkboxes. Neither one of these
options is acceptable.

What's even worse is that you made Chang to send that out, instead of
guiding him with your experience to do the right thing. IOW, you make
Chang look bad instead of helping him. Big coporate culture I assume,
but that does not justify that in any way.

Yours seriously grumpy

Thomas