Re: [PATCH v4 14/20] sched/core: Introduce a simple steal monitor
From: Yury Norov
Date: Thu Jun 18 2026 - 02:39:47 EST
On Thu, Jun 18, 2026 at 11:31:17AM +0530, Shrikanth Hegde wrote:
>
>
> On 6/18/26 11:02 AM, K Prateek Nayak wrote:
> > Hello Shrikanth, Yury,
> >
> > On 6/18/2026 10:14 AM, Shrikanth Hegde wrote:
> > > On 6/18/26 10:00 AM, Yury Norov wrote:
> > > > On Wed, Jun 17, 2026 at 11:11:33PM +0530, Shrikanth Hegde wrote:
> > > > > Start with a simple steal monitor.
> > > > >
> > > > > It is meant to look at steal time and make the decision to
> > > > > reduce/increase the preferred CPUs.
> > > > >
> > > > > It has
> > > > > - work function to execute the steal time calculations and decision
> > > > > making periodically.
> > > > > - low and high thresholds for steal time.
> > > > > - sampling period to control the frequency of steal time calculations.
> > > > > - cache the previous decision to avoid oscillations
> > > >
> > > > This monitor is the one implementation out of quite many possible,
> > > > right? I don't think it should live in the core scheduler files, it
> > > > should be a module.
> >
> > I agree that this tight of an integration with the sched bits might not
> > not be required.
> >
> > >
> > > You mean similar to drivers/cpuidle/? a new one drivers/steal_monitor/ ?
> >
> > Since steal time is a virtualization concept, somewhere in drivers/virt/
> > probably makes more sense unless we need some scheduler internal API to
> > implement it which shouldn't be the case.
> >
> > All the driver has to do is track steal-time (which should be available
> > via kcpustat_cpu_fetch()) periodically (using a workqueue?) and should
> > do set_cpu_preferred() (which needs to be made available for other use
> > cases anyways) so it should be possible.
>
> Yes. Seems like doable.
>
> Do you think it would make sense to keep the debugfs in sched still?
The enable/disable part will be replaced with insmod/rmmod. The
statistics part - IDK. It is nice to have all stats at the same
place. On the other hand, without the driver loaded it would
always read zeroes. It anyways is just a single line in sched/core.c,
not a big deal.
Thanks,
Yury