Comment by femto

Comment by femto 10 months ago

If you want to see the effect of the real-time kernel, build and run the cyclictest utility from the Linux Foundation.

https://wiki.linuxfoundation.org/realtime/documentation/howt...

It measures and displays the interrupt latency for each CPU core. Without the real-time patch, worst case latency can be double digit milliseconds. With the real-time patch, worst case drops to single digit microseconds. (To get consistently low latency you will also have to turn off any power saving states, as a transition between sleep states can hog the CPU, despite the RT kernel.) Cyclictest is an important tool if you're doing real-time with Linux.

As an example, if you're doing processing for software defined radio, it's the difference between the system occasionally having "blips" and the system having rock solid performance, doing what it is supposed to every time. With the real time kernel in place, I find I can do acid-test things, like running GNOME and libreoffice on the same laptop as an SDR, and the SDR doesn't skip a beat. Without the real-time kernel it would be dropping packets all over the place.

aero-glide2 10 months ago

Interestingly, whenever I touch my touchpad, the worst case latency shoots up 20x, even with RT patch. What could be causing this? And this is always on core 5.

Reply View 8 replies

femto 10 months ago

Perhaps the code associated with the touchpad has a priority greater than that you used to run cyclictest (80?). Does it still happen if you boost the priority of cyclictest to the highest possible, using the option:
--priority=99
Apply priority 99 with care to your own code. A tight endless loop with priority 99 will override pretty well everything else, so about the only way to escape will be to turn your computer off. Been there, done that :-)

Reply View | 2 replies
- snvzz 10 months ago
  
  The most important is to set the policy, described in sched(7), rather than the priority.
  Notice that without setting the priority, default policy is other, which is the standard one most processes get unless they request else.
  By setting priority (while not specifying policy), the policy becomes fifo, the highest, which is meant to give the cpu immediately and not preempt until process releases it.
  This implicit change in policy is why you see such brutal effect from setting priority.
  
  Reply View | 1 reply
  
  femto 10 months ago
  
  Thanks.
  
  Reply View | 0 replies
robocat 10 months ago

Perhaps an SMM ring -2 touchpad driver?
If you're developing anything on x86 that needs realtime - how do you disable SMM drivers causing unexpected latency?

Reply View | 1 reply
- jabl 10 months ago
  
  Buy HW that can be flashed with coreboot?
  And while it won't (completely) remove SMM, https://github.com/corna/me_cleaner might get rid of some stuff. I think that's more about getting rid of spyware and ring -1 security bugs than improving real-time behavior though.
  
  Reply View | 0 replies
angus-g 10 months ago

Maybe a PS/2 touchpad that is triggering (a bunch of) interrupts? Not sure how hardware interrupts work with RT!

Reply View | 1 reply
- jabl 10 months ago
  
  One of the features of PREEMPT_RT is that it converts interrupt handlers to running in their own threads (with some exceptions, I believe), instead of being tacked on top of whatever thread context was active at the time like with the softirq approach the "normal" kernel uses. This allows the scheduler to better decide what should run (e.g. your RT process rather than serving interrupts for downloading cat pictures).
  
  Reply View | 0 replies
monero-xmr 10 months ago

Touchpad support very poor in Linux. I use System76 and the touchpad is always a roll of the dice with every kernel upgrade, despite it being a "good" distro / vendor

Reply View | 0 replies

dijit 10 months ago

Quiet reminder that "real-time" is almost best considered "consistent-time".

The problem space is such that it doesn't necessarily mean "faster" or lower latency in any way, just that where there is latency: it's consistent.

Reply View 3 replies

amiga386 10 months ago

I always viewed it as "the computer needs to control things that are happening in real time and won't wait for it if it's late".

Reply View | 0 replies
PhilipRoman 10 months ago

Indeed, some of my colleagues worked on a medical device which must be able to reset itself in 10 seconds, in case something goes wrong. 10 seconds is plenty of time on average, the real problem is eliminating those remaining 0.01% cases.

Reply View | 0 replies
froh 10 months ago

consistent as in reliably bounded that is.

Reply View | 0 replies

[removed] 10 months ago

[deleted]

Reply View 0 replies