Comment by miki123211

Comment by miki123211 10 months ago

26 replies

Are there any good resources on how this kind of real-time programming is done?

What goes into ensuring that a program is actually realtime? Are there formal proofs, or just experience and "vibes"? Is realtime coding any different from normal coding? How do modern CPU architectures, which have a lot of non-constant time instructions, branch prediction, potential for cache misses and such play into this?

throwup238 10 months ago

> What goes into ensuring that a program is actually realtime?

Realtime mostly means predictable runtime for code. As long as its predictable, you can scale the CPU/microcontroller to fit your demands or optimize your code to fit the constraints. It’s about making sure your code can always respond in time to hardware inputs, timers, and other interrupts.

Generally the Linux kernel’s scheduling makes the system very unpredictable. RT linux tries to address that along with several other subsystems. On embedded CPUs this usually means disabling advanced features like cache, branch prediction, and speculative execution (although I don’t remember if RT handles that part since its very vendor specific).

  • gmueckl 10 months ago

    "Responding in time" here means meeting a hard deadline under any circumstances, no matter what else may be going on simultaneously. The counterintuitive part is that this about worst case, not best case or average case. So you might not want a fancy algorithm in that code path that has insanely good average runtime, but a tiny chance to blow up, but rather one that is slower on average, but has tight bounded worst case performance.

    Example: you'd probably want the airbags in your car to fire precisely at the right time to catch you and keep you safe rather than blow up in your face too late and give you a nasty neck injury in addition to the other injuries you'll likely get in a hard enough crash.

    • newqer 10 months ago

      Or it fires to soon and you get an explosion to the face and hit your head on the steering wheel.

juliangmp 10 months ago

I'm not hugely experienced in the field personally, but from what I've seen, actually proving hard real time capabilities is rather involved. If something is safety critical (think break systems, avionic computers, etc.) it likely means you also need some special certification or even formal verification. And (correct me if I'm wrong) I don't think you'll want to use a Linux kernel, even with the preempt rt patches. I'd say specialized rt operating systems, like FreeRTOS or Zephyr, would be more fitting (though I don't have direct experience with them).

As for the hardware, you can't really use a ‘regular’ CPU and expect completely deterministic behavior. The things you mentioned (and for example caching) absolutely impact this. iirc amd/xilinx actually offer a processor that has both regular arm cores, alongside some arm real time cores for these exact reasons.

actionfromafar 10 months ago

For things like VxWorks, it's mostly vibes and setting priority between processes. But there are other ways. You can "offline schedule" your tasks, i.e. you run a scheduler at compile time which decides all possible supported orderings and how long slots each task can run.

Then, there's the whole thing of hardware. Do you have one or more cores? If you have more than one core, can they introduce jitter or slowdown to each other accessing memory? And so on and so forth.

  • tonyarkles 10 months ago

    > it's mostly vibes and setting priority between processes

    I'm laughing so so hard right now. Thanks for, among other things, confirming for me that there isn't some magic tool that I'm missing :). At least I have the benefit of working on softer real-time systems where missing a deadline might result in lower quality data but there's no lives at risk.

    Setting and clearing GPIOs on task entry/exit are a nice touch for verification too.

    • nine_k 10 months ago

      Magic? Well, here's some: predictably fast interrupts, critical sections where you code cannot be preempted, but with a watchdog so if your code hits an infinite loop it's restarted, no unpredictable memory allocation delays, no unexpected page fault delays, things like that.

      These are relatively easy to obtain on an MCU, where there's no virtual memory, physical memory is predictable (if slow), interrupt hardware is simple, hardware watchdogs are a norm, an normally there's no need for preemptive multitasking.

      But when you try to make it work in a kernel that supports VMM, kernel / userland privilege separation, user sessions separation, process separation, preemptive multitasking, and has to work on hardware with a really complex bus and a complex interrupt controller, — well, here's where magic begins.

      • aulin 10 months ago

        VMM is one of the few things I really miss while working in embedded. I would happily trade off memory allocation errors from fragmented heap with some unpredictable malloc delay (which could be maybe mitigated with some timeout?).

        • nine_k 10 months ago

          Reminds me of the time of banked memory in 8-bit systems :) It's certainly doable, to some extent, and is a hassle to manage %) I suppose it can be implemented with an MCU + QSPI RAM at a cost of one extra SPI clock to access the RAM through a small SRAM that would store the page translation table.

          I just think that something like A0 (to say nothing of ATMega) usually has too little RAM for it to be worth the trouble, and A7 (something like ESP32) already has an MMU.

      • tonyarkles 10 months ago

        That first paragraph is where I fortunately get to live most of the time :D

  • rightbyte 10 months ago

    > If you have more than one core, can they introduce jitter or slowdown to each other accessing memory?

    DMA and fancy peripherals like UART, SPI etc, could be namedropped in this regard, too.

    • nine_k 10 months ago

      Plot twist: the very memory may be connected via SPI.

wheels 10 months ago

There's some difference between user space and kernel. I don't have much experience in the kernel, but I feel like it's more about making sure tasks are preemptable.

In user space it's often about complexity and guarantees: for example, you really try not to do mallocs in a real-time thread in user space, because it's a system call that will only return in an unpredictable amount of time. Better to preallocate buffers or use the stack. Same for opening files, or stuff like that -- you want to avoid variable time syscalls and do them at thread / application setup.

Choice of algorithms needs to be such that for whatever n you're working with, that it can be processed inside of one sample generation interal. I'm mostly familiar with audio -- e.g. if you're generating audio at 44100 Hz, you need your algorithms to be able to process chunks in less than 22 microseconds.

  • saagarjha 10 months ago

    Real-time performance is not really possible in userspace unless your kernel is kept in the loop, because preemption can happen at any time.

    • kaba0 10 months ago

      I guess we really have to add whether it is soft or hard realtime we are talking about. The former can be done in userspace (e.g. video games), the latter probably need a custom OS (I don’t think rt-linux is good for actual hard realtime stuff)

  • dgan 10 months ago

    How do you handle runtime - defined sizes then? Just preallocate maximum possible number of bytes?

    • wheels 10 months ago

      Well, usually in a realtime system you're required to produce something in a fixed amount of time. Designing the algorithms to not need variable amounts of memory is one of the challenges. Commonly you can have a buffer that's the largest you could reasonably work on in that time slice.

monocasa 10 months ago

There's only one a few projects I know of that provide formal proofs wrt their real time guarantees; sel4 being the only public example.

That being said, vibes and kiss principle can get you remarkably far.

YZF 10 months ago

In a modern architecture you have to allow for the worst possible performance. Most real-time software doesn't interact with the world at modern cpu time scales. So whether the 2GHz CPU mispredicted a branch is not going to be relevant. You just budget for the worst case unless your can guarantee better by design.

rightbyte 10 months ago

On all the real time systems I've worked on, it has just been empirical measurements of cpu load for the different task periods and a good enough margin to overruns.

On an ECU I worked on, the cache was turned off to not have cache misses ... no cache no problem. I argued it should be turned on and the "OK cpu load" limit decreased instead. But nope.

I wouldn't say there is any conceptual difference from normal coding, except for that you'd want to be kinda sure algorithms terminate in a reasonable time in a time constrained task. More online algorithms than normally, though.

Most of the strangeness in real time coding is actually about doing control theory stuff is my take. The program often feels like state-machine going in a circle.

  • tonyarkles 10 months ago

    > On an ECU I worked on, the cache was turned off to not have cache misses ... no cache no problem. I argued it should be turned on and the "OK cpu load" limit decreased instead. But nope.

    Yeah, the tradeoff there is interesting. Sometimes "get it as deterministic as possible" is the right answer, even if it's slower.

    > Most of the strangeness in real time coding is actually about doing control theory stuff is my take. The program often feels like state-machine going in a circle.

    Lol, with my colleagues/juniors I'll often encourage them to take code that doesn't look like that and figure out if there's a sane way to turn it into "state-machine going in a circle". For problems that fit that mold, being able to say "event X in state Y will have effect Z" is really powerful for being able to reason about the system. Plus, sometimes, you can actually use that state machine to more formally reason about it or even informally just draw out the states, events, and transitions and identify if there's anywhere you might get stuck.

stevemackinnon 10 months ago

Here’s a frequently cited article about real-time audio programming that should be generally applicable to other contexts: http://www.rossbencina.com/code/real-time-audio-programming-... In my experience in audio dev, enforcing hard real-time safety is mostly experience based: knowing to avoid locks, heap allocations, and sys calls from the real-time thread, etc.

candiddevmike 10 months ago

You don't break the electrical equipment/motor/armature/process it's hooked up to.

In rt land, you test in prod and hope for the best.

8bitsrule 10 months ago

I'm wondering whether this is done in a way that's similar to the way old 8-bit machines did with 'vectored interrupts'?

(That was very handy for handling incoming data bits to get finished bytes safely stashed before the next bit arrived at the hardware. Been a -long time- since I heard VI's mentioned.)

chasd00 10 months ago

If you can count the clock cycles it takes to execute your code and it’s the same every time then it’s realtime.