Comment by RealityVoid

Comment by RealityVoid 4 days ago

9 replies

> they were given the specs and not allowed to communicate with the other teams in any way during the hw and sw development.

Jeez, it would drive me _up the wall_. Let's say I could somewhat justify the security concerns, but this seems like it severely hampers the ability to design the system. And it seems like a safety concern.

lambdaone 4 days ago

What you are trying to minimize here is the error rate of the composite system, not the error rate of the individual modules. You take it as a given that all the teams are doing their human best to eliminate mistakes from their design. The idea of this is to make it likely that the mistakes that remain are different mistakes from those made by the other teams.

Providing errors are independent, it's better to have three subsystems with 99% reliability in a voting arrangement than one system with 99.9% reliability.

  • saltcured 4 days ago

    This seems like it would need some referees who watch over the teams and intrude with, "no, that method is already claimed by the other team, do something else"!

    Otherwise, I can easily see teams doing parallel construction of the same techniques. So many developments seem to happen like this, due to everyone being primed by the same socio-technical environment...

K0balt 4 days ago

The idea was to build three completely different systems to produce the same data, so that an error or problem in one could not be reasonably replicated in the others. In a case such as this, even ideas about algorithms could result in undesirable similarities that could end up propagating an attack surface, a logic error, or a hardware weakness or vulnerability. The desired result is that the teas solve the problem separately using distinct approaches and resources.

  • Filligree 4 days ago

    And did they?

    Sometimes the solution is obvious, such that if you ask three engineers to solve it you’ll get three copies of the same solution, whereas that might not happen if they’re able to communicate.

    I’m sure they knew what they were doing, but I wonder how they avoided that scenario.

    • addaon 4 days ago

      I can (and have in past) written a long explanation on my experience with this, but…

      Redundancy is a tool for reducing the probability of encountering statistical errors, which come from things like SEUs.

      Dissimilarity is a tool for reducing the “probability” of encountering non-statistical errors — aka defects, bugs — but it’s a bit of a category error to discuss the probability of a non-probabilistic event; either the bug exists or it does not, at best you can talk about the state coverage that corresponds to its observability, but we don’t sample state space uniformly.

      There has been a trend in the past few decades, somewhat informed by NASA studies, to favor redundancy as the (only, effective) tool for mitigating statistical errors, but to lean against heavy use of dissimilarity for software development in particular. This is because of a belief that (a) independent software teams implement the same bugs anyway and (b) an hour spent on duplication is better spent on testing. But at the absolute highest level of safety, where development hours are a relatively low cost compared to verification hours, I know it’s still used; and I don’t know how the hardware folks’ philosophy has evolved.

    • cmckn 4 days ago

      Even with the same approach, I imagine the implementation could differ enough to still meet the goal. But I’m also curious if the differences were actually quantified after the fact, it seems an important step.

    • sandworm101 4 days ago

      Not at airbus. Ask a german, french and british engineer the same question and you will never, ever get the same answer from each.

    • K0balt 4 days ago

      I think this would come down to team selection. At airbus they have the advantage of cultural diversity to lean on, I have no doubt that implementations would differ not only in implementation but in design philosophy, compromises, and priorities.

lxgr 4 days ago

How so? It’s a safety measure at least as much as a security one.

It’s essentially a very intentional trade-off between groupthink and the wisdom of crowds, but it lands on a very different point on that scale than most other systems.

Arguably the track record of Airbus’s fly-by-wire does them some justice for that decision.