Fault Tolerance



 

A Duplex approach to fault tolerance requires a doubling in the amount of hardware and provides coverage for a single fault.

A Majority Vote approach to fault tolerance requires a tripling in the amount of hardware and provides coverage for a single fault.

Our Fractional approach to fault tolerance requires at most a 33% increase in the amount of hardware and provides coverage for multiple faults.




A 54 port MetaRouter can be made fault tolerant by adding another row of routers. This 33% increase in hardware results in a 3X increase in fault coverage.

In perspective, our approach to fault tolerance is similar to RAID (redundant array of inexpensive disks). RAID uses an error correction code to redundantly distribute the data over a linear array of disks. Our approach adds a Galois network and a row of routers to redundantly distribute the topology over a two dimensional array of routers




Breaking all of the links and routers shown in red will deplete the gene pool, however complete connectivity will be maintained.

To break a connection you must break at least:

  • 3 links between the first and second rows or

  • 6 links between the second and third rows or

  • 3 links between the third and fourth rows.



A fault tolerant MetaRouter is capable of withstanding catastrophic damage.

All the router in each column can be placed in the same room, building, etc. All the routers in that column and all the links to and from that column can be broken while complete connectivity is maintained between all the remaining inputs and outputs.



A 500 port fault tolerant MetaRouter requires 2000 internal links while a router based on a Duplex Banyan or Butterfly requires 8000 internal links.

This is more hardware by a factor of 4.