• Our booking engine at tickets.railforums.co.uk (powered by TrainSplit) helps support the running of the forum with every ticket purchase! Find out more and ask any questions/give us feedback in this thread!

triple redundancy

Status
Not open for further replies.

zaax

Member
Joined
8 Oct 2015
Messages
97
When building an important computer system triple redundance is built in. ie three processors normal working together but one can do the job of three if the others break down.

Is this the same with the railway signalling system?
 
Sponsor Post - registered members do not see these adverts; click here to register, or click here to log in
R

RailUK Forums

ComUtoR

Established Member
Joined
13 Dec 2013
Messages
9,445
Location
UK
You sneeze and the system breaks.

Leaf mulch can prevent a track circuit from operating.

I think you need to define further about 'redundancy' being built in. The system will tend to fail safe so when a set of points fail they fail on a specific direction and that will interlock with other points and signals. That is a form of redundancy I suppose.
 

Joseph_Locke

Established Member
Joined
14 Apr 2012
Messages
1,878
Location
Within earshot of trains passing the one and half
When building an important computer system triple redundance is built in. ie three processors normal working together but one can do the job of three if the others break down.

Is this the same with the railway signalling system?

In SSI there are three duplicated systems - three entirely separate Zilog Z80B processors in fact - and two of these must agree before the system takes action. In this case a complete failure of one processor subsystem can be tolerated, but not two.

I'm not sure how CBI achieves its SIL compliance - you should remain on the platform and await the arrival of an interlocking technologist.
 

MarkyT

Established Member
Joined
20 May 2012
Messages
6,250
Location
Torbay
The successors to SSI are Westlock and Smartlock and these also use a similar technique but with more modern hardware. Not sure about other suppliers, although SSI and it's derivatives almost completely dominate the mainline market. Note the triple redundancy only applies to the central interlocking modules, not the distributed trackside function modules (TFM) that have only duplicated processors. A failure to agree in a TFM will result in a signals at red shutdown for that module but will only affect the small quantity of trackside equipment that one module interfaces to. No more than two signals or two sets of points usually.
 
Last edited:

edwin_m

Veteran Member
Joined
21 Apr 2013
Messages
24,920
Location
Nottingham
SSI actually has hardware to shut down the processor that disagrees with the other two and if only two remain to shut down both if they disagree. This is done by blowing a fuse I think. The data links between the processor and the trackside are also duplicated for reliability, but the data on each link is digitally encoded so it won't do anything unsafe if it is corrupted or, for example, connected to the trackside equipment for a different interlocking.
 
Last edited:

Bletchleyite

Veteran Member
Joined
20 Oct 2014
Messages
97,877
Location
"Marston Vale mafia"
When building an important computer system triple redundance is built in. ie three processors normal working together but one can do the job of three if the others break down.

Is this the same with the railway signalling system?

Provided it fails safe, there isn't really much need other than for reliability - "all trains stop now" is a safe if annoying failure outcome.

An Airbus's control systems are different - the plane can't just stop if there is a failure.

Edit: Though other more knowledgeable posters on this matter have confirmed it is indeed used.
 
Last edited:

edwin_m

Veteran Member
Joined
21 Apr 2013
Messages
24,920
Location
Nottingham
Provided it fails safe, there isn't really much need other than for reliability - "all trains stop now" is a safe if annoying failure outcome.

An Airbus's control systems are different - the plane can't just stop if there is a failure.

Edit: Though other more knowledgeable posters on this matter have confirmed it is indeed used.

Indeed. In aviation it's probably more important that a system keeps working even if it isn't quite doing the right thing - the pilot can usually compensate. For a railway signalling system if it can't be guaranteed to do the right thing it shouldn't do anything. Doing the wrong thing may not be evident to anyone until too late, and could have catastrophic results.
 

snowball

Established Member
Joined
4 Mar 2013
Messages
7,738
Location
Leeds
I assume it's no use trying to triplicate the unit that compares the outputs of the triplicated processors. :D
 

MarkyT

Established Member
Joined
20 May 2012
Messages
6,250
Location
Torbay
SSI actually has hardware to shut down the processor that disagrees with the other two and if only two remain to shut down both if they disagree. This is done by blowing a fuse I think. The data links between the processor and the trackside are also duplicated for reliability, but the data on each link is digitally encoded so it won't do anything unsafe if it is corrupted or, for example, connected to the trackside equipment for a different interlocking.

It's a kind of enhanced Mexican standoff in classic SSI. Each of the three MPMs (main processor modules) has some built in hardware for checking it's own output in comparison to the two others. If the outputs differ the minority module attempts to blow it's own fuse if it's capable. The other two modules are also hardwired to be able to kill the dissenting module and both will attempt this if they detect a disagreement. The two remaining modules, if they survive the shoot out, are capable of running the railway alone but alarms are generated to summon technical assistance, as if any further disagreement is detected by either module, both are eliminated and the interlocking shuts down. Signals go to red. Points are immobile.

Central Interlocking - Duplicated for safety, triplicated for reliabiliy.

Trackside Datalinks - Duplicated for reliability. A and B links are also preferably fed along the trackside from nodes at opposite ends of the scheme so even if both cables are cut midway all trackside objects remain connected to the interlocking by one or the other link. Similar architecture is preferred for equipment power supplies.

Trackside Function Modules (Distributed I/O) - Duplicated for safety alone. As to reliability, failure of one module in an interlocking otherwise fully functional is considered acceptable.

Other Control Centre Equipment - Systems usually contain duplicated boards or modules for reliability.
 

rf_ioliver

Member
Joined
17 Apr 2011
Messages
868
When building an important computer system triple redundance is built in. ie three processors normal working together but one can do the job of three if the others break down.

Quick rough answers:

If you are specifically referring to redudancy, then in the simplest system all three processors will perform the same job and then voting circuitry will ensure that the "correct" result is given in the case one processor fails.

You can get redundancy in other ways, eg: providing over capacity as might be done in certain "cloud" scenarios etc. Anyway, there are many, many forms of system redundancy. I think in your example however you're referring to a parallel processing system in which load can be shared - that's different case to fault-tolerance which is really the case for railways.

There are a number of issues in such systems, this article gives a good overview of the most famous of these: https://en.wikipedia.org/wiki/Byzantine_fault_tolerance

Then someone mentioned avionics and fail-safe. In a simple system, eg: a signal, then if something fails then the system fails to a safer state, eg: if a yellow bulb fails in a 4-aspect signal then either no signal is shown or only a single yellow is shown, both of which are more restrictive then a double yellow.

You can get variations of this, eg: circuitry which ensures that a red is shown for any failure.

The main point here is to get a system to fail *gracefully*. For example, in Airbus the general principle is that the system under failure returns more and more control, gracefully, to the pilot, eventually leaving the pilot with full control. And yes, Airbus aircraft can be flown fully manually.

One interesting point to note is that in such systems pretty much everything is done to avoid the system giving up and handing all control over to a human at once.

Big topic to discuss, if you have anything specific let me know by PM or reply here,

t.

Ian
 

mark-h

Member
Joined
14 Jan 2015
Messages
374
An Airbus's control systems are different - the plane can't just stop if there is a failure.

I think Airbus have the critical soft/firmware programmed by 3 different companies to reduce the risk of a bug causing an issue.

Three processors running the same software would give the same (wrong) responce if there was a code issue.
 

najaB

Veteran Member
Joined
28 Aug 2011
Messages
30,820
Location
Scotland
I think Airbus have the critical soft/firmware programmed by 3 different companies to reduce the risk of a bug causing an issue.

Three processors running the same software would give the same (wrong) responce if there was a code issue.
That does then introduce the problem of three different but equally valid solutions to the same set of inputs, initial condition and desired outcome. For example, if you're at FL20 and want to go to FL25 one program might calculate the minimum time solution while another calculates minimum fuel and the third does something somewhere in between.

I don't know about Airbus, but in other applications the three computers have to run identical software for the majority voting system to work. There will be a fourth, completely isolated backup system running a different code stack which takes over in the case of a system freeze/crash on the main computers.

Edit to add: Whatsmore, three different code stacks makes bugs *more* likely rather than less as that's three times as much code to test and validate.
 
Last edited:

asylumxl

Established Member
Joined
12 Feb 2009
Messages
4,260
Location
Hiding in your shadow
In SSI there are three duplicated systems - three entirely separate Zilog Z80B processors in fact - and two of these must agree before the system takes action. In this case a complete failure of one processor subsystem can be tolerated, but not two.

I'm not sure how CBI achieves its SIL compliance - you should remain on the platform and await the arrival of an interlocking technologist.

Amazing how prolific Zilog Z80 variants still are!
 

asylumxl

Established Member
Joined
12 Feb 2009
Messages
4,260
Location
Hiding in your shadow
Now you mention it, they are actually Motorola 6800 variants in SSI not Z80 - even older I think!
Makes perfect sense I suppose. Stability is one of the most important qualities of an embedded system and the software/hardware combo must be pretty stable by now!
 

Tim M

Member
Joined
9 Jul 2016
Messages
182
The WESTRACE Computer Based Interlocking (now Siemens Trackguard WESTRACE) system uses single processor using a true and complementary system to achieve safety to Safety Integrity Level 4. The Mk2 system has redundancy capabilities by duplicating both the interlocking module and the various Input and Output modules, with as the link below says hot swap capabilities.

WESTRACE has been in service in many countries around the world for about 25 years.

https://www.mobility.siemens.com/mo...-interlockings/trackguard-westrace-mk2-en.pdf
 
Status
Not open for further replies.

Top