• Our booking engine at tickets.railforums.co.uk (powered by TrainSplit) helps support the running of the forum with every ticket purchase! Find out more and ask any questions/give us feedback in this thread!

Cambrian line 20 Oct 2017: loss of ERTMS speed restrictions. RAIB report released

Status
Not open for further replies.
Sponsor Post - registered members do not see these adverts; click here to register, or click here to log in
R

RailUK Forums

Nean

Member
Joined
28 Dec 2013
Messages
158
Location
Sheffield
1. A signal lamp for a red aspect.
2. TPWS.

Correct me if I'm wrong, but I was of the understanding that if there's a signal lamp failure (i.e. a blank signal) then that's automatically interpreted as a "Danger", in addition to being preceded by cautionary signals and the AWS magnet still being active making it subsequently a human failure if the driver overrides the AWS and carries on?
 

Wilts Wanderer

Established Member
Joined
21 Nov 2016
Messages
2,463
No arguments that it was a wrong side failure.

A wire count is a much simpler task than deciphering millions of lines of code but you are correct that one simple error can cause a dangerous situation.

This is one of the reasons why designing interlocking for modern signalling installations is so fraught with difficulty and delay. In the past it was straightforward to design a relay system with sufficient safeguards to achieve acceptable safety levels. In effect it is engineering, a tangible series of physical items, electrical circuitry etc.

Problem is, now everything is software and extremely specialised. Signalling engineers need to be software engineers as well. How exactly do you prove the safety of the interlocking logic in a system involving processor cores and memory? Presumably there is a redundancy aspect; for instance engineer the system so three separate identical ‘brains’ compute the same problem and compare solutions. (I’m not an expert here, just speculating.)
 

Dave1987

On Moderation
Joined
20 Oct 2012
Messages
4,563
Waterloo?
Cardiff East Junction?
Watford tunnel?
Broad Oak level crossing, Kent?

That's just from RAIB reports published in 2017.

Waterloo was an installation error, so was Cardiff. Watford was a embankment collapse, nothing to do with signaling. Broad Oak the system was not put back into the correct state. Apart from Watford all are attributed to human error in some way because of time pressures etc. This failure of the digital signaling is because there seems to be a fundamental flaw somewhere in the whole system which is very scary. The others could not fail safe because they weren’t installed or put back correctly.
 

Dave1987

On Moderation
Joined
20 Oct 2012
Messages
4,563
The utterly scary thing about all this is there is clearly a bug in the coding. It could be buried in millions of lines of code. There is every possibility that there could be many many more bugs in the code which have yet to be discovered. This whole thing has not given me any confidence that digital signaling will be any better than what we already have.
 

Railsigns

Established Member
Joined
15 Feb 2010
Messages
2,488
Correct me if I'm wrong, but I was of the understanding that if there's a signal lamp failure (i.e. a blank signal) then that's automatically interpreted as a "Danger", in addition to being preceded by cautionary signals and the AWS magnet still being active making it subsequently a human failure if the driver overrides the AWS and carries on?

You're not wrong (although a failed signal lamp may hold the preceding signal at red), but the fact remains that a signal lamp that's gone out when it should be lighting a red aspect has not failed safe. This is an example of a 'protected wrong side failure', for the reasons you've given.
 

Wilts Wanderer

Established Member
Joined
21 Nov 2016
Messages
2,463
You're not wrong (although a failed signal lamp may hold the preceding signal at red), but the fact remains that a signal lamp that's gone out when it should be lighting a red aspect has not failed safe. This is an example of a 'protected wrong side failure', for the reasons you've given.

Precisely - the obvious test case in this case; what happens if an aspect is blank at night?
 

Railsigns

Established Member
Joined
15 Feb 2010
Messages
2,488
1. Yes it does. A blank aspect is the same as a red to a driver.

That doesn't make it failsafe. A lamp that's not lit when it should be showing red hasn't failed safe.

2. TPWS test is checked every time a driver opens their desk in the cab.

TPWS track equipment isn't failsafe. If a loop fails to energise, all protection is lost.
 

Dave1987

On Moderation
Joined
20 Oct 2012
Messages
4,563
That doesn't make it failsafe. A lamp that's not lit when it should be showing red hasn't failed safe.

It has because as a driver you are trained to deal with that by stopping immediately.

TPWS track equipment isn't failsafe. If a loop fails to energise, all protection is lost.

TPWS is only there to protect against a driver error or incapacitation. TPWS isn’t the main protection, the driver is. In this case the ETCS in cab signaling is the main protection for those speed restrictions and it failed. It’s the whole mantra of some that technology is infallible and humans are defunct that annoys me about all. Software is only as good as the human who programmed it.
 

Railsigns

Established Member
Joined
15 Feb 2010
Messages
2,488
It has because as a driver you are trained to deal with that by stopping immediately.

I assure you, a signal that's gone dark when it should be displaying a red aspect has not failed safe. Driver training regarding such an occurrence is a form of mitigation, that's all.

TPWS is only there to protect against a driver error or incapacitation. TPWS isn’t the main protection, the driver is. In this case the ETCS in cab signaling is the main protection for those speed restrictions and it failed. It’s the whole mantra of some that technology is infallible and humans are defunct that annoys me about all. Software is only as good as the human who programmed it.

So, we're in agreement that TPWS isn't failsafe then?
 

Bald Rick

Veteran Member
Joined
28 Sep 2010
Messages
29,070
I assure you, a signal that's gone dark when it should be displaying a red aspect has not failed safe. Driver training regarding such an occurrence is a form of mitigation, that's all.

Indeed, and I am aware of an incident where a driver was talked past a red signal, missed the next (blank) signal (that no one knew was blank, as an auto), and nearly collided with a train in front.
 

ComUtoR

Established Member
Joined
13 Dec 2013
Messages
9,399
Location
UK
I assure you, a signal that's gone dark when it should be displaying a red aspect has not failed safe. Driver training regarding such an occurrence is a form of mitigation, that's all.

Whilst I agree. Isn't failing to a blank state still considered 'safe' ? Failing unsafe would mean showing a proceed aspect when it should be showing a red. Can they do that ? I have seen a signal do something very weird but I have never seen a clear into an occupied section. Anecdotally I have heard it but I considered it more of an urban myth.
 

Bald Rick

Veteran Member
Joined
28 Sep 2010
Messages
29,070
Whilst I agree. Isn't failing to a blank state still considered 'safe' ? Failing unsafe would mean showing a proceed aspect when it should be showing a red. Can they do that ? I have seen a signal do something very weird but I have never seen a clear into an occupied section. Anecdotally I have heard it but I considered it more of an urban myth.

It definitely happens, albeit rare. Usually a wiring error, or more rarely component degradation in the signal or in the controlling loc case.
 

Railsigns

Established Member
Joined
15 Feb 2010
Messages
2,488
Whilst I agree. Isn't failing to a blank state still considered 'safe' ?

A blank signal that should be at red is less safe than a signal showing red (it's more likely that a train will pass it). 'Failsafe' means reverting to the safer option upon failure, i.e. lit at red in the case of the example given.

Failing unsafe would mean showing a proceed aspect when it should be showing a red. Can they do that ? I have seen a signal do something very weird but I have never seen a clear into an occupied section. Anecdotally I have heard it but I considered it more of an urban myth.

It could happen with the right combination of circumstances, for example a signal failing to return to red behind a train, because of a wrong-side track circuit failure ahead.
 
Last edited:

ComUtoR

Established Member
Joined
13 Dec 2013
Messages
9,399
Location
UK
A blank signal that should be at red is less safe than signal showing red (it's more likely that a train will pass it). 'Failsafe' means reverting to the safer option upon failure, i.e. lit at red in the case of the example given.

Granted, but failing to blank is far far safer then failing to proceed. So I can understand the perception that it still fails 'safe' Just not the 'safest'

A quick google of the term 'fail-safe' leads : designed to return to a safe condition in the event of a failure or malfunction and : in engineering is a design feature or practice that in the event of a specific type of failure inherently responds in a way that will cause no or minimal harm to other equipment, the environment or to people.

Failing blank is both a safe condition and minimal (restricted on approach, Driver mitigation etc.)

I would also agree with Dave in that Red/Blank mean exactly the same. I would also argue that more Drivers pass Red than Blank.

I'm not here to argue over the semantics or play the pedant game and I can see both sides. So I will agree to disagree and thank you for the insight.

However; clear, when occupied is certainly not fail safe and not something I ever hope to see.
 

Railsigns

Established Member
Joined
15 Feb 2010
Messages
2,488
I would also argue that more Drivers pass Red than Blank.

I don't think anyone would disagree with you. It's true because drivers encounter far more red signals than they do blank signals.
 

Railsigns

Established Member
Joined
15 Feb 2010
Messages
2,488
The railway doesn't do things based on what's "likely" to happen (or not happen).

Sometimes it does. Risk assessments are an important part of the signalling design process nowadays. The two components of risk are magnitude and probability.
 
Joined
7 Feb 2008
Messages
285
Someone's tying themselves in knots here. Signals not showing an aspect are referred to as 'dark'

A signal can be dark for a variety of reasons. It is very serious and must be treated as a instruction not to proceed if it's a stop signal and proceed at caution if a distant. It has not failed safe but is a kind of wrong side failure. In the event of a power failure trains can be talked from signal to signal by phone, pilotman or time-interval whichever system is implemented by the signaller.

It's all highly unusual and an emergency. Thankfully train radio can help to stop movements more quickly than in the past.

Red and dark are not the same. The train must stop for both but red indicates some or all protection is in place, whereas dark could mean no power, track circuit failures, detection problems etc.
 
Joined
7 Feb 2008
Messages
285
Someone's tying themselves in knots here. Signals not showing an aspect are referred to as 'dark'

A signal can be dark for a variety of reasons. It is very serious and must be treated as a instruction not to proceed if it's a stop signal and proceed at caution if a distant. It has not failed safe but is a kind of wrong side failure. In the event of a power failure trains can be talked from signal to signal by phone, pilotman or time-interval whichever system is implemented by the signaller.

It's all highly unusual and an emergency. Thankfully train radio can help to stop movements more quickly than in the past.

Red and dark are not the same. The train must stop for both but red indicates some protection is in place, whereas dark could mean no power, track circuit failures, detection problems etc.
 

sbt

Member
Joined
12 Oct 2011
Messages
268
The utterly scary thing about all this is there is clearly a bug in the coding.

Not necessarily. It could be a data and/or maintenance failure. A recent aircraft crash was due to blanking of a data table which then interacted with a safety feature and normal crew actions to cause engine shutdown on three (of four) engines. The issue was an improperly performed software update on the three engines - the instructions were not followed.

Ultimately the problem was one of 'incomplete assembly' of the software load and a safety feature that precluded recovery from, or mitigation of, the consequences of that error. Similar interactions could have occurred with mechanical and/or hydraulic links, levers and cams if they had been designed to act in similar ways (which were, fundamentally, quite simple) and someone had, say, replaced a part and not put a pivot pin back in the mechanism.

Code is generally tested against the design. Just as with a mechanical system if the design has flaws problems will occur without errors in the code. Similarly if parts of it, or the data (including data feeds) it relies upon, are missing then no error in coding is required for an accident to occur.

The sort of distinctions I am trying to make is that between an accident due to improper design of a a braking system, an accident due to improper manufacturing or use of incorrect or flawed material (equivalent to a code error) in the brake system or an accident due to improper operation or maintenance of the brake system. The 2015 crash I reference above was caused by a combination of, firstly, improper maintenance (a component left out) and, secondly, an arguable design flaw in that a 'fail safe' mechanism that prevented a generally very inadvisable crew response was designed under the assumption that only one engine out of four would suffer a maintenance or other failure at any one time.

Just as in an investigation that does not involve software you need to check 'was the device in its designed state' and 'is the design implicated' as well as 'is the code improperly constructed' (which may actually not be necessary if you find the culprit elsewhere). And in the last case, just as the 'shape' of the accident gives you clues as to where to look for incorrect construction and materials (the RAIB doesn't go slicing all of every train that has an accident into molecule thin slices and examine all the slices under an Electron Microscope) and where and how to test and forensically examine them. There may be 'Millions' of lines of code but probably only hundreds or a few thousand will be looked at in detail - before that the code equivalent of 'test rigs' will be used to narrow down where any code or data problem lies.

Someone will inevitably mention self checking code. I do not work with Safety Critical software (although I know someone who has) but I do know that self checking in software can only go so far before the checking itself introduces potential failure paths, suffers from its own failures or overloads the system. Once your accident prevention systems and routines reach a certain level of complexity they begin to introduce failure possibilities into the system. This is, at least as I understand it, the key fundamental, or at least one of the fundamental, concepts behind the idea of 'Normal Accidents' - there is a minimum accident rate for any complex system that cannot be further reduced as once it has been reached every effort to reduce it in fact creates new ways for it to fail.
 

HSTEd

Veteran Member
Joined
14 Jul 2011
Messages
16,635
As a driver you know where signals are so you know something is wrong if you don’t see any light at all at night.

Unfortunately this is not foolproof - numerous drivers have lost situational awareness in dark conditions over the years.
For example that Class 92 driver that ended up rolling back on Shap without realising.
 

philthetube

Established Member
Joined
5 Jan 2016
Messages
3,749
CLJ was not an engineered system that failed. It was human error. No ifs and buts. That failure was caused by rogue wires not being cut back after changes to the system.

Waterloo was an installation error, so was Cardiff. Watford was a embankment collapse, nothing to do with signaling. Broad Oak the system was not put back into the correct state. Apart from Watford all are attributed to human error in some way because of time pressures etc. This failure of the digital signaling is because there seems to be a fundamental flaw somewhere in the whole system which is very scary. The others could not fail safe because they weren’t installed or put back correctly.

They failed so are not failsafe, why doesn't matter except that things should be put in place and lessons learnt to ensure that it doesn't happen again. All the way through railway history there are examples of failures which have led to changes and improvements.
 

Chris M

Member
Joined
4 Feb 2012
Messages
1,057
Location
London E14
Perhaps a better analogy for the ERTMS failure situation is the signal post that fell over and obstructed the adjacent tracks a few years ago, while still remaining electrically connected to the signalling system and with all lamps functioning as intended.
The mode of failure was not anticipated and so there was no provision made for detecting whether it had happened or not. There was plenty of error checking in the system that performed exactly as designed and correctly reported that the errors they were designed to detect were not present yet the system still failed to an unsafe condition.
 

sw1ller

Established Member
Joined
4 Jan 2013
Messages
1,567
A signal going from red to “dark” or blank is clearly not failsafe. Failsafe in its very wording means if it fails, it’s safe! This simply isn’t the case. I shall go past 4 signals tomorrow night that will be blank/dark. If it’s mega foggy like it can be, I won’t even know I’ve gone past them, they have no AWS or TPWS so there’s nothing to warn me of their approach. I know they’re there somewhere but if they’re not lit up, I’m not bothered about them.

My point being is, should a signal normally show an aspect, and then it wasn't showing one, there’s every chance a driver could go past it, rendering the whole situation unsafe. I don’t know why anyone’s even arguing little details. It’s very simple.
 
Status
Not open for further replies.

Top