• Our booking engine at tickets.railforums.co.uk (powered by TrainSplit) helps support the running of the forum with every ticket purchase! Find out more and ask any questions/give us feedback in this thread!

Cambrian line 20 Oct 2017: loss of ERTMS speed restrictions. RAIB report released

Status
Not open for further replies.

Muzer

Established Member
Joined
3 Feb 2012
Messages
2,773
Presumably even with ETCS the drivers still have to read Weekly Operating Notices (WONs) which contain details of TSRs? This would mean that the driver would at least have some way of recognising that there should be a TSR in place when there isn't, though I'll grant that something you read in a notice at the start of the week is quite likely to be forgotten in the actual moment.
 
Sponsor Post - registered members do not see these adverts; click here to register, or click here to log in
R

RailUK Forums

Llanigraham

On Moderation
Joined
23 Mar 2013
Messages
6,103
Location
Powys
But the RAIB report says the signallers indications showed the speeds as being correctly applied. So to me that says the RBC (Radio Block Center) was not transmitting the revised/amended speed profiles to the trains. Essentially ignoring there was a slower speed on it

At long last, someone gets it!
 

Llanigraham

On Moderation
Joined
23 Mar 2013
Messages
6,103
Location
Powys
Presumably even with ETCS the drivers still have to read Weekly Operating Notices (WONs) which contain details of TSRs? This would mean that the driver would at least have some way of recognising that there should be a TSR in place when there isn't, though I'll grant that something you read in a notice at the start of the week is quite likely to be forgotten in the actual moment.

Which is exactly what happened.
 

Chris M

Member
Joined
4 Feb 2012
Messages
1,057
Location
London E14
Indeed as far as I can make out the humans in the operating centre had no way of knowing that there was anything wrong until the report was received from the driver. It was only when actively investigating that report that it became apparent that this was affecting multiple trains (we don't know how this became apparent though). Prior to this point there were no human errors made anyone in the control centre. However after this point it is possible that they should have halted all trains until the failure was resolved - this is under investigation by the RAIB so it is not for me to say whether their actions were a mistake or not.

All failure modes should have plans in place to prevent them from happening and/or to mitigate the consequences when they do happen. However this is only possible for failure modes that it is known or speculated can or might happen. The impression I get is that nobody, either operating staff or equipment suppliers knew that such a failure mode was even theoretically possible. It was, to use Donald Rumsfeld's terminology, an unknown unknown. It's now a known known and they have a procedure in place for if it happens again.
 

snowball

Established Member
Joined
4 Mar 2013
Messages
7,746
Location
Leeds
I'm not sure about the train side, but the old T69 trams on the Midland Metro made by AnsaldoBreda weren't the most reliable things ever, and the wiring was apprently quoted as being like spaghetti, if I'm remembering right.
And Manchester's T68s were withdrawn well before their originally planned lifetime.
 

theironroad

Established Member
Joined
21 Nov 2014
Messages
3,697
Location
London
Is this any different to a normal physical TSR sign, falling over / being stolen / painted over by vandals etc?

Well they'd have to steal the aws magnet as well. I don't recall ever seeing a ESR or tsr painted over, though I guess if it was long term then it's going to be a possibility as certainly psr boards seem to be used as drawing boards.

Edit : just realised that these points were covered in post #27 which I seems to have skipped!!!!
 

daikilo

Established Member
Joined
2 Feb 2010
Messages
1,623
Most likely is human error, whoever was performing the routine system restart accidentally wiped the data, as RAIB correctly points out there should have been a process to check after restarts by the operator to see the data was still there and the system was operating normally.

That is a very unlikely scenario (but maybe not impossible). What is much more likely is that the restart did not call up all the files needed. That is the issue that RAIB will first address along with why the controlers did not notice/were not informed.
 

WatcherZero

Established Member
Joined
25 Feb 2010
Messages
10,272
What RAIB is saying is there was no logging of files, so they dont know what data was uploaded to the system before the restart. The data was still present locally in internal displays after the restart but what was being transmitted was blank, which also suggests a system design flaw in allowing the possibility of data desync to occur.
 

GreatAuk

Member
Joined
16 Jan 2018
Messages
60
All failure modes should have plans in place to prevent them from happening and/or to mitigate the consequences when they do happen. However this is only possible for failure modes that it is known or speculated can or might happen. The impression I get is that nobody, either operating staff or equipment suppliers knew that such a failure mode was even theoretically possible. It was, to use Donald Rumsfeld's terminology, an unknown unknown. It's now a known known and they have a procedure in place for if it happens again.
If it is the case that the equipment supplier had not considered this failure mode (assuming it actually was a software failure), that would be rather worrying, and would raise all sorts of questions about the quality of safety analysis they performed on what is maybe the most safety critical system on the railway.

I don't know much about the details, but there are a whole range of techniques which can be applied to the analysis of supposedly high integrity software systems, and while complex it's not a brand new field. If it does turn out that a software fault caused this issue then I hope there will be a full review of what analysis was done, and how it was decided that it was sufficient... In any case a reminder against complacency about safety.
 

rebmcr

Established Member
Joined
15 Nov 2011
Messages
3,851
Location
St Neots
The data was still present locally in internal displays after the restart but what was being transmitted was blank, which also suggests a system design flaw in allowing the possibility of data desync to occur.

As someone responsible for systems' interactions with users and each other, this is mind-boggling. That sort of functionality would be in the very first draft version of anything I worked on, built-in to the very foundations of the software!
 

theageofthetra

On Moderation
Joined
27 May 2012
Messages
3,508
As someone responsible for systems' interactions with users and each other, this is mind-boggling. That sort of functionality would be in the very first draft version of anything I worked on, built-in to the very foundations of the software!
As I said earlier someone or a committee signed this off.
 

axlecounter

Member
Joined
23 Feb 2016
Messages
403
Location
Switzerland
I think you two are building things over unproved facts. This kind of software has a very precise way of being projected written and verified, with the so called V-model development process, which would eliminate most faults.
Nevertheless, like any other software, these can contain bugs too.
 

rebmcr

Established Member
Joined
15 Nov 2011
Messages
3,851
Location
St Neots
I think you two are building things over unproved facts. This kind of software has a very precise way of being projected written and verified, with the so called V-model development process, which would eliminate most faults.
Nevertheless, like any other software, these can contain bugs too.

I believe that's true for the train-to-processing-centre setup, but the primary facts from the RAIB release directly indicate that's not the case for signaller-to-processing-centre.
 

ToTheHills

New Member
Joined
23 Mar 2018
Messages
1
Which is exactly what happened.
What really puzzles me is why are they using TSRs to protect the approach speed to a level crossing, it really is not like the crossings will be removed, so why is this not coded into the static speed profile then it is unlike to have been lost over the reset of the RBC. This all smacks of an engineering scheme after thought to me and nothing to do with the safety of a digital solution
 

Llanigraham

On Moderation
Joined
23 Mar 2013
Messages
6,103
Location
Powys
What really puzzles me is why are they using TSRs to protect the approach speed to a level crossing, it really is not like the crossings will be removed, so why is this not coded into the static speed profile then it is unlike to have been lost over the reset of the RBC. This all smacks of an engineering scheme after thought to me and nothing to do with the safety of a digital solution

But crossings are being removed and others altered along The Cambrian, and things can and do change.
 

theironroad

Established Member
Joined
21 Nov 2014
Messages
3,697
Location
London
What really puzzles me is why are they using TSRs to protect the approach speed to a level crossing, it really is not like the crossings will be removed, so why is this not coded into the static speed profile then it is unlike to have been lost over the reset of the RBC. This all smacks of an engineering scheme after thought to me and nothing to do with the safety of a digital solution

Tsrs are being used around the country where a foot crossing has been deemed to have insufficient warning time from seeing/hearing an approaching train to being able to cross safely.

I imagine long term the crossings will be closed if that is practical and legal or the permanent line speed will have to be reduced, which may have timetabling impacts.
 

Elecman

Established Member
Joined
31 Dec 2013
Messages
2,906
Location
Lancashire
Or they will be fitted with some sort of bolt on advance warning system if they can’t be closed
 

rebmcr

Established Member
Joined
15 Nov 2011
Messages
3,851
Location
St Neots
This all smacks of an engineering scheme after thought to me and nothing to do with the safety of a digital solution

Regardless of the appropriateness of these TSRs, the fact that any TSR could be 'lost' is indeed a major safety failing.
 

driver_m

Established Member
Joined
8 Nov 2011
Messages
2,248
The issue with TSR's being used for timings at level crossing or sighting issues, is the length of time they're in place before they tend to become permanent. Prestatyn had a 75 imposed for a long time, Macclesfield still has one on the up. If NR won't spend the money on bridges or Banners, then the PSR should be introduced much more quickly. NR shouldn't be able to leave them on for such long periods. Obviously it doesn't properly solve why this failure happened but it would make it less likely.
 

Llanigraham

On Moderation
Joined
23 Mar 2013
Messages
6,103
Location
Powys
The issue with TSR's being used for timings at level crossing or sighting issues, is the length of time they're in place before they tend to become permanent. Prestatyn had a 75 imposed for a long time, Macclesfield still has one on the up. If NR won't spend the money on bridges or Banners, then the PSR should be introduced much more quickly. NR shouldn't be able to leave them on for such long periods. Obviously it doesn't properly solve why this failure happened but it would make it less likely.

To defend NR a little here, there have been numerous "problems" with UWC's along a section of the Cambrian. The proposal from NR was to close several of them, and build a new road and bridge. The time taken, which may seem long to some, was because each farmer had to be negotiated with, and many had their own ideas and didn't/wouldn't talk to each other, they had to design a suitable bridge and route for the road, purchase the land for that road, obtain permission from the Highways Agency and the County Council to alter accesses, apply for Traffic Regulation Orders to close and divert Rights of Way, apply for Planning Permission and then contract for the work. Many of these things could not be done at the same time as there are proceedures laid down that specify the order they have to be done.
There are no short cuts to make it quicker in many cases! For example, it only needs the Ramblers Assoc to complain about diverting a Footpath and you can end up with a Public Enquiry that could take 2 years; it has happened.
 

Roast Veg

Established Member
Joined
28 Oct 2016
Messages
2,202
But by making them PSRs, NR can then not do the work required to remove the TSR as it would constitute an upgrade rather than routine maintenance.
 

driver_m

Established Member
Joined
8 Nov 2011
Messages
2,248
But by making them PSRs, NR can then not do the work required to remove the TSR as it would constitute an upgrade rather than routine maintenance.

That sounds very much like "computer says no" from little Britain.
 

theironroad

Established Member
Joined
21 Nov 2014
Messages
3,697
Location
London
I'm sure someone can confirm or deny, but I seem to remember that after a certain period of time a tsr has to become a psr. I was thinking it was 2 years, but then I recall there's been a tsr somewhere for years, so I could be wrong....
 

driver_m

Established Member
Joined
8 Nov 2011
Messages
2,248
The Macclesfield tsr has been on for at least 6 years. So I'd say no.
 

Jonny

Established Member
Joined
10 Feb 2011
Messages
2,562
Falling over/Missing in part/Vandalised would all be noticed. There would be physical evidence that a Driver would see. These tend to be reported quite frequently. Completely missing is different as it would appear that the restriction has been lifted (unless SPATEd) However, if just the signage was removed/stolen then there would still be the presence of AWS. I couldn't tell you if ERTMS has an AWS warning for a TSR

How long would a SPATE be signposted for?
 

pt_mad

Established Member
Joined
26 Sep 2011
Messages
2,960
There is a similar thread going on alongside this one about the automatically controlled operation of trains which has started within the last week or two on Thameslink using a similar but updated signalling system.

Most scepticism on the thread has been put to bed by claims the system is designed to make no errors, be fail safe and fool proof. When posters have asked who takes responsibility for errors in the system such as this most answers have come back the driver, as they are expected to notice any such error and take control before an operating incident is able to occur.

References to this incident have lead to responses of well that's why there will be a driver onboard automatically controlled trains, as it's ultimately their responsibility to prevent any incidents by noticing any problems.

Don't feel the excitement towards this system on this thread though.
 

OneOffDave

Member
Joined
2 Apr 2015
Messages
453
As someone responsible for systems' interactions with users and each other, this is mind-boggling. That sort of functionality would be in the very first draft version of anything I worked on, built-in to the very foundations of the software!

The issue you get is that in complex systems it becomes effectively impossible to model all the modes in which the system will operate and how it interacts. Charles Perrow in his safety research coined the term 'Normal Accidents' to describe the sort of incident that is probably inevitable with systems that are complex, tightly coupled and there is catastrophic potential. His book of the same name is well worth a read if you are interested in that kind of thing. It meshes well with a lot of James Reason's work
 

rebmcr

Established Member
Joined
15 Nov 2011
Messages
3,851
Location
St Neots
The issue you get is that in complex systems it becomes effectively impossible to model all the modes in which the system will operate and how it interacts.

Perfectly reasonable, but making a tightly-scoped "sync verification" module a mandatory part of the normal operating sequence, would at least cause a right-side failure during a desync, no matter what unexpected scanario caused it.
 
Status
Not open for further replies.

Top