train onboard software

Discussion in 'Traction & Rolling Stock' started by Ken H, 7 Oct 2019.

  1. Ken H

    Ken H Established Member

    Messages:
    2,121
    Joined:
    11 Nov 2018
    Location:
    in Greece on holiday
    Was reading an article in Oct Modern Railways about how the railway coped with the power supply problems earlier this year.

    one of the problems was the power supply frequency dropped, and the on train protection tripped on some trains

    The power supply frequency quickly recovered as National grid bought generating capacity online and switched out consumers. But the railway kept their supplies.

    The procedure after such a trip is to do a battery reset of the train, and the train should be good to go again

    It seems that the software on some trains was changed by a new release, so 2 versions were in service.

    The older version had a wider tolerance of frequency so did not trip

    The newer version had a narrower tolerance, so did trip. But the software had also been changed to disallow a battery reset after a supply frequency trip

    They could not reload the old version on the new trains because of a reliability fix to the CCTV system

    So some questions

    1. Was the software not subject to User Acceptance Testing by the ROSCO or the TOC? Did no-one read the release documentation and think 'Hmm, thats quite a big change, I will escalate that'. Was there release documentation?

    2. Surely the software should be divided into applications. Upgrading one application should not affect the others. So the CCTV app should be up-gradeable without affecting the power protection stuff.

    3. Are features like the tolerance levels of the power supply frequency not 'soft coded', i.e. kept in a parameter file and not 'hard coded' in the programs.
     
  2. Registered users do not see these banners - join or log in today!

    Rail Forums

     
  3. ComUtoR

    ComUtoR Established Member

    Messages:
    6,321
    Joined:
    13 Dec 2013
    Location:
    UK
    I haven't read the article but are you sure that the sequence of events is correct ?

    There are have always been multiple versions of the train software in service. As I understand it, the units tripped and then got rebooted. Because there some didn't come back in, they had to upload a software fix to get them to reboot. Is the article stating that the software uploaded before the incident also had issues ?

    Did they report the other issue that pretty much caused the problem ?
     
  4. edwin_m

    edwin_m Veteran Member

    Messages:
    17,270
    Joined:
    21 Apr 2013
    Location:
    Nottingham
    The article does say that the newer software release was the problem, and that somebody had intentionally taken away the ability of the driver to recover, "intended to protect some electronic components in the traction package". At the time of the article a patch was under test to restore this ability.

    Sounds very much like the sort of unintended consequence that comes from software changes. Discussion on another thread suggests a frequency deviation of this magnitude is pretty much unprecedented, so perhaps whoever it was just thought it wouldn't happen.
     
  5. Ken H

    Ken H Established Member

    Messages:
    2,121
    Joined:
    11 Nov 2018
    Location:
    in Greece on holiday
    the article stated the fleet was running with 2 versions of the software. The trains with the old version didnt trip, the new version did (because of different tolerances for line frequency). The new version did not allow the driver to do a battery reset. The article implies this was a feature of the new version. That meant each failed train had to be visited by a technician with a laptop to reboot the train.
     
  6. Ken H

    Ken H Established Member

    Messages:
    2,121
    Joined:
    11 Nov 2018
    Location:
    in Greece on holiday
    The article quotes network rail standards and conflicting euro-norm standards. It was argued the trains with the new software didnt conform to Network Rail standards.
     
  7. ComUtoR

    ComUtoR Established Member

    Messages:
    6,321
    Joined:
    13 Dec 2013
    Location:
    UK
    I find this quite odd as rebooting is pretty standard and is a button press in the cab. I'm not sure what the article is suggesting.

    This is interesting. Because although the old version tripped, they still didn't reboot.

    Some of the units that did trip were still able to reboot. The Drivers did do a battery reset and the unit rebooted correctly. I think it was more than just new version/old version.

    There are at least 3 versions currently running about.
     
  8. hwl

    hwl Established Member

    Messages:
    5,033
    Joined:
    5 Feb 2012
    The article is wrong...

    Several software and specification screw ups:

    EN50163 permits traction electronics to start shuttling down below 49Hz (a good idea) with shut off for everything (e.g. auxiliaries) at 48.5Hz however they programmed in 49Hz as the complete shut off value by mistake. The auxiliary power supplies should never have shut down The second issues was resetting (or not after) the shut down.

    At the time of the "700" incident there were at least 5 software variants in service on the 700s. In the latest 3 software versions pre incident (3.27/28/29 - circa 60% of the fleet) they managed to remove the ability to battery disconnect reset and didn't regression test, all the problem sit down units had the later software (3.27+). Units with 3.25 and 3.26 were able to battery disconnect reset and get moving.
    In version 3.30 (roll out started the night of the incident) and later battery disconnect reset was restored.


    The newest software versions will have autoreset when the frequency returns to above 49.5Hz as well as setting the complete shutdown frequency to 48.5Hz instead of 49Hz.
     
  9. Ken H

    Ken H Established Member

    Messages:
    2,121
    Joined:
    11 Nov 2018
    Location:
    in Greece on holiday
    so how did 3.27 manage to get into production without proper version control and client sign-off, after UAT?
     
  10. DarloRich

    DarloRich Veteran Member

    Messages:
    23,335
    Joined:
    12 Oct 2010
    Location:
    Work - Fenny Stratford(MK) Home - Darlington
    It was probably just a cock up rather than a conspiracy i suspect you would prefer!
     
  11. dosxuk

    dosxuk Member

    Messages:
    690
    Joined:
    2 Jan 2011
    A reliability fix for the CCTV could mean many things, including (off the top of my head, I've got no idea what they actually did) :-
    - Making the display of timestamps more accurate
    - Changing the way data is sent along the train
    - Altering the power switching to reduce glitches when the train switches between AC & DC

    That last idea though I could well see affecting other parts of the trains power systems - it's all very well saying things should be updated separately, but when systems are interconnected there will be updates that affect more than the 'headline' system in an update.
     
  12. Ken H

    Ken H Established Member

    Messages:
    2,121
    Joined:
    11 Nov 2018
    Location:
    in Greece on holiday
    if I put in software in production that severely impacted my clients business, I would find my contract ended and find myself being sued for damages.
    Which is why we have UAT signoff. Then its the manager who signed it off's fault.
    But one would expect said manager to be told of any material changes. Stuff like disabling battery reset and frequency tolerances.

    But what is the point of type testing if the manufacturer can change the characteristics of the train? All the tests done in acceptance testing of the hardware are invalidated by software changes, now that software are a core component, not a bolt on goody. How do we know a (hypothetical) bug hasnt been installed that affects safety, like braking?
     
  13. jon0844

    jon0844 Veteran Member

    Messages:
    24,385
    Joined:
    1 Feb 2009
    Location:
    UK
    The next big update will be to the PIS, fixing the audio/stuttering issue. This may mean we can expect a return of the full-screen graphical images and speeches about engineering works, safety etc.

    Some will like this, some will hate it!
     
  14. ComUtoR

    ComUtoR Established Member

    Messages:
    6,321
    Joined:
    13 Dec 2013
    Location:
    UK
    They fixed the braking issue a few versions back....
     
  15. theageofthetra

    theageofthetra Established Member

    Messages:
    3,134
    Joined:
    27 May 2012
    Spot on.
     
  16. theageofthetra

    theageofthetra Established Member

    Messages:
    3,134
    Joined:
    27 May 2012
    I imagine this will be to ensure disability compliance?
     
  17. DarloRich

    DarloRich Veteran Member

    Messages:
    23,335
    Joined:
    12 Oct 2010
    Location:
    Work - Fenny Stratford(MK) Home - Darlington
    I know how IT projects work, thanks. The problem is that, sometimes, $hit happens and communications, understanding and sign off fail:

    There was an important job to be done and Everybody was sure that Somebody would do it. Anybody could have done it, but Nobody did it. Somebody got angry about that, because it was Everybody’s job. Everybody thought Anybody could do it, but Nobody realized that Everybody wouldn’t do it. It ended up that Everybody blamed Somebody when Nobody did what Anybody could have.

    The important thing is that the process fault is identified and fixed so it doesn't happen again. The lawyers can sort the rest out.
     
  18. edwin_m

    edwin_m Veteran Member

    Messages:
    17,270
    Joined:
    21 Apr 2013
    Location:
    Nottingham
    It sounds to me like an issue with the requirements not the software. Somebody changed the requirements relating to frequency-related shutdowns and resets, without realizing this put them arguably in breach of a standard. The requirement for the driver to be able to reset after frequency deviation was deleted, either unintentionally or because someone had considered the scenarios when it would be needed and decided they weren't likely enough to worry about. Once that happens the version control and sign-offs just ensure that it is doing the wrong thing very well.
     
  19. rebmcr

    rebmcr Established Member

    Messages:
    2,973
    Joined:
    15 Nov 2011
    Location:
    Cambridge
    I get your point, but this is not really a materially different situation to 1980s stock having a non-standard design of object deflector installed through routine maintenance, which later causes an incident. (I seem to remember that this actually happened in the north west, causing a great many Sprinters to be fixed overnight).
     
  20. coppercapped

    coppercapped Established Member

    Messages:
    2,322
    Joined:
    13 Sep 2015
    Location:
    Reading
    The contractual issue here is that the manufacturer's client is Cross London Trains, which in turn has a contract with the Department for Transport to supply trains to the franchisee, in this case GTR.

    Cross London Trains is a subsidiary of Siemens, the manufacturer.

    Who sues whom for damages in this case? :rolleyes:
     
  21. PG

    PG Member

    Messages:
    716
    Joined:
    12 Oct 2010
    I'm sure each parties lawyers will manage to work out who to claim against whilst lining their own pockets :smile:
    Another case of GIGO = Garbage In Garbage Out. If the specification against which something is being tested isn't right then neither will the end product.
     
  22. Ken H

    Ken H Established Member

    Messages:
    2,121
    Joined:
    11 Nov 2018
    Location:
    in Greece on holiday
    [​IMG]
     
  23. kkong

    kkong Member

    Messages:
    118
    Joined:
    8 Sep 2008
    Ahem. :s

    [​IMG]
     

Share This Page