• Our new ticketing site is now live! Using either this or the original site (both powered by TrainSplit) helps support the running of the forum with every ticket purchase! Find out more and ask any questions/give us feedback in this thread!

Spreadsheet error resulted in missing Sars-Cov-2 cases for track and trace

Status
Not open for further replies.

HSTEd

Veteran Member
Joined
14 Jul 2011
Messages
18,913
Serco are getting paid £12bn to run the world beating Serco Test and Trace system. I'm sorry, but for £12bn I'd expect a bit more than a shonky CSV file and some sticky-backed plastic

What's wrong with a CSV file?

It's the best way of moving data between disparate systems simply because it is so simple.

The problem in this case is entirely at the PHE end.

Microsoft Office costs a couple of hundred quid. Where the hell has the rest of the cash gone?

Maybe ask PHE?
 
Sponsor Post - registered members do not see these adverts; click here to register, or click here to log in
R

RailUK Forums

Bantamzen

Established Member
Joined
4 Dec 2013
Messages
9,996
Location
Baildon, West Yorkshire
Serco are getting paid £12bn to run the world beating Serco Test and Trace system. I'm sorry, but for £12bn I'd expect a bit more than a shonky CSV file and some sticky-backed plastic.

Microsoft Office costs a couple of hundred quid. Where the hell has the rest of the cash gone?

Welcome to the world of public / private sector partnerships, the kind of thing we are seeing is all too common. Part of the problem is that PHE will probably not have access to Serco's network, and visa versa. And having had first hand experience in the difficulties of getting security clearance for data stores for shared, confidential data, end up with a .CSV file is no surprise to me at all. You'd be shocked how common it can actually be.
 

DelayRepay

Established Member
Joined
21 May 2011
Messages
2,929
I sometimes find myself having to use 'Sellotape and string' solutions like this, because when something needs to be put in place quickly, it's often the only way. In fact only last week I was complaining about having to use CSV files to move data between two systems. I hate doing it that way, but sometimes needs must.

But, I understand the risks. I understand the limitations of Excel, I understand that things can go wrong. Crucially I understand that you need to reconcile the output to the input. A very simple 'count in count out' control should have identified that there were tens of thousands of records missing. Personally I always manually check the first and last record when moving data this way. Even a sense check should have questioned why numbers appeared to be stable when other information suggested there should have been an increase.

Equally, using Excel in this way may have been acceptable in the beginning, but there have been months for someone to develop a more robust solution. When Covid came, we originally had to set up some basic Excel tools to track various things, but as weeks progressed IT provided better, supported solutions.

When this news broke, I asked my team and manager to read the coverage to understand what had happened. This incident will now be my stock answer when anyone asks me 'why do we spend so much time checking things?'...

In my company I must have seen hundreds of errors involving Excel tools - incorrect formula, formulas that don't cover the full range of data, pivot tables that haven't been refreshed, VBA code that stops working properly because someone's added an extra column... It's a dangerous tool, particularly because most users seem to ignore error messages and seem happy to correct any discrepancies manually rather than finding out what's gone wrong.
 

Bantamzen

Established Member
Joined
4 Dec 2013
Messages
9,996
Location
Baildon, West Yorkshire
I sometimes find myself having to use 'Sellotape and string' solutions like this, because when something needs to be put in place quickly, it's often the only way. In fact only last week I was complaining about having to use CSV files to move data between two systems. I hate doing it that way, but sometimes needs must.

Absolutely, I've done the same myself for my department this year. When you haven't got enough time to build bespoke systems, i.e. the results are needed yesterday, then getting the data from a CSV file and chucking some VBA at it to extract & display the data is something a decent coder can get going in literally no time at all.

But, I understand the risks. I understand the limitations of Excel, I understand that things can go wrong. Crucially I understand that you need to reconcile the output to the input. A very simple 'count in count out' control should have identified that there were tens of thousands of records missing. Personally I always manually check the first and last record when moving data this way. Even a sense check should have questioned why numbers appeared to be stable when other information suggested there should have been an increase.

This kind of thing sadly is often overlooked, some devs tend to work with smaller datasets whilst building & testing, and often overlook things like datasets getting larger over time. As you say, a simple sense-check like number of records at source vs records output would have highlighted the problem much sooner.

Equally, using Excel in this way may have been acceptable in the beginning, but there have been months for someone to develop a more robust solution. When Covid came, we originally had to set up some basic Excel tools to track various things, but as weeks progressed IT provided better, supported solutions.

Herein lies the problem with public sector IT development, the wheels often run very slowly when commissioning bespoke systems. And if PHE is anything like my department (and I'm pretty confident it is), then they will be transitioning from almost total dependency on external partners to a mixture of in-house & out-of-house developers. So it can often take longer than it should just to draw up a commission and decide who gets it, as well as who pays for it of course!

When this news broke, I asked my team and manager to read the coverage to understand what had happened. This incident will now be my stock answer when anyone asks me 'why do we spend so much time checking things?'...

My answer is a bit simpler, "Because users....". ;)

In my company I must have seen hundreds of errors involving Excel tools - incorrect formula, formulas that don't cover the full range of data, pivot tables that haven't been refreshed, VBA code that stops working properly because someone's added an extra column... It's a dangerous tool, particularly because most users seem to ignore error messages and seem happy to correct any discrepancies manually rather than finding out what's gone wrong.

Whenever I've worked with Office based apps these days, my number one rule is lock everything down. So no data stored in Excel but in Access at the minimum, limit all input & query fields, sense checking them as users type to make sure that they use only the expected characters / parameters & that there is no chance of unexpected SQL injections, and of course error trap absolutely everything. There will always be that user that somehow still knackers the show, but that's life in IT. But it does mean that once I have passed apps through beta testing, they generally just play nicely with the users giving me a simpler life. And when managers ask how long it will take, I do a Scotty from Star Trek I say "2 weeks" for a project I know I can knock out in 3 days.... :lol:
 

DelayRepay

Established Member
Joined
21 May 2011
Messages
2,929
Whenever I've worked with Office based apps these days, my number one rule is lock everything down. So no data stored in Excel but in Access at the minimum, limit all input & query fields, sense checking them as users type to make sure that they use only the expected characters / parameters & that there is no chance of unexpected SQL injections, and of course error trap absolutely everything.

I am a user, I don't work in IT, but my team manage large volumes of data so we sometimes have to resort to home made solutions. We also, separately, have dependencies on data managed by other parts of the organisation which is often contained in other home made solutions.

In general, Access would be a better tool, but most users don't know how to use it, whereas they think they know how to use Excel. In fact, we didn't even have Access as part of our standard desktop build until we moved to Office 365 - you had to make a special request to IT if you wanted it installed. So people use Excel because they're comfortable with it, not because it's the right tool for the job. The worse ones are those developed by someone who is a 'wizz in Excel' (but their actual job is sales or something completely unrelated). Their manager will initially be delighted that they've solved a problem without using any of the IT budget. Then the 'wizz' will leave, the spreadsheet will break or need changing, and nobody has a clue how it works. And of course the 'wizz' didn't document anything....

Don't get me started on SharePoint lists and workflows... <(
 

Bantamzen

Established Member
Joined
4 Dec 2013
Messages
9,996
Location
Baildon, West Yorkshire
I am a user, I don't work in IT, but my team manage large volumes of data so we sometimes have to resort to home made solutions. We also, separately, have dependencies on data managed by other parts of the organisation which is often contained in other home made solutions.

In general, Access would be a better tool, but most users don't know how to use it, whereas they think they know how to use Excel. In fact, we didn't even have Access as part of our standard desktop build until we moved to Office 365 - you had to make a special request to IT if you wanted it installed. So people use Excel because they're comfortable with it, not because it's the right tool for the job. The worse ones are those developed by someone who is a 'wizz in Excel' (but their actual job is sales or something completely unrelated). Their manager will initially be delighted that they've solved a problem without using any of the IT budget. Then the 'wizz' will leave, the spreadsheet will break or need changing, and nobody has a clue how it works. And of course the 'wizz' didn't document anything....

Don't get me started on SharePoint lists and workflows... <(

You don't need to know how to use Access as a user, indeed you don't even need a licence for it. All you need is for someone competent in VBA & SQL to port the data from Excel to Access (or indeed any other local or cloud based data server), the Excel front end can stay almost the same but the data is stored away from prying eyes and the temptation to add/remove columns and rows.
 

duncanp

Established Member
Joined
16 Aug 2012
Messages
4,856
Whenever I've worked with Office based apps these days, my number one rule is lock everything down. So no data stored in Excel but in Access at the minimum, limit all input & query fields, sense checking them as users type to make sure that they use only the expected characters / parameters & that there is no chance of unexpected SQL injections, and of course error trap absolutely everything. There will always be that user that somehow still knackers the show, but that's life in IT. But it does mean that once I have passed apps through beta testing, they generally just play nicely with the users giving me a simpler life. And when managers ask how long it will take, I do a Scotty from Star Trek I say "2 weeks" for a project I know I can knock out in 3 days.... :D

It is very important to validate everything that is put into a database or spreadsheet, because if you don't there is always one user who will try to put, for example, a non numeric character in a field that is supposed to be numeric.<(<(

One major problem with CSV files is that you have to make sure that the character which separates the fields does not appear in any of the data items.
ie. If you are using a comma to separate the fields, you cannot have, for example, a field called Name where the name is equal to Tom, Dick or Harry.

You wouldn't believe the number of times this can screw up the importation of data from a CSV file into a database, and I wonder whether this is happening at the moment.

I am a user, I don't work in IT, but my team manage large volumes of data so we sometimes have to resort to home made solutions. We also, separately, have dependencies on data managed by other parts of the organisation which is often contained in other home made solutions.

In general, Access would be a better tool, but most users don't know how to use it, whereas they think they know how to use Excel. In fact, we didn't even have Access as part of our standard desktop build until we moved to Office 365 - you had to make a special request to IT if you wanted it installed. So people use Excel because they're comfortable with it, not because it's the right tool for the job. The worse ones are those developed by someone who is a 'wizz in Excel' (but their actual job is sales or something completely unrelated). Their manager will initially be delighted that they've solved a problem without using any of the IT budget. Then the 'wizz' will leave, the spreadsheet will break or need changing, and nobody has a clue how it works. And of course the 'wizz' didn't document anything....

Don't get me started on SharePoint lists and workflows... <(

Oh I can so agree with every single word here, especially the bit in bold - been there, done that, and got a whole bloody wardrobe full of t shirts.

The politics that are involved in software development are unbelievable.

There is a lot of pressure to get "something done" quickly, but tight fisted management don't want to spend any money on getting anything done properly. They would rather pay for something strung together by Trotters Independent Testers than by a proper firm that knows what they are doing.

The worst type of manager to work for is someone who doesn't know anything about how software development works, and thinks that by shouting twice as loud you can get it done in half the time.

Rant over anyway - glad I am no longer working in IT (made redundant and then retired)
 

MotCO

Established Member
Joined
25 Aug 2014
Messages
5,210
Early on, I was downloading the daily CSV files produced by the DofH on to Excel to monitor the number of Covid cases just for the hell of it. I noticed that something strange was happening to the data - some data for previous days was being updated, so the totals didn't tally, so I built in a simple check. If I can do it, why can't PHE?
 

trebor79

Established Member
Joined
8 Mar 2018
Messages
4,768
The really horrifying thing for me as somenoe who knows a bit about computer, that when presented with a dialogue box error about a large file.... noone even bothered to google what it meant.
They just saw some of the file had loaded in and ignored it.
I suspect it will have run as an automated batch task so any such errors may not have been seen in real time
On Error Resume Next

I'd lay money on that being at the top of a VBA module... Makes all your macros run perfectly and gets rid of those pesky errors and warnings. The kind of thing you do when you're quickly hacking something together, but really ought not to.
 

Bantamzen

Established Member
Joined
4 Dec 2013
Messages
9,996
Location
Baildon, West Yorkshire
On Error Resume Next

I'd lay money on that being at the top of a VBA module... Makes all your macros run perfectly and gets rid of those pesky errors and warnings. The kind of thing you do when you're quickly hacking something together, but really ought not to.

Yeah, that's probably the case... <(
 
Status
Not open for further replies.

Top