I am investigating more about the data and I am skepical about the credibility for some of the figures, for example:
Brondesbury - Brighton: 18 (I am really surprised that, if this is true, there are so few people living in North London who fancy a seaside trip - I have done this a number of times on a West Hampstead - Brighton ticket when the Thameslink core is closed)
Cricklewood - Caledonian Road & Barnsbury: 25
Cricklewood - Heathrow T5: 4 / Heathrow T2&3: 7 (I know this is pre-Elizabeth line, but wasn't the fastest way back then to take the Thameslink to St Pancras then Express from Paddington)
Cricklewood - Southampton Central: 7
Cricklewood - Hackney Downs: 11 (Did really everyone used Hackney Central even on the days North London Line was closed)
Cricklewood - Stratford International: 12 (I also did this a number of times, this was pre-Elizabeth line and the simple change at St Pancras trumps all other methods by speed)
Farringdon - Folkestone West / Sandwich / Ramsgate: 2 (are you really serious that no one commutes between these Kent towns and Farringdon transferring at St Pancras?!)
London Bridge - London Road (Guildford): 7 (does really no one live near that suburban station commute to the City by changing at Waterloo?)
Gospel Oak - Southampton Central: 15
Gospel Oak - Haywards Heath: 18
Gospel Oak - Reading: 19
West Hampstead - Milton Keynes: 10
You are putting too much faith in the ODM actually being people origins and destinations, they are the origin and destination of each ticket leg with some imposed system logic and some dodgy allocations.
"London Bridge - London Road (Guildford)" Waterloo - London Road (Guildford) and walk/tube from Waterloo is more likely for the average users given Waterloo East - Waterloo interchange time and walking time to London Bridge station.
The allocations will undercount Farringdon from the "London XXX" station pools, one would hope to see big improvements in this years data as it obviously hasn't been refreshed for Thameslink Programme completing. DfT and Steer need to do some serious analysis here on the "London XXX" tickets pools.
The questionable imposed logic screwed up the Crossrail journeys as it was still breaking every Crossrail journey at Paddington in the passenger numbers releases and inflating Crossrail journey numbers despite
Greater use of split ticketing is going to really break the matrix.
Given how heavily TfL-branded the overground is, I suspect it wouldn’t occur to many passengers to buy a through ticket; passengers using Oyster for the overground leg then an NR ticket to Brighton would register as separate journeys.
Agreed
I suspect similar logic applies to a lot of these, though I’m not familiar enough with London railway geography to be sure off hand.
It does.
There are lots of issues related to the O-D matrix due to the way ticket sales are or are not captured. The allocation of tickets sold to London Terminals between each of the actual stations is to some extent based on modelling not on what actually happens. The use of zonal tickets and even worse free concessionary passes in London and some other PTE areas again means lots of assumptions on the number of trips between individual stations. Some of the examples you give above may also relate to people using contactless tickets to get to a main line station and then buying a ticket from there to say Brigton and Heathrow.
Given that this data was previously treated as a state secret I'm just grateful it has now been published warts and all.
Some of the modelling is decades old and there may still be plenty of BR era assumption for train that haven't existed for decades.
For example in mentioned the King's Cross – Sheffield (188,481) example from the previous page to a seasoned industry expert who has been involved designing what became Oyster and also with Project Oval and who was one of the people who forced the last big set of ORR data improvements in ~2011 (improvements but lots of warts remaining), their comment was
"oh **** the Sheffield Pullmans* via Retford are still assumed to run" * including the
Master Cutler which hadn't run on that route since 1968. Alternatively KingCross St Pancras station tube might be causing some data issues with St Pancras data flows.
It appears there are lots of assumptions from the previous big data improvement exercise in 2002-03 still in the system and plenty from before that too.
There should be a lot fewer warts in this years release. Significant staff turnover in ORR's stats department appears to have brought through new people willing to sort issues be willing
I imagine the vast majority of these sorts of examples are explained by people using Oyster or contactless PAYG (or holding Travelcards) for the 'local' journey within London and then buying onward tickets from London Terminals.
Agreed, but things will start to change with Crossrail in this coming release and Oval in 2 years' times release. It also takes a while for people to adjust ticket buying habits.
As has also been mentioned on numerous previous occasions, your travel habits are very different to those of the average person; just because these are the sorts of journeys you do regularly, that's not necessarily going to result in significant numbers of journeys in the statistics.