• Our booking engine at tickets.railforums.co.uk (powered by TrainSplit) helps support the running of the forum with every ticket purchase! Find out more and ask any questions/give us feedback in this thread!

ORR Origin-Destination Matrix 2021-22

crucible72

Member
Joined
5 Nov 2016
Messages
39
Here's a version sorted by passenger count, so any rows you lose will be flows with 1 or 2 passengers a year.


Oh, I also asked ORR about these; apparently most commonly caused by incomplete contactless journeys, but also something related to refunds - not entirely sure on the details there. There are a few outside Oysterland but it's much rarer.
That's a great data set, thanks for that. Is it possible to sort it by passenger count or origin nlc the other way round,
so that the remaining part of the data can be viewed in Excel? Unless someone knows a better tool for viewing it.
 
Sponsor Post - registered members do not see these adverts; click here to register, or click here to log in
R

RailUK Forums

Gaelan

Member
Joined
3 Apr 2023
Messages
836
Location
St Andrews
That's a great data set, thanks for that. Is it possible to sort it by passenger count or origin nlc the other way round,
so that the remaining part of the data can be viewed in Excel? Unless someone knows a better tool for viewing it.
I’ve done so! It’s linked in the OP, under “edit”
 

crucible72

Member
Joined
5 Nov 2016
Messages
39
I’ve done so! It’s linked in the OP, under “edit”
Thanks but what I meant was have it sorted so that the combinations with smallest passenger counts appear at the top, so
that the rarest combinations can be viewed.
 

NorthOxonian

Established Member
Associate Staff
Buses & Coaches
Joined
5 Jul 2018
Messages
1,492
Location
Oxford/Newcastle
There are 103 stations with no journeys to any London Terminal recorded. As well as the usual suspects such as Teeside Airport, Reddish South and Altnabreac, it also includes a lot of Glasgow suburban stations, which seems to be a bug in the data, as I find it hard to believe that of the 103k people who used e.g. Hillington West, not a single one of them was travelling to London.
Argyle Street is probably the starkest case of this – 386,577 tickets were from there, yet just 24 were to destinations outwith Scotland (and most of these were to places like Newcastle, Windermere, or Lancaster which are relatively close to the border). None were to London, or indeed anywhere further south than Birmingham. I suppose this is a little more plausible as it's so close to Glasgow Central that longer distance passengers would just book to there instead (and those travelling from Glasgow would just walk to Central).

With a figure of 99.99%, Argyle Street had the highest percentage of its tickets being to destinations within the region/nation (I excluded a few very quiet stations where no tickets at all were sold to other regions; mostly these were in the Highlands). Other Glasgow suburban stations dominate this list including Ashfield (99.98%), Bridgeton (99.96%), and Anderston (99.96%). As mentioned, Merseyrail also has some stations with similar patterns, like Bootle Oriel Road: out of over quarter of a million tickets, just 130 were to places beyond the North West, and just four Merseyrail stations (Liverpool Central, Southport, Blundellsands & Crosby, Hunts Cross) accounted for half of all tickets. The flipside of this is Kings Sutton - technically this is in Northamptonshire and so the East Midlands, but most passengers were travelling to the South East (Banbury/Oxford), West Midlands (mostly Warwick or Birmingham), or to a lesser extent London. Just 38 out of its 22,766 tickets were to/from other stations in the East Midlands, with Northampton the main destination.
 

rob.rjt

Member
Joined
13 Mar 2010
Messages
85
That's a great data set, thanks for that. Is it possible to sort it by passenger count or origin nlc the other way round,
so that the remaining part of the data can be viewed in Excel? Unless someone knows a better tool for viewing it.
Unless Gaelan is happy to do it, the best thing to do is probably put it into a database like Access and run queries on it that way. Took me 5 minutes, most of which was checking the syntax for the queries
 

MikeWM

Established Member
Joined
26 Mar 2010
Messages
4,438
Location
Ely
Fascinating stuff - many thanks for making this easily available! I've already found a few journeys I make that only have a single-digit number of sales, which is surprising (and interesting!).
 

b0b

Established Member
Joined
25 Jan 2010
Messages
1,335
Argyle Street is probably the starkest case of this – 386,577 tickets were from there, yet just 24 were to destinations outwith Scotland (and most of these were to places like Newcastle, Windermere, or Lancaster which are relatively close to the border).

Honestly makes a lot of sense, not many people live in that area and anyone traveling to/from Argyle Street is likely commuting. For longer distance you'd simple just walk to Central or Queen Street.
 

A S Leib

Member
Joined
9 Sep 2018
Messages
803
Probably better to set up a new thread when it's out, but the ORR station usage numbers for April 2022 to March 2023 will be published on 14 December.
 

b0b

Established Member
Joined
25 Jan 2010
Messages
1,335
this data is also super useful for software developers, gives the ability to cache the most common flows
 

hwl

Established Member
Joined
5 Feb 2012
Messages
7,443
Probably better to set up a new thread when it's out, but the ORR station usage numbers for April 2022 to March 2023 will be published on 14 December.
And hopefully a new OD matrix shortly there after...

This months planned ORR releases:
Passenger rail performance Jul - Sep 2023 (Q2) 07/12/2023
Freight rail usage and performance Jul - Sep 2023 (Q2) 12/12/2023
Estimates of station usage Annual (2022-23) 14/12/2023
Passenger rail usage Jul - Sep 2023 (Q2) 19/12/2023

The planned dates show ORR catching up as the Q2 releases are only 4-6 weeks post Q1.
 

telstarbox

Established Member
Joined
23 Jul 2010
Messages
5,968
Location
Wennington Crossovers
Thanks for posting this.

Comparing this to the station totals which ORR also provide suggests that this data is single direction (so the total return demand between stations A and B is the A to B row PLUS the B to A row).

For most station pairs outside London /Oyster, the A to B and B to A values are the same which is to be expected.
 

miklcct

On Moderation
Joined
2 May 2021
Messages
4,388
Location
Cricklewood
Out of my interest, can a journey with a change of trains include non-National Rail legs in between, such as tube and buses?
 

greatkingrat

Established Member
Joined
20 Jan 2011
Messages
2,798
Actually the chances of A-B and B-A being identical (for reasonably large flows at least) is tiny. I would expect them to be similar but not exactly the same. While the vast majority will buy returns, there will always be some single journeys.

I suspect they have taken the average of A-B and B-A and used that figure for both directions.
 

hwl

Established Member
Joined
5 Feb 2012
Messages
7,443
Thanks for posting this.

Comparing this to the station totals which ORR also provide suggests that this data is single direction (so the total return demand between stations A and B is the A to B row PLUS the B to A row).

For most station pairs outside London /Oyster, the A to B and B to A values are the same which is to be expected.
The non TfL supplied data has been normalised so A-->B and B-->A match, this may not actually be correct in practice but is an underlying assumption. TfL provide good actual user data to show that flows are often not symmetric.

The higher level ORR station user totals for 2021-22 are the sum of A--->? and ? -->A from the 2021-22 ODM.
 

miklcct

On Moderation
Joined
2 May 2021
Messages
4,388
Location
Cricklewood
I wonder how many people can find journeys here that they are the only person/group that made.
I've found 2 within that time period.
I look forwards to future versions, where I think I might have some more unique journeys.
Also I am surprised that Bournemouth to most North London Line stations has only single digit usage, e.g. 5 to Brondesbury, 1 to Hampstead Heath, 2 to Gospel Oak. I didn't make such journeys in 2021-2022 but much more than that in 2022-2023, and I hope to see how the number changed afterwards.
 

Craig1122

Member
Joined
14 May 2021
Messages
290
Location
UK
Why are there 182,057 journeys from Clapham Junction to Clapham Junction?
I live in the South East but outside the Oyster zone. Almost every train load getting off at my local station will see at least one person try to touch out on the TOC smart card reader with an oyster card. I presume this is duplicated across multiple stations each day. I'm guessing this is the most likely cause of the majority of same station journeys in the SE.
 

crucible72

Member
Joined
5 Nov 2016
Messages
39
Another thing notable about this data is that it includes journeys where the origin and destination
are the same. I am not sure how this is possible. For example, there are 35,109 recorded journeys
from Reading to Reading, despite there being no fare for this on brfares.
 

Gaelan

Member
Joined
3 Apr 2023
Messages
836
Location
St Andrews
Another thing notable about this data is that it includes journeys where the origin and destination
are the same. I am not sure how this is possible. For example, there are 35,109 recorded journeys
from Reading to Reading, despite there being no fare for this on brfares.
These journeys can appear for a number of reasons, most notably incomplete contactless journeys.
 

hwl

Established Member
Joined
5 Feb 2012
Messages
7,443
Another thing notable about this data is that it includes journeys where the origin and destination
are the same. I am not sure how this is possible. For example, there are 35,109 recorded journeys
from Reading to Reading, despite there being no fare for this on brfares.
Previously explained on page 1 of this thread and no doubt @Gaelan and others will have to keep repeating the explanation at least several times a page at the current rate...

Reading has been a TfL contactless station for over four years now and every TfL contactless station has unresolved incomplete journeys, this means only 1 station is identified and they can't tell whether it is the origin or destination so it is listed as both.

You have just discovered that BRfares doesn't list all fares despite their claim at the top of their banner "Look up fares for any train journey in Britain".
They also don't list advances with demand based pricing structures. "Look up fares for any most train journeys in Britain" would be more accurate.
 

Alex365Dash

Member
Joined
2 Jul 2019
Messages
677
Location
Brighton
They also don't list advances with demand based pricing structures.
Actually, they do - each Advance tier is shown on their site. TfL PAYG fares are also shown in theory (except maximum fares, but do those count?), although because it's based off the National Rail data feed rather than the TfL one, the data in there tends to be outdated.
 

Watershed

Veteran Member
Associate Staff
Senior Fares Advisor
Joined
26 Sep 2020
Messages
12,284
Location
UK
Actually, they do - each Advance tier is shown on their site. TfL PAYG fares are also shown in theory (except maximum fares, but do those count?), although because it's based off the National Rail data feed rather than the TfL one, the data in there tends to be outdated.
Indeed that's why there's a corresponding sister site, LT Fares, for TfL PAYG fares which takes data directly from the TfL feeds.
 

miklcct

On Moderation
Joined
2 May 2021
Messages
4,388
Location
Cricklewood
Previously explained on page 1 of this thread and no doubt @Gaelan and others will have to keep repeating the explanation at least several times a page at the current rate...

Reading has been a TfL contactless station for over four years now and every TfL contactless station has unresolved incomplete journeys, this means only 1 station is identified and they can't tell whether it is the origin or destination so it is listed as both.

You have just discovered that BRfares doesn't list all fares despite their claim at the top of their banner "Look up fares for any train journey in Britain".
They also don't list advances with demand based pricing structures. "Look up fares for any most train journeys in Britain" would be more accurate.
I am investigating more about the data and I am skepical about the credibility for some of the figures, for example:

Brondesbury - Brighton: 18 (I am really surprised that, if this is true, there are so few people living in North London who fancy a seaside trip - I have done this a number of times on a West Hampstead - Brighton ticket when the Thameslink core is closed)
Cricklewood - Caledonian Road & Barnsbury: 25
Cricklewood - Heathrow T5: 4 / Heathrow T2&3: 7 (I know this is pre-Elizabeth line, but wasn't the fastest way back then to take the Thameslink to St Pancras then Express from Paddington)
Cricklewood - Southampton Central: 7
Cricklewood - Hackney Downs: 11 (Did really everyone used Hackney Central even on the days North London Line was closed)
Cricklewood - Stratford International: 12 (I also did this a number of times, this was pre-Elizabeth line and the simple change at St Pancras trumps all other methods by speed)
Farringdon - Folkestone West / Sandwich / Ramsgate: 2 (are you really serious that no one commutes between these Kent towns and Farringdon transferring at St Pancras?!)
London Bridge - London Road (Guildford): 7 (does really no one live near that suburban station commute to the City by changing at Waterloo?)
Gospel Oak - Southampton Central: 15
Gospel Oak - Haywards Heath: 18
Gospel Oak - Reading: 19
West Hampstead - Milton Keynes: 10
 

Gaelan

Member
Joined
3 Apr 2023
Messages
836
Location
St Andrews
I am investigating more about the data and I am skepical about the credibility for some of the figures, for example:

Brondesbury - Brighton: 18 (I am really surprised that, if this is true, there are so few people living in North London who fancy a seaside trip - I have done this a number of times on a West Hampstead - Brighton ticket when the Thameslink core is closed)
Given how heavily TfL-branded the overground is, I suspect it wouldn’t occur to many passengers to buy a through ticket; passengers using Oyster for the overground leg then an NR ticket to Brighton would register as separate journeys.

I suspect similar logic applies to a lot of these, though I’m not familiar enough with London railway geography to be sure off hand.
 

deltic

Established Member
Joined
8 Feb 2010
Messages
3,265
I am investigating more about the data and I am skepical about the credibility for some of the figures, for example:

Brondesbury - Brighton: 18 (I am really surprised that, if this is true, there are so few people living in North London who fancy a seaside trip - I have done this a number of times on a West Hampstead - Brighton ticket when the Thameslink core is closed)
Cricklewood - Caledonian Road & Barnsbury: 25
Cricklewood - Heathrow T5: 4 / Heathrow T2&3: 7 (I know this is pre-Elizabeth line, but wasn't the fastest way back then to take the Thameslink to St Pancras then Express from Paddington)
Cricklewood - Southampton Central: 7
Cricklewood - Hackney Downs: 11 (Did really everyone used Hackney Central even on the days North London Line was closed)
Cricklewood - Stratford International: 12 (I also did this a number of times, this was pre-Elizabeth line and the simple change at St Pancras trumps all other methods by speed)
Farringdon - Folkestone West / Sandwich / Ramsgate: 2 (are you really serious that no one commutes between these Kent towns and Farringdon transferring at St Pancras?!)
London Bridge - London Road (Guildford): 7 (does really no one live near that suburban station commute to the City by changing at Waterloo?)
Gospel Oak - Southampton Central: 15
Gospel Oak - Haywards Heath: 18
Gospel Oak - Reading: 19
West Hampstead - Milton Keynes: 10
There are lots of issues related to the O-D matrix due to the way ticket sales are or are not captured. The allocation of tickets sold to London Terminals between each of the actual stations is to some extent based on modelling not on what actually happens. The use of zonal tickets and even worse free concessionary passes in London and some other PTE areas again means lots of assumptions on the number of trips between individual stations. Some of the examples you give above may also relate to people using contactless tickets to get to a main line station and then buying a ticket from there to say Brigton and Heathrow.

Given that this data was previously treated as a state secret I'm just grateful it has now been published warts and all.
 

Watershed

Veteran Member
Associate Staff
Senior Fares Advisor
Joined
26 Sep 2020
Messages
12,284
Location
UK
I am investigating more about the data and I am skepical about the credibility for some of the figures, for example:

Brondesbury - Brighton: 18 (I am really surprised that, if this is true, there are so few people living in North London who fancy a seaside trip - I have done this a number of times on a West Hampstead - Brighton ticket when the Thameslink core is closed)
Cricklewood - Caledonian Road & Barnsbury: 25
Cricklewood - Heathrow T5: 4 / Heathrow T2&3: 7 (I know this is pre-Elizabeth line, but wasn't the fastest way back then to take the Thameslink to St Pancras then Express from Paddington)
Cricklewood - Southampton Central: 7
Cricklewood - Hackney Downs: 11 (Did really everyone used Hackney Central even on the days North London Line was closed)
Cricklewood - Stratford International: 12 (I also did this a number of times, this was pre-Elizabeth line and the simple change at St Pancras trumps all other methods by speed)
Farringdon - Folkestone West / Sandwich / Ramsgate: 2 (are you really serious that no one commutes between these Kent towns and Farringdon transferring at St Pancras?!)
London Bridge - London Road (Guildford): 7 (does really no one live near that suburban station commute to the City by changing at Waterloo?)
Gospel Oak - Southampton Central: 15
Gospel Oak - Haywards Heath: 18
Gospel Oak - Reading: 19
West Hampstead - Milton Keynes: 10
I imagine the vast majority of these sorts of examples are explained by people using Oyster or contactless PAYG (or holding Travelcards) for the 'local' journey within London and then buying onward tickets from London Terminals. As has also been mentioned on numerous previous occasions, your travel habits are very different to those of the average person; just because these are the sorts of journeys you do regularly, that's not necessarily going to result in significant numbers of journeys in the statistics.
 

telstarbox

Established Member
Joined
23 Jul 2010
Messages
5,968
Location
Wennington Crossovers
To turn it around, if you look at all the flows from Cricklewood in rank order of journeys, are the top five destinations logical or surprising? I would not be surprised if 75% are to Thameslink stations.

Cricklewood to Southampton (and others) would fall into the "easier to drive" category for lots of people, especially if they're not going to central Soton.

West Hampstead to MK would probably be a bus to Euston first?
 
Last edited:

hwl

Established Member
Joined
5 Feb 2012
Messages
7,443
I am investigating more about the data and I am skepical about the credibility for some of the figures, for example:

Brondesbury - Brighton: 18 (I am really surprised that, if this is true, there are so few people living in North London who fancy a seaside trip - I have done this a number of times on a West Hampstead - Brighton ticket when the Thameslink core is closed)
Cricklewood - Caledonian Road & Barnsbury: 25
Cricklewood - Heathrow T5: 4 / Heathrow T2&3: 7 (I know this is pre-Elizabeth line, but wasn't the fastest way back then to take the Thameslink to St Pancras then Express from Paddington)
Cricklewood - Southampton Central: 7
Cricklewood - Hackney Downs: 11 (Did really everyone used Hackney Central even on the days North London Line was closed)
Cricklewood - Stratford International: 12 (I also did this a number of times, this was pre-Elizabeth line and the simple change at St Pancras trumps all other methods by speed)
Farringdon - Folkestone West / Sandwich / Ramsgate: 2 (are you really serious that no one commutes between these Kent towns and Farringdon transferring at St Pancras?!)
London Bridge - London Road (Guildford): 7 (does really no one live near that suburban station commute to the City by changing at Waterloo?)
Gospel Oak - Southampton Central: 15
Gospel Oak - Haywards Heath: 18
Gospel Oak - Reading: 19
West Hampstead - Milton Keynes: 10
You are putting too much faith in the ODM actually being people origins and destinations, they are the origin and destination of each ticket leg with some imposed system logic and some dodgy allocations.

"London Bridge - London Road (Guildford)" Waterloo - London Road (Guildford) and walk/tube from Waterloo is more likely for the average users given Waterloo East - Waterloo interchange time and walking time to London Bridge station.

The allocations will undercount Farringdon from the "London XXX" station pools, one would hope to see big improvements in this years data as it obviously hasn't been refreshed for Thameslink Programme completing. DfT and Steer need to do some serious analysis here on the "London XXX" tickets pools.

The questionable imposed logic screwed up the Crossrail journeys as it was still breaking every Crossrail journey at Paddington in the passenger numbers releases and inflating Crossrail journey numbers despite

Greater use of split ticketing is going to really break the matrix.
Given how heavily TfL-branded the overground is, I suspect it wouldn’t occur to many passengers to buy a through ticket; passengers using Oyster for the overground leg then an NR ticket to Brighton would register as separate journeys.
Agreed
I suspect similar logic applies to a lot of these, though I’m not familiar enough with London railway geography to be sure off hand.
It does.
There are lots of issues related to the O-D matrix due to the way ticket sales are or are not captured. The allocation of tickets sold to London Terminals between each of the actual stations is to some extent based on modelling not on what actually happens. The use of zonal tickets and even worse free concessionary passes in London and some other PTE areas again means lots of assumptions on the number of trips between individual stations. Some of the examples you give above may also relate to people using contactless tickets to get to a main line station and then buying a ticket from there to say Brigton and Heathrow.

Given that this data was previously treated as a state secret I'm just grateful it has now been published warts and all.
Some of the modelling is decades old and there may still be plenty of BR era assumption for train that haven't existed for decades.
For example in mentioned the King's Cross – Sheffield (188,481) example from the previous page to a seasoned industry expert who has been involved designing what became Oyster and also with Project Oval and who was one of the people who forced the last big set of ORR data improvements in ~2011 (improvements but lots of warts remaining), their comment was "oh **** the Sheffield Pullmans* via Retford are still assumed to run" * including the Master Cutler which hadn't run on that route since 1968. Alternatively KingCross St Pancras station tube might be causing some data issues with St Pancras data flows.
It appears there are lots of assumptions from the previous big data improvement exercise in 2002-03 still in the system and plenty from before that too.

There should be a lot fewer warts in this years release. Significant staff turnover in ORR's stats department appears to have brought through new people willing to sort issues be willing
I imagine the vast majority of these sorts of examples are explained by people using Oyster or contactless PAYG (or holding Travelcards) for the 'local' journey within London and then buying onward tickets from London Terminals.
Agreed, but things will start to change with Crossrail in this coming release and Oval in 2 years' times release. It also takes a while for people to adjust ticket buying habits.
As has also been mentioned on numerous previous occasions, your travel habits are very different to those of the average person; just because these are the sorts of journeys you do regularly, that's not necessarily going to result in significant numbers of journeys in the statistics.
 

DDB

Member
Joined
11 Sep 2011
Messages
486
I've been trying to wrestle my old laptop into being able to have a look at these. Thanks to the OP for making the sorted version available that allowed me to chop off the lowest number ones.

What I've seen so far is that I think it does break down whenever you have station groups and they would have been better just reporting it by station groups.

For instance Dorchester West to Poole are listed. I would bet that no such ticket has actually been sold. Everyone going to Poole from Dorchester goes from Dorchester West so I assume that a proportion from "Dorchester Stations" group has been allocated.

If anyone has the computing power it would be interesting to see if members of the major station groups always appear in the same proportion from long distance stations I.e. do the Birmingham stations always appear in the same realtive proportions from any station far enough away that the destination is Birmingham stations.

Also the two small stations between which I have my annual season ticket have 270 journeys. I wonder what proportion are due to me? How many would be allocated for an annual season ticket? In fact one of the stations is the boundary with my robinhood annual area season tickets I use in combination. So I'm not actually making any of the 270 journeys but I doubt the system knows that. Although I was once surveyed by some travel stats people so maybe they do?
 

Watershed

Veteran Member
Associate Staff
Senior Fares Advisor
Joined
26 Sep 2020
Messages
12,284
Location
UK
I've been trying to wrestle my old laptop into being able to have a look at these. Thanks to the OP for making the sorted version available that allowed me to chop off the lowest number ones.

What I've seen so far is that I think it does break down whenever you have station groups and they would have been better just reporting it by station groups.

For instance Dorchester West to Poole are listed. I would bet that no such ticket has actually been sold. Everyone going to Poole from Dorchester goes from Dorchester West so I assume that a proportion from "Dorchester Stations" group has been allocated.

If anyone has the computing power it would be interesting to see if members of the major station groups always appear in the same proportion from long distance stations I.e. do the Birmingham stations always appear in the same realtive proportions from any station far enough away that the destination is Birmingham stations.

Also the two small stations between which I have my annual season ticket have 270 journeys. I wonder what proportion are due to me? How many would be allocated for an annual season ticket? In fact one of the stations is the boundary with my robinhood annual area season tickets I use in combination. So I'm not actually making any of the 270 journeys but I doubt the system knows that. Although I was once surveyed by some travel stats people so maybe they do?
I believe an annual season ticket is counted as 232 return journeys a year. But yes, I agree that the group station allocation data seems rather suspect, especially since it is apparently reliant on a 2002 survey - when travel patterns and timetables on many lines were completely different.
 

Top