I hadn't seen this thread previously and now, looking back through the postings (unless I've missed it - apologies if I have), I can't see an answer to the original question, so am offering one now.
The primary reason that 156 units can go places that 158s can't is because the 158 is fitted with a yaw damper, which extends outside the gauge and therefore limits its use. This presumably applies on the West Highland.
As a company looking to introduce new trains, gauge is a primary issue and we have studied this for some time. Network Rail has an aspiration to have a complete gauging database for the whole network, enabling absolute gauging (see Note 1) to be adopted for new trains, but is some years away from achieving that goal. In the meantime, the data that NR offers has a caveat of a 100 mm (4") tolerance, particularly in the six foot (i.e. the distance between one track and the adjacent one). This is no use whatsoever to a train designer because it potentially means that the train must be 100 mm narrower than it actually needs to be.... bonkers!
At present therefore, it is necessary to adopt comparative gauging (see Note 2), meaning that it is necessary to find a good comparator vehicle. When we carried out this exercise by trawling through the Sectional Appendices, we found that the most "go anywhere" train on the network is the Class 156 DMU, it being able to go places that (for differing reasons), Classes 14x, 150 and 158 can't. For this reason, our 23 metre products for the UK closely follow the existing Class 156 profile.
Note 1: absolute gauging: where a computer model of the dynamic
performance of the new train is superimposed on the computer model of
the infrastructure.
Note 2: comparative gauging: where a computer model of the new train is
compared to that of an existing train which already operates over the
routes in question.