Bad Performance Measurement on Tour (#1)

A friend at work told me that whilst giving blood recently, he noticed there were performance indicators on a nearby board showing the number of blood donations, in the dreaded ‘this week vs last week’ format. Nothing’s sacred, is it?

Anyway, this gave me an idea – being as there are so many examples of bad performance measurement around in all walks of life (e.g.  binary comparisons, meaningless measures, numerical targets and so on), I thought I’d start sharing a few on here as I come across them. They torment me, so I might as well inflict them on you as well.

I therefore bring you the first installment from the ‘Bad Performance Measurement on Tour’ extravaganza, courtesy of a noticeboard on the wall of my local railway station.

This poster relates to overall performance for train journeys (I think). What does it tell us? Wow, compared to last month, something has changed by over one whole percent. (Note the big ‘down arrow’). Feel the burn, you naughty workers! Is this within the range of normal variation? Who knows. Is it part of a trend? No idea.

It’s certainly below average. So are about half of most things.

Great example of a meaningless poster. It made me chuckle whilst I was waiting for my train, which incidentally was on time. Maybe next month, the percentage might be slightly above average and this will be recognised with a nice big ‘up arrow’. Hurrah!

Here’s some more, from the same ‘noticeboard of shame’.

This chart claims to be able to identify ‘trends’ by comparing two values (i.e. ‘this month’s percentage vs the annual average’). Sorry, a trend can never be established by making a comparison like this. Nil points, as they say at Eurovision. Notice the only row in the top table that is awarded an ‘up arrow’ features the lowest percentage of the four rows, plus it is also below the average. So, is this good, or bad, or indifferent? Is performance getting better or worse? No one knows. The table doesn’t tell us anything useful.

Finally, here’s the third of the ugly sisters…

This table includes another flawed binary comparison made against an average, but it also features the added bonus of a totally arbitrary numerical target of 90%. Why 90%? What science dreamed this one up? Because it’s a nice round figure? Because it was drawn out of a hat? No one knows. Note also that a train is classed as being on time if it is ‘within ten minutes of its scheduled time’. Is this what ‘on time’ means to the customer? If it departs nine minutes early or nine minutes late then is that the same thing as being ‘on time?’ Where does the ‘ten minute either way’ threshold come from anyway? Another arbitrary target.

The funniest thing about this set of measures is that by setting this punctuality target at 90%, the Passenger’s Charter is effectively saying, ‘We plan for 10% of trains to be late”. Think about it. They’re happy with 90% punctuality. It’s in their plan! That’s what they aim for. Not 100%. Not the best they can do. 90%!

The major flaw in the way these performance measures are presented is that they don’t tell us anything about the capability of the system. Neither do they inform the reader of any actual trends, or help predict what performance will look like into the future. The numbers are of no practical use to the customer whatsoever. If the train companies’ management rely on this sort of thing to make decisions they may as well determine strategy based on the National Lottery numbers.

Conversely, by relying on the right measures and presenting them in a format that exposes the extent of variation, along with any trends or signals that might be present (i.e. on my old favourite, a control chart), managers can gain an understanding of how the system is performing and therefore take evidence-based steps to improve it. Measures that are derived from purpose (as defined from the customer’s perspective) can be used to elucidate an evidence base that informs method. Publishing the sort of meaningless pap that passes for ‘performance data’ in the examples above, serves no purpose except to amuse saddos like me whilst I wait for trains.

More cases of bad performance measurement next time, as and when I find them on my travels…

By the way, if you see any similar horrors, please feel free to send me a message on twitter @SimonJGuilfoyle and I’ll post some of the daftest or most outrageous examples.

About InspGuilfoyle

I am a serving Police Inspector and systems thinker. I am passionate about doing the right thing in policing. I dislike numerical targets and unnecessary bureaucracy.
This entry was posted in Systems thinking and tagged , , , , , , , , , . Bookmark the permalink.

8 Responses to Bad Performance Measurement on Tour (#1)

  1. 90% trains on time? Good!
    1 in 10 trains late? Bad!

    Does the doctor tell you the op has a 90% survival rate, or does he tell you there’s a 1 in 10 chance you might not survive? Feeling lucky?

    Do these figures actually mean anything to the travelling public?

    Punctuality is important. But there are other things that are also important to passengers. Getting a seat. Getting there. Cleanliness. Prices. Connections to other routes.

    In order to hit these targets, we can expect the usual games. (Oh, and a load of people employed to count, check, and churn the data.)

    When does a train actually become late? Perhaps we should allow a 5 minute margin before we classify a train as being late (just to be sure it is really late). What if it’s late for reasons beyond the company’s control? What if the train is late on the first half of the journey, but makes it up on the second half? Is it late, half-late? or on-time? OK if you’re going all the way, but not useful if you are travelling the first half only.

    Then there’s the pressure to hit the targets. Sending out a train that’s dirty, toilets not working, no food service, etc becomes more important than sending it out late. That may be OK for a short commuter journey, but not ok for a long distance route.

    How do you calculate the 90%? Is it an average of all the train journeys across the whole network? Maybe some routes are 98% on time, all the time. But others are 70% on time, all the time. Rather than fix the 70%, I just need to make sure the 98% routes keep to time. Perhaps I can adjust some of the train routes, so that I can cut the number of late ‘routes’, and bump up the number of prompt ‘routes’. Perhaps delete some of those more difficult services. Bingo, a performance improvement just by counting things differently.

    This is the age of the train, as someone used to say on TV.

  2. yo mo says:

    Let’s have a stab at the 90%.

    It’s a real world process, ergo there will be variation.
    One way to fix it would be to build in slack, by running the trains at half of their capable speed. So if there was a delay at any stage it could be made up before the next way point. But that would double journey times so probably not going to be popular. [Lower speed would also reduce fuel consumption, wear and tear, and could increase system capacity, but thats by the by.]

    Where was I? Ah, variation.
    Plot the arrival times as a Bell Curve [Proper chart that, not one of those namby, pamby, soft, southern Control efforts 😉 ] and you get a spread. Now if you (mis)apply a little bit of Pareto, what comes out is 10% early, 10% late and 80% within bounds. Early is not a problem, nobody complains. So if we lump together the early and the within bounds, Bingo! 90%.

    Is it a valid method, probably not, but it does get passengers a refund , which is better than the old system, trains late, hard luck.

    Why do they measure it, the contract says so, why publish it, the contract again. Though I doubt the contract says how precisely they should do so.

    Having been behind the scenes at a TOC for reasons much too dull to relate, they do keep a lot more data. Most of which is not for public consumption, especially the events that end in meaty chunks.

  3. Pingback: Living by numbers | BrownhillsBob's Brownhills Blog

  4. Pingback: British Policing: A Crime In Progress! « Dave's Bankside Babble

  5. NickQ says:

    Simon, what is your view on single crewing policies? I’m not opposed to them as such, I just doubt whether anyone has actually thought about them in other than purely “accounting” terms rather than as part of a system, let alone what the public actually want from us…

    • Hi Nick, I suppose it depends on the type and frequency of predictable demand. I see examples of double-crewed bobbies dealing with straightforward front office jobs together, which in my opinion is inefficient, but on the other hand I’m not convinced that single-crewing in a large rural force (for example) makes sense due to the obvious risks involved. From a systems point of view, although it might at first appear to be more economically viable to single crew, the reality is that if a single crewed officer makes an arrest, then a second car is needed to attend to provide an officer to allow safe transport of the prisoner, leaving one police car in situ. Assuming the car doesn’t get smashed up, then there will often be a need for a third car to take the second bobby back to his or her car, so if I’m honest it seems like a bit of a false economy. (i.e. what might seem as a good financial option at first actually ends up costing more). The best way to tackle it in my opinion is to learn about predictable demand then afford Sergeants and Inspectors the latitude to deploy their resources appropriately in order to provide the best service to the public.

      • NickQ says:

        Thanks Simon, pretty much my own view, just wanted to check I’m not completely mad before I challenge people on this issue! Keep up the good work, there are lots of us who agree with you.

  6. Pingback: Back on target « Becoming Better

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s