Top of the Table

When the Chair of the House of Commons Education Committee asked Michael Gove (Secretary of State of Education at the time) about comparative performance measurement between schools, this happened:

Chair: If “good” requires pupil performance to exceed the national average, and if all schools must be good, how is this mathematically possible?

Michael Gove: By getting better all the time.

(Full transcript here)

Now, sniggers to one side, there’s a few important points here. The first is that I don’t disagree with striving to get better all the time; neither do I think performance shouldn’t be measured. I also believe it can be useful to understand apparent differences in comparative peer performance.

So, what’s the problem?

Well, it’s the way it’s so often done – league tables.

Here’s an example using police forces, although you could replace them with schools, hospitals or other institutions, if you like.

Stick Child top of the table 1

League tables are over-simplified, misleading, fundamentally illegitimate, charlatans of the performance world; they purport to convey information about comparative peer performance, when in fact they are little more than mirages. They lie to you. They tell you stuff that isn’t there. They set you off on thought processes and assumptions that are utterly unwarranted. (A bit like slightly more elaborate binary comparisons. Ugh!) But the most dangerous thing about them is that they appear so plausible.

A notable problem with league tables is that they are routinely methodologically unsound and notoriously unstable. (This is particularly true of league tables constructed from complex public sector data). Due to statistical considerations I won’t inflict on you here, it is often mathematically impossible to neatly rank institutions in the tidy fashion we are so used to (i.e. one at the top, one at the bottom, and the remainder nicely stacked in between, from best to worst). You see, in league table world, about half of those ranked end up as ‘below average’, and someone is always bottom. So not everyone can be above the national average! Why not? Because it’s an average.

What we should be doing is trying to establish if there are significant differences between peers, and this can be done very simply in a couple of ways, as demonstrated by Stick Child…

Stick Child top of the table 2

In this first example, the six police forces we saw earlier are assessed against each other, taking into account confidence intervals in the data. (Don’t worry if you’re unfamiliar with the term, just trust me that it’s important). As you can see, this tells us that two forces are performing significantly differently to the other four (i.e. there are no overlaps between the two groups). We can’t, however, neatly rank them from ‘best’ to ‘worst’, because we can’t separate the ‘top’ two from each other, and we can’t separate the other four from each other.

Here’s another way of understanding comparative peer performance in a more contextualised manner:

Stick Child top of the table 3

This time we can observe that the six police forces are all within the boundaries of ‘normality’ (by applying Statistical Process Control methodology). If any of them were outside of the dashed lines we might be concerned that particular force was significantly different from its peers; however, in this case, all six forces are clustered around the mean average (solid horizontal line) and within the range of anticipated performance for the group.

Therefore, there is absolutely no way the forces should be placed in ranked order – they are likely to move positions each time a snapshot is taken because of normal variation, but as long as they stay within the lines (and ideally, improve as a group), it is wrong to judge performance based on apparent position.

You see, when this happens, we encounter the other big problem associated with the league table mindset – concern about someone’s position in a league table leads to unfair assumptions about performance, unnecessary ‘remedial’ activity to address the perceived deficiencies, pressure from management, sanctions, and so on. And all based on something that essentially isn’t there. Cue gaming and dysfunctional behaviour! Like clockwork.

And a final thought – if league tables are constructed using crime data, are we even measuring the right thing? See this.

About InspGuilfoyle

I am a serving Police Inspector and systems thinker. I am passionate about doing the right thing in policing. I dislike numerical targets and unnecessary bureaucracy.
This entry was posted in Systems thinking and tagged , , , , . Bookmark the permalink.

3 Responses to Top of the Table

  1. trumpetmajor says:

    Here’s how we use league tables as a force for good in my partnership:

    We don’t do anything as silly as measuring our place against other boroughs/forces. That would be silly.

    What we DO do, though, is concoct Top Ten league tables of problematic premises – be they licensed premises, shops, businesses (eg betting shops, takeaways) etc. We rank them for the amount of demand they cause us on a range of indicators depending on what is causing us concern at the moment.

    We can then use this league table in a number of ways. Firstly, it helps us target resources – essentially, hotspotting. Secondly, it means we can explain to interested parties why we’re targeting resources at those places and not others – it’s an easy to understand format. Thirdly, we can use it with the premises themselves. Sometimes just seeing where they are in the league table can focus their minds on changing their approach; and if they don’t want to help us, then a league table, of, say, the ten most violent pubs in town makes a very media-friendly press release 🙂

    • That makes sense to me (it’s a different concept to ranking work units according to performance), although I’d say it would be better would be to Pareto your demand, as that would give an even clearer picture (i.e. your 7th, 8th, 9th and 10th ‘problematic’ premises might only account for 3% of demand between them, so just because they feature in the ‘Top Ten’ might not warrant any additional focus. Hope this makes sense.

  2. Cop says:

    Hampshire is currently ‘bottom’ for crime data quality. This has led to a car crash of common sense with the law of unintended consequences looming large. Officers are trawling forms for un-crimed offences and recently another 15 front line officers have been brought in to audit their diminished colleagues. Fear of the Daily Mail league table has taken the eye off the ball of dealing with criminals. The cost of 3 Inspectors and a growing army of stats minions v’s the damage caused to service delivery is heart braking. Morale cowardice and lack of senior leadership to say ‘so we lost all our crime counting staff in the cuts and this is a consequence.’

Leave a comment