Those of you outside the UK might be unfamiliar with REF, the Research Excellence Framework.
Lucky you.
This is the periodic evaluation of how good British-based academics are at research. With all the obvious issues about what they might mean and how you measure it.
It’s not as errantly stupid as TEF, its newer teaching equivalent, since outputs – articles, books, etc. – actually get read by people working in the field and there’s acknowledgement of the various different aspects of research and dissemination (‘impact’) involved.
The most recent cycle – REF2021 – just reported its results last week, to a fanfare of, well, of not much, other than many, many tweets by over-excited marketing units. No media coverage, no government statement, nothing to reflect on either the significant effort that went into making it happen, or the breadth and depth of high-quality research that it found.
(Oh, some institutions have got in early in using REF as a moment for restructuring, doubtless to be followed by others, to which I can only ask you to sign the petition to help the excellent colleagues at DMU).
But enough prelude: let’s look at the results for Politics and IR. The Times Higher have a more accessible datasheet here, with Strathcylde and Royal Holloway both jumping a long way up the rankings from 2014 to top off the list on a GPA of the three main elements: outputs, impact and environment (funding, doctoral students and a descriptive statement of activities). LSE round out the top three.
Several points are notable at this early stage.
Firstly, there has been a strong trend up in overall GPAs (out of 4). That might sound unsurprising, given that the general optimising that everyone does in such situations, but there has been a big change in methodology from last time.
In 2014, universities could chose who to submit, and so often cherry-picked staff to bump up GPAs: the calculation was that the central government money they’d get from all this (calculated by multiplying GPA by staff numbers returned, with a marked weighting on higher-evaluated results) would be higher this way, plus everyone loves to do well in a ranking list. However, not only was this rather exclusive, but some universities felt their true worth was being hidden by this, so there was a (successful) push to return all research-active staff this time.
So you’d expect the inclusion of staff who’d not previously been returned might drag down GPAs.
Nope.

As you can see here, almost all those units returned in 2014 (51 out of 56) out-performed this time around: only two dropped. The overall GPA went from 2.83 in 2014 to 3.10 this year.
What that rise reflects is moot. It’s reflected in the overall performance by institutions, so it’s not just a Politics/IR thing, certainly.
Second notable outcome was that, as in 2014, results in this panel reflected a stronger performance by larger units.

If we log-plot number of staff (by Full Time Equivalent) against overall GPA, then the tendency is clear: very few units below 20FTE got over 3.0 GPA, while a similar proportion over that size got below 3.0
The reasons for this aren’t so clear, especially as it’s not a pattern found in all other panels. Moreover, the size effect is found in all three elements of the assessment. This might be more understandable in impact and environment, where size allows for more capacity to target these activities than is possible in a smaller unit where the division of labour is a theory rather than a practical possibility. But it’s also found in outputs, which seems more odd.
Outputs are read individually and nominally without reference to the ‘quality’ of the outlet they are published in. In addition, units had to return an average of 2.5 outputs per FTE, so a bit less than the equivalent of 1 article every other year, so the time-scarcity in smaller units might have been expected to attenuate. However, you can still see the correlation in the graphs below.



It’s still early days in the post-result process and doubtless I’ll be coming back to this again, but comments are always welcome, not least as the nature of the next REF exercise is still open for discussion, so insights now might be of use further down the line.
I don’t know who originally said this, but: when a measure becomes a target, it ceases to be a good measure.
I looked at the “Key Facts” infographic at https://ref.ac.uk/. The percentages on the quality of submissions reminded me of Lake Wobegon, a place where all the children are above average.