The REF’s star system leaves a black hole in fairness

I smiled and suppressed a?giggle.

Other members of the panel were less amused. After all, the rating and ranking of academics’ outputs is serious stuff. Careers – indeed, the viability of entire departments, schools, institutes and universities – depend critically on the judgements made by peers on the REF panels.

Not only do the ratings directly influence the intangible benefits arising from the prestige of a high REF ranking, they also translate into cold, hard cash. An by the University of Sheffield suggests that in my subject area, physics, the average annual value of a 3*?paper for REF?2021 is likely to be roughly ?4,300, whereas that of a 4*?paper is ?17,100. In?other words, the formula for allocating “quality-related” research funding is such that a paper deemed 4* is worth four times one judged to be 3*; as for 2* (“internationally recognised”) or 1* (“nationally recognised”) papers, they are literally worthless.

成人VR视频

We might have hoped that before divvying up more than ?1?billion of public funds a?year, the objectivity, reliability and robustness of the ranking process would be established beyond question. But, without wanting to cast any aspersions on the integrity of REF panels, I’ve got to admit that, from where I?was sitting, Professor Aspire’s tongue-in-cheek answer regarding the difference between 3* and 4*?papers seemed about as good as any – apart from, perhaps, “I?don’t know”.

The solution certainly isn’t to reach for simplistic bibliometric numerology such as impact factors or ; anyone making that suggestion is not displaying even the level of critical thinking we expect of our undergraduates. But every academic also knows, deep in their studious soul, that peer review is far from wholly objective. Nevertheless, university senior managers – many of them practising or former academics themselves – are often all too willing, as part of their REF preparations, to credulously accept internal assessors’ star ratings at face value, with sometimes worrying consequences for the researcher in question (especially if the verdict is 2* or less).

Fortunately, my institution, the University of Nottingham, is a little more enlightened?– last year?it had the good sense to check the consistency of the internal verdicts on potential REF 2021 submissions via the use of independent reviewers for each paper. The results were sobering. Across seven scientific units of assessment, the level of full agreement between reviewers varied from 50?per cent to 75?per cent. In other words, in the worst cases, reviewers agreed on the star rating for no more than half of the papers they reviewed.

Granted, the vast majority of the disagreement was at the 1*?level; very few pairs of reviewers were “out” by two stars, and none disagreed by more. But this is cold comfort. The REF’s credibility is based on an assumption that reviewers can quantitatively assess the quality of a paper with a precision better than one star. As our exercise shows, the effective error bar is actually ±?1*.

成人VR视频

That would be worrying enough if there were a linear scaling of financial reward. But the problem is exacerbated dramatically by?both the 4x multiplier for 4*?papers and the total lack of financial reward for anything deemed to be below?3*.

The Nottingham analysis also examined the extent to which reviewers’ ratings agreed with authors’ self-scoring (let’s leave aside any disagreement between co-authors on that). The level of full agreement here was similarly patchy, varying between 47?per cent and 71?per cent. Unsurprisingly, there was an overall tendency for authors to “overscore” their papers, although underscoring was also common.

Some argue that what’s important is the aggregate REF score for a department, rather than the ratings of individual papers, because, according to the , any wayward ratings will “wash out” at the macro level. I?disagree entirely. Individual academics across the UK continue to be coaxed and cajoled into producing 4* papers; there are even dedicated funding schemes to help them do so. And the repercussions arising from failure can be severe.

It is vital in any game of consequence that participants be able to agree when a goal has been scored or a boundary hit. Yet, in the case of research quality, there are far too many cases in which we just can’t. So the question must be asked: why are we still playing?

A fitter rival would soon make the REF extinct

The UK’s research excellence framework is slow, expensive and disruptive. The time and technology is ripe for a better alternative, says James Tooley

By James Tooley

11 April

No strike over Liverpool REF targets – but ‘anger’ remains

Liverpool UCU members supported strike action, but vote failed to meet turnout threshold

By Nick Mayo

17 April

Do Australia’s ERA discipline assessments really measure research excellence?

Frank Larkins calls for more transparency in how the Excellence in Research Australia exercise uses global benchmarks to measure improvements in science and humanities research

By Frank Larkins

17 June

New Zealand research evaluation ‘sidelines locally focused work’

Academics say evaluation conducted to decide distribution of Performance-Based Research Fund is no longer helpful

By John Ross

6 May

Reader's comments (4)

#1 Submitted by rejs71 on June 27, 2019 - 2:33pm

The answer would appear to have more independent peer-reviewing, assessing work with an external eye with no skin in the game. Obviously this has implications for time and resource management, though if you want a level playing field beyond reproach what other alternative do you have?

#2 Submitted by David Parker2 on June 27, 2019 - 4:36pm

REF should be abolished and a different way of funding university research should be found. There is no evidence that it has improved the quality of research or genertaed more research with an enduring impact. THe ancient methodf of distribution of funds by a University Grants Committee was certainly no worse and would liberate the universities from a wasteful bureaucracy and scholars from a philistine exercise in counting.

#3 Submitted by ke.chen... on June 28, 2019 - 10:01am

In my opinion, a fairer way is to have more independent experts from other nations in panels.

#4 Submitted by skiing on August 14, 2019 - 7:53am

"The solution certainly isn’t to reach for simplistic bibliometric numerology such as impact factors or SNIP indicators; " Why not? It would be no less flawed than the current system and free up around ?0.25 billion.