Rankings, metrics, scores; numerical methods are so widely used at the moment to judge and analyse different systems that we often forget there are alternatives. A number is objective, and plugs nicely into algorithms to allow better assessments of a whole range of human interactions – and science is no different. Whether h-index, m-index, citation score, or even your Researchgate number, scientists are often gauged on their performance by these numerical metrics. The relative merits and disadvantages of these scores have been widely described, but one relatively under-discussed aspect is the very basis of how they’re calculated – which, for the most part, is the research papers the scientists have themselves written.
The professor and philosopher Marshall McLuhan famously argued that ‘the medium is the message’ – that the medium by which information is communicated affects the information itself. For example, the introduction of the telegram didn’t just allow people to send messages over long distances – it allowed news to be reported immediately across continents, irrevocably changing the type of news that was reported; in other words, it changed the society that it was used in. In other words, technology is rarely neutral in its social effect. A simpler example: the electric lightbulb allowed people to see and work at night – giving society the chance to change working hours.
Scientific journals and the papers therein have existed in some form or other since the mid-17th century. Many aspects have changed, including the important advent of peer-review in the early 19th century, but at their most simple terms they remain the same: written text, published, and therefore unchanged after they go into print. Rarely do we ask what this medium means for the research that is published and the broader communication of science. In particular, if scientists are judged on what they’ve authored, what are the implications of this medium for our metrics?
A published paper is by it’s nature static, unchanging, and part of the historical record. Contrast a paper with a lecture, or conference presentation, for example; a public talk is seen once by an audience, but unless it is recorded, then it cannot be judged again or referred back to. A paper, on the other hand, can be referred to and cited from the point it is published. These aspects are essential to modern science; we must have record of prior work, in order to justify the assumptions within novel studies.
However, once published, a research paper is left, unaltered. This stands in contrast to science as a whole; no theory should go unquestioned, and new hypotheses should redress the issues with prior studies. Is it fair, then, to judge a scientist upon older papers that may have been disproved – even by the researcher themselves? Given the tendency for older papers to be superseded, how are we to factor this in when assessing a researcher’s oeuvre? If a journalist pens a series of articles on an event that is still ongoing, would it be fair to assess them on pieces published before all the facts emerged? The parallel to science is clear, with the important caveat that scientific research is always evolving.
The tradition of published research predates Karl Popper, the philosopher of science, and I would argue that some aspects of the medium are contradictory to the way he argued science should be conducted. Popper argued in the first half of the 20th century that for a statement to be scientific, it must be falsifiable. Providing definitive proof of a statement is not logically possible, due to the problem of induction, and as such science should offer only falsifiable hypotheses that represent the best current understanding of a problem.
Other thinkers have added to this notion. I particularly note Imre Lakatos’ contribution. In his paper Falsification and the Methodology of Scientific Research Programs he suggests that
“Intellectual honesty does not consist in trying to entrench, or establish one’s position by proving (or ‘probabilifying’) it intellectual honesty consists rather in specifying precisely the conditions under which one is willing to give up one’s position.”
If one is judged by the research that one has published – and in particular the amount of citations your work receives – it hardly incentivises you to state the precise conditions under which you’d be prepared to admit you’re wrong. In fact, it encourages the opposite behaviour, since a more embedded idea will likely stick around longer and accumulate more citations. The essence of science (at least since logical positivism was largely discredited in the last 100 years) is to embrace being wrong in the search for a deeper understanding of the world at large, but this is certainly not mirrored in our publication model.
So how can we address this? Evolving scientific understanding could benefit from evolving accounts of science, moderated and curated by researchers. The internet provides us with a platform for continually updating our understanding, and in a fundamentally collaborative way. Wikipedia is a clear example of just such a platform, where the state of the art can be continually adjusted and revised. A hypothetical ‘unified compedium of knowledge’ could operate and evolve much like the code bases of tech systems; changing in response to new discoveries, but where archived versions could show the evolution of ideas.
“But wait,” I hear many scientists interject, “what about peer review? How can we trust content that isn’t reviewed?” In response, I would again turn to the philosophers of science. Why should a one-time peer review guarantee the long-term validity of a study? Contrary evidence could arise at any later point (and this indeed is the problem of induction), and I would argue that instead of a one-time review, we should consider all work critically at all points, whether before or after peer-review. This attitude would naturally lend itself to a perpetually updated repository of knowledge.
One can imagine, of course, that this kind of project could rapidly stagnate; if researchers disagree, they could demonstrate the false nature of each others’ ideas without significantly contributing to the knowledge-base. Here, Lakatos can offer some guidance. He suggests we should only consider a theory falsified if an alternative theory is provided, that can both explain the existing observations and “predicts novel facts” – i.e., improves over the prior theorem.
This centralised model would (in my eyes at least) increase collaborative work, and since new theorems would have to explain the existing observations, there is an in-built mechanism to encourage testing of the reproducibility of findings. Continual review and improvement would also be immanent. Open access to data and methods would be necessary to this kind of model, and it would be necessary to train authors and contributors to state the conditions under which their findings would be falsified.
Metrics to gauge the contribution of individuals to this kind of project would not be dependent on how fast a given field evolves, but they would likely look significantly different to those we currently work with. However, the amount of data that would be generated would provide ample opportunity to assess researchers in a fundamentally different way.
It remains to be seen whether such a model would even be conceivable. Science changes slowly, and the objectives that funding agencies look for are not necessarily aligned; universities may not be appreciative of researchers sharing insight with competing institutions, much like commercial entities try to avoid corporate espionage. But if we genuinely value the advancement of science rather than local politicking, then these kind of concerns should not prevent a shift.