Determinism in research code
If you don't feel comfortable with defining what science is, please have a read of my post what is science as this article builds upon those concepts.
The last few years have been a dark time for people interested in a pursuit of the truth. Politics and economics are downstream from culture, and the culture in many places has shifted away from a pursuit of the truth. Corrupt money along with bad incentive structures has made it much more economically profitable to be intellectually dishonest than a few decades earlier. There's a lot of reasons why this is the case, many of which are a direct result of the completely terrible incentive structures that exist in modern academia. Following the money and power structures goes a long way to understanding what's going on.
Culture is very important and the culture around academia has most certainly shifted in the last few decades. One aspect of this that has had an enormous impact on the world is the falling of academic standards that has occurred in many institutions around the world. This isn't exactly a local problem but I think is a result of a variety of both intellectual and economic trends.
Many people say they believe in "science" without knowing what science actually is, it even became a trendy throwaway phrase to say you believe in "the science" at one point around the pandemic era. This was, and is, often a performative statement and not one that represents an actual belief in the scientific method. Part of why these statements were often a strong signal of ignorance is because science is about maintaining a careful balance of skepticism and belief and thus proclaiming a complete belief by fiat that what other people tell you is the truth is the truth is strong evidence of a mindset that lacks the intellectual curiosity required to actually do science. Basically a lot of people will publicly say they believe in the institutions of science but really will reveal their true preferences via their actions or when they meet resistance when it comes to the actual philosophy of science. Most of the time I've heard people making reference to science in the last few years has been deeply twisted, in many of these conversations "the science" was a proxy statement for an application of political power, something that any good science must be impartial to.
The personal cost to those people who wish to uphold academic principles has unfortunately increased in the last few years. In recent years we have seen multiple cases where universities have not backed the tenured staff in matters of academic integrity. We have also seen many universities succumb to pressure from activists who have positions not based in reality. In some cases this has been so bad as to have angry mobs, quite literally, demanding the resignation of tenured staff based on the results of their research and seeing universities fold to that pressure. We have also seen the harassment of academic staff by the creation of false accusations as well, and on some topics the universities have not been supporting proper process let alone the staff that work in their departments.
While there's some systematic reasons the sciences have perhaps fared better than other areas in this broad assault on the very idea of truth they are still facing pressures to turn away from academic integrity. The sciences have not been immune to the cultural shifts and we have seen an enormous replication crisis emerge in the sciences. There has not been a huge uptrend in retracted papers either, though we are starting to see sites like Retraction Watch emerge. Worse still is that in some institutions this lack of rigor has become normalized, as long as the grant money and research funding keeps rolling in many departments are turning a blind eye to poorly conducted or unethical research. There's a lot of economic and realpoliticking reasons why this continues. One of the most terrible policy mistakes that we will be paying for in the decades to come is the practice of connecting certification and credentials to economic rewards and in the worst cases to immigration status. When I was at Melbourne University I saw the beginnings of this dual track system begin. There was one group of people actually interested in the subject matter primarily and other that was really only there as a university degree was an important hurdle to get past. As time has gone on more jobs have started to require degrees as a condition of employment and this has caused many people who would never have gone to university to be compelled to do so. Unfortunately requiring people to have degrees has increased wealth inequality quite considerably. But the less obvious thing, at least initially, was the strongly corrosive impact that this had on academic integrity in the tertiary education sector. When you have a pay-to-play system where students are now referred to as "customers" it is much harder to hold your ground on traditional core academic standards. If you'll literally get kicked out of the country if you don't get some papers published there's a very strong incentive to do whatever it takes to get those papers published. If you are a staff member there's very strong incentives to not complain about the foreign student gravy train, even if in specific cases there are really academic integrity issues with a student. It shouldn't come as any surprise that there's a very strong incentive here to either outright cheat or to cut back on the quality of the research, this is especially so if you know you won't be held accountable. After all you don't get rewarded for doing higher quality research compared to pumping out more output. Perhaps one of the largest conflicts this is creating is that people are getting prestige and funding based on how much they are publishing rather than the quality of their publishing.
The power of the scientific method is built on top of the mistrust of experts. Experts are fallible and can get things wrong, and potentially for a very long time. Quantum physics is a great example of this, a lot of very intelligent experts didn't understand how this worked for a very long time. Progress in the field required extending the knowledge past what all the collective of experts knew in the past, such is the nature of scientific process1. The idea is that you can verify for yourself the truth of claims and as such an extremely important part of science is reproducibility. If an idea requires the trust of experts then the idea by definition lacks scientific rigor. Replication is an incredibly important part of any scientific research work, if the work can't be replicated it can't be falsified and if work can't be falsified it is not science.
Replication of results that involve code
As computers have got more ubiquitous we have started to see computer programs become more prominent in research. With this comes the challenge of dealing with computing machinery, something that I think should be treated in some similar ways to other lab equipment. One of the interesting examples of replication issues is in papers that require code.
Unfortunately far too few papers that draw results from code include their source code. I think if you build results directly on top of code and don't supply that code you are demonstrating a lack of scientific integrity. I would go so far as to say not including the code with the paper is intellectually corrupt. In my mind this is like running a physics experiment and leaving out details of the lab equipment and set up, the details are crucial details because they enable reproducibility by others. I think the terrible situation that has developed with printed academic journals and the toxic freeloading predatory organizations that run some of them has contributed to this.
Fortunately many people do still have integrity as shown by the Papers with code project, that shows which papers do have code available. But they are currently working against strong economic incentives to cheat and deceive in the quest to pump out more research papers. There's many people who are subject to extreme pressures to publish more papers, the publish-or-perish situation is all too real and causes an enormous amount of stress to researchers with integrity. Unfortunately it's these economics that cause many of the issues and these issues will remain as long as the incentive structures reward poor integrity as much as they currently do.
Unfortunately even if you do supply the code there's a lot of issues with reproducibility due to a number of hard to deal with issues. One of the main ones is determinism of code that involves random number generation.
Random number generation on computers is a very interesting topic, and there's a surprisingly large amount of impact on results due to this, including in projects where none of the first-party source code has any random number generation as part of it.
When I was working on the Persephone project one of the core goals we had was to make it as easy as possible for people to reproduce results if they wanted to. This turned out to be a substantial pain in the ass for various reasons. I fully understand why people don't do this if they aren't required to, there is a lot more effort involved. But I think if the code is widely used the effort is entirely worth it in terms of the extra value it creates for the field of study.
The biggest sources of problems came from the fact that the libraries we used, in particular Tensorflow, did not provide deterministic results in the versions available at the time. It's getting better as time goes on but there's still the possibility of executing the exact same source code with the exact same random seed set on two different computers running the exact same versions of libraries with the exact same Python version and getting different results. This is of course very annoying, and we did try to go to a best-effort to deal with this. But I think the fact we went to some effort was very important in terms of scientific integrity.
Now if something really mattered in terms of exact replication this would be a major concern, a concern that would be amplified greatly by other research that builds on previous results. This is why I'm glad to see things like this project to track determinism issues in Tensorflow.
But what about the researchers who have non-deterministic code? Is it politically feasible for them to actually go and fix it? What happens if bugs in their code fundamentally changes the result of their research? Are they able to take the time to build more deterministic code in the first place?
When I ask these questions of people I often get responses along the lines of "we would love to do this but we don't have enough time to do it". We need to fix the incentive structures of research jobs so that people are not punished for doing quality research.
-
In practice scientific process is a messy affair because of the incentive structures of academia and the psychology of people. To build your reputation based on a particular theory means you are staking a lot of time and effort on an idea. If this idea turns out to be incorrect it's very hard for people to pivot away. Psychologically this should be fairly obvious, it's hard to admit to yourself that your life's work is wrong, but this does happen. It's also hard to step back from an idea if you have structural incentives that reward you for not stepping back. This often happens when people have prestigious research labs or staff under them, to reject the idea is to lose this prestige and this staff. Thomas Kuhn wrote extensively about this and I'd recommend reading more of his work if you are curious as to how scientific paradigms have changed over time, including the real practical reasons why there's generational dynamics in the sciences and other fields. ↩