White House Office of Science and Technology Policy Opens Inquiry into Open Science

The meaning of “open science” is in the eye of the beholder. In other words, everyone favors more openness in science, but there are many views on what “openness” entails and how scholarly and scientific publishing should get there. For some, it is about publications. For others, it is about data. For others still, it is about peer review. And for yet others, it is about the software code used to conduct the research and develop the data. For many, it is about all of these. What it means to the US Government, however, is currently up for grabs.

Wednesday, May 6, was the due date for comments under the Office of Science and Technology Policy’s (OSTP), “Request for Information: Public Access to Peer-Reviewed Scholarly Publications, Data and Code Resulting from Federally Funded Research”, 85 Fed. Reg. 9488 (2/19/20). The OSTP, which sets science and technology policy for the Executive Branch in the United States, has been struggling with various constituencies since at least December 2019, when it came very close to removing its current twelve-month embargo on mandatory deposit and public accessibility of post–peer-reviewed scientific articles based on research funded by the US Government. It did so seemingly without consultation with many of the key stakeholder groups, including affected learned societies, publishers and, apparently, Congress. After pausing in December, the OSTP has held several invitation-only, “no notes allowed” confidential meetings, reporting these out through its Twitter feed. One result of this unusual process is that the only opportunity for those not selected to engage on these issues—including CCC—is the above-referenced RFI.

Process questions aside, in the RFI OSTP asks excellent questions about how the government can advance open science and, indeed more importantly in the days of COVID-19, science generally. Without taking up the space to repeat the questions here, the government essentially asks what it can (or ought) to do towards promoting open science, better communication of science results, and higher-quality science overall, looking specifically at publications, research data and code.

There are many ways that the US Government can help bring about improvement in the openness and accessibility of good science. For example, open data would help alleviate what is termed the replication crisis in science. As we know, humans bring bias into research. Machine learning and machine interpretation of data is subject to even greater risks of bias, both because flawed humans write the original software and select the input, and because algorithms may further amplify the bias. For science to be reliable, data must be available so that others can review it and identify errors in interpretations. Where code is involved in the research, others should be able to review the code, instructions and input criteria to search for error and bias.

As to data, let’s face it—data is messy. (‘Data are messy’? Never mind that right now.) For users to be able to trust and use open data, there needs to be some “dataset check-up” or other verification mechanism available so that they can be sure the data has not been manipulated. Ensuring quality in this way is a function that publishers provide with respect to final “versions of record” (VoRs) for articles, and one which will almost certainly involve a combination of government mandates, financial and structural support. Governments acting alone, or through public/private partnerships, could ensure similar levels of quality for data.

Another huge win for more open science would result from the US Government providing meaningful incentives to encourage publication of confirmatory, negative and null results from US-funded research, along with the underlying data and code upon which such results are based. In times of pandemic, this necessary market intervention could prevent researchers from wasting time on failed theories. The publishing models exist for dissemination, especially gold road open access, but the researchers themselves need to be rewarded—not punished—for discovering and then acknowledging what does not work.

Finally, if the goal is immediate open access to scientific publications, there is a direct path to its achievement, should the US choose it: The federal government could fund the private investment made in articles by publishers through the payment of article publication charges. Given history and current budget priorities this may not happen, but it stands as a logical extension of “open it all, and all at once” thinking.

Publishers who maintain subscription and mixed subscription/open access models have generally opposed the elimination of the currently applicable 12-month embargo, while fully open access publishers seem to support the elimination. I suspect that the fully open access publishers hope that “no-embargo” mandates will accelerate the move to full open access, which is a worthwhile goal for those disciplines and publishers who can make the switch. Moreover, it is certainly possible that a significantly reduced embargo period might lead to an acceleration of the switch to gold open access.

On the other hand, as the recent history of the newspaper industry indicates, uncompensated reuse of high-quality content online sometimes leads to unintended, and very negative, consequences for the continued production of such content. Now – perhaps especially now – we are justified in urging caution regarding unintended consequences to mandated changes to the process of scientific dissemination. As Senator Thom Tillis (R-NC), the chair of the Senate Subcommittee on Intellectual Property, wrote in his response to the OSTP’s RFI, rescission of the current one-year embargo period (presumably if not accompanied by funding or a new dissemination model) “…would be harmful to American intellectual property interests and, as a result, would jeopardize the American exports, jobs, research, and innovation that our intellectual property system supports.”