State of Open Data 2022
The State of Open Data 2022 survey was conducted in May – July 2022, and received over 6000 responses by respondents from across different regions, in different fields of interest, and at different career stages.
Most trends are encouraging around the adoption and acceptance of open data. A majority of the respondents are, in general, pro open science (Digital Science, 2022).
Figure 1: Respondents’ attitudes towards open science
1. Authors’ motivation
In the State of Open Data 2022 survey, the top circumstances that would motivate respondents to share their data were revealed to be:
- Citation of their research papers – 67%
- Increased impact and visibility of their papers – 61%
- Some form of public benefit – 56%
- Journal/publisher mandate – 56%
A recent paper in PLOS Biology discovered that publishing the data in a repository is associated with a 25.36% increase in citations to the paper itself (Colavizza et al., 2020).
In the survey, 66% of the respondents who previously shared data stated that they received some form of recognition for their efforts – most commonly via full citation in another article (41%) (Digital Science, 2022).
[Image curtesy of いらすとや]
2. Mandates from funding organizations and policymakers
There has been a growing number of open data mandates from funding organizations and policymakers. 70% of respondents were required to follow a policy on data sharing for their most recent piece of research (Digital Science, 2022).
In the chapter “The US National Institutes of Health’s policies, programs, and partnerships to enhance data discoverability and reuse”, the National Institutes of Health (NIH) explains its new Data Management and Sharing (DMS) Policy which would take effect from 25 January 2023, in compliance with the White House Office of Science and Technology Policy (OSTP) Memorandum (Digital Science et al., 2022).
The NIH, the world’s largest funder of biomedical research, will have a new DMS Policy, which requires all NIH supported research to have a data management and sharing plan, outlining how scientific data and accompanying metadata will be managed and shared. The NIH strongly encourages the use of open access data sharing repositories as a first choice. The interconnected data assets would enable stewardship of relevant research data, with the ability to measure scientific impact through metrics for usage and utility.
3. Support in data management
72% of respondents would rely on an internal resource (research offices, peers, librarians) to help with managing data or making their data open. On the other hand, 41% would like to receive support from publishers for help in reviewing, curating, and preparing their data for public use; 38%, from their own institutions.
In the chapter “Preparing for South Africa’s proposed open data strategy: Lessons from Stellenbosch University”, Samuel Simango discusses how Stellenbosch University responded to a proposed national policy and built a data repository.
In April 2021, the Department of Communications and Digital Technologies of the South Africa published an invitation for the public to submit written submissions on a Proposed National Data and Cloud Policy. Two of the proposed interventions are relevant to open data:
1. The development of an open data strategy for the sharing of data that is informed by ‘Data for Good’ principles.
2. The application of the FAIR Data Principles to South Africa’s open data.
Addressing both the ‘Data for Good’ and the FAIR Data Principles, Stellenbosch University developed a data repository – also known as SUNScholarData. The data curation process would include data appraisal, assigning and managing metadata, and facilitating access to research data.
[Image curtesy of いらすとや]
In the chapter “Understanding and supporting data sharing in the humanities: new insights from a publisher survey”, the authors from the publishing industry shared their insights in open data for humanities researchers.
A public survey conducted in Spring 2022 revealed that the respondents did not agree with policy mandates, and did not use data repositories. However, a majority of the respondents had experiences in sharing research data with others:
- By personal/email transfer – 76.95%
- In a data repository or online database – 36.88%
- On a website – 28.01%
Figure 2: Reported methods of sharing humanities data (Digital Science et al., 2022, p. 26)
The survey responses identified points of improvement for how publishers educate, and facilitate regarding data sharing practices, and it also revealed an overall enthusiasm and support for data sharing in principle.
The State of Open Data 2022 report revealed the current trends in open data and open science. While the current trends are encouraging about open data, governments, funders, researchers, intuitions, and publishers continue to work together to accelerate sharing of data.
To read the full State of Open Data 2022 report, please visit: https://doi.org/10.6084/m9.figshare.21276984
The HKU Libraries also provides support in managing research data. Check out the following tools:
- DataHub – the cloud platform for storing, citing, sharing, and discovering research data and all scholarly outputs for HKU scholars (https://datahub.hku.hk/)
- Guide on DataHub (https://libguides.lib.hku.hk/researchdata/datahub)
- Guide on research data management (https://hub.hku.hk/researchdata/rdm.htm)
Colavizza, G., Hrynaszkiewicz, I., Staden, I., Whitaker, K., & McGillivray, B. (2020). The citation advantage of linking publications to research data. PLOS ONE, 15(4), e0230416. https://doi.org/10.1371/journal.pone.0230416
Digital Science. (2022). The State of Open Data Report 2022: Researchers need more support to assist with open data mandates https://www.eurekalert.org/news-releases/967428
Digital Science, Goodey, G., Hahnel, M., Zhou, Y., Jiang, L., Chandramouliswaran, I., Hafez, A., Paine, T., Gregurick, S., Simango, S., Palma Peña, J. M., Murray, H., Cannon, M., Grant, R., McKellar, K., & Day, L. (2022). The State of Open Data 2022. https://digitalscience.figshare.com/articles/report/The_State_of_Open_Data_2022/21276984