How can we ensure that BOLD data become public as quickly as possible?

I would like to open discussion on the data sharing policy and principles. While this is a topic that matters for the entire iBOL and BIOSCAN, I believe each nation also should have a clear strategy on that. I had some discussion on that topic with Donald, and we decided to continue discussions here. I would be grateful for active discussion by all of you.

In Finland, we have decided to make ALL barcode data, including sequences and metadata, public almost immediately. A short embargo will be applied to validate the data correctness after the sequences are available in BOLD, but after this step, the data will be made public regardless of their publication status (in scientific connections). In Finland, barcoding is financially supported by the governmental funders (presently Academy of Finland) that means that we use tax payers’ money to generate barcodes. Data generated by public money also should me made available to the public and not kept private by researhers for a long time.

This principle may of course result in a situation which an individual researcher may find undesirable as the data (perhaps based on specimens collected by her/him) is made available for the other researcher before it is published in a scientific connection. However, I find this risk, especially what comes to the taxonomic research, rather small. I like that we should recognize the importance of broad accessability of data as indicated by the point 4 of BIOSCAN Strategic plan that states:

“Ensure that data in BOLD and mBRAVE are to the fullest extent possible, well documented and accessible under open licenses and following the FAIR principles: Follow best practice internationally for accessibility and reuse of BIOSCAN data.”

If I understood it correctly, also the FAIR principles are helpful in protecting using data generated by other researchers in an undesirable manner.

I know that some researcher won’t be happy for the idea of making “their” data available that rapidly, but as I turned to the BOLD project managers in Finland and asked if they have anything against making the projects fully public, no one opposed. I believe that making the generated barcode data open as widely as possible would not benefit only the community, but also individual researchers, and would also enhance collaboration between researchers.

Indeed, it would be really nice to learn about your thoughts regading this topic.

Many thanks, @mamutane, for posting your thoughts on this fundamental matter. I’ve moved this to a new topic of its own so that we can focus on it. As we expand visibility to this forum, I plan to direct people here to add their thinking.

My own perspective is that we should aim for universal public access to BOLD data and work to resolve any issues that make this difficult for researchers and contributers, whether those issues relate to national practice, institutional expectations or personal wishes.

Clearly, the widest possible access to the fullest possible dataset of barcodes gives the best basis for confident application in barcoding and metabarcoding applications and offers a good foundation for detecting discrepancies and curating the data as a whole. Public open access to all BOLD projects would be good for science.

At a time when natural systems are also facing unprecedented pressures from climate change and the biodiversity crisis, making these data open is something we can all do to assist scientists and policy makers around the world to find ways to monitor change and respond to these pressures. If we keep these data hidden, we cannot support good planning and sustainable outcomes.

Thanks @ahausmann - you are correct that we should explore enhancements to the BOLD workflows and data presentation as a linked issue. The goal should be not just to make more data accessible but also to build the community around the data that helps to improve it as a reliable and trustworthy resource.

Excellent point, that’s exactly what I mean.

One addition: I also believe that being able to indicate that we do not only publish a lot, but also publish in top journals, would help gaining funding for barcoding in many countries. Of course the open question is that who would take the responsibility of data analysing. (S)he should be someone who is an expert in handling massive data sets and searching interesting patterns in it. Well, I am clearly turning too much to a different topic now.

Thanks again @mamutane. I think such a paper may be a very useful tool for us. As well as analysing what we already have, it could serve as an overview of gaps and needs and how additional investments could point the way to delivering specific results and outcomes.