6 Where To Publish Data

6.1 Searching for Research data repositories

re3data is a global registry of research data repositories with over 3,000 entries where you can search for an appropriate place to deposit your data (Pampel et al. 2023 [cito:citesAsAuthority]). So if you don’t know of a suitable repository then searching for one that is a good fit for your data in re3data is a good place to start.

FAIRsharing.org is a curated resource of educational material on databases, standards, and policies for data sharing.

6.3 Integrated publishing - a possible future

Data, analysis, prose, collaboration, pre-print, review and publication in one place with literate programming and single source publishing

You begin your project on an instance of a platform like Renku (section Section 4.8), Start by uploading your raw data to a domain specific data repository. You get a DOI or accession for your dataset. You import this into your project. You perform your computational analyses in the reproducible computational environment. Potentially documenting your analysis as a workflow that could be used by others with a pipeline management tool. You write your manuscript in a literate programming format like Quarto. You work with your collaborators on the manuscript using a git hosting tool like gitlab where you raise and discuss issues, and share revised versions. You generate your statistics and graphics for inclusion in the manuscript with code from your data in a reproducible computational environment. You publish a pre-print by making use of a static site generator like the one built into gitlab and simply setting the project to public. You tag this version 0.0.0 and associated it with a DOI from zenodo. To manage reviews of your work you make use of gitlab issues in a manner similar to the review processes of JOSS, rOpenSci and f1000 but potentially independent of a particular publication venue through community peer review projects like Peer Community In (PCI) & Review Commons. This approach permits author led updates, errata, & corrections whilst preserving a version of record (Kane and Amin 2023 [cito:agreesWith]). Once Reviewed and published you have the 1.0.0 version of your manuscript, for future minor corrections you increment the patch version 1.0.1 and your change-log reflects that you fixed a typo. If you add a new dataset or fix an error that changes an outcome you increment the minor version number. If the journal updates the version of record you increment the major version number.

In this Fashion the complete history of the project is documented start to finish and you never had to change medium from scripts to manuscripts in word processors to emailing pdfs, to publisher websites etc. Review is handled with the same set of tools as was your internal collaboration with co-authors. pre-print publication is creating a version tag and setting the repo to public. Anyone can pick up your project in it’s entirety and play around with their own variants of your analysis at the click of a button (specifically the ‘fork’ button).

Baker, Monya. 2021. “Five Keys to Writing a Reproducible Lab Protocol.” Nature 597 (7875): 293–94. https://doi.org/10.1038/d41586-021-02428-3.

Kane, Adam, and Bawan Amin. 2023. “Amending the Literature Through Version Control.” Biology Letters 19 (1). https://doi.org/10.1098/rsbl.2022.0463.

Lee, Jamie A., Josef Spidlen, Keith Boyce, Jennifer Cai, Nicholas Crosbie, Mark Dalphin, Jeff Furlong, et al. 2008. “MIFlowCyt: The Minimum Information about a Flow Cytometry Experiment.” Cytometry Part A 73A (10): 926–30. https://doi.org/10.1002/cyto.a.20623.

Pampel, Heinz, Nina Leonie Weisweiler, Dorothea Strecker, Michael Witt, Paul Vierkant, Kirsten Elger, Roland Bertelmann, et al. 2023. “Re3data Indexing the Global Research Data Repository Landscape Since 2012.” Scientific Data 10 (1). https://doi.org/10.1038/s41597-023-02462-y.

Sarkans, Ugis, Wah Chiu, Lucy Collinson, Michele C. Darrow, Jan Ellenberg, David Grunwald, Jean-Karim Hériché, et al. 2021. “REMBI: Recommended Metadata for Biological Imagesenabling Reuse of Microscopy Data in Biology.” Nature Methods 18 (12): 1418–22. https://doi.org/10.1038/s41592-021-01166-8.

Schmied, Christopher, Michael S. Nelson, Sergiy Avilov, Gert-Jan Bakker, Cristina Bertocchi, Johanna Bischof, Ulrike Boehm, et al. 2023. “Community-Developed Checklists for Publishing Images and Image Analyses.” Nature Methods, September. https://doi.org/10.1038/s41592-023-01987-9.

Spidlen, Josef, Karin Breuer, and Ryan Brinkman. 2012. “Preparing a Minimum Information about a Flow Cytometry Experiment (MIFlowCyt) Compliant Manuscript Using the International Society for Advancement of Cytometry (ISAC) FCS File Repository (FlowRepository.org).” Current Protocols in Cytometry 61 (1). https://doi.org/10.1002/0471142956.cy1018s61.

Swedlow, Jason R., Pasi Kankaanpää, Ugis Sarkans, Wojtek Goscinski, Graham Galloway, Leonel Malacrida, Ryan P. Sullivan, et al. 2021. “A Global View of Standards for Open Image Data Formats and Repositories.” Nature Methods 18 (12): 1440–46. https://doi.org/10.1038/s41592-021-01113-7.

6.1 Searching for Research data repositories

6.2 Selected Public Data Repositories & Data Sharing Platforms

6.2.1 Sequencing Data

6.2.1.1 GEO (Gene Expression Omnibus)

6.2.1.2 SRA (Sequence Read Archive)

6.2.1.3 HCA (Human Cell Atlas) data portal

6.2.1.4 GenBank

6.2.1.5 ENA (European Nucleotide Archive)

6.2.2 Imaging Data

6.2.2.1 IDR (Image Data Resource)

6.2.2.2 Bioimage Archive (EBI)

6.2.2.3 EMPIAR (Electron Microscopy Public Image Archive)

6.2.2.4 figshare

6.2.3 Protocols

6.2.3.1 Protocols.io

6.2.3.2 JOVE

6.2.4 Code & Computational Environments

6.2.4.1 Software packages

6.2.4.2 Bioinformatic analysis pipelines

6.2.4.3 Scripts, Notebooks and project specific workflows can be shared as git repositories.

6.2.5 Biological Materials access / Sharing

6.2.5.1 HDBR (Human Developmental Biology Resource)

6.2.6 Spatial transcriptomics

6.2.7 Flow Cytometry

6.2.8 Proteomics

6.2.9 None of the above

6.3 Integrated publishing - a possible future