Reproducibility and Reusability are considered vital aspects of modern science. Achieving these goals requires a variety of measures, including community best practices, data models, and terminologies, designed to provide a common basis for integrating and comparing datasets and analytic methods.
A primary focus of the MIDAS Coordination Center (MCC) is developing tools, guidance, and demonstration projects designed to help the infectious disease modeling community move toward greater reproducibility and reusability, through adoption of community approaches such as FAIR ( Findable, Accessible, Interoperable, and Reusable) data.
These efforts are neither complete nor completed – they are preliminary works in progress. Constructive input from the community is always welcome. Please contact us at firstname.lastname@example.org to provide any comments, questions, or criticisms.
Open sharing of data and software is a cornerstone of reproducibility and reusability. Data and code that are held under tight proprietary control cannot be reused, and resulting analyses cannot be reproduced. Although data sharing in fields such as clinical informatics might be limited due to privacy and regulatory concerns, most datasets used in infectious disease modeling are not subject to any such limitations.
To be truly reusable and shareable, these artifacts must be described with clear metadata describing the origin, scope, contents, and licensing restrictions impacting their use. Our notes on Sharing Disease Modeling Software provide recommendations for sharing these resources to encourage reuse.
The MIDAS Ontology file is a hierarchy of terms for annotation of datasets, software, and other digital objects curated in the MIDAS Online Portal for COVID-19 Modeling Research. The ontology builds off of existing relevant ontologies such as Apollo-SV, adding terms for the contents and topics covered in COVID-19 datasets reviewed by the MCC team. For example, it has terms for case counts, death data, school closures, and phylogenetics data.
The MIDAS Coordination Center maintains a manually curated collection of metadata from more than 300 COVID related resources, including datasets, repositories, dashboards, and software packages. This curated metadata can be browsed through the MIDAS data catalog. These datasets are described in a common metadata format. This format is compatible with Google Dataset Search, and will soon be described in a schema to be published on the MCC site.