Skip to Main Content
Libraries
askus Ask us
 

Research Data Services

Guidance, tools, and training to support faculty and students working with research data.

Reproducibility 

Reproducibility is obtaining consistent computational results using the same input data, computational steps, methods, code, and conditions of analysis. Reproducibility is important not because it ensures that the results are correct, but rather because it ensures transparency and lends confidence in the quality of academic research.

Online Course: Research Data Management and Reproducible Research (UVic Libraries)


Evaluating Reproducibility 

Use this chart to evaluate the reproducibility of your data. 

Organization Yes / No / Maybe? (explain if necessary)
Are all files encapsulated within one directory?

Is the sub-directory structure clear and easy to navigate?

  • Are the names of subdirectories self-explanatory?
  • Is the raw / input data separated from the derived data?
  • Is the data separated from the code?
  • Are any outputs (figures, tables) provided? Are they contained in their own subdirectory?
Are file names self-explanatory? If not, how could they be improved?
Are there multiple versions of a file? If yes, are versions clearly enumerated?

Is there a README file?

  • If yes, does it specify author contact information, file contents, directory overview, dependencies, etc? What other information could it provide to improve reproducibility?
Document Software Yes / No / Maybe? (explain if necessary)
Is the software environment specified?
Are dependencies needed to run scripts specified clearly?
Are relative paths used in scripts (vs. absolute paths)?
Are all file conversion, data cleaning and analysis steps documented by scripts?
Is the execution of all code automated by a master script?
Are decisions behind data cleaning, analysis, and other scripts well documented within the code as annotations, or as a reproducible report (e.g. R markdown (*.Rmd))?
Document Data Yes / No / Maybe? (explain if necessary)
Are the raw data provided? If only processed data are provided, is there sufficient description to understand transformations made to raw data?
Are all data files necessary to rerun analyses provided? If not, are links to containing repositories specified?
Are data provided in open file formats?
Is sufficient documentation provided to understand the data? (e.g. data dictionary, code book)
Licensing and Sharing Yes / No / Maybe? (explain if necessary)
Is a license specified for the software? (for e.g. either in a README file or a separate license text file?)
Is a license specified for the data?
Is the repository(ies) containing the data and code registered with a unique DOI?
Are the repository(ies) and published article cross linked with metadata?  

References:
Broman, K. (n.d.). Initial steps toward reproducible research. Accessed September 05, 2019 from https://kbroman.org/steps2rr/

Clyburn-Sherin, A. (2019). Preparing data and code for reproducible publication using container technology. Workshop presented at the Research Data Access and Preservation Summit 2019, Coral Gables, FL. Slides accessed September 05, 2019 from http://bit.ly/rdap-workshop

Rokem, A., Marwick, B., & Staneva, V. (2018). Assessing reproducibility. In Kitzes, J., Turek D., & Deniz, F. (Eds.) The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences. Accessed September 05, 2019 from https://www.practicereproducibleresearch.org/core-chapters/2-assessment.html


Additional Resources

Creative Commons License
This work by The University of Victoria Libraries is licensed under a Creative Commons Attribution 4.0 International License unless otherwise indicated when material has been used from other sources.