Tools for Reproducible Research

Description

One of the key principles of proper scientific procedure is the act of repeating an experiment or analysis and being able to reach similar conclusions. Published research based on computational analysis, e.g. bioinformatics or computational biology, have often suffered from incomplete method descriptions (e.g. list of used software versions); unavailable raw data; and incomplete, undocumented and/or unavailable code. This essentially prevents any possibility of attempting to reproduce the results of such studies. The term “reproducible research” has been used to describe the idea that a scientific publication based on computational analysis should be distributed along with all the raw data and metadata used in the study, all the code and/or computational notebooks needed to produce results from the raw data, and the computational environment or a complete description thereof. Reproducible research not only leads to proper scientific conduct but also provides other researchers the access to build upon previous work. Most importantly, the person setting up a reproducible research project will quickly realize the immediate personal benefits: an organized and structured way of working. The person that most often has to reproduce your own analysis is your future self!

Topics covered

The following topics and tools are covered in the course:

Data management
Project organisation
Git
Conda
Snakemake
Nextflow
Quarto
Jupyter
Docker
Apptainer

Learning outcomes

At the end of the course, students should be able to:

Use good practices for data analysis and management
Clearly organise their bioinformatic projects
Use the version control system Git to track and collaborate on code
Use the package and environment manager Conda
Use and develop workflows with Snakemake and Nextflow
Use Quarto and Jupyter Notebooks to document and generate automated reports for their analyses
Use Docker and Apptainer to distribute containerized computational environments

Pre-requisites

The only entry requirements for this course is a basic knowledge of Unix systems (i.e. being able to work on the command line) as well as at least a basic knowledge of either R or Python.

Level

Beginner

Course leaders

Erik Fasterius

John Sundh

edu.trr@nbis.se

Upcoming courses

Course	Date	Location	Apply by
Tools for Reproducible Research	2026-04-20 - 2026-04-24	Lund, Stockholm	2026-03-20

Previous courses

Course	Date	Location	Apply by
Tools for Reproducible Research	2025-04-07 - 2025-04-11	Stockholm	2025-03-14
Tools for Reproducible Research	2024-11-25 - 2024-11-29		2024-10-18
Tools for Reproducible Research	2024-04-22 - 2024-04-26		2024-03-18
Tools for Reproducible Research	2023-11-20 - 2023-11-24		2023-10-20
Tools for Reproducible Research	2023-04-24 - 2023-04-28
Tools for Reproducible Research	2022-11-21 - 2022-11-25
Tools for Reproducible Research	2022-04-25 - 2022-04-29
Tools for Reproducible Research	2021-11-15 - 2021-11-19