Reproducible data analysis pipelines in R

Letter R

Photo by Steinar Engeland on Unsplash

This training session is delivered by Dr Nicolas Payette from the Oxford Research Software Engineering Group.


Years ago, you wrote a data analysis script. Hundreds of lines of R code, all in a single file. It was not beautiful, but it worked, and you got a great paper out of it. But now some new version of one of the datasets you used has come out, and there is also that new statistical technique that you have been meaning to try anyway. Assuming you still have your original code somewhere, can you still run it? Even on your new machine? Maybe. And do you need to re-run the whole thing if you only change parts of it? It did take ages to run...

In this one-day course, you will learn about packages and practices that can help you make your analyses reproducible and portable. The material is centred around the `targets` package for building computational pipelines, but we will also talk about `renv` for package management, `git` and GitHub for version control and remote execution, and Quarto for the production of final research outputs. We will also, of course, use the `tidyverse` throughout.

Some basic proficiency in R is required to make the most of this course.

 

Places will be allocated on a first-come, first-served basis, and once places are full, we will maintain a waiting list.

Please only register if you are certain of your availability and commitment to attend.

Booking will open in early December