6.1 Introduction to Reproducible Research in Clinical Settings
Reproducibility is fundamental to clinical research integrity, particularly in regulated environments. In this chapter, we explore how to implement robust reproducible research workflows using R.
6.1.1 The Value of Reproducibility in Clinical Research
Reproducible research provides several key benefits in clinical settings:
Regulatory compliance: Meeting FDA and other regulatory requirements
Error reduction: Minimizing mistakes through automation and validation
Transparency: Enabling review and verification of methods and results
Efficiency: Streamlining updates when data or requirements change
Knowledge transfer: Facilitating collaboration and continuity
6.1.2 Key Elements of Reproducible Research
Code
library(knitr)library(tidyverse)# Create a table of reproducibility elementsreproducibility_elements <-tribble(~Element, ~Description, ~R_Tools,"Version control", "Tracking changes to code and documents", "git, GitHub, GitLab","Environment management", "Capturing software dependencies", "renv, packrat, Docker","Code organization", "Structuring analysis code", "R packages, targets, drake","Documentation", "Recording methods and decisions", "roxygen2, knitr, quarto","Data management", "Tracking data provenance", "DataPackageR, pins, arrow","Workflow automation", "Orchestrating analysis steps", "Make, targets, drake","Validation", "Verifying analysis correctness", "testthat, valr, riskmetric")# Display the tablekable(reproducibility_elements)
6.2 Setting Up a Reproducible Project Structure
6.2.1 Project Organization
Creating a well-organized project structure is the foundation of reproducibility:
Code
# Function to create a standardized project structurecreate_clinical_project <-function(project_name, base_dir ="~/projects") {# Create main project directory project_dir <-file.path(base_dir, project_name)dir.create(project_dir, recursive =TRUE, showWarnings =FALSE)# Create standard subdirectories dirs <-c("data/raw", # Original unmodified data"data/processed", # Cleaned and processed data"data/external", # External reference data"R", # R functions and scripts"analysis", # Analysis scripts"reports/figures", # Generated figures"reports/tables", # Generated tables"docs", # Documentation"output/logs", # Log files"renv/library"# Isolated package library )# Create directoriesfor (d in dirs) {dir.create(file.path(project_dir, d), recursive =TRUE, showWarnings =FALSE) }# Create files like README.md, .gitignore, and setup.R# (Code omitted for brevity - see online repository for full example)# Return the project directory pathreturn(project_dir)}# Example usage# project_path <- create_clinical_project("clinical_trial_2023")
6.2.2 Using the here Package
The here package helps maintain reproducible file paths across different systems:
Code
library(here)# Instead of this (not reproducible across systems)data <-read.csv("C:/Users/username/projects/clinical_trial/data/raw/baseline.csv")# Use this (works the same on any system)data <-read.csv(here("data", "raw", "baseline.csv"))# Create a function that uses reproducible pathssave_analysis_result <-function(result, filename) { output_path <-here("output", filename)saveRDS(result, output_path)return(output_path)}# Example usagemodel_result <-lm(outcome ~ treatment + age + sex, data = clinical_data)save_analysis_result(model_result, "primary_analysis_model.rds")
For further details on structuring clinical research projects, see the additional resources section at the end of this chapter.
# Reproducible Research Workflows## Introduction to Reproducible Research in Clinical SettingsReproducibility is fundamental to clinical research integrity, particularly in regulated environments. In this chapter, we explore how to implement robust reproducible research workflows using R.```{r}#| echo: false#| fig-cap: "Components of a Reproducible Research Workflow"library(DiagrammeR)# This would render a workflow diagram in the actual document# Placeholder comment for the diagram code```### The Value of Reproducibility in Clinical ResearchReproducible research provides several key benefits in clinical settings:1. **Regulatory compliance**: Meeting FDA and other regulatory requirements2. **Error reduction**: Minimizing mistakes through automation and validation3. **Transparency**: Enabling review and verification of methods and results4. **Efficiency**: Streamlining updates when data or requirements change5. **Knowledge transfer**: Facilitating collaboration and continuity### Key Elements of Reproducible Research```{r}#| echo: true#| eval: falselibrary(knitr)library(tidyverse)# Create a table of reproducibility elementsreproducibility_elements <-tribble(~Element, ~Description, ~R_Tools,"Version control", "Tracking changes to code and documents", "git, GitHub, GitLab","Environment management", "Capturing software dependencies", "renv, packrat, Docker","Code organization", "Structuring analysis code", "R packages, targets, drake","Documentation", "Recording methods and decisions", "roxygen2, knitr, quarto","Data management", "Tracking data provenance", "DataPackageR, pins, arrow","Workflow automation", "Orchestrating analysis steps", "Make, targets, drake","Validation", "Verifying analysis correctness", "testthat, valr, riskmetric")# Display the tablekable(reproducibility_elements)```## Setting Up a Reproducible Project Structure### Project OrganizationCreating a well-organized project structure is the foundation of reproducibility:```{r}#| echo: true#| eval: false# Function to create a standardized project structurecreate_clinical_project <-function(project_name, base_dir ="~/projects") {# Create main project directory project_dir <-file.path(base_dir, project_name)dir.create(project_dir, recursive =TRUE, showWarnings =FALSE)# Create standard subdirectories dirs <-c("data/raw", # Original unmodified data"data/processed", # Cleaned and processed data"data/external", # External reference data"R", # R functions and scripts"analysis", # Analysis scripts"reports/figures", # Generated figures"reports/tables", # Generated tables"docs", # Documentation"output/logs", # Log files"renv/library"# Isolated package library )# Create directoriesfor (d in dirs) {dir.create(file.path(project_dir, d), recursive =TRUE, showWarnings =FALSE) }# Create files like README.md, .gitignore, and setup.R# (Code omitted for brevity - see online repository for full example)# Return the project directory pathreturn(project_dir)}# Example usage# project_path <- create_clinical_project("clinical_trial_2023")```### Using the `here` PackageThe `here` package helps maintain reproducible file paths across different systems:```{r}#| echo: true#| eval: falselibrary(here)# Instead of this (not reproducible across systems)data <-read.csv("C:/Users/username/projects/clinical_trial/data/raw/baseline.csv")# Use this (works the same on any system)data <-read.csv(here("data", "raw", "baseline.csv"))# Create a function that uses reproducible pathssave_analysis_result <-function(result, filename) { output_path <-here("output", filename)saveRDS(result, output_path)return(output_path)}# Example usagemodel_result <-lm(outcome ~ treatment + age + sex, data = clinical_data)save_analysis_result(model_result, "primary_analysis_model.rds")```For further details on structuring clinical research projects, see the additional resources section at the end of this chapter.## References