8.1 Introduction to Case-Based Learning in Clinical Research
Case studies provide a powerful framework for applying the concepts, techniques, and tools we’ve explored throughout this book to real-world clinical research scenarios. In this chapter, we present comprehensive case studies that demonstrate how to implement reproducible R workflows in different clinical research contexts.
8.1.1 The Value of Case-Based Learning
Case-based learning offers several advantages for developing proficiency in clinical data analysis:
Integration: Combines multiple concepts and skills into cohesive workflows
Context: Places technical skills within realistic clinical research scenarios
Decision points: Highlights key decision points and their implications
Problem-solving: Develops critical thinking about real-world challenges
Translation: Bridges the gap between theoretical knowledge and practical application
8.2 Case Study 1: Phase II Oncology Trial
8.2.1 Study Background
This case study examines a Phase II oncology trial evaluating a novel treatment for advanced non-small cell lung cancer (NSCLC):
Code
library(tidyverse)library(survival)library(here)library(knitr)# Study design informationstudy_design <-tribble(~Parameter, ~Value,"Design", "Randomized, double-blind, placebo-controlled Phase II trial","Population", "Adults with stage IIIB/IV NSCLC with EGFR mutations","Sample size", "120 patients (1:1 randomization)","Primary endpoint", "Progression-free survival (PFS)","Secondary endpoints", "Overall survival (OS), objective response rate (ORR), safety","Follow-up", "Up to 24 months or until disease progression")# Display study designkable(study_design, caption ="Study Design Summary")
8.2.2 Data Preparation
First, we’ll set up our project structure following the reproducible research principles from Chapter 6:
Code
# Create a function to initialize the project (normally run once)setup_oncology_project <-function() {# Create directory structure dirs <-c("data/raw","data/processed","R","analysis","reports/figures","reports/tables" )for (d in dirs) {dir.create(here(d), recursive =TRUE, showWarnings =FALSE) }# Initialize renv renv::init()# Create helper scriptsfile.create(here("R", "data_cleaning.R"))file.create(here("R", "analysis_functions.R"))file.create(here("R", "visualization_functions.R"))}# Load and clean dataload_oncology_data <-function() {# Load sample datasets (in practice, these would be actual trial data) demographics <-read_csv(here("data", "raw", "oncology_demographics.csv")) baseline <-read_csv(here("data", "raw", "oncology_baseline.csv")) efficacy <-read_csv(here("data", "raw", "oncology_efficacy.csv")) safety <-read_csv(here("data", "raw", "oncology_safety.csv"))# Join datasets patient_data <- demographics %>%left_join(baseline, by ="patient_id") %>%mutate(treatment =factor(treatment_group, levels =c("Placebo", "Active")),sex =factor(sex),ecog =factor(ecog_status),stage =factor(disease_stage) )# Create analysis datasets efficacy_data <- efficacy %>%left_join(patient_data %>%select(patient_id, treatment, age, sex, ecog, stage),by ="patient_id") safety_data <- safety %>%left_join(patient_data %>%select(patient_id, treatment, age, sex),by ="patient_id")# Save processed datasetswrite_csv(patient_data, here("data", "processed", "patient_data.csv"))write_csv(efficacy_data, here("data", "processed", "efficacy_data.csv"))write_csv(safety_data, here("data", "processed", "safety_data.csv"))# Return list of datasetsreturn(list(patient_data = patient_data,efficacy_data = efficacy_data,safety_data = safety_data ))}
Response analysis: Evaluating treatment efficacy using categorical outcomes
Safety assessment: Monitoring and visualizing adverse events
The integration of these approaches within a reproducible workflow provides a comprehensive analysis of the trial results, supporting accurate scientific conclusions and transparent reporting.
8.3 Case Study 2: Longitudinal Observational Study
8.3.1 Study Background
This case study examines a multi-center observational study of patients with rheumatoid arthritis:
Code
library(tidyverse)library(lme4)library(here)library(knitr)# Study design informationstudy_design <-tribble(~Parameter, ~Value,"Design", "Prospective, multi-center observational cohort study","Population", "Adults with rheumatoid arthritis on various treatments","Sample size", "500 patients from 20 centers","Primary objective", "Identify predictors of treatment response over time","Follow-up", "5 years with biannual assessments","Key measurements", "Disease Activity Score (DAS28), Health Assessment Questionnaire (HAQ), Biomarkers")# Display study designkable(study_design, caption ="Study Design Summary")
We’ll create visualizations to explore longitudinal patterns:
Code
# Load processed datara_data <-read_csv(here("data", "processed", "ra_analysis_data.csv"))# Create individual trajectory plottrajectory_plot <-ggplot(ra_data, aes(x = time_point, y = das28, group = patient_id, color = treatment)) +geom_line(alpha =0.2) +geom_smooth(aes(group = treatment), method ="loess", se =TRUE) +scale_color_brewer(palette ="Set1") +facet_wrap(~ treatment) +labs(title ="DAS28 Trajectories by Treatment Group",subtitle ="Individual patient trajectories with treatment group trends",x ="Month",y ="Disease Activity Score (DAS28)",color ="Treatment Group" ) +theme_minimal() +theme(legend.position ="bottom")# Save plotggsave(here("reports", "figures", "das28_trajectories.png"), plot = trajectory_plot, width =10, height =8, dpi =300)# Display plotprint(trajectory_plot)# Create boxplot of DAS28 by time and treatmentboxplot_das28 <-ggplot(ra_data, aes(x =factor(time_point), y = das28, fill = treatment)) +geom_boxplot(alpha =0.8) +scale_fill_brewer(palette ="Set1") +labs(title ="DAS28 Distribution Over Time by Treatment Group",x ="Month",y ="Disease Activity Score (DAS28)",fill ="Treatment Group" ) +theme_minimal() +theme(legend.position ="bottom",axis.text.x =element_text(angle =0) )# Save and display boxplotggsave(here("reports", "figures", "das28_boxplot.png"), plot = boxplot_das28, width =10, height =6, dpi =300)print(boxplot_das28)
8.3.4 Mixed Effects Modeling
We’ll use mixed effects models to analyze the longitudinal data:
Code
library(lme4)library(lmerTest)# Fit linear mixed effects modellmm_model <-lmer( das28 ~ time_point * treatment + age + sex + (1+ time_point | patient_id) + (1| center),data = ra_data)# Model summarylmm_summary <-summary(lmm_model)print(lmm_summary)# Extract fixed effectsfixed_effects <-as.data.frame(coef(summary(lmm_model)))fixed_effects$p_value <- fixed_effects$`Pr(>|t|)`fixed_effects$variable <-rownames(fixed_effects)# Display fixed effectskable(fixed_effects %>%select(variable, Estimate, `Std. Error`, `t value`, p_value),caption ="Fixed Effects from Linear Mixed Model",col.names =c("Variable", "Estimate", "Std. Error", "t value", "p-value"))# Plot model predictionsra_data$predicted <-predict(lmm_model)prediction_plot <-ggplot(ra_data, aes(x = time_point, color = treatment)) +geom_point(aes(y = das28), alpha =0.1) +geom_smooth(aes(y = predicted, group = treatment), method ="loess") +scale_color_brewer(palette ="Set1") +labs(title ="Mixed Effects Model Predictions of DAS28 Over Time",x ="Month",y ="Disease Activity Score (DAS28)",color ="Treatment Group" ) +theme_minimal()# Save and display prediction plotggsave(here("reports", "figures", "lmm_predictions.png"), plot = prediction_plot, width =8, height =6, dpi =300)print(prediction_plot)
8.3.5 Treatment Response Analysis
We’ll analyze treatment response patterns over time:
Code
# Define treatment response criteriara_data <- ra_data %>%group_by(patient_id) %>%mutate(baseline_das28 =first(das28),das28_change = das28 - baseline_das28,responder = das28_change <=-1.2| (das28_change <=-0.6& das28 <=3.2) ) %>%ungroup()# Calculate response rates by time and treatmentresponse_rates <- ra_data %>%filter(time_point %in%c(6, 12, 24, 36, 48, 60)) %>%group_by(time_point, treatment) %>%summarize(n =n(),responders =sum(responder, na.rm =TRUE),response_rate = responders / n *100,.groups ="drop" )# Display response rateskable(response_rates,caption ="Response Rates by Time Point and Treatment",col.names =c("Month", "Treatment", "N", "Responders", "Response Rate (%)"))# Plot response ratesresponse_plot <-ggplot(response_rates, aes(x = time_point, y = response_rate, color = treatment, group = treatment)) +geom_line(size =1) +geom_point(size =3) +scale_color_brewer(palette ="Set1") +labs(title ="Treatment Response Rates Over Time",x ="Month",y ="Response Rate (%)",color ="Treatment Group" ) +theme_minimal() +scale_x_continuous(breaks =c(6, 12, 24, 36, 48, 60))# Save and display response plotggsave(here("reports", "figures", "response_rates.png"), plot = response_plot, width =8, height =6, dpi =300)print(response_plot)
8.3.6 Biomarker Analysis
We’ll examine the relationship between biomarkers and treatment response:
Code
# Fit logistic mixed model for response predictionbiomarker_model <-glmer( responder ~ treatment * crp + age + sex + (1| patient_id) + (1| center),family = binomial,data = ra_data %>%filter(time_point ==24) # Looking at 24-month response)# Model summarybio_summary <-summary(biomarker_model)print(bio_summary)# Extract odds ratiosodds_ratios <-exp(fixef(biomarker_model))ci <-exp(confint(biomarker_model, method ="Wald"))# Create results tableor_table <-data.frame(variable =names(fixef(biomarker_model)),odds_ratio = odds_ratios,lower_ci = ci[,1],upper_ci = ci[,2])# Display odds ratioskable(or_table,caption ="Odds Ratios for Treatment Response Prediction",col.names =c("Variable", "Odds Ratio", "95% CI Lower", "95% CI Upper"))# Create prediction plotra_data$prob_response <-predict(biomarker_model, type ="response", newdata = ra_data %>%filter(time_point ==24))biomarker_plot <-ggplot(ra_data %>%filter(time_point ==24), aes(x = crp, y = prob_response, color = treatment)) +geom_point() +geom_smooth(method ="loess") +scale_color_brewer(palette ="Set1") +labs(title ="Probability of Treatment Response by CRP Level",x ="C-Reactive Protein (mg/L)",y ="Probability of Response",color ="Treatment Group" ) +theme_minimal()# Save and display biomarker plotggsave(here("reports", "figures", "biomarker_response.png"), plot = biomarker_plot, width =8, height =6, dpi =300)print(biomarker_plot)
8.3.7 Key Insights
This longitudinal case study highlights several important aspects of observational research analysis:
Longitudinal data management: Transforming complex longitudinal data into analysis-ready formats
Mixed effects modeling: Accounting for within-subject correlations and center effects
Trajectory visualization: Creating informative visualizations of patient trajectories
Response prediction: Using biomarkers to predict treatment response
Time-varying effects: Capturing how treatment effects evolve over time
The methods demonstrated here can be applied to many types of longitudinal clinical studies, from small observational studies to large registry-based analyses.
8.4 Case Study 3: Registry-Based Pharmacovigilance Study
8.4.1 Study Background
This case study examines a large-scale pharmacovigilance study using registry data:
Code
library(tidyverse)library(survival)library(here)library(knitr)# Study design informationstudy_design <-tribble(~Parameter, ~Value,"Design", "Registry-based retrospective cohort study","Data source", "National health registry data","Population", "Patients receiving one of four antihypertensive medications","Sample size", "Approximately 50,000 patients","Primary outcome", "Incidence of adverse events of special interest (AESI)","Follow-up", "Up to 5 years from treatment initiation")# Display study designkable(study_design, caption ="Study Design Summary")
8.4.2 Data Preparation and Propensity Score Matching
Given the observational nature of registry data, we’ll implement propensity score matching:
This registry-based pharmacovigilance case study demonstrates several important techniques for real-world evidence analysis:
Propensity score matching: Reducing confounding bias in observational data
Incidence rate calculation: Properly accounting for varying exposure times
Survival analysis for safety outcomes: Analyzing time-to-event data for adverse events
Risk stratification: Identifying patients at higher risk of adverse events
Sensitivity analyses: Testing the robustness of findings using alternative methods
These methods are essential for generating reliable evidence from real-world data sources, which increasingly complement traditional clinical trials in regulatory decision-making and clinical practice guidelines.
8.5 Conclusion
These case studies demonstrate how the principles, techniques, and tools covered throughout this book can be applied to solve real-world clinical research challenges. By integrating data preparation, statistical analysis, visualization, and reproducible workflows, we can effectively analyze and communicate clinical research findings.
The key lessons from these case studies include:
Integration of methods: Each case study required multiple analytical approaches working together
Reproducible workflows: Structured project organization and dependency management ensured reproducibility
Visualization as communication: Tailored visualizations effectively communicated complex findings
Context-specific approaches: Different clinical research contexts required adapting our analytical approach
In the next chapter, we’ll explore regulatory considerations for R-based clinical research, building on the practical applications demonstrated in these case studies.
# Clinical Research Case Studies## Introduction to Case-Based Learning in Clinical ResearchCase studies provide a powerful framework for applying the concepts, techniques, and tools we've explored throughout this book to real-world clinical research scenarios. In this chapter, we present comprehensive case studies that demonstrate how to implement reproducible R workflows in different clinical research contexts.```{r}#| echo: false#| fig-cap: "Learning Pathway Through Case Studies"library(DiagrammeR)# This would render a learning pathway diagram in the actual document# Placeholder comment for the diagram code```### The Value of Case-Based LearningCase-based learning offers several advantages for developing proficiency in clinical data analysis:1. **Integration**: Combines multiple concepts and skills into cohesive workflows2. **Context**: Places technical skills within realistic clinical research scenarios3. **Decision points**: Highlights key decision points and their implications4. **Problem-solving**: Develops critical thinking about real-world challenges5. **Translation**: Bridges the gap between theoretical knowledge and practical application## Case Study 1: Phase II Oncology Trial{{< include 08a-case-oncology.qmd >}}## Case Study 2: Longitudinal Observational Study{{< include 08b-case-longitudinal.qmd >}}## Case Study 3: Registry-Based Pharmacovigilance Study{{< include 08c-case-registry.qmd >}}## ConclusionThese case studies demonstrate how the principles, techniques, and tools covered throughout this book can be applied to solve real-world clinical research challenges. By integrating data preparation, statistical analysis, visualization, and reproducible workflows, we can effectively analyze and communicate clinical research findings.The key lessons from these case studies include:1. **Integration of methods**: Each case study required multiple analytical approaches working together2. **Reproducible workflows**: Structured project organization and dependency management ensured reproducibility3. **Visualization as communication**: Tailored visualizations effectively communicated complex findings4. **Context-specific approaches**: Different clinical research contexts required adapting our analytical approachIn the next chapter, we'll explore regulatory considerations for R-based clinical research, building on the practical applications demonstrated in these case studies.## References