Resume

Edie Espejo

Data scientist with a foundation in statistical thinking and experience in modern machine learning, focused on interpreting complex, real-world data and turning it into meaningful insight in healthcare and customer analytics.

Let’s Connect

LinkedInGitHub

Experience

Biostatistician

2025-Present
UCSF Division of Geriatrics

  • Designed and validated clinical prediction models for mortality, recovery, and readmission outcomes, using logistic regression and generalized linear mixed-effects models for hierarchical healthcare data

  • Addressed missing data using multiple imputation (MICE) and evaluated model performance through discrimination (AUC/ROC), calibration (intercept/slope), and internal–external validation across sites

  • Led COVID-19 analyses of mortality and LTACH outcomes, applying modeling frameworks to characterize recovery trajectories in critically ill patients

  • Analyzed large-scale EHR text data (~100K records) using NLP methods in Python to study linguistic patterns (e.g., pronoun usage) and their association with clinical outcomes

Senior Statistician

2021-2025
Northern California Institute for Research and Education

  • Investigated the impact of disruptive life events on dementia patients, using longitudinal models (interrupted time series, Cox models) and analysis pipelines in R (dplyr, ggplot2) to evaluate functional decline and mortality

  • Analyzed nationally representative datasets to study deprescribing, smoking epidemiology, and patient outcomes using survey-based methods in R

  • Studied bias in ICU clinical documentation, applying NLP methods in Python (transformer models including DeBERTa) to analyze language patterns in EHR notes

Data Science Internship

2019
Gap, Inc.

  • Analyzed customer behavior to inform business strategy, using SQL and large-scale datasets

  • Developed visualizations and reports in R (ggplot2, dplyr) to communicate insights to stakeholders

  • Presented findings and collaborated with cross-functional teams, working in shared server environments to support scalable analysis

Education

Biostatistics, M.A.
University of California, Berkeley
2020

Statistics, B.S.
University of California, Davis
2017