Summary

Ph.D. statistician specializing in statistical modeling, experimental design, and causal inference. Experienced in study design evaluation, reproducible analytical workflows, and building statistical tools for evidence-based decision-making.

Experience

Lead Statistical Consultant
Boston University · Boston, MA
  • Managed a team of 12 graduate statistical consultants supporting 16 cross-functional research projects.
  • Designed and evaluated A/B tests; conducted power and sample-size analyses to ensure statistically valid study designs.
  • Built analytical workflows for high-dimensional, noisy, and grouped datasets using GLMs and XGBoost in R and Python.
  • Reviewed analyses, identified limitations in data and study design, and contributed to 2 peer-reviewed publications.
PhD Researcher & Instructor
Boston University · Boston, MA
  • Built reproducible pipelines in R and Python for simulation, estimation, and model evaluation.
  • Designed R Shiny dashboards and visualizations to communicate complex statistical results.
  • Developed a PostgreSQL database for messy research data and wrote SQL queries for extraction.
  • Taught undergraduate statistics courses and mentored graduate students in statistical reasoning and experimental design.
Licensed Practical Nurse (68C), Sergeant
U.S. Army Reserve, 405th Field Hospital · United States

Education

Ph.D. in Statistics
Boston University · Boston, MA
B.S. in Mathematics
Stony Brook University · Stony Brook, NY
Magna Cum Laude

Selected Projects

Geo-Based Incrementality Testing Platform

Causal inference framework for geo-based incrementality experiments: study design, sample size determination, power analysis, and counterfactual estimation across geographic markets. Includes a Python RAG assistant for non-technical users.

Modular Classification Pipeline for Tabular Data

Reusable Python ML pipeline for cross-validated model training, evaluation, and ensembling using XGBoost, LightGBM, CatBoost, and regularized logistic regression. Applied to real-world datasets in the NESS Statathon, earning top placements in 2023, 2024, and 2025.

Real-Time Computer Vision Automation System

Headless automation system on Linux using Python, OpenCV, OCR, and CNN-based screen parsing. Graph-based state controller for real-time perception and decision-making. Reduced manual supervision from hours per day to ~15 minutes; sustained >99% uptime over 6+ months.

Survival Analysis of Clinical Outcomes

Time-to-event analysis of clinical outcomes using Kaplan–Meier estimation and Cox proportional hazards modeling. Assessed associations between treatment exposure and survival while accounting for censored observations. Produced reproducible statistical reports with hazard ratios and confidence intervals.

Technical Skills

Statistical Methods

  • experimental design
  • power & sample-size analysis
  • hypothesis testing
  • causal inference
  • A/B testing
  • GLMs
  • survival analysis
  • simulation

Programming

  • Python
  • R
  • SQL
  • PostgreSQL
  • Bash

ML / Modeling

  • scikit-learn
  • XGBoost
  • LightGBM
  • CatBoost
  • PyTorch
  • transformers
  • NLP

Tools & Libraries

  • pandas
  • NumPy
  • tidyverse
  • R Shiny
  • OpenCV
  • Git
  • Linux

Domains

  • biomedical research
  • marketing analytics
  • network models
  • computer vision

Publications

Attractor-Based Coevolving Dot Product Random Graph Model

arXiv preprint arXiv:2505.02675, 2025

Modeled polarization and flocking behavior in dynamic networks using graph embedding methods. Proposed an attractor-based framework for coevolving latent-space network dynamics.

Simplex-Constrained Orthogonal Transformation Estimation

Manuscript in preparation, 2026

Introduced a penalty function to align point clouds with the simplex under orthogonal constraints. Targets applications in latent-space model identifiability and estimation.

Awards

2nd Place — NESS Statathon 2023: Predictive modeling for car insurance risk
4th Place — NESS Statathon 2024: Customer conversion prediction pipeline for marketing optimization
2nd Place — NESS Statathon 2025: Predictive modeling for car insurance risk