Comment by bitsofgrace

Here’s mine:

I’m building a small live NFL game-prediction tracker and writing up what I learn as I go:

https://michellepellon.com/portfolio/nfl-game-predictions

# What’s under the hood today

ELO translated to the NFL with margin-of-victory adjustments, a modest home-field term, and week-to-week recency weighting.

Post-hoc calibration with isotonic regression so 70% predictions land near 0.70 empirically.

Monte Carlo to roll games forward for distributions on weekly win odds and season outcomes, plus basic reliability/Brier/log-loss tracking.

# Where I’m taking it (ensemble ideas)

Blend a few complementary signals: (1) pure ELO strength; (2) schedule-adjusted EPA/Success Rate features; (3) injury/QB continuity and rest/travel effects; (4) a small “market prior” from closing lines; (5) weather/play style pace features.

Combine via a simple stacked model (regularized logistic, isotonic on top), or a Bayesian hierarchical model that lets team effects evolve with partial pooling.

Separate models for win prob vs. expected margin, then reconcile with a consistent link so the two don’t disagree.

Emphasis on calibration over leaderboard-chasing: reliability diagrams, ECE, PIT histograms, and backtests that penalize regime drift.

# Why I’m doing it

It’s a sandbox to teach myself Monte Carlo and ELO end-to-end—data ingest → feature plumbing → simulation → calibration → eval—on a domain with immediate feedback every week.

# How this connects to my day job (healthcare ops)

I work at BlueSprig, running ~150 ABA therapy clinics. I’m exploring whether ELO-like ideas can augment ops decisions:

“Strength” ratings for clinics, care teams, or scheduling templates based on outcome deltas and throughput (margin-of-victory ≈ effect size/efficiency).

Opponent/schedule ≈ case-mix, payer mix, staffing constraints, geography.

Monte Carlo for expansion planning (new-site ramp curves), capacity/OT forecasting, and risk-adjusted outcome monitoring with calibration so probabilities mean something.

Guardrails for fairness and interpretability so ratings don’t become blunt scorecards.

# Help

If you’ve shipped calibrated ensembles in sports or have pointers on applying rating systems to multi-site healthcare operations, I’d love to trade notes or if you need someone to this and other kind of work for their dayjob email me at mgracepellon@gmail.com -- I would love to do this fulltime.