Grammer/Graham BaseRuns + PythagenPat + FIP/SIERA

Two data sources: BACKTEST 2024–25 regular season, 782 bets — the validated performance baseline. LIVE Spring 2026, 55 graded games — calibration data, no bet signals until Opening Day Mar 27.

BACKTEST 2024–2025 regular season · 782 bets · $50 flat · validated
P&L
+$6,495
2024 + 2025
ROI
16.6%
782 bets · $50 unit
Bet Win Rate
56.1%
439 W / 343 L
Reg. Slope
0.448
ideal 1.0 · overconfident
Avg Odds
2.11
post underdog filter
LIVE Spring training 2026 · 55 graded · calibration only · bet signals suppressed
Pick Accuracy
56.4%
31/55 graded games
Last 10 Games
70%
7/10 correct picks
Last 20 Games
70%
14/20 correct picks
vs Market
61.8%
market baseline
Previous model (ERA as xFIP · $100 flat)

Bets: 1,001
Bet win rate: 50.8%
ROI: 10.4%
Odds 2.7+ included: 200 bets @ 28.5% WR
Edge floor: 10%

Current model (FIP/SIERA · $50 flat)

Bets: 782
Bet win rate: 56.1%
ROI: 16.6%
Underdog filter: ≤2.65 · 12% edge floor
Spring training: signals suppressed

By season BT
VALUE vs STRONG BT
Home vs Away BT
Bet win rate lifted 50.8% → 56.1% (backtest validated)Removing odds >2.65 cut 200 weak bets (28.5% WR). Raising the edge floor to 12% removed another tier of noise. ROI improved from 10.4% to 16.6% on a per-bet basis over two full seasons.
Spring 2026 pick accuracy tracking at 56.4% — on paceThe last 10 and last 20 games both show 70% pick accuracy, trending above the backtest average. Market baseline in spring is 61.8% — the model is within range and improving. No bet signals until Opening Day Mar 27 when regular season data takes over from Steamer projections.
Model overconfident above 63% — regression slope 0.448When the model predicts 70%, actual win rate is ~60%. A soft probability cap at 0.72 addresses the worst extreme. Full calibration detail on the Calibration tab.
Spring training bets are suppressed — this is intentionalThe 11 spring bets placed before this fix ran at −$850 / −111% ROI. All spring STRONG bets on home short-odds favourites lost — Steamer projections don't account for spring roster decisions. Bet signals fire automatically from March 27.

Spring 2026 Live Performance LIVE

55 graded games, March 10–21 2026. All spring training — calibration data only. Bet signals suppressed until Opening Day March 27.

Pick Accuracy
56.4%
31 correct / 55 games
Last 10
70%
7/10 · improving
Last 20
70%
14/20 · improving
Market baseline
61.8%
34/55 · spring inflated

Model pick accuracy vs market — spring 2026

Weekly breakdown. Week 11 (Mar 10–15): model 54.2%, market 62.5%. Week 12 (Mar 16–21): model 75.0%, market 68.8%. Strong improvement in the final week as the Steamer projections settle and pitching data improves.

CategoryGamesModel correctModel %Market %Gap
All 55 graded (spring)553156.4%61.8%−5.4%
Last 20 games201470.0%65.0%+5.0%
Last 10 games10770.0%70.0%±0%
When model ≠ market (11 games)11436.4%small sample
Spring bets (pre-fix): −$850 on 11 bets — now suppressedHome STRONG bets at short odds (1.62–1.96) went 1/5 = 20%. The model generated probabilities of 75–85% from Steamer projections; the market priced the same teams at 1.77–1.96. Steamer doesn't account for spring rotation decisions or player rest. Away STRONG bets (backing underdogs) went 5/7 = 71% — that's the model working correctly.
Home win rate is elevated in spring (66%+) — this is normalHome win rates in spring training consistently run above the regular season 53–54% baseline due to scheduling asymmetries, travel, and team-specific choices about who plays at home. The market accounts for this; the model will as well once live season stats replace Steamer projections.

Model Calibration BACKTEST

Calibration analysis from the 2024–25 regular season backtest (782 bets). This is the validated dataset. Spring 2026 calibration runs on only 55 games and is not shown separately here — sample size is insufficient for meaningful regression.

Model probability vs actual win rate (2024–25 · 782 bets)

Each bubble is a bin of bets grouped by raw model probability. Bubble size = number of bets. Green line = perfect calibration. Amber dashed = regression fit (slope 0.448). The model tracks well between 49–63% but is increasingly overconfident above that — predicting 74% when the actual win rate is ~58%.

Actual win rate Market implied Perfect calibration Regression fit (slope 0.448)
Slope
0.448
ideal = 1.0 · overconfident
Intercept
0.293
baseline lift
R² / p-value
0.010
p = 0.005 · significant
Regression: Actual win% ≈ 0.293 + 0.448 × Raw model probability
When model says 70%: expected actual = 0.293 + 0.448×0.70 = 60.6%
When model says 55%: expected actual = 0.293 + 0.448×0.55 = 53.9%
Correction applied: soft probability cap at 0.72 — prevents predictions exceeding 72%, addressing the worst overconfidence without compressing mid-range signals that are profitable.

Calibration table

Model prob bucketBetsAvg modelActual WRMarket probGapVerdict
42–49%11846.3%43.2%40.4%−3.1%Watch
49–56%20952.6%54.5%44.5%+1.9%Good
56–63%20859.5%60.1%50.8%+0.6%Best
63–70%10365.8%56.3%53.0%−9.5%Watch
70–78%5574.3%58.2%51.9%−16.1%Capped
78–86%8782.3%66.7%53.8%−15.6%Capped

Gap = actual WR minus model predicted. Negative = overconfident. The 0.72 soft cap prevents bets in the 70–86% range where overconfidence is worst. The 49–63% range is well calibrated and generates the bulk of P&L.

Edge Analysis BACKTEST

Where does the edge come from in the 2024–25 regular season? The 14–20% zone is the sweet spot. Above 25%, model overconfidence inflates apparent edges that don't fully materialise.

Win rate & ROI by edge bucket (2024–25 · 782 bets)

Edge = (model prob − market prob) / market prob. The 14–20% range delivers both 60% win rate and 25%+ ROI. The 35–50% bucket also performs well — these are genuine market mispricings at moderate odds rather than overconfident high-probability plays.

Win rate % ROI %
Edge rangeBetsWin rateP&LROISignal
10–12% (removed)~51%Low~5%Floor raised
12–14%11454.4%$61210.7%Marginal
14–17%12560.0%$1,55824.9%Sweet spot
17–20%7560.0%$98926.4%Sweet spot
20–25%8356.6%$82319.8%Good
25–35%5453.7%$34912.9%Moderate
35–50%6064.9%$73824.6%Good
50%+7655.3%$52813.9%Moderate
By market odds range BT
FIP/SIERA regression flag BT

Odds table: all four buckets are profitable. REGRESS flag (ERA − SIERA ≥ 1.0): 34 bets, 52.9% WR, +$251 — small sample, monitoring in 2026.

Parlay Strategy BACKTEST

Parlay P&L is tracked and reported completely separately from straight-bet P&L. These are optional daily suggestions — not placed automatically. Backtest: best +EV combo per leg-count per day, $12.50 stake.

📐
The compounding mathWhen multiple games each have positive expected value, combining them into a parlay multiplies the edge. Two legs at model prob 0.60 and odds 2.11 each: straight bet EV = $13.34 per bet. Parlay EV = $30.24 — 113% EV efficiency per dollar staked. Leg criteria: STRONG/VALUE signal, or model×odds > 1.0 (EV+), or model prob ≥ 60% (DUMMY pad).
EV = (p₁ × p₂ × …) × (O₁ × O₂ × … − 1) × stake − (1 − p₁ × p₂ × …) × stake
Parlays compound edge — the EV is real over a large sample at the cost of lower win frequency and higher variance per bet.
Backtest 2024–25 · $12.50 stake · best combo per day · completely separate from straight bets
2-Leg Parlays
−2.6%
188 picks · 27 wins (14.4%)
P&L: −$60.34
Avg combo odds: 7.72
2024: −$102 · 2025: +$42
3-Leg Parlays
+39.8%
185 picks · 16 wins (8.6%)
P&L: +$920.26
Avg combo odds: 19.22
2024: +$631 · 2025: +$290
4-Leg Parlays
+54.8%
181 picks · 8 wins (4.4%)
P&L: +$1,240
Avg combo odds: 45.48
2024: +$304 · 2025: +$936

ROI comparison — straight bets vs parlay tiers

Win frequency drops sharply with leg count (56% → 14% → 9% → 4%) but expected value per dollar increases due to compounded edge. 2-leg parlays underperformed in the backtest — insufficient margin after vig on both legs. 3-leg and 4-leg both positive.

Win rate % ROI %
Variance noteA 4-leg parlay wins 4.4% of the time even when +EV. Most individual days will be a loss. The EV is real across a full season sample but you will run losing streaks of 20–30+ consecutive parlays. These are supplementary bets alongside straight bets, not a replacement. The PARLAYS sheet in the daily model output shows the day's highest-EV combinations.
💡
Live tracking noteParlays are now stored in the tracker JSON and graded automatically each morning. The STATS sheet shows a running parlay log once results come in. Spring 2026: 20 combinations stored from March 17, 0 graded so far (results pending).