2024–2025 Backtest Results

Grammer/Graham BaseRuns + PythagenPat + FIP/SIERA · $50 flat unit · 2.65 underdog filter · 12% edge floor · Soft prob cap 0.72

Total P&L

+$6,495

2024 + 2025 combined

ROI

16.6%

782 bets · $50 unit

Win Rate

56.1%

439 W / 343 L

Reg. Slope

0.448

raw model · ideal 1.0

Avg Mkt Odds

2.11

post 2.65 filter

Previous model (ERA as xFIP · $100 flat)

Bets: 1,001
Win rate: 50.8%
ROI: 10.4%
Odds 2.7+ bets: 200 @ 28.5% WR
Edge floor: 10%

Current model (FIP/SIERA · $50 flat)

Bets: 782
Win rate: 56.1%
ROI: 16.6%
Underdog filter: ≤2.65
Edge floor: 12% · Soft cap: 0.72

By season

VALUE vs STRONG

Home vs Away

✓

Win rate lifted from 50.8% → 56.1%The 2.65 underdog filter removed 200 low-quality bets (28.5% WR). Fewer bets, much higher quality. ROI improved from 10.4% to 16.6% on a per-bet basis.

✓

STRONG now outperforms VALUE (17.6% vs 16.2% ROI)With underdog STRONG bets removed, the signal correctly identifies higher-conviction picks. The 25% edge threshold is now justified by the data.

⚠

Model overconfident above 63% probabilityRegression slope 0.448 (ideal 1.0) confirms the model is too confident at high probabilities. A soft cap of 0.72 prevents the most extreme predictions. This is an analytical observation — the raw signal (56.1% WR) remains profitable.

⚠

Edge sweet spot is 14–20%The 14–20% edge range returns 25–26% ROI. Above 25%, overconfidence inflates apparent edges. Edge floor raised to 12% to remove the weakest VALUE bets. The 10–12% bucket was returning only 5% ROI.

Model Calibration

How closely do model predictions match actual win rates? The closer the blue bubbles to the green line, the better calibrated the model is.

Model probability vs actual win rate (2024–2025 · 782 bets)

Each bubble is a bucket of bets grouped by raw model win probability. Bubble size = number of bets. The model tracks well in the 49–63% range but is significantly overconfident above 63% — predicted 74% actually wins only 58%.

Actual win rate Market implied prob Perfect calibration Regression fit (slope 0.448)

Slope

0.448

ideal = 1.0 · overconfident

Intercept

0.293

baseline lift from model

R² / p-value

0.010

p = 0.005 · significant

Regression: Actual win% ≈ 0.293 + 0.448 × Raw model probability

When model says 70% → expected actual win rate: 0.293 + 0.448×0.70 = 60.6%

When model says 55% → expected actual win rate: 0.293 + 0.448×0.55 = 53.9%

Correction applied: soft cap at 0.72 prevents predictions above 72%, addressing the worst overconfidence without shrinking profitable mid-range signals.

Calibration table

Model prob bucket	Bets	Model prob	Actual WR	Market prob	Gap	Verdict
42–49%	118	46.3%	43.2%	40.4%	−3.1%	Watch
49–56%	209	52.6%	54.5%	44.5%	+1.9%	Good
56–63%	208	59.5%	60.1%	50.8%	+0.6%	Best
63–70%	103	65.8%	56.3%	53.0%	−9.5%	Watch
70–78%	55	74.3%	58.2%	51.9%	−16.1%	Overconfident
78–86%	87	82.3%	66.7%	53.8%	−15.6%	Overconfident

Gap = actual WR minus model predicted. Negative = model overestimates. Overconfidence concentrated above 63%. Soft cap addresses the 70–86% range.

Edge Analysis

Where does the model's edge come from? The 14–20% zone is the sweet spot; 25%+ is where model overconfidence inflates apparent edges.

Win rate & ROI by model edge bucket

Edge = (model prob − market prob) / market prob. The 14–20% range shows both the highest win rate (60%) and best ROI (25–26%). Above 35% edge the wins return because those are genuine market mispricings at moderate odds, not just overconfidence.

Win rate % ROI %

Edge range	Bets	Win rate	P&L	ROI	Signal
10–12% (removed)	—	~51%	Low	~5%	Floor raised to 12%
12–14%	114	54.4%	$612	10.7%	Marginal
14–17%	125	60.0%	$1,558	24.9%	Sweet spot
17–20%	75	60.0%	$989	26.4%	Sweet spot
20–25%	83	56.6%	$823	19.8%	Good
25–35%	54	53.7%	$349	12.9%	Moderate
35–50%	60	64.9%	$738	24.6%	Good
50%+	76	55.3%	$528	13.9%	Moderate

Performance Splits

Detailed breakdown by odds range, season, and home/away. Parlay P&L is completely separate — see Parlay Strategy tab.

By market odds range

Season comparison

Odds range	Bets	Win rate	Model prob	Market prob	P&L	Gap
1.4–1.7	25	72.0%	78.1%	62.6%	+$184	−6.1%
1.7–2.0	334	62.3%	66.0%	53.3%	+$2,808	−3.7%
2.0–2.3	220	52.3%	57.3%	46.1%	+$1,492	−5.0%
2.3–2.65	203	48.3%	50.9%	40.2%	+$2,012	−2.6%

All four odds buckets are profitable. Gap = actual WR minus model predicted. Negative = slight overconfidence, still profitable due to market mispricing.

FIP/SIERA regression flag performance

Flag	Bets	Win rate	P&L	Interpretation
REGRESS	34	52.9%	+$251	ERA − SIERA ≥ 1.0 · pitcher overperforming, expect regression
No flag	748	56.4%	+$6,244	ERA and FIP/SIERA within 1.0 of each other — normal range

REGRESS flag now fires with real FIP/SIERA data. Small sample (34 bets) — continue monitoring over the 2026 season.

Parlay Strategy

Parlay results are tracked and reported completely separately from straight-bet P&L above. These are optional supplementary bets — not placed every day.

📐

The math: compounding +EV betsWhen multiple games each have positive expected value, combining them into a parlay compounds that edge. The combined EV exceeds the sum of individual straight bets, providing better theoretical return at the cost of higher variance and lower win frequency. Parlays also vary the profile of bets placed, which is a practical benefit when managing betting accounts.

EV = (p₁ × p₂ × ...) × (O₁ × O₂ × ... − 1) × stake − (1 − p₁ × p₂ × ...) × stake

At avg model prob 0.600 and avg odds 2.11: EV per 2-leg parlay = $30.24 vs $26.68 for two straight bets → 113% EV per dollar staked

Backtest results by leg count (2024–2025 · $12.50 stake · best combo per day · SEPARATE from straight bets)

2-leg parlays

−2.6%

188 combos · 27 wins (14.4%)
P&L: −$60.34
Avg combo odds: 7.72
2024: −$102 · 2025: +$42

3-leg parlays

+39.8%

185 combos · 16 wins (8.6%)
P&L: +$920.26
Avg combo odds: 19.22
2024: +$631 · 2025: +$290

4-leg parlays

+54.8%

181 combos · 8 wins (4.4%)
P&L: +$1,240
Avg combo odds: 45.48
2024: +$304 · 2025: +$936

ROI comparison — straight bets vs parlay tiers

Note: straight bets win ~56% of the time; 3-leg parlays win ~8.6%; 4-leg parlays win ~4.4%. Higher ROI reflects compounded edge at the cost of infrequent wins. Only the highest-EV combo per leg-count per day is counted.

Win rate % ROI %

Parlay leg qualification (three tiers)

✓

How legs are selectedSTRONG / VALUE — has a bet signal (edge ≥ 12%). EV+ — model_prob × market_odds > 1.0 (positive EV, no signal required). DUMMY — model_prob ≥ 60% (high-confidence pad). Each leg from a different game. Stake: $12.50 per combination.

⚠

Variance warningA 4-leg parlay wins only ~4.4% of the time even when +EV. The EV is real across a large sample but individual sessions will show mostly losses. Treat parlays as supplementary, not a replacement for straight bets. The daily model output (PARLAYS sheet) shows the highest-EV combinations available that day — they are suggestions only.