Grammer/Graham BaseRuns + PythagenPat + FIP/SIERA

Two data sources: BACKTEST 2024–25 regular season, 782 bets — the validated performance baseline. LIVE Spring 2026, 55 graded games — calibration data, no bet signals until Opening Day Mar 27.

BACKTEST 2024–2025 regular season · 782 bets · $50 flat · validated

P&L

+$6,495

2024 + 2025

ROI

16.6%

782 bets · $50 unit

Bet Win Rate

56.1%

439 W / 343 L

Reg. Slope

0.448

ideal 1.0 · overconfident

Avg Odds

2.11

post underdog filter

LIVE Spring training 2026 · 55 graded · calibration only · bet signals suppressed

Pick Accuracy

56.4%

31/55 graded games

Last 10 Games

70%

7/10 correct picks

Last 20 Games

70%

14/20 correct picks

vs Market

61.8%

market baseline

Previous model (ERA as xFIP · $100 flat)

Bets: 1,001
Bet win rate: 50.8%
ROI: 10.4%
Odds 2.7+ included: 200 bets @ 28.5% WR
Edge floor: 10%

Current model (FIP/SIERA · $50 flat)

Bets: 782
Bet win rate: 56.1%
ROI: 16.6%
Underdog filter: ≤2.65 · 12% edge floor
Spring training: signals suppressed

By season BT

VALUE vs STRONG BT

Home vs Away BT

✓

Bet win rate lifted 50.8% → 56.1% (backtest validated)Removing odds >2.65 cut 200 weak bets (28.5% WR). Raising the edge floor to 12% removed another tier of noise. ROI improved from 10.4% to 16.6% on a per-bet basis over two full seasons.

✓

Spring 2026 pick accuracy tracking at 56.4% — on paceThe last 10 and last 20 games both show 70% pick accuracy, trending above the backtest average. Market baseline in spring is 61.8% — the model is within range and improving. No bet signals until Opening Day Mar 27 when regular season data takes over from Steamer projections.

⚠

Model overconfident above 63% — regression slope 0.448When the model predicts 70%, actual win rate is ~60%. A soft probability cap at 0.72 addresses the worst extreme. Full calibration detail on the Calibration tab.

⚠

Spring training bets are suppressed — this is intentionalThe 11 spring bets placed before this fix ran at −$850 / −111% ROI. All spring STRONG bets on home short-odds favourites lost — Steamer projections don't account for spring roster decisions. Bet signals fire automatically from March 27.

Spring 2026 Live Performance LIVE

55 graded games, March 10–21 2026. All spring training — calibration data only. Bet signals suppressed until Opening Day March 27.

Pick Accuracy

56.4%

31 correct / 55 games

Last 10

70%

7/10 · improving

Last 20

70%

14/20 · improving

Market baseline

61.8%

34/55 · spring inflated

Model pick accuracy vs market — spring 2026

Weekly breakdown. Week 11 (Mar 10–15): model 54.2%, market 62.5%. Week 12 (Mar 16–21): model 75.0%, market 68.8%. Strong improvement in the final week as the Steamer projections settle and pitching data improves.

Category	Games	Model correct	Model %	Market %	Gap
All 55 graded (spring)	55	31	56.4%	61.8%	−5.4%
Last 20 games	20	14	70.0%	65.0%	+5.0%
Last 10 games	10	7	70.0%	70.0%	±0%
When model ≠ market (11 games)	11	4	36.4%	—	small sample

✗

Spring bets (pre-fix): −$850 on 11 bets — now suppressedHome STRONG bets at short odds (1.62–1.96) went 1/5 = 20%. The model generated probabilities of 75–85% from Steamer projections; the market priced the same teams at 1.77–1.96. Steamer doesn't account for spring rotation decisions or player rest. Away STRONG bets (backing underdogs) went 5/7 = 71% — that's the model working correctly.

ℹ

Home win rate is elevated in spring (66%+) — this is normalHome win rates in spring training consistently run above the regular season 53–54% baseline due to scheduling asymmetries, travel, and team-specific choices about who plays at home. The market accounts for this; the model will as well once live season stats replace Steamer projections.

Model Calibration BACKTEST

Calibration analysis from the 2024–25 regular season backtest (782 bets). This is the validated dataset. Spring 2026 calibration runs on only 55 games and is not shown separately here — sample size is insufficient for meaningful regression.

Model probability vs actual win rate (2024–25 · 782 bets)

Each bubble is a bin of bets grouped by raw model probability. Bubble size = number of bets. Green line = perfect calibration. Amber dashed = regression fit (slope 0.448). The model tracks well between 49–63% but is increasingly overconfident above that — predicting 74% when the actual win rate is ~58%.

Actual win rate Market implied Perfect calibration Regression fit (slope 0.448)

Slope

0.448

ideal = 1.0 · overconfident

Intercept

0.293

baseline lift

R² / p-value

0.010

p = 0.005 · significant

Regression: Actual win% ≈ 0.293 + 0.448 × Raw model probability

When model says 70%: expected actual = 0.293 + 0.448×0.70 = 60.6%

When model says 55%: expected actual = 0.293 + 0.448×0.55 = 53.9%

Correction applied: soft probability cap at 0.72 — prevents predictions exceeding 72%, addressing the worst overconfidence without compressing mid-range signals that are profitable.

Calibration table

Model prob bucket	Bets	Avg model	Actual WR	Market prob	Gap	Verdict
42–49%	118	46.3%	43.2%	40.4%	−3.1%	Watch
49–56%	209	52.6%	54.5%	44.5%	+1.9%	Good
56–63%	208	59.5%	60.1%	50.8%	+0.6%	Best
63–70%	103	65.8%	56.3%	53.0%	−9.5%	Watch
70–78%	55	74.3%	58.2%	51.9%	−16.1%	Capped
78–86%	87	82.3%	66.7%	53.8%	−15.6%	Capped

Gap = actual WR minus model predicted. Negative = overconfident. The 0.72 soft cap prevents bets in the 70–86% range where overconfidence is worst. The 49–63% range is well calibrated and generates the bulk of P&L.

Edge Analysis BACKTEST

Where does the edge come from in the 2024–25 regular season? The 14–20% zone is the sweet spot. Above 25%, model overconfidence inflates apparent edges that don't fully materialise.

Win rate & ROI by edge bucket (2024–25 · 782 bets)

Edge = (model prob − market prob) / market prob. The 14–20% range delivers both 60% win rate and 25%+ ROI. The 35–50% bucket also performs well — these are genuine market mispricings at moderate odds rather than overconfident high-probability plays.

Win rate % ROI %

Edge range	Bets	Win rate	P&L	ROI	Signal
10–12% (removed)	—	~51%	Low	~5%	Floor raised
12–14%	114	54.4%	$612	10.7%	Marginal
14–17%	125	60.0%	$1,558	24.9%	Sweet spot
17–20%	75	60.0%	$989	26.4%	Sweet spot
20–25%	83	56.6%	$823	19.8%	Good
25–35%	54	53.7%	$349	12.9%	Moderate
35–50%	60	64.9%	$738	24.6%	Good
50%+	76	55.3%	$528	13.9%	Moderate

By market odds range BT

FIP/SIERA regression flag BT

Odds table: all four buckets are profitable. REGRESS flag (ERA − SIERA ≥ 1.0): 34 bets, 52.9% WR, +$251 — small sample, monitoring in 2026.

Parlay Strategy BACKTEST

Parlay P&L is tracked and reported completely separately from straight-bet P&L. These are optional daily suggestions — not placed automatically. Backtest: best +EV combo per leg-count per day, $12.50 stake.

📐

The compounding mathWhen multiple games each have positive expected value, combining them into a parlay multiplies the edge. Two legs at model prob 0.60 and odds 2.11 each: straight bet EV = $13.34 per bet. Parlay EV = $30.24 — 113% EV efficiency per dollar staked. Leg criteria: STRONG/VALUE signal, or model×odds > 1.0 (EV+), or model prob ≥ 60% (DUMMY pad).

EV = (p₁ × p₂ × …) × (O₁ × O₂ × … − 1) × stake − (1 − p₁ × p₂ × …) × stake

Parlays compound edge — the EV is real over a large sample at the cost of lower win frequency and higher variance per bet.

Backtest 2024–25 · $12.50 stake · best combo per day · completely separate from straight bets

2-Leg Parlays

−2.6%

188 picks · 27 wins (14.4%)
P&L: −$60.34
Avg combo odds: 7.72
2024: −$102 · 2025: +$42

3-Leg Parlays

+39.8%

185 picks · 16 wins (8.6%)
P&L: +$920.26
Avg combo odds: 19.22
2024: +$631 · 2025: +$290

4-Leg Parlays

+54.8%

181 picks · 8 wins (4.4%)
P&L: +$1,240
Avg combo odds: 45.48
2024: +$304 · 2025: +$936

ROI comparison — straight bets vs parlay tiers

Win frequency drops sharply with leg count (56% → 14% → 9% → 4%) but expected value per dollar increases due to compounded edge. 2-leg parlays underperformed in the backtest — insufficient margin after vig on both legs. 3-leg and 4-leg both positive.

Win rate % ROI %

⚠

Variance noteA 4-leg parlay wins 4.4% of the time even when +EV. Most individual days will be a loss. The EV is real across a full season sample but you will run losing streaks of 20–30+ consecutive parlays. These are supplementary bets alongside straight bets, not a replacement. The PARLAYS sheet in the daily model output shows the day's highest-EV combinations.

💡

Live tracking noteParlays are now stored in the tracker JSON and graded automatically each morning. The STATS sheet shows a running parlay log once results come in. Spring 2026: 20 combinations stored from March 17, 0 graded so far (results pending).