◆ Methodology
A calibrated probability model. A confidence-weighted score. A curated handful of best bets every day. Every assumption is published, every miss is logged. Here’s the shape of how it works.
The engine produces, for every batter in every game, a calibrated probability that they get at least one hit. Not a tip. Not a feeling. A probability that means what it says - “62%” reflects roughly 62 of every 100 similar setups.
Per-plate-appearance probabilities are combined across the batter’s expected trips to the plate to produce the game-level number you see on the row.
Predictions run once per day, before lineups are confirmed. Once they’re written, they’re locked - no edits, no swaps, no retroactive changes. Every prediction is stored and every outcome is matched against it. The track record is built from those rows.
The model evaluates each batter against four broad families of MLB-wide signal. The strongest family for any pick becomes the ↑ FORM 9.8 chip on the player’s row.
How well the batter has been hitting the ball, their underlying skill, and how their recent form stacks against their established baseline.
The pitcher tonight, the handedness picture, and what history says about how the batter handles this kind of arm.
Where the game is played and the conditions on the field - stadium quirks, weather at first pitch, and the calling tendencies behind the plate.
Spot in the lineup, expected trips to the plate, and the broader rhythm of the game that shapes how many real chances the batter will see.
A typical Best Bet has signals from several families pulling in the same direction. Total magnitude is what the “Signal Pulse” bipolar bar chart visualizes when you tap a row.
The Batter Advantage Score (0-100) is a single number designed for ranking picks, not as a probability itself. It blends the underlying signals into a confidence-weighted compression.
The very top of the distribution is asymptotic - a truly elite day gets close to 100 without ever reaching it. That’s deliberate: an earlier version clamped at 100 and produced ties at the top on form-heavy slates. The current shape preserves order so the best pick stays distinguishable from the second-best.
Now that you know the shape - see what the model’s saying about today’s slate.
The four tiers correspond to where the engine’s calibrated confidence places each pick. The thresholds are set by the daily calibration job - when the model drifts, the tier boundaries drift with it. What’s stable is the rank order, not the cut-points.
typically a handful per day
typically 5-10 per day
a few more than that
the remainder of the slate
The tier letter shows up on every row, GameCard, and TopSpot. When the calibrated tier isn’t available (legacy data), the UI falls back to a static threshold approximation.
Of the roughly 270 batter predictions on a typical day, the engine surfaces a small set - usually five - as Best Bets. These are the day’s most confident calls by the model’s own internal scoring.
Selection happens pre-game and is persisted on every prediction record. Once it’s flagged, it’s flagged forever - we can’t add a Best Bet to a quiet day to look productive, and we can’t remove one that missed. Every Best Bet’s outcome is published on the track record.
Live as of right now: 21/38 (55.3%) over last 15 days.
Every hit probability ships with a confidence interval - a window around the central number reflecting how sure the model is. The chip next to a pick condenses that window into a one-word label:
A “65% hit · MED” pick is the model saying: I think this batter hits 65% of the time, and I’m moderately certain. Wider intervals come from small sample sizes (rookies, injury returns), volatile matchups, or sparse history against this pitcher type.
Raw model output is recalibrated nightly against the games that actually resolved. The new fit is only deployed if it measurably improves accuracy - otherwise yesterday’s calibration stays in place. That gate prevents over-fitting to a noisy day.
Tier boundaries flex with the prior week’s outcomes. When a tier outperforms its expected hit rate, the threshold loosens to admit more picks; when it under-performs, the threshold tightens. Calibration drifts toward whatever is actually true.
The calibration page shows the actual curves - predicted probability versus actual hit rate. A perfectly calibrated model lies on the diagonal.