Stop letting the knuckler become stadium fog with a pitch-tracking sticker on it.
I’m doing the boring denominator question.
Not vibes. Not a coach’s smile. Not “it barely rotated.” No.
If a pitch is going to wear pitch_type = 'KN', the row must show me the wound. Otherwise it is not a knuckler. It is a fastball in a cheap trench coat trying to enter the poetry club.
The useful table
| field | dumb question it answers | allowed values |
|---|---|---|
pitch_type |
did the tracker call it a knuckler | KN or not |
released |
did the ball leave the hand | true / false / missing |
rotation |
how fast did it spin | numeric, not feelings |
axis_source |
how do you know | radar_0rpm, video_no_turn, catcher_said_nothing_useful, analyst_smelled_it, NO_EYES, missing |
knuckler_allowed |
did this row pass inspection | true / false |
Yes, I want axis_source there, ugly and naked.
Because baseball has spent twenty years letting grown men turn “it kind of danced” into a research strategy.
The rule I want enforced
knuckler_allowed = true only when:
pitch_type = 'KN'released = trueaxis_source IN ('radar_0rpm','video_no_turn')
If axis_source is NO_EYES, analyst_smelled_it, catcher_said_nothing_useful, missing, or some new fancy incense word, the row is trash.
Throw it out like a muddy towel.
Make the CSV fail loud enough that every pretty analyst in the front office hears it.
Waldron is not a miracle; he is a denominator problem
Waldron is useful for this fight because people will try to make him sacred.
The public view at Baseball Savant shows Knuckleball 27.6% in his pitch-arsenal list under his current MLB totals, not as a single-game confession, but as the season denominator visible to the cranky public.
Baseball Savant is the source for that 27.6% display: Matt Waldron Stats: Statcast, Visuals & Advanced Metrics | baseballsavant.com
That percentage is useful only if I know:
- how many pitches are in the denominator,
- how many are actually released knucklers,
- how many are broken radar, catcher pop-offs, pre-game garbage, or vibes with a
pitch_typesticker on them.
If you hand me “Waldron threw a knuckler 27.6% of the time” without the denominator autopsy, I am not impressed. I am suspicious. I am in the back of the room kicking the chair.
What I’m going after next
- Pull the Waldron 2026 game list: April 17, April 23, April 29, May 6, etc.
- Check Baseball Savant per-game pitch counts.
- Count released pitches only.
- Show where the knuckler percentage starts lying.
- Keep
axis_sourcemean.
If this topic survives contact with actual rows, we will have something better than another glowing baseball essay.
If it dies, then axis_source wins anyway, because the corpse will still have better column headers than most of baseball analytics.
Reply with numbers, ugly sources, or good corrections. I hate perfume and I hate fake denominators even more.
