I’m in Phoenix this weekend, doing a interview with NHK television, primarily talking about the prospects for Japanese players in the US…which this year means lots of Yu Darvish. If you’ve seen the projections I’ve made, then you already know I’m very bullish on him. Simply put, he has the best statistics of any pitcher we’ve seen leave Japan – better than Matsuzaka’s – and, yes, I know about and have accounted for the extreme drop in Japanese offense last year. The only negative thing I have to say concerns his workload, where he exceeded 31 batters per start last year. That would normally be a very large concern, but let me make two points. One, that extreme offensive decline meant that a higher than usual number of these batters should have come with the bases empty – so pitching from the windup, and less likely arm strain. Two, every scouting report glows over his smooth, repeatable mechanics, which we also think leads to a higher pitch capacity.
I’ve been doing these interviews with NHK for about eight years now, we think since Hideki Matsui first signed with the Yankees. They are always a lot of fun to do; they feed me really open-ended questions (So, Norichika Aoki?) and just let me go, and I’ll rattle off whatever I can think of, with digressions into whatever sabermetric points I think have relevance to the case or explanations of my process or…really whatever. We did the interview outside, at the Dodgers’ training camp offices, with Clayton Kershaw throwing warmup pitches on a mound right behind me, and various players staring out windows right at us while running on the treadmill inside. Pleasant as I make it sound, the conditions were actually kind of nasty – there was a 25-30 mph breeze blowing in with the occasional grain of sand or dust, and temps only in the low 50s. We had to seek a more sheltered location because the sound man couldn’t hear anything but wind through our mics.
And in the middle of the interview, I made a point about the ongoing discussion about expanding to a second wild card team this year – a decision that was officially announced while I was being interviewed. It took a little while to sort through the logic changes that come with having two wild cards, but the post-season odds calculator is now running that way. The biggest thing to like about it, for me, is re-introducing a real race for something like the expected Yankee-Red Sox collision – it really matters who wins the division and who gets the wild card, beyond just a one-game home advantage.
I sent the latest update in this afternoon. The most significant change from the programming side is a bugfix that was giving pitchers too many innings, by about 3%. So pitchers who did have 200 innings got moved down to about 194, without any of their hit/run/walk/strikeout numbers changing (those numbers had already been set by the program; literally the last step is take all the batters faced that are left over, and convert those to outs, and that’s where the bug was.) That had the side-effect of requiring me to pump about 30 innings back from the starters to the bullpens, with some slight rearrangement for teams who were stronger in one than the other.
There were also quite a few changes to the hitters, but not from the program. As I was preparing for my first draft (tomorrow!) I came to the realization that I was just being too conservative with the PA I was giving to the top, no doubt they’re starting as long as they’re healthy, types of players. Those players got their PA set by my eyeball rather than the computer-derived number, with a strong favoritism towards the median value for the past five years.
I also blew the dust off the playoff odds routine and ran the current projections though it, producing the results you’ll find here. That chart uses the basic projections to set the performance for each team, and then plays the season a million different times using the actual schedule. A key feature of this model is that it does not treat the Red Sox, for example, as a definite .577 team, even though that’s what their latest projection says. That .577 rating is an estimate of the Red Sox value; it may be higher (key players may do better than expected), although there’s a lot more ways it could end up worse (injuries, in addition to just randomly worse performances). This model creates a spread of values used for each “year” run in the model – perhaps .578 in year 1, then .534 in year 2, .601 in years 3, etc etc up to a million. The spread is a curve that has a median value of .577 (meaning that there are just as many scores below .577 as above it) - but it has a longer tail to the low side, so the mean value will be a little below .577.
Unfortunately, the American League looks rather dull – my models clearly and unambiguously favor the Red Sox, Yankees, Tigers, and Rangers to make the playoffs, with only weak challenges led by the Angels and Rays. Fortunately, the National League looks to be almost totally up for grabs. The Phillies and Marlins provide a strong East race (the Nationals break .500, but there’s too much power in the division for them to contend). The Cardinals, Red, and Brewers have been repeatedly swapping positions through my various updates, so it is no surprise that they all have solid chances to win. And the West is even tighter, between the Giants, Diamondbacks, and Rockies – I would say that is officially too close to call.
And I have to note, with some sadness, the passing of Gary Carter. For ten consecutive years, from 1977-86, I have Carter rated (by WARP) as the #1 catcher in the National League; seven of those he was the #1 catcher in both leagues, and in 1982 he was the #1 player in the NL (losing the major league title to Robin Yount’s spectacular season). That creates a strongly affirmative answer the question, “Was he the best player at his position?”, the number 3 item on Bill Jmaes’ Keltner List. By one way I ran for comparing players all-time, I rated Carter as the #2 catcher – only behind Bench. It is ridiculous, beyond ridiculous, that he was not elected on the first ballot.
But Carter’s death holds a special poignancy for me, because, just a few months prior to his initial diagnosis, I had a similar run-in with the medical establishment – a shadow on a CT and overhearing a long and largely unintelligible word that ended with the suffix “-oma”. Mine, fortunately for me, turned out to be of a type that is as passive and benign as Carter’s was aggressive and malignant – but it took a couple of weeks to get the tests done to verify that, and I certainly have not forgotten the fear that came in the meantime. There, but for the grace of God, go I.
I know I haven’t written anything for a week, but I’ve been hard at work. For the last week I’ve been working on making improvements to the forecast algorithm, particularly the pitching side. Through December and January, I was able to incorporate the component scores into the hitter forecasts, and produce an improvement over the whole stat-line approach I had been using. I’ve been trying to do the same thing for pitchers, and just this morning cracked the ‘prior performance’ barrier. While I’m still working on the improvements, I felt good enough about them to incorporate them into new model run. While the changes were dramatic for some pitchers, the effects on teams wasn’t so large – the new method does not shake up the standings. But new standings, new depth charts, and new projections are on-line.
Like the old system, the projection is based on a Marcel-like baseline. Where it differs from Marcel is that the different statistics have different weighted averages, and I use the translated data throughout the process. Strikeouts are very heavily weighted towards the most recent season – roughly a 5-2-1, rounding off, with a small (~15%) regression to mean component. Walks and groundball rates are also highly slanted, though not as much as Ks. At the other extreme, hit rates have essentially no weight for seasons (1.2 – 1.1 – 1), and an 85% regression to mean, which is why stats like FIP work. Once the baselines are calculated for everybody, I go through a similar-player search, and then see how those similar players deviated from their baselines in the following year(s), and apply those deviations to the players. Once all this is done, I run the player through the translation routine backwards to get his stats back into an expected-2012 performance baseline.
I’m testing the new projection system against the set of all pitchers, who had 50 major league innings in 2011, who pitched for only one team in 2011, and who had a major league appearance in 2010. My note says that is 437 pitchers. I’m only looking at five top-level stats for judgment – hits, walks, strikeouts, homeruns, and runs allowed. The projection is normalized to the actual innings pitched in 2011, and I just look resulting errors tabulated.
Here’s the root-mean-square error you get from just using the player’s 2008-10 (major league) stats as your 2011 projection:
Hits 14.03 HR 4.09 BB 8.74 SO 12.72 R 11.48 Sum= 51.06
Same thing, but using his translated stats for 2008-10 as the projection:
Hits 13.76 HR 3.57 BB 8.01 SO 14.26 R 10.66 Sum= 50.26
Lower is better, so this gives us the not terribly surprising result that using reasonably adjusted minor league data in addition to major league data is better than major league data alone. Incidentally, if I use the luck-free runs allowed instead of actual runs – that would be calculated runs, using a normal number of H/BIP and HR/FB – the run error would drop almost a run, to 9.85.
Here’s the results of the program I’d been using to use projections for the past two months:
Hits 11.66 HR 3.27 BB 7.62 SO 13.03 R 9.88 Sum= 45.46
And here’s the results I’m getting from the new version, as of 11:00 PM Sunday night:
Hits 11.48 HR 3.33 BB 7.59 SO 13.16 R 9.47 Sum= 45.03
I’m more than a little annoyed at seeing the strikeout numbers trend backwards; on the other hand, the improvements everywhere else suggest that I’ve got a blind spot – a hole in my swing, as it were – probably a calculation error that should lead to a nifty improvement once I track it down.
In case you were wondering about over-fitting, I am also checking the routines against 2009 and 2010 pitchers, who are not part of the test set. The improvements there are about 3/4 size of the 2011, which suggests some mild overfitting, but not enough for me to be worked up over. At least not yet.
The Nationals added Edwin Jackson to their rotation, which gave me a good reason to run another update.
He’s a no-doubt rotation member, and actually goes straight to #1 on my rankings for them if you rank by innings. He, Gio Gonzalez, Stephen Strasburg, and Jordan Zimmermann make a pretty strong front four – perhaps not quite a match for the Phillies, but very strong. It also puts them in a position where at least one of Ross Detwiler or John Lannan goes underutilized (I have no idea where Chien-Ming Wang fits in, but I haven’t been able to run any projections that like him for 2012).
He pushes their projected record up a notch to 80-82. Their projected run totals – 4th best in the majors in the pitching, 6th-worst in batting – point to a seriously unbalanced team. Their offense has fallen into the “OK trap” – with the exception of Ryan Zimmerman at third, everybody on the roster looks to be an OK player for the position, neither good enough to help the team towards a title nor bad enough that an easy upgrade is available. Even the deservedly-maligned Roger Bernadina projects as an average major league hitter. You’re not going to do any better with a Marlon Byrd or Vernon Wells, who strike me as the kind of guys you might be able to get for a John Lannan. And no, I don’t think Bryce Harper is a savior for this season – the K rates he’s shown make him a bad risk for jumping straight to the majors.
Clay Davenport’s Projections for 2012
generated on 2- 2-2012
|
|
Three days later, and there’s a host of changes already made, which you can see here.
I spent six hours yesterday going through the pitcher projection codes, and did track down a couple of bugs, mostly in the way that the BABIP numbers (and thus hits) would track for each pitcher. This dramatically changes some pitchers, while others are barely changed; the main effect on the standings is that everything got narrower. The extreme teams moved 4-5 games back towards the middle, but I feel better about the way they look.
One flaw I noted – which I am guilty of every year – was that I was too conservative on the playing time for known, no doubt about it starters, listing too many guys at 80 or 85% instead of 85-90%. 90 is as high as I go. Cal Ripken, during the Streak, might have gotten a 95. Maybe. I went through and bounced a lot of hitters up a notch.
Surprised by the Tigers’ making the deal with Prince Fielder. I rushed him in to the projection in the simplest way – he gets the 1B slot, Cabrera gets the DH. I’m intrigued by the idea of Cabrera getting some time at third – its a bad-looking set of guys over there, currently. That will deserve a closer look. But it really solidifies the Tigers as Central favorites. Prior to the signing, I had them at 82 wins, with the White Sox at 80 and Indians at 78. Afterwards, they’re up to 88, while every other team in the Central is down 1, so they go from +2 to +9 over their nearest rival.
From the comments, SG says:
Clay, are the relief->starter projections adjusted for role? It seems like Chris Sale, Neftali Feliz, Aceves and Bard are projected to perform at the same rates they’d projected to have as relievers, but with more innings as starters. Relievers moved to the rotation should see a degradation in rate of performance, something like 15-20% higher in ERA. If that’s already in there, nevermind.
The master spreadsheet was making those role adjustments, but it turns out that the data being sent to the output was reading from the unadjusted area, not the adjusted area. That’s been fixed.
Not Anonymoussays:
Slugging does look strangely low across the board for TB. Almost everyone is about 30-60 points lower than their zips projection. You may want to check those numbers.
There are a couple of ways to go with projections. The first time around, I ran them with adjustments built in so that the sum total of the AL projections, for instance, would equal the AL numbers from 2011. The slugging average of the AL last year was .408; the projections came out to .428. The corrections then went and knocked 20-30 points off everybody across the board.
That is a common problem with projections; they tend to be optimistic. The way the optimism fails, though, is not with an across the board cut, but because specific individuals fall dramatically short – think Ryan Zimmerman, Hanley Ramirez, or Stephen Drew last year. This time I ran the numbers without those adjustments, so the figures for individuals come up. On the downside, though, the numbers are now unbalanced – the sum totals for the hitters will not match the sum totals of the pitchers.
RSsays:
Any chance to get all the player projections linked in one excel sheet?
They’re now on the projections home page.
Anna says: Hi Clay, for the Giants, a couple of things stood out:
- Brandon Crawford is the projected starting SS, not Mike Fontenot
- Bochy-endorsed RF Nate Schierholtz needs a projection!
The Schierholtz thing was another bug – he was in there, but the NL players on each team with the most PA without leading in any position did not get printed out – a kind of DH mixup. And I did see Bochy’s comments on Crawford, I just missed adding them in from an even earlier version (I first set the master spreadsheet up around Thanksgiving). I still have my doubts about him lasting for the whole season, but I did go ahead and reverse the PT between those two.
Dan Lewissays:
A few comments on the Mets:
* I think the 72 win outcome is reasonable, and as a Mets fans, I’ll sadly take it.
* Barring injury, I’d be shocked if Ronny Cedeno gets 2x the starts at SS as Ruben Tejada.
* One of the Mets front office staff — I think DePodesta? — has said that the way the team intends to treat SP prospects is by giving them one-way tickets to NYC. That is, when a SP comes up, it is with the expectation that he not go back to AAA. I think you’ll see more of Hefner and Schwindin and less of Familia and perhaps Mejia. You may also see some Matt Harvey, if the team thinks he can go from September call-up to Opening Day 2013 rotation. But that’s a minor nit.
* Another minor nit: Beato almost certainly will not make the team out of camp.
My take on Cedeno is that he has “established starter” attached to his name, even though he sucks. Second glance does show that I probably was hasty in setting all the time to Ronny, so I will be revisiting that. And if the Mets are indeed tracking towards a low-70s (or, revised, mid-70s) figure, they’ll be a little more willing to let a guy who perhaps should be in AAA to stick around.
This will be regularly updated in the “Projected Standings” tab on the header menu. You can access all teams and players from the links.
Clay Davenport’s Projections for 2012
generated on 1-21-2012
|
|
Yoennis Cespedes has made quite a splash since leaving the communist paradise of Cuba, aligning himself with some very clever marketers and well-produced promotional videos. Scouts are raving about his power and his speed, and there is some anticipation of a bidding war for his services once he clears the regulatory hurdles that come with being a Cuban emigre.
As well there should be. While I was initially pessimistic about his past performances, and the projections that can be reasonably made from them, I discovered a few things about how the system works. Those influences were holding him back, so to speak, for what I’m pretty sure were bad reasons. He looks like he should be a solidly above-average major league player – not likely to be a Hall of Famer and maybe not even an All-Star, but someone who could place in the top third of starting center fielders for the next several years.
I am not basing that conclusion on the videos prepared by his agents, nor on the reports of scouts. What I have, that no one else seems to have done yet, is to compile his performance history from the Cuban Serie Nacional, along with the stats of everyone else in the Cuban leagues, and subject those statistics to the same kind of review we have for players in the American minor leagues. I have what I think is a complete record back to 2001 of Cuban batting and pitching performances, and fielding numbers from 2006 to now. You can see all of them by going through the ‘DTs by League’ tab; the drop-down menus will allow you to change the league to “Cuban Serie Nacional” (its at the bottom of the list) , to change the year (the Cuban League typically runs from November to April; the year given is the calendar year in which the league finished, so “2011″ is the 2010-2011 season), to switch between hitting or pitching stats, and to switch between Real stats, Translated stats, and Peak Translated Stats. The latter isn’t as useful for Cuba as it is for American leagues, since there are so many players whose age is unknown – at least, unknown to me. Even some of the ones we think we know turn out to be wrong. The drop-down menus also have a ‘Splits’ menu, but that won’t work with the Cuban stats – all I have are the top-line ‘ALL’ numbers.
Each of those pages are sortable. The stats for the entire league occupy the top of the page, with separate sortable stat boxes for each team down below. Each player is linked to his own page – accessible through the league pages, or through the search box at the top of any of the player pages. At least I think I have them all on their own page – the Cuban culture apparently takes a very lackadaisical attitude towards consistent spelling, which made it an enormous chore to link player stats from one season to the next together. I am almost certain that there is somebody whose stats are split in two because I didn’t catch on that Yoandry Malleta and Joandi Mayeta were, in fact, the same person. Having these pages gives us a chance to look at what Yoennis Cespedes has actually done on the baseball field in a competitive environment.
His real statistics are reasonably impressive – a consistent .300 eqa, averaging 35 HR per 650 PA (the sum lines are scaled down to 650 PA to ease interpretation), with excellent center field defense. He’s pulling 350-ish atbats a year, which – given that the Cuban season is 90 games, and the league leader in AB is around 380 – speaks well to his durability.
The 33 home runs he hit in 2011 represented a new Cuban league record. I have said before, that records, as often as not, are not the product of a great individual effort, but a good effort carried out in especially favorable circumstances. Cespedes’ home run record is undoubtedly the latter. Having the full stats in hand allows us to look at the trend in total HR hit over the last few years in Cuba: how they’ve gone from 669 in 2007, to 1192, then 1292, 1498, and finally 1449 in 2011, all with roughly the same number of games and plate appearances. From 2001 to 2007, no hitter in Cuba had more than 28 home runs; its happened nine teams in the last four years. In addition, Cespedes plays with Granma, which has had the highest park factors in Cuba over the last three years. (To be fair, he did hit 18 of his 33 home runs on the road, so the record is not simply a park effect). But a look at the leader board for home runs in 2011 makes it crystal clear:
| Name | Age | Year | Tm | Lge | AB | H | DB | TP | HR ? | BB | SO | R | RBI | SB | CS | Out | BA | OBP | SLG | EqA | EqR | POW | SPD | KRt | WRt | BIP | Defense | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Jose_Abreu | 0 | 2011 | Cfg | CBA | 212 | 96 | 14 | 0 | 33 | 58 | 32 | 79 | 93 | 2 | 1 | 119 | .453 | .597 | .986 | .438 | 75 | 81 | -4 | 1 | 16 | 35 | 56-1B | 1 | ||
| Yoennis_Cespedes | 25 | 2011 | Gra | CBA | 354 | 118 | 17 | 1 | 33 | 49 | 40 | 89 | 99 | 11 | 3 | 242 | .333 | .424 | .667 | .311 | 65 | 38 | 2 | 3 | 3 | -5 | 85-CF | 21 | ||
| Reutilio_Hurtado | 0 | 2011 | SCu | CBA | 321 | 104 | 18 | 0 | 30 | 64 | 45 | 77 | 77 | 0 | 0 | 220 | .324 | .457 | .660 | .340 | 74 | 45 | -6 | 0 | 10 | -6 | 87-CF | -4 | ||
| Joan_Carlos_Pedroso | 31 | 2011 | LTu | CBA | 253 | 79 | 9 | 0 | 29 | 60 | 67 | 51 | 83 | 1 | 0 | 177 | .312 | .452 | .692 | .341 | 60 | 64 | -6 | -20 | 14 | 0 | 49-1B | -4 | ||
| Frederich_Cepeda | 31 | 2011 | SSp | CBA | 305 | 121 | 25 | 3 | 28 | 77 | 36 | 84 | 81 | 0 | 1 | 188 | .397 | .519 | .774 | .379 | 83 | 48 | -5 | 4 | 16 | 22 | 74-LF | -16 | ||
| Alexander_Malleta | 0 | 2011 | Ind | CBA | 311 | 100 | 22 | 1 | 27 | 78 | 40 | 74 | 76 | 6 | 5 | 219 | .322 | .462 | .659 | .322 | 64 | 38 | -2 | 2 | 16 | -6 | 87-1B | -3 | ||
| Alfredo_Despaigne | 25 | 2011 | Gra | CBA | 261 | 93 | 7 | 0 | 27 | 33 | 42 | 56 | 74 | 1 | 2 | 172 | .356 | .439 | .693 | .319 | 49 | 46 | -6 | -5 | 2 | 8 | 27-LF | 4 | ||
| Edilse_Silva | 0 | 2011 | Hol | CBA | 331 | 111 | 22 | 1 | 25 | 55 | 66 | 60 | 87 | 0 | 4 | 227 | .335 | .433 | .634 | .309 | 60 | 41 | -6 | -12 | 7 | 13 | 78-1B | -7 | 10-LF |
Cespedes was tied for the record this year by Jose Abreu, who did it in 60% as mnay atbats, and there are a whole slew of players right behind them. This looks like a pretty normal leaderboard, not the leaderboard of a record-setting season – which is how you can be pretty sure the record really belongs to the conditions, not the individual. Give him credit for leading the league in HR, but leave the record talk out of it.
(Aside: you really, really should click on Abreu’s link and look at his numbers. If I were an MLB exec I’d be tempted to hire an extraction team to go in and kidnap the guy so he could play for me).
The thing is, while there are certainly some high quality players in the Cuban league, enough to fill out an All-Star team that is strong in world competitions, the quality depth just isn’t there – a problem that isn’t helped by the continual exodus of top players like Cespedes (let us please leave the moral issues, governmental ideologies, personal freedoms, what is right and what is wrong out of this; I am explicitly and only considering the baseball issue here). In past analyses I have graded the Cuban league, as a whole, to be on par with low A ball in the States. This means that the translation process is going to take some very big bites out of these Cuban statistics, which you can see for yourself in the Regular DT portion of his page and which I’ve reproduced here:
Yoennis Cespedes Born 19851018 Age 25 Bats R Throws R Height 70 Weight 200 Regular DT
Year Team Lge AB H DB TP HR BB SO R RBI SB CS Out BA OBP SLG EqA EqR POW SPD KRt WRt BIP Defense
2004 Granma______ CBA 300 73 17 4 8 23 86 40 34 3 1 233 .243 .302 .407 .246 35 5 0 -18 -3 4 77-DH 0
2005 Granma______ CBA 358 92 22 3 13 26 83 55 42 4 1 273 .257 .314 .444 .261 48 10 1 -8 -3 0 93-DH 0
2006 Granma______ CBA 360 101 23 2 19 31 68 66 58 6 1 262 .281 .344 .514 .289 59 17 3 -1 -1 0 87-CF -3
2007 Granma______ CBA 361 91 22 2 17 30 84 67 57 12 5 278 .252 .317 .465 .267 51 15 8 -8 -2 -4 77-CF 8
2008 Granma______ CBA 378 82 14 1 18 21 79 57 52 3 2 301 .217 .258 .402 .225 36 13 3 -6 -6 -21 76-CF 9 1-LF 1
2009 Granma______ CBA 348 87 14 1 18 28 60 58 54 4 2 270 .250 .307 .451 .260 47 14 1 2 -2 -13 65-CF 2
2010 Granma______ CBA 358 93 19 3 14 30 70 61 45 4 1 267 .260 .321 .447 .264 48 9 2 -2 -2 -3 75-CF -9
2011 Granma______ CBA 375 92 16 1 22 34 67 60 65 8 1 287 .245 .311 .469 .267 53 18 2 1 -1 -16 85-CF 16
--------------------------------------------------------------------------------------------------------
Minors 591 148 31 4 27 46 124 97 85 9 3 2171 .251 .309 .451 .260 78 13 2 -5 -2 -7 97-CF 5
What we have is a guy who (over a 162-game season) has 25-30 HR power, which is worth roughly 15 runs more than an average major league player. The only evidence for good speed, which was on prominent display in his videos, are his fielding ratings – it is not apparent from his hit distribution or stolen base totals. Statistically speaking, he only rates as slightly speedier than average. His strikeout and walk rates both rate as “poor”, although his K rate has improved in recent years. Initially, at least, and especially so if a team tries to move him straight to the majors, the K rates are likely to be worse than this. The really bad mark on his record are his BIP numbers, which have become sharply worse recently, with a score at least 13 runs worse than major league average in three of the last four years. I am still in the process of understanding the way BIP fluctuates. The BIP score, by a wide margin, has the least continuity from one year to the next of the five component stats I have listed: correlations are only about 0.4, instead of the 0.8 ratings the others enjoy. While there are tendencies for different hitters, there is a pretty good chance that this rating numbers will improve in the US.
So yes, the overall projection is for good power, low BA and OBA, and a good CF, with an overall EQA in the .260-.270 range. Per the EqA report, the average EqA for major league center fielders last year was .269.
There is a major league player who is, statistically, quite comparable to Cespedes. He is also a center fielder, has a Gold Glove, and is only about 2.5 months older. Compare their 2011 DTs, enlarging Cespedes’ to the same plate appearance total:
. ab h db tp hr bb so sb cs ba oba slg eqa
Cespedes 544 133 23 1 32 49 97 12 1 .244 .307 .467 .267
ML Player 562 162 25 2 25 31 100 11 5 .288 .325 .473 .274
The major league player gets more 24 hits, but gives most of that advantage back by drawing 20 fewer walks. He’s had major league EqAs of .246, .265, .262, and .274 in his career, and there is a near-constant expectation for him to break out and have a great season. Instead, we have a string of seasons which place him as the 7-10th best CF in the majors – very good, but short of All-Star caliber. The hype of what we expect from Adam Jones and the reality we’ve gotten seems to me like a very good lesson for Yoennis Cespedes.
When I run a projection for Cespedes – and, for that matter, Jones – I get a forecast that carries them from their current .270ish figure to something more .280ish. Combine that with being good-fielding center fielders, and you’re talking about 4-5 WARP, right on the border of All-Star status. A 4.3 WARP, which is his 50% projection, would have made him the 6th best CF in 2011, 5th in 2010, or 4th in 2009. The overwhelming majority of players with similar stats are in the majors; the Improve percentage and breakout/collapse ratios are both in his favor for the next few seasons.
Cespedes’ comps are interesting, in that there isn’t a lot range to it. None of them are in the Hall of Fame, but a couple of them get mentioned in Hall discussions. Seven of the ten were very, very good – and the remaining three did just about nothing, with no gradation between the two groups. His ten best comps (based on performances from ages 23-25) :
| Year | WARP | VORP | EQAlast3 | EQAnext3 | POW | SPD | K | BB | BIP | |
| Yoennis Cespedes | 2012 | .264 | 14 | 2 | 0 | -2 | -11 | |||
| George Hendrick | 1976 | 20 | 275 | .278 | .297 | 18 | 0 | 2 | -4 | -2 |
| Joe Mather | 2009 | -1 | -2 | .264 | .220 | 13 | 0 | -1 | -2 | -9 |
| Vernon Wells | 2005 | 16 | 226 | .270 | .267 | 8 | 0 | 8 | -4 | -2 |
| Kevin McReynolds | 1986 | 30 | 269 | .270 | .301 | 13 | -1 | 4 | -4 | -5 |
| Jason Lane | 2003 | 3 | 28 | .269 | .273 | 11 | 2 | -3 | -2 | -1 |
| Andruw Jones | 2003 | 53 | 382 | .281 | .283 | 16 | 2 | -1 | 0 | -3 |
| Matt Luke | 1997 | 1 | 3 | .252 | .234 | 9 | -1 | -4 | -5 | -4 |
| Torii Hunter | 2002 | 36 | 265 | .247 | .263 | 5 | 3 | -4 | -7 | -3 |
| George Bell | 1986 | 12 | 182 | .267 | .285 | 11 | 1 | 1 | -7 | -3 |
| Vada Pinson | 1965 | 42 | 291 | .292 | .282 | 10 | 5 | 8 | -3 | 3 |
“Year” corresponds to the “current season”; WARP and VORP are career totals; EQAlast3 is the players translated EQA for the three seasons prior to “Year” (e.g., 1973-75 for Hendrick), while EQAnext3 is for Year plus the next 2 seasons (1976-78 for Hendrick); the component scores are also for the prior three seasons.
George Hendrick, not surprisingly, also comes up as Adam Jones’ #1 comp.
One thing that is always a question with Cuban players is age – how much do these stats change if Cespedes is not actually going on 26, but is in fact several years older? It actually isn’t that bad for him, compared to other players who were reportedly something like 22 when coming to the States. He’s reached what you might call the plateau portion of the aging curve, where expected performance stays fairly level for about six years. Changing his current age had almost no affect on the projections for 2012 or 2013; where it does have an affect is in how long he can play before he goes into the downhill portion of the aging curve. You can see in his projection that the breakout/collapse ratio goes below 1 at age 29, and continues getting worse; that’s a good indication of where his comps have tended to lose it.
Looking back at trades and moves since Christmas:
Red Sox: Traded Josh Reddick, Miles Head, and Raul Alcantara to Oakland for Andrew Bailey and Ryan Sweeney. Signed Rich Hill.
By adding Bailey to their bullpen, who should immediately take over as the closer (sorry, Mark Melancon), I’m reasonably certain that at least one of Alfredo Aceves or Daniel Bard will join the rotation. Its been talked about all off-season, but in my experience these discussions happen about five times as often as the move actually happens. Good relievers turning to starters have a pretty good track record – Ogando last year, Ryan Dempster, CJ Wilson – and I would expect Bard, Aceves, or both to do fine. In right field, I suppose Sweeney just slots in directly for the playing time I expected would go to Reddick. In the mean forecast, I actually have Sweeney being a touch better than Reddick for 2012 – but that is conditioned by Sweeney having some demonstrated fragility, a much lower chance of surprising you to the upside this year, and a lower chance of improving in future years. Sweeney adds to the collection of OK but unexciting outfielders the Red Sox have in-system: Ryan Kalish (who will miss the first two months or so), Dan Nava, Alex Hassan.
Hill is a TJ-reclamation project five years removed from a good season, and I don’t expect him to contribute.
Cubs: Signed Andy Sonnanstine.
In the Tampa Bay system, Sonnanstine was the beneficiary of solid defensive teams that made him look, at times, almost respectable. Wrigley Field is not going to be his friend.
White Sox: Traded Carlos Quentin to San Diego for Simon Castro and Pedro Hernandez. Traded Jason Frasor to Toronto for Miles Jaye and Daniel Webb.
I am not sure what the Sox are trying to do with this, unless they really think that Quentin is damaged beyond repair or such head case they no longer want to deal with them. Who they expect to play the outfield for them has me baffled – I suppose they are committed to Alejandro De Aza and Dayan Viciedo, but Brent Lillibridge (or Adam Dunn(!)) are the only other players in-system with outfield experience and a projected EqA north of .245, which equates to “remarkably thin”. I had them as the best contender to the Tigers for the AL Central, but they’ll have to shore up that outfield. I’m wondering, given their experience with Viciedo and Alexei Ramirez, if they aren’t making a strong for Yoennis Cespedes.
Of the pitchers they got, I’d expect Castro (who reached AAA wit the Padres) and maybe Hernandez (who made AA) to make appearances in the majors this year, but not to remain up for very long. Jaye and Webb are much farther away, and none of the four have pitched like a sure-fire major leaguer.
Yankees: Signed Andruw Jones.
Jones steps in for what figures to be another 200 or so PA, backing up the heretofore durable Yankee outfielders and occasionally stepping in at DH. He’s a step up from Justin Maxwell, who pretty much drops off the depth chart.
Athletics: Traded Andrew Bailey and Ryan Sweeney to Boston for Josh Reddick, Miles Head, and Raul Alcantara. Signed Jorge Soler.
Reddick is the only player here i expect to see in Oakland this year. He’s a pretty average outfielder, roughly equal to Sweeney for this year, but a fair bit more growth potential in the power department. Head turned in a really nice half-season in the Sally League last year, but none of his other stops point to a major league first baseman. Alcantara was in the Gulf Coast League last year, and did well enough for me to consider him a decent long-term prospect – but it’s a huge gap between the GCL and the majors.
Padres: Traded Simon Castro and Pedro Hernandez to the White Sox for Carlos Quentin.
I had Castro slotted for about 10 starts this year, which I’ll now pass on to Joe Wieland (who I like better, anyway) instead. I’m assuming that Quentin locks down left field, which should kick off a big fight between Yonder Alonso, Kyle Blanks, Jesus Guzman, and Anthony Rizzo for first base, a fight that could, in a stretch, also draw in Will Venable or Chris Denorfia. That’s a fight that Alonso probably wins, although the projections for all of them are close enough that the hottest March hand could take it. Quentin gives the Pads a proven power bat, but a) they look like a distant fifth-place team, b) he’s brittle, and c) has a pretty lousy projection going forward for a 28-year-old (probably because of injuries and lack of speed) …I don’t see how this fits in as part of a plan.
Giants: Signed Boof Bonser.
Hasn’t had a positive RAR since 2007.
Blue Jays: Traded Miles Jaye and Daniel Webb to the White Sox for Jason Frasor. Signed Darren Oliver. Signed Aaron Laffey.
When I first filled out a depth chart for the Blue Jays, I had real problems with the Toronto bullpen, and I wound up assigning 20 innings apiece to seven different guys with lousy (4.50+ ERA) projections. Frasor and Oliver – not so much, Laffey – wipe out most of those and cut a quarter run off the projected bullpen ERA.
Nationals: Signed Michael Ballard.
10 years, 254 Megabucks.
Ten years.
That’s an eternity in baseball, particularly when you are dealing with a player who has already passed the age of 30.
I have a new card for him up here. Lets start with the projections for this year and beyond, down at the bottom of the page:
Year Age PA AB R H DB TP HR RBI BB SO SB CS BAvg OBP SLG EqA RAR WARP Defense MJ Brk Imp Col Att Drp 2012 32 596 520 90 160 25 1 34 98 71 58 9 4 .308 .396 .556 .329 57.3 7.1 139-1B 3 100 9 38 18 3 1 2013 33 592 518 88 159 25 1 32 96 70 58 7 4 .307 .394 .544 .326 54.1 6.7 138-1B 3 100 2 32 25 9 1 2014 34 575 503 84 153 24 1 30 90 68 59 6 3 .304 .391 .535 .323 50.7 6.3 134-1B 3 100 3 25 31 15 6 2015 35 553 483 78 144 23 1 28 84 66 56 6 3 .298 .387 .524 .319 45.9 5.8 129-1B 3 100 6 19 42 25 13 2016 36 485 423 69 126 20 1 25 75 58 49 5 3 .298 .388 .527 .319 40.6 5.0 113-1B 2 99 3 15 52 37 24 2017 37 447 392 64 116 18 1 23 69 52 46 4 2 .296 .383 .523 .316 36.1 4.5 104-1B 2 99 0 12 58 45 33 2018 38 407 357 56 104 17 1 21 62 47 42 4 2 .291 .378 .521 .315 31.8 4.0 95-1B 2 97 1 8 63 51 42 2019 39 360 316 50 91 15 1 18 54 41 37 3 2 .288 .375 .513 .311 26.6 3.4 84-1B 2 98 1 6 69 61 53 2020 40 300 263 39 76 10 0 15 44 35 32 2 1 .289 .377 .498 .309 21.3 2.7 70-1B 1 96 0 4 81 71 63 2021 41 248 221 31 63 9 0 12 36 26 27 1 1 .285 .363 .489 .301 15.2 1.9 58-1B 1 96 0 2 88 81 71 .
(I’ve cut some columns to make it fit here, and added more projection years than the pages are going to have). The top-line projection is for him to steadily drop from an EqA of .330-ish with a 7 WARP down to .300 with a 2 WARP – my gut feeling is that the long-term projections underestimate the EqA decline while exaggerating the PA decline, but that the overall level comes out about right. There’s 47.4 WARP showing on the board for the duration of the contract. The marginal value of those WARP needs to be about $5.35 million over the life of the contract for it to break even, which is pretty close to the current rate. If you allow for a 5% growth in the marginal value per year, then you only need to start around $4.15M to break even; since current marginal values are probably around 5.25, there’s a little (very little) bit of room to come up short in the expectations. The value looks OK for the expected production, as a total. Even for that, though, you are looking at a strong, strong likelihood of overpaying for the years at the end of the contract…you just have to hope you come out far enough ahead in the first few to make up for it.
But the risks, oh, those risks. The drop rate (“Drp”), which is the percentage of his comparables who are completely out of the league, is already up to one-third through six years; that number is a big part of why the expected plate appearances drop so steadily through the forecast. His breakout (Brk) scores, the chances of turning in a performance well above his 2009-2011 baseline, is only in the single digits throughout the contract; his collapse rates, the chance of underperforming that baseline, grows steadily, as expected. There is far more downside risk than there is upside, and the contract doesn’t leave much downside room to keep from becoming a stinker.
There is also the case – already factored in to the total forecast, but something I’d like to look at a little closer – that we may have already seen some decline. Here’s a cut from the regular translation portion of his card:
Albert Pujols Born 19800116 Age 31 Bats R Throws R Height 75 Weight 230 Regular DT Year Team Lge AB H DB TP HR BB SO R RBI SB CS Out BA OBP SLG EqA EqR POW SPD KRt WRt BIP 2009 St_Louis____ NL 571 191 36 1 50 109 57 129 144 16 5 395 .335 .442 .664 .357 150 31 0 16 12 1 2010 St_Louis____ NL 585 189 32 1 46 103 66 122 128 14 5 409 .323 .423 .617 .340 138 26 -1 14 10 1 2011 St_Louis____ NL 573 182 24 1 42 64 50 119 114 9 1 402 .318 .384 .583 .320 117 22 0 17 2 -3 .
The last five columns I’ve left in, after EqA and EqR, are component breakdowns. They are rate statistics, not totals, measured in runs per 650 plate appearances. To digress a moment, there was a comic book character in the DC universe by the name of Ultra Boy, who had all the powers of Superman – but he could only use them one at a time. If he turned on the strength, he wasn’t invulnerable, if he was invulnerable he couldn’t use the heat vision, if he was using X-ray vision he couldn’t fly to get a good vantage point, and so on and so on. I think Peter Petrelli had the same thing in Heroes. That’s kind of the idea here, with Pujols standing in for Superman. I have taken a perfectly average player, and replaced his averageness with one aspect of Pujols at a time. The POW column gives Mean Joe Mean Pujols’ power – mostly his home runs, with some doubles counting in as well – while leaving everything else perfectly average. SPD does the same thing for speed – stolen bases, triples, some doubles. KRt is strikeout rate, WRt is walk rate (which, for this purpose. includes HBP), and BIP is essentially singles.
What I really want to focus on is the drop in POW – from +31 in 2009, to +26 in 2010, and to +22 last year, to find players who underwent similar changes at the same ages (29-31), and to see what they did in the future. I’ve run cards like this for every player since 1954 (which you can find, for now, by going through the search bar at the top of each player’s page), so I started looking for players who had POW scores at ages 29-31 that matched Pujols’ scores, plus or minus some value. There weren’t any players found when that plus or minus value was 0, 1, or 2, which isn’t surprising since a +20 power score is a pretty high bar. I started to get some nibbles at +/- 3, a few more at 4, and by +/-5 I had twelve players to work with. Chronologically, they are
Ted Kluszewski 1954-56 34, 24, 18 Stan Lopata, 1955-57 32, 27, 19 Bill Skowron, 1960-62 26, 23, 23 Lee May, 1972-74 28, 27, 21 Johnny Bench, 1977-79 31, 29, 17 Dwight Evans, 1981-83 27, 23, 17 Steve Balboni, 1986-88 28, 22, 24 Barry Bonds, 1994-96 32, 22, 27 Jeff Bagwell, 1997-99 31, 21, 27 Mo Vaughn, 1997-99 28, 28, 22 Manny Ramirez, 2001-03 37, 27, 21 Troy Glaus, 2006-08 26, 21, 20 ALBERT PUJOLS, 2009-11 31, 26, 22 .
(Note that for the older players, pre-1980, the “new” design player cards haven’t caught up yet.)
These 12 players, between them, were able to meet or beat their worst age 29-31 POW 28 times over the remainder of their careers, an average of 2 and a third times per player, which doesn’t sound too bad. Nine of those 28, however, belong to Barry Bonds; six more belong to Manny Ramirez, which means that more than half of the good results come from two players with PED issues. The remaining ten players tally only 13 seasons, just a little over 1 per player
Kluszewski had a 21 in half-time play for the Angels in 1961
Lopata was done.
Skowron never reached 20 again.
May had a 29 in 1976, age 33, and had a couple more high-teens.
Bench bounced back with a 30 in 1980 and a 19 in 1981, so he gets credit for two.
Evans went 21-14-18-22 over the next four years, giving him three hits.
Balboni put up a 28 and 36 in the next two years, and then was out of the league.
Bonds went ape on the league for the next decade, including an incredible 68.
Bagwell went 26-21-18-20-12 for the remaining five full years he played, that counts for 2.
Vaughn had a 22 in 2000, missed 2001, had another 22 in 2002, then an 8 and then was done, so that counts for 2.
Ramirez had a 31-34-33 over the next three years, and then a 21, 25, and 33 later on, for a total of 6.
Glaus was pretty much done.
This is, to some extent, factored into the general projection, although the comps there are based on similarity across all performance sectors, not just power (as well as height, weight, position). Pujols has significant advantages over many of the players on this list in terms of speed (just by being average, instead of -5 or -10), strikeout rates (very strong +15 typically, compared to -22 from Balboni), and walk rates (although the drop from 12 to 10 to 2 is troubling), which should help him remain a star player even if his power drops into the mid-teens. Compare him to someone like Balboni, for whom power was the only component worth anything; and even at a +36 the major leagues didn’t think it offset his high strikeouts (-29), lack of speed (-8), and lack of singles (-23) enough to justify bringing him back for more. We would expect that kind of broad toolset to age better than a narrow one.
A quick look at how the last 12 years of pennant races would have looked like with a second wild card:
In six of the last eight NL races, the carte deux team finished just 1 game behind the actual wild card winner. From 2001 until now, the fifth-best NL record never been more than 4 games behind the wild card.
By contrast, this year’s race, between the Rays and Red Sox, is the only time since 2000 that the fifth-place AL team has finished within 1 game of the wild card. In eight of the last 12 years, the new card winner (assuming things played out as before) would have trailed the real winner by 5 or more games, with the 2001 Twins finishing a whopping 17 games behind the “102 win but still just a wild card” A’s.
Given that knowledge, you’d think the fifth-best NL teams would have a better record than their AL counterparts, but you’d be wrong. The AL wild cards averaged 95.42 wins to the NL’s 91.25 from 2000-2012, and their “tame cards” still led by 89.84-88.67.
In 2007 and 2002, the “Drive for Five” (“Drift for Fifth”?) would have ended in a tie in the AL, between Detroit and Seattle in ’07 and between Boston and Seattle in ’02.
In the 2007 NL, the Padres and Rockies finished the regular season in an 89-73 tie for the wild card, forcing a one-game playoff between them – exactly what would have happened under the new system.
In seven of the 24 leagues between 2000-2011, the second wild card actually had the fourth-best record in the league, beating out at least one of the division winners. In two more seasons the WC2 was tied with one of the division winners – so that quite often the “fifth” team has as good a claim to the playoffs as one of the division winners.
In the 2011 AL, 2010 AL, 2009 NL, 2008 AL, 2007 NL, 2006 AL, and 2002 NL, the two wild cards were from the same division. Getting the wild cards coming off a one game playoff, with likely disruptions to their starting rotation and bullpen, just got a little more important, relative to facing a rested division winner. Will the team with the best record still get bumped from the wild card if they come from the same division?
Similarly, holding the tiebreaker for a division lead in a “one team wins the division, one gets the wild card” scenario just got more important.
The Giants have been the “last team looking in” three times since 2000, in 2009, 2004, and 2001. The Red Sox held the distinction in both 2010 and 2011, plus they were tied for it in 2002. The Red Sox would also have had to defend their wild card win five times, more than anyone else. The Padres, Phillies, Dodgers, and Indians have been in the situation twice. The Mariners were that team once, and tied for being that team two more times.
NL
Year WC WC-2 GB
2011 STL (90) ATL 1
2010 ATL (91) SD 1
2009 COL (92) SF 4
2008 MIL (90) NYM 1
2007 COL (89+) SD 0 (not counting playoff)
2006 LA (88) PHI 3
2005 HOU (89) PHI 1
2004 HOU (92) SF 1
2003 FLA (91) HOU 4
2002 SF (95) LA 3
2001 STL (93) SF 3
2000 NYM (94) LA 8
AL
2011 TB (91) BOS 1
2010 NYY (95) BOS 6
2009 BOS (95) TEX 8
2008 BOS (95) NYY 6
2007 NYY (94) Det/Sea 6
2006 DET (95) CWS 5
2005 BOS (95) CLE 2
2004 BOS (98) OAK 7
2003 BOS (95) SEA 2
2002 ANA (99) Bos/Sea 6
2001 OAK (102) MIN 17
2000 SEA (91) CLE 1
-
Categories
-
Calendar
May 2012 M T W T F S S « Mar 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 -
Meta
