Until I return home from surgery. Maybe tomorrow, cialis more likely saturday.

 

I participated in another “expert” draft on Tuesday night, courtesy of an invitation from Grey Albright at Razzball. It was a 12-team mixed league straight draft, and it was a strange experience – without a doubt the shallowest draft I’ve played in several years.

About the only plan I had going in was to avoid the players I had taken in LABR’s mixed league – spread my risks around, if you will. I wound up breaking that rule, not once, but four times, with players who had been sitting at the top of my boards for a long time by the time I took them.

The most interesting part of the draft for me, though –  and not in a good way – was the physical toll it had on me. It was only about three hours, and I did the whole thing from the comfort of my office chair, but by the last few rounds I was sweating like I was in the Amazon, my stomach was a solid constricted knot, and my head was crossed between a solid compressional pain and a loopy lightheadedness that had me barely able to sit up. It gives me a lot of worry about how I’ll handle upcoming drafts, in person, away from my house, for six+ hours at a time.

I had the six spot in this draft,  a touch better than the 13 I drew in LABR – right in the middle of the pack, with a steady interval between picks. Braun, Cabrera, Trout, McCutcheon, and Cano went before me. I settled for Matt Kemp readily enough.  Those who know me know I tend to value pitchers more than most, so it should not come as a surprise that I took Justin Verlander in the second round (Kershaw had gone a couple of picks earlier, and I probably would have taken him if I could.)

The third round had me salivating for Adam Jones, who was taken right before my turn. Damn. (That was my only real non-self-inflicted Damn of the night, though, so that’s good). I reloaded with Jay Bruce instead, and when the fourth round got back to me I decided to take an infielder with Starlin Castro. The fifth round sent me back to pitchers, and I chose Craig Kimbrel – first closer taken, and with serious misgivings about how he’s looked this spring. Freddie Freeman is one of my favorite breakout guys for the season, and his spring numbers – .346 translated EQA in Florida games – hasn’t done anything to dissuade me. I got him in the sixth. I took Kris Medlen in the 8th – he wasn’t actually next on my list, but only because of my doubts about his endurance, giving me three Braves in a row.

My 8th round went to Carlos Gomez, and I’m not sure why – I seem to have had a sudden panic attack about stolen bases and just flipped his name out there. Number 9 for me was Sergio Romo, who I clearly like a lot better than any other site I’ve seen. Tenth I took the “catcher” who is least likely to catch any games at all this season, Victor Martinez. I’m honestly not sure what to expect from him, but I do love the lineup slot he’s got. Eleven saw me go back to the mound for C.J. Wilson; here’s looking at the minor surgery clearing up the problem that dragged him down the last couple of months of 2012.

Halfway through, and I make my first repeat in Danny Espinosa. I really should have taken Howie Kendrick here – they were essentially equal players on my draft sheet, and I like what Kendrick (.409) has done this spring a lot more than Espinosa (.232).  I think the physical issues were starting to come on and affect me. Went to Doug Fister for my 13th pick, Pedro Alvarez at 14 (there was a run going on 3b going on, and I was getting panicky about being stuck with an even deeper option). Decided to take on some age and get Torii Hunter in the 15th. For 16 I hit my repeat board again, with another player I love as a 2013 breakout – Brandon Belt.

At 17 I went for Jason Vargas, who had actually been atop my board for a couple of rounds – I do love the outfield he’ll be pitching in front of. Alexei Ramirez at 18 wrapped up my infield. At this point I went looking for a pitcher who’s forecast was tolerable – not necessarily the best that’s left, but someone who had value within the format – but was having a strong spring. And I came up Jeff Niemann. I decided to take Joaquin Benoit next – I’ve got Rondon in another league, so I should have a closer in at least one of them. I did sort of the same thing for hitters, bringing up Aaron Hicks and his likely seizure of Minnesota’s center field. Hmm. Probably not my best choice of words. That left one more spot on my roster, and I filled it with a repeat (ugh) of Drew Stubbs (ugh ugh).

I was able to wash that sour taste out of my mouth by taking the hottest of hot bats, Jackie Bradley, with my first reserve pick. I liked him a lot coming into the season – the straight output forecast gave him a .274 eqa and 3.0 WARP, which was the 3rd-best total amongst Boston outfielders (Victorino and Ellsbury were higher). I certainly don’t like him less with his spring. Found my way to taking Hyunjin Ryu, another repeat, next, but I love what I get from the Korean numbers. And with the last pick, I went with a guy whose spring numbers are even better than Jackie Bradley’s, and in an organization where there is a lot less talent blocking his path to the majors – Christian Yelich of the Marlins.

And so to bed.

 

As so often happens, the rollout didn’t go quite as expected.

Spring data didn’t actually start loading until Monday, but should be on all player cards now. (Just the 2013 spring data).

The pitchers forecast pages were supposed to become analogous to the hitter pages – primarily through the addition of their top 20 comps. But it looks like that change didn’t make it all the way through the list – it seems to cut off somewhere between CC Sabathia and Jeff Samardzija. And there seem to be some odd problems – Trevor Bauer’s comp list has reasonable personnel but strange numbers, which I think might relate to having less than three years of work; I think innings are getting divided by 3, but runs are only getting divided by 2.

But in Miguel Gonzalez’s case, I find it very hard to believe that comp list belongs with that starting line – as though two jobs were running at once, and the other guys comp list made it in. Let me dig into those issues.

Update – Bauer problem solved, and yes, the less than three years bit had a lot to do with it.

 

 

 

Just as I’m getting ready to throw in the towel, I finally track down the bugs in the pitching projections that have been driving me up a wall for more than a month. And damned if it isn’t always the stupidest, piddlingest crap that is the hardest to find – I’d managed to switch the indices between the input league and the output league.

The way the program is supposed to work, the real stats are compared to the real league, and then adjusted to the new league. Somehow, don’t know when or how, the part that was supposed to move the stats from translation space back to the expected 2013 reality was going off on a weird angle.

And yes, that was holding me up from writing. I couldn’t face up to an audience without having a proper product (even if the product, with all its faults, was going out already.)

So start by checking out the Projected Standings tab. That’s the culmination of this winter’s work, as it stands now…early in the spring, with a lot of decisions yet to be settled. I would have liked to have had this done two months (and two of my own drafts) ago, but life doesn’t always move at our preferred pace.

AL East Won Lost Runs Runs A
TOR 87 75 778 716
NYY 86 76 735 689
TBY 85 77 662 624
BOS 84 78 789 754
BAL 76 86 687 732

Just to be clear about what goes in to the projections: all of the performance rates are based on the computer output. No manual intervention allowed. The computer also sets the upper limit for playing time. For a clear starter, I’ll stay within that limit, with a few exceptions (Stephen Strasburg, for instance, I bounced a good 30 IP above the computer, which is still a conservative 180 innings).  Deciding on who’s the starter – and how long they’ll hold it, and who the likely replacements are – that’s all me.

And when I put all of those things together, I get an absolute free for all in the AL East, with four out of five dentiststeams falling within an 84-87 win range. Looking at the Playoff Chances tab, which runs with these projections a million times, they are all in the 37-50% range for making the playoffs. The Orioles are well back, around 10% – as much as I don’t want it to be so, their performance last year was built on so much good luck that a repeat seems impossible.

Going back to the ‘Projected Standings’, if you click on the TOR tab you’ll get my full projection for the team. The team stats are sortable; just click the header line. Each player should be a link; click there and you should get a full page for the player, with stats  for his full career, right through whatever he’s done in spring training so far this year. (I think – I’ve been able to run the ST stats manually, but they haven’t gone through an automated computer run yet). Everything is still subject to change, and I welcome your suggestions on what to add. And write on – topic selection has always been a weakness for me.

And so, to bed.

 

No sooner do I say I am back, then I come down with an inflammation of my stomach that’s wiped me out for a week.

I estimate that my gas generation this past week has vaulted me into a top 10 source for climate change.

 

Not that I ever really left. Kept updating everything behind the scenes, just didn’t say anything about doing it. I guess I hit a point where I felt like I was saying the same things over and over again, and wasn’t coming up with anything new. So I walked away for awhile, maintaining things as much as a reference for myself and inviting anyone who wants to to use it as well.

My personal baseball season gets under way tonight, with my first draft of the years…which I’ll follow with an auction for another league this coming weekend.

I’m still not sure I have anything new to say, but maybe I’m over letting that bother me. At least temporarily.

 

I’m in Phoenix this weekend, doing a interview with NHK television, primarily talking about the prospects for Japanese players in the US…which this year means lots of Yu Darvish. If you’ve seen the projections I’ve made, then you already know I’m very bullish on him. Simply put, he has the best statistics of any pitcher we’ve seen leave Japan – better than Matsuzaka’s – and, yes, I know about and have accounted for the extreme drop in Japanese offense last year. The only negative thing I have to say concerns his workload, where he exceeded 31 batters per start last year. That would normally be a very large concern, but let me make two points.  One, that extreme offensive decline meant that a higher than usual number of these batters should have come with the bases empty – so pitching from the windup, and less likely arm strain. Two, every scouting report glows over his smooth, repeatable mechanics, which we also think leads to a higher pitch capacity.

I’ve been doing these interviews with NHK for about eight years now, we think since Hideki Matsui first signed with the Yankees. They are always a lot of fun to do; they feed me really open-ended questions (So, Norichika Aoki?) and just let me go, and I’ll rattle off whatever I can think of, with digressions into whatever sabermetric points I think have relevance to the case or explanations of my process or…really whatever. We did the interview outside, at the Dodgers’ training camp offices, with Clayton Kershaw throwing warmup pitches on a mound right behind me, and various players staring out windows right at us while running on the treadmill inside. Pleasant as I make it sound, the conditions were actually kind of nasty – there was a 25-30 mph breeze blowing in with the occasional grain of sand or dust, and temps only in the low 50s. We had to seek a more sheltered location because the sound man couldn’t hear anything but wind through our mics.

And in the middle of the interview, I made a point about the ongoing discussion about expanding to a second wild card team this year – a decision that was officially announced while I was being interviewed. It took a little while to sort through the logic changes that come with having two wild cards, but the post-season odds calculator is now running that way. The biggest thing to like about it, for me, is re-introducing a real race for something like the expected Yankee-Red Sox collision – it really matters who wins the division and who gets the wild card, beyond just a one-game home advantage.

 

 

 

I sent the latest update in this afternoon. The most significant change from the programming side is a bugfix that was giving pitchers too many innings, by about 3%. So pitchers who did have 200 innings got moved down to about 194, without any of their hit/run/walk/strikeout numbers changing (those numbers had already been set by the program; literally the last step is take all the batters faced that are left over, and convert those to outs, and that’s where the bug was.) That had the side-effect of requiring me to pump about 30 innings back from the starters to the bullpens, with some slight rearrangement for teams who were stronger in one than the other.

There were also quite a few changes to the hitters, but not from the program. As I was preparing for my first draft (tomorrow!) I came to the realization that I was just being too conservative with the PA I was giving to the top, no doubt they’re starting as long as they’re healthy, types of players. Those players got their PA set by my eyeball rather than the computer-derived number, with a strong favoritism towards the median value for the past five years.

I also blew the dust off the playoff odds routine and ran the current projections though it, producing the results you’ll find here. That chart uses the basic projections to set the performance for each team, and then plays the season a million different times using the actual schedule. A key feature of this model is that it does not treat the Red Sox, for example, as a definite .577 team, even though that’s what their latest projection says. That .577 rating is an estimate of the Red Sox value; it may be higher (key players may do better than expected), although there’s a lot more ways it could end up worse (injuries, in addition to just randomly worse performances). This model creates a spread of values used for each “year” run in the model – perhaps .578 in year 1, then .534 in year 2, .601 in years 3, etc etc up to a million. The spread is a curve that has a median value of .577 (meaning that there are just as many scores below .577 as above it) –  but it has a longer tail to the low side, so the mean value will be a little below .577.

Unfortunately, the American League looks rather dull – my models clearly and unambiguously favor the Red Sox, Yankees, Tigers, and Rangers to make the playoffs, with only weak challenges led by the Angels and Rays. Fortunately, the National League looks to be almost totally up for grabs. The Phillies and Marlins provide a strong East race (the Nationals break .500, but there’s too much power in the division for them to contend). The Cardinals, Red, and Brewers have been repeatedly swapping positions through my various updates, so it is no surprise that they all have solid chances to win. And the West is even tighter, between the Giants, Diamondbacks, and Rockies – I would say that is officially too close to call.

And I have to note, with some sadness, the passing of Gary Carter. For ten consecutive years, from 1977-86, I have Carter rated (by WARP) as the #1 catcher in the National League; seven of those he was the #1 catcher in both leagues, and in 1982 he was the #1 player in the NL (losing the major league title to Robin Yount’s spectacular season).  That creates a strongly affirmative answer the question, “Was he the best player at his position?”, the number 3 item on Bill Jmaes’ Keltner List.  By one way I ran for comparing players all-time, I rated Carter as the #2 catcher – only behind Bench. It is ridiculous, beyond ridiculous, that he was not elected on the first ballot.

But Carter’s death holds a special poignancy for me, because, just a few months prior to his initial diagnosis, I had a similar run-in with the medical establishment – a shadow on a CT and overhearing a long and largely unintelligible word that ended with the suffix “-oma”. Mine, fortunately for me, turned out to be of a type that is as passive and benign as Carter’s was aggressive and malignant – but it took a couple of weeks to get the tests done to verify that, and I certainly have not forgotten the fear that came in the meantime. There, but for the grace of God, go I.

 

 

 

I know I haven’t written anything for a week, but I’ve been hard at work. For the last week I’ve been working on making improvements to the forecast algorithm, particularly the pitching side. Through December and January, I was able to incorporate the component scores into the hitter forecasts, and produce an improvement over the whole stat-line approach I had been using. I’ve been trying to do the same thing for pitchers, and just this morning cracked the ‘prior performance’ barrier. While I’m still working on the improvements, I felt good enough about them to incorporate them into new model run. While the changes were dramatic for some pitchers, the effects on teams wasn’t so large – the new method does not shake up the standings. But new standings, new depth charts, and new projections are on-line.

Like the old system, the projection is based on a Marcel-like baseline. Where it differs from Marcel is that the different statistics have different weighted averages, and I use the translated data throughout the process. Strikeouts are very heavily weighted towards the most recent season – roughly a 5-2-1, rounding off, with a small (~15%) regression to mean component. Walks and groundball rates are also highly slanted, though not as much as Ks. At the other extreme, hit rates have essentially no weight for seasons (1.2 – 1.1 – 1), and an 85% regression to mean, which is why stats like FIP work. Once the baselines are calculated for everybody, I go through a similar-player search, and then see how those similar players deviated from their baselines in the following year(s), and apply those deviations to the players. Once all this is done, I run the player through the translation routine backwards to get his stats back into an expected-2012 performance baseline.

I’m testing the new projection system against the set of all pitchers, who had 50 major league innings in 2011, who pitched for only one team in 2011, and who had a major league appearance in 2010. My note says that is 437 pitchers. I’m only looking at five top-level stats for judgment – hits, walks, strikeouts, homeruns, and runs allowed. The projection is normalized to the actual innings pitched in 2011, and I just look resulting errors tabulated.

Here’s the root-mean-square error you get from just using the player’s 2008-10 (major league) stats as your 2011 projection:

Hits 14.03     HR  4.09     BB  8.74     SO  12.72     R  11.48       Sum= 51.06

Same thing, but using his translated stats for 2008-10 as the projection:

Hits 13.76     HR  3.57     BB  8.01     SO  14.26     R  10.66      Sum= 50.26

Lower is better, so this gives us the not terribly surprising result that using reasonably adjusted minor league data in addition to major league data is better than major league data alone. Incidentally, if I use the luck-free runs allowed instead of actual runs – that would be calculated runs, using a normal number of H/BIP and HR/FB – the run error would drop almost a run, to 9.85.

Here’s the results of the program I’d been using to use projections for the past two months:

Hits 11.66     HR  3.27     BB  7.62     SO  13.03     R  9.88      Sum= 45.46

And here’s the results I’m getting from the new version, as of 11:00 PM Sunday night:

Hits 11.48     HR  3.33     BB  7.59     SO  13.16     R  9.47      Sum= 45.03

I’m more than a little annoyed at seeing the strikeout numbers trend backwards; on the other hand, the improvements everywhere else suggest that I’ve got a blind spot  – a hole in my swing, as it were – probably a calculation error that should lead to a nifty improvement once I track it down.

In case you were wondering about over-fitting, I am also checking the routines against 2009 and 2010 pitchers, who are not part of the test set. The improvements there are about 3/4  size of the 2011, which suggests some mild overfitting, but not enough for me to be worked up over. At least not yet.

 

 

 

 

The Nationals added Edwin Jackson to their rotation, which gave me a good reason to run another update.

He’s a no-doubt rotation member, and actually goes straight to #1 on my rankings for them if you rank by innings. He, Gio Gonzalez, Stephen Strasburg, and Jordan Zimmermann make a pretty strong front four – perhaps not quite a match for the Phillies, but very strong. It also puts them in a position where at least one of Ross Detwiler or John Lannan goes underutilized (I have no idea where Chien-Ming Wang fits in, but I haven’t been able to run any projections that like him for 2012).
He pushes their projected record up a notch to 80-82. Their projected run totals – 4th best in the majors in the pitching, 6th-worst in batting – point to a seriously unbalanced team. Their offense has fallen into the “OK trap” – with the exception of Ryan Zimmerman at third, everybody on the roster looks to be an OK player for the position, neither good enough to help the team towards a title nor bad enough that an easy upgrade is available. Even the deservedly-maligned Roger Bernadina projects as an average major league hitter. You’re not going to do any better with a Marlon Byrd or Vernon Wells, who strike me as the kind of guys you might be able to get for a John Lannan.  And no, I don’t think Bryce Harper is a savior for this season – the K rates he’s shown make him a bad risk for jumping straight to the majors.

Clay Davenport’s Projections for 2012

generated on 2- 2-2012

AL East Won Lost Runs Runs A
BOS 95 67 807 675
NYY 91 71 797 697
TOR 84 78 767 743
TBY 83 79 708 691
BAL 74 88 693 759
AL Cent Won Lost Runs Runs A
DET 89 73 771 696
CWS 79 83 697 717
CLE 78 84 699 728
MIN 74 88 679 742
KCR 71 91 681 780
AL West Won Lost Runs Runs A
TEX 96 66 818 675
LAA 86 76 719 675
OAK 76 86 673 724
SEA 69 93 609 712
NL East Won Lost Runs Runs A
PHI 90 72 695 621
MIA 88 74 712 650
ATL 84 78 666 641
WAS 80 82 642 649
NYM 77 85 648 688
NL Cent Won Lost Runs Runs A
CIN 87 75 722 673
MIL 85 77 686 655
STL 84 78 691 662
PIT 72 90 660 744
CHC 68 94 616 729
HOU 65 97 587 723
NL West Won Lost Runs Runs A
ARI 86 76 724 676
COL 86 76 761 718
SFG 81 81 639 639
LAD 78 84 632 660
SDP 75 87 611 666
 
Set your Twitter account name in your settings to use the TwitterBar Section.