Chase Utley: One of the Best Ever

If you are reading this, there is a remarkably decent chance that you are a somewhat invested baseball fan. As a baseball fan, you probably ...

American League MVP Sleepers

 As I talked about in my national league article, there is nothing more fun than scanning through the betting odds for sporting awards. Sometimes you'll see something really silly (like Kyle Schwarber being one of the top favorites for National League MVP) or sometimes you will find a hidden gem (Steven Kwan was +7500 to win rookie of the year last year. Sure, he didn't win, but he finished third. He probably had something like the 50th best odds. I bet a lot on both him and Julio (along with some on Adley after his midseason breakout) and made a lot of money on Julio. Point is, Kwan +7500 was an all time value.) Let's look at the current preseason odds for 2023 AL MVP and see if there are any goofy lines/hidden gems.

Jose Trevino +30000

There is a zero percent chance that Jose Trevino wins MVP. Not a 0.1% chance, not a 0.01% chance, a 0% chance. He simply does not have the skillset to have an MVP level season. He is a very good player and is arguably better than certain players who do have the skillset to breakout to an MVP level, but he simply does not have the ability to do so on his own. It's crazy that he has better MVP odds than Corbin Carroll.

Danny Jansen +30000

Unlike Trevino, Jansen certainly has a path to win an MVP. It would start with a significant injury to Alejandro Kirk that allows Jansen to maintain full time catching duties. This is not necessarily a likely outcome, but is is certainly a plausible one. I see Jansen as an offensive threat with a very strong all around skillset and a high ceiling. Is he one of the very best catchers in the league? No, but he certainly has that capability for a breakout. He has the defensive chops to at least variance his way into an excellent season, and he would be in the middle of an excellent lineup. Again, these are 300 to 1 odds, and I'm not saying the guy is a superstar. If you're a Blue Jays fan who isn't too fond of Alejandro Kirk for some reason, then this might be a fun little bet to toss a few bucks on.

Steven Kwan +25000

As much as I love Steven Kwan, I think he simply doesn't have the ability to win an MVP. +25000 odds might still be worth it, but it just doesn't seem probable at all for me. Let's put it this way: Steven Kwan has a significantly higher chance of being a hall of famer than being an MVP winner.

Riley Greene +25000

Riley Greene, on the other hand, seems like excellent value at +25000. The guy had a solid age 21 season, good enough that it inspires confidence for me without being good enough that it inspires the oddsmakers to move him the front of the line. The guy is a star center field prospect with all of the skills needed to breakout if he puts things together. Surely he would win once every 250 years.

Randy Arozarena +15000

There was a time that I would be chomping at the bit to throw down money on Arozarena. He certainly has the raw ability to become a true MVP player, but at this point, the guy is 28 and is just as frustrating as ever. He is still an excellent player, but his defensive and baserunning lapses make it hard to believe that he has a true shot at MVP. Still, once every 150 years doesn't seem too improbable. 

Gunnar Henderson +10000

Gunnar is the AL's version of Corbin Carroll. Truly excellent player that is already established despite still being a rookie. The oddsmakers respect him a bit more than Corbin, but I still think Gunnar easily wins this award once every 100 theoretical years. A superstar breakout season with plus defense at the hot corner for a blossoming Orioles team is exactly what the MVP voters ordered. 

Teoscar Hernandez +7500

I do NOT think this is a good bet. I'm a little confused as to why Teoscar has better odds than Randy when they are extremely similar players, with the exception that Randy is younger and better. 

Jose Altuve +7500

The dude is coming off of one of his best career seasons and has 75 to 1 odds? Yes, please. The playoffs are a concern but c'mon now. I think Altuve is a guy whose quality as a player exceeds his chances of actually winning an MVP at this stage in his career, (not unlike Steven Kwan), but 75 to 1 for one of the best players in baseball is too juicy to pass up.

Adley Rutschman +2500

This isn't exactly a steal, but the dude has MVP written all over him. One of the best prospects we've seen, has an incredible rookie season, and is poised to follow it up with an exceptional sophomore year. Buster Posey won an MVP very easily, and Adley has a good chance to be on that level. Him winning once every 25 years seems like easy value.

Shohei Ohtani +200

The rest of the guys are solid bets, but Ohtani is the real value here. The dude just put up 9.0 and 9.6 rWAR in back to back seasons, and it just feels like his best is yet to come. His competition isn't nearly as stiff as you would think (Judge obviously took home the award last year, but it's hard to see him performing like that again. No one else is really built like that at the current moment, at least with Trout's massive injury concern.) In my opinion, he should be the odds on favorite every year until we have to ask whether or not he is in his prime. For now, especially with the shift being banned, this should be a no brainer. With the pitch clock and banned shift, Ohtani could easily pitch 150 innings of plus plus ball, steal 50 bases, hit over 300 and hit over 40 home runs. No one is winning the award over him unless they literally break a mainstream record yet again. 

ranking hall of famers

 Discussion over Hall of Fame candidacy is always interesting, but I think things often get a bit subjective. For the sake of simplicity, I will not be ranking these Hall of Fame candidates based on their Hall of Fame merit, rather their pure merit as a baseball player that contributes to winning. I will not be considering factors like playoff performance, general impact on the welfare of the game, etc., despite the fact that I believe that said accomplishments do matter for a hall of fame candidacy. 

The most simple way to rank players is to look at them at their peak. By peak, I do not mean the season in which the player produced the most. For example, in 2010, Josh Hamilton had a fantastic season. He put up a 175 wRC+ in 133 games and played good corner outfield defense, good for an 8.4 fWAR. Was Josh Hamilton really an 8 win player? Of course not. His fantastic 2010 was bookended by some very nice seasons and one very mediocre season in 2009, which is a strong indication that his 2010 output was a result of overperformance in a small sample. A much more reasonable look at his true talent would be his statistical production from 2007-2012, in which he posted a 135 wRC+ and 25.1 fWAR in 737 games. His regression after the 2012 season is much more easily explained by aging (among other things). I won't define a player's peak in some strict window, like looking at a player's best 7 year stretch or his production from ages 24-30. I will try to just use context to identify when changes in production are much more easily explained by genuine drop offs/improvements in player talent. 

Tier 1

Alex Rodriguez

"True" WAR/150 Estimate: 8.4

Probably the best prospect of all time, ARod debuted at age 18.  He struggled a little in his first few stints in the majors, and was sent down to Tacoma where he absolutely dominated. In classic ARod fashion, he came back to the MLB at age 20 and hit a smooth 36 home runs, hitting at 59% above the league average and putting up a 9 win season. ARod really was the perfect baseball player. You can't ask for much more than a shortstop with monumental power, great defense and fantastic control of the plate. 

In order to estimate ARod's true peak, I will have to do some mixing and matching. His defense fell off considerably after being traded to the Yankees and moving to third base. Some of this has to do with age, and some has to do with the fact that he wasn't a natural third baseman. I want to evaluate him as a shortstop, so I will just look at his defensive performance pre-trade. His offense is a slightly different story. While he started off with a bang at the plate, he wasn't quite as good in the next 3 seasons. Don't get me wrong, he was still one of the best players in all of baseball, but he wasn't quite that guy. Given a significant spike in walk rate at age 24, a very reasonable breakout age, I would say that his offensive peak began in the year 2000. He had his final MVP season in 2007 before slightly declining in 2008. However, since I don't want to cherry pick too much, so I will include his 2008 season when I quickly get a snapshot of his production. Combining his defensive and offensive peak will basically be an evaluation of how good he was around 2002.

From 2000-2008, ARod hit 305/401/591 in 6233 plate appearances. Given average defense and baserunning, this would make him a 6.8 WAR/150 guy. Looking at some hazy total zone numbers and more accurate UZR numbers, he was probably a +5 defender at shortstop. With the positional adjustment at short, let's just say he was a +12 defender. He was also a nice baserunner, probably worth around 3 runs annually. 

Post-Roid Barry Bonds

"True" WAR/150 Estimate: 12.1

When Barry Bonds was consuming a balanced breakfast, no other player since integration touches him. Not Willie Mays, not Mike Trout, no one. From 2001-2004, Bonds walked 30% of the time and struck out just 9% while posting a comical 0.460 isolated power in 2443 plate appearances. This was all good for a 232 wRC+. The dude's offensive output was 132% higher than the league average. Despite his old age, he was an above average corner outfielder (still below average overall on defense) and broke even as a baserunner. Everyone knows just how good Barry Bonds was after he started taking steroids, but it is always fun to revisit. Bonds' performance did slow down after 2004, but he was 40 years old and dealt with a lot of injuries, so I think it is fair to say that his 2001-2004 stretch is very close to how truly good he was. He also had probably the best pure playoff run in MLB history in 2002, slashing 356/581/978 in 17 playoff games, almost single handedly carrying the Giants to a ring. 

This will be expanded on. 

Listing Tier 1 MLB Players

 The goal here is simple: I have a system in which I used to evaluate baseball players. It isn't very well fleshed out, but I have generally figured out things towards the top of the pyramid. Most of the best players in MLB at the moment are tier 2 players: Bryce Harper, Francisco Lindor, Vladimir Guerrero Jr., etc. Aaron Judge was a guy that I'm pretty adamant about being tier 2; given his recent performance, that might be subject to change. A tier 2 guy is a bonafide annual MVP contender that could be the best player in the MLB if there is ever a lack of a tier 1 guy. Tier 1 guys are rare; if you read my Chase Utley article, think of the guys that I put him next to. Willie Mays, Albert Pujols, Alex Rodriguez, Mike Trout, those kinda guys. Yes, Chase Utley was a tier 1 player. Entering this year, I believed there was an unusual amount of tier one guys in the MLB. Shohei Ohtani, Mike Trout, Fernando Tatis Jr., Ronald Acuna Jr., and Juan Soto. 

Given Trout's depressingly concerning injuries, I might have to drop him to tier 2 in the near future. For the moment, he falls to the bottom of tier 1. I can't let him drop any further. Ohtani can stay where he is, because he is awesome. Tatis also stays, as I'm not as concerned about his injury issues. The shoulder(s) are terrifying, but it's not like he has a "rare back condition". His broken wrist was a matter of him being a dumbass off the field, and should be completely fine pretty soon. Heading forward, Tatis is the best pure position player in the league. I believe that wholeheartedly. Ronald Acuna Jr. is a different story. He is still certainly a cream of the crop talent, but the struggles coming off of a torn ACL are enough of a red flag to drop him. Remember, tier 1 is reserved for all time great talent. I'm willing to excuse Acuna's struggles this year, and if he returns to form in the near future, I'm happy to move him back up to tier 1. For now, he slides to the top of tier 2. Juan Soto is interesting. His corner outfielderness combined with mediocre baserunning make it such that he can't have "slumps" like he had for much of the first half this year if he wants to remain in tier 1. He is easily the most magical hitter in recent memory, and I'm hoping that playing in a more competitive environment in Sunny San Diego will boost the "small stuff" (defense/baserunning) while allowing him to access his god tier power more often. It's a little strange that he doesn't hit better than he has (which is crazy, given his 150ish wRC+), but his ceiling as a hitter is unlimited. This leads me to a caveat about my tiers.

Here's the thing about three of those guys: Tatis, Soto, and Acuna had unprecedentedly good starts to their careers, but said starts happened to coincide with this silly little virus causing some issues. The lack of a full 2020 diminishes the historical dominance of their careers, and it sucks. They were all in the zone that year, and a full season of that performance would inspire more confidence. They all followed that up with incredible 2021 seasons, launching them into the historical stratosphere. Acuna might have been the best of the three(eh, maybe not), but he went and fucked up his knee, and now he is certainly the worst. Tatis' injuries have also lessened his accomplishments. Soto is the only individual with a relatively clean bill of health, and he has suffered through confusing power outages despite his obvious raw power and incredible plate control. My point is that I'm doing a little projecting on these guys, and there is a lot of uncertainty. They (outside of Tatis, who is) aren't necessarily better in the moment than a few of the tier 2 guys, but their potential is too good. 

Now it is time to go back in time, and identify who the tier 1 guys were. Let us begin. Not all of these guys are tier 1, but rather guys that deserve some discussion.

Mookie Betts (Yes, barely):

Entering 2018, Mookie was a high end tier 2 guy who did all of the small things phenomenally well while providing solid offense. He proceeded to have the most prolific (in terms of WAR) season since Barry Bonds, and followed that up with a slightly disappointing but still elite 2019. His 2020 with the Dodgers basically sealed the deal on his tier one status, just before injuries knocked him out of that tier for me. Betts might be making a bid to reenter tier 1 given his incredible performance so far in 2022. Given his injuries/worsening athleticism, it is my belief that we should be patient before adding him back to tier 1. He wasn't exactly a bonafide tier 1 guy when he was in tier 1.

Alex Bregman (No): 

Bregman's power outage is a modern tragedy. He never had great raw power, but he was able to leverage a juicier ball en route to a historic 2018-2019 stretch of dominance. Back to back 8 win seasons is nothing to scoff at. However, given that he might have been a beneficiary of something outside of his control, and that he only lasted for two years before regressing to being just an "excellent" player, I can say with a moderate amount of confidence that he wasn't as good of a baseball player as his production would indicate during the 2018-2019 sample. He deserves recognition for back to back MVP caliber seasons, but he isn't a tier 1 guy.

Aaron Judge (Yes): I've been an Aaron Judge skeptic. His career rate numbers are inflated by the fact that he debuted at age 24, a time in which many players are starting to enter their prime. His 2017 was obviously excellent, but it can be attributed to an unprecedented skillset being poorly handled by opposing pitchers. Pitchers "figured" him out, to an extent, and his performance from the 2017 all star break to 2020 was merely fantastic, and not otherworldly. Combine this with a smorgasbord of injuries, and there was no way in which one could confidently say this guy compares to the all time greats. In 2021, his production was on the level of his 2018-2020, but he was a little different. Judge came up as a super powerful, terrifying figure in the box. This hasn't changed. What has changed is the nuances in his hitting ability. Despite still being pretty erratic in 2021, Judge demonstrated some hope. Then, in 2022, he has done nothing short of blow the league out of the water. This will likely be his second straight fully healthy season, and his improved "hitterishness" has allowed him to incessantly terrorize opposing pitches with his greek godlike power. Combine that with the fact that he has effectively manned center field, and it's hard to not say that he has ascended to tier 1, at least for the moment. Will he age well? I have no idea. Where will he sign? If he ages well, hopefully not the Yankees. If he doesn't age well, hopefully the Yankees. The guy still has a lot of question marks, but that's more about the future. In the moment, he is a tier 1 player. Still not as good as Tatis, though.

Jose Altuve (No): I love Jose Altuve, and he deserved his 2017 MVP. He is one of the best second baseman of all time, and is currently the best second baseman in the league. However, he just isn't a tier one guy. He was a mid 6 win player that has dominated the postseason, he is a first ballot hall of famer, and he is one of the best players of his generation. I'm bringing him up because I am going to name every MVP winner that isn't an obvious no. 

Cody Bellinger (No): Cody's 2019 campaign was one of the most promising seasons in recent memory. After a great rookie season and a disappointing but solid sophomore campaign, Bellinger had one of the best age 23 seasons that you will ever see. He combined his incredible power with fantastic plate control, murdering pitchers with his violent swing. His plus defense and positional versatility didn't hurt. He followed that up with a weak, but still great looking shortened 2020. At that point in time, I was not particularly concerned for him. Then, just one year later, the dude had a negative WAR. His already noisy swing, which was an asset when it worked, was out of wack. Things have improved this year, but he is not exactly Tony Gwynn in the box. His 2019 was tier 1 worthy, but given his wacky mechanics and lack of supporting seasons, and I can't go out and say that he was ever among the best players ever. 

Christian Yelich (No): Yelich's mid 2018 breakout was a sight to behold, and he rode that wave until a broken kneecap cut his MVP worthy (yes, he should have won) 2019 short. His breakout was allegedly caused by a minor swing tweak, but I'm always skeptical of these massive power breakouts that fizzle out. Yet again, I can't say that he was ever one of the best players ever.

Buster Posey(Yes): Catchers are a little different.

This article will be finished later (maybe)

In defense of Derek Jeter

 Derek Jeter is easy to pick on. His quick ascent to prominence, amplified by playing for the Yankees, made him a prime target for whose career to be picked apart. And a lot of the criticisms are valid. Even I was very amused when his hall of fame unanimity fell one vote short of becoming a reality. However, people will tend to go way too far when criticizing the hall of fame shortstop.*

The first thing that needs to be addressed is his defense. It is well known that Derek Jeter has the "worst" defensive runs saved total of all time. Defensive runs saved only spans back to 2002, so "all time" is a bit of a stretch, but you get the point. The next logical jump from hearing that statistic is as follows: if he has allowed the most runs on defense of anyone ever, then he must be the worst defensive player ever. This is a perfectly reasonable conclusion to draw. That is, it would be a perfectly reasonable conclusion to draw if you were an Irish R&B singer whose knowledge of baseball consists of nothing but the notion that scoring runs is good, and that stopping the other team from scoring is bad. You hear that this player is allowing the other team to score more than anyone else, so naturally, he is the worst defensive player. Again, this is a reasonable conclusion to draw if you know very little about the game of baseball. If you're a fan, someone who watches baseball somewhat generally, you should at least question this notion a little.

Who would you rather have at shortstop: Derek Jeter, or David Ortiz? Derek Jeter, or Adam Dunn? Derek Jeter, or Vince Wilfork? You get the point. Vince Wilfork was actually an excellent defensive player, but unfortunately for the Jeter haters, he played a different sport. Adam Dunn and Ortiz are two guys whose careers spanned the post DRS era, and they both had a higher DRS than Jeter, but were obviously inferior defenders. A lot of this might have to do with Ortiz being the Red Sox full-time DH, but Adam Dunn played the outfield for much of his career. The difference is that Dunn played the corner outfield, a significantly less stressful defensive position with a much lower demand for defensive talent. What a lot of people seem to misunderstand about metrics like DRS, UZR, and OAA is that they are position-based. Jeter's -253 career fielding runs (as measured by total zone pre 2002 and DRS post 2002) are relative to the league average shortstop. The league average shortstop is an excellent defensive player. Jeter was a really bad defensive shortstop, but he was not a terrible defensive baseball player. People used to understand this a lot better, but I think that a fundamental misunderstanding of modern day analytics has lead to some misrepresentation of the game. 

This article is unfinished.

*Around the middle of 2021, I saw someone compare Derek Jeter to Keston Hiura. Keep in mind that this was just some prick on twitter, but I'm sure people have similar ideas about him. 

Brad Hawpe

 Remember Brad Hawpe? If you weren't a devout Rockies fan in the Tulo era, then probably not. But off-brand Marty McFly was a pretty fun little player. He played in a better era of baseball, in which hitters were not blatantly cheating but still had the upper hand of their counterparts on the mound. A late bloomer, Hawpe peaked from his age 26-30 range after limited playing time in his first two seasons. In that stretch, over 585 games and  2338 plate appearances, the sweet swinging lefty hit 288/384/518 for a 902 OPS. Unfortunately, this excellent 902 OPS comes out to just a 124 OPS+ due to the Coors field park adjustment. Coors Field was actually at its "low" in terms of measurable hitter friendliness around this time. There are a few explanations for this, the main one being the humidor installed to increase drag on baseballs after 2004. However, I would also theorize that this might have just coincided with an era of Rockies pitchers that were effectively tailored to suit Coors Field, but that's just me. Hawpe peaked at the same time that the Rockies peaked as a franchise.

Defensive runs saved was not a fan of his work in right field during this stretch. Per DRS, he cost the Rockies -56 fielding runs from 2006-2009 and that is not even counting the hefty positional adjustment penalty placed on corner outfielders. UZR was not any kinder to him. This defensive atrocity prevented Hawpe from being even a league average player despite his consistently excellent offense.

Here's the thing: Rockies outfielders have often been absolutely atrocious by measures of defensive production. Although some of this can be attributed to the Rockies ultimately being a poorly run organization that doesn't develop talent very well, it is reasonable to assume that the comical Coors Field outfield proportions combined with the altitude makes defense at Coors a tricky endeavor. I don't know how much evidence there is to truly support this. The reason one would believe this is just because DRS and UZR are fairly simple measures that track simple batted ball data, but this isn't the case for Statcast's Outs Above Average measure, which very much incorporates fielder positioning and catch difficulty.  I expected the Rockies to rank very poorly in terms of outfield DRS and UZR but decently well in terms of outfield OAA since 2016. Instead, the Rockies are 24th in rPM (plays made, DRS' measure of range), 14th in RngR (UZR's measure of range) and 28th in OAA (statcast's measure of range, since it doesn't incorporate arm performance anyways). So, unless OAA is even more biased against teams with cavernous outfields, then that level of anecdotal evidence doesn't mean much. I could always derive my own OAA using public statcast data, and then compare the Rockies home/road splits in that time, but I'm currently working on something else and am too lazy to do so.

Anyways, Brad Hawpe. Maybe he wasn't that bad of a defender. It is worth pointing out that the data has improved since the late 2000s, and DRS especially is more reliable these days. Hawpe was still a smooth ass hitter who walked a lot and peppered the gaps. He also hit a home run in the world series. Mike Trout has not. Curious. 

What's up with Tom? (Brady)

 As a devout Tom Brady supporter, it goes without saying that this season has been distressing. The Buccaneers offense has been an absolute travesty, and Brady looks as bad as he has ever been. Unfortunately, I don't think this is just a scheme issue. Brady is clearly worse; while he is taking fewer sacks than ever, his pocket mobility is clearly deteriorating to the point that he struggles to buy time to push the ball downfield. Furthermore, he is certainly missing a lot more throws than he used to, and his arm, while still good, is no longer the absolute cannon that he once had.

However, not all of this is on him. The Buccaneers passing offense is actually 11th in total expected points added per pro football reference, which absolutely shocked me when I looked that up. Their issue is an utterly abysmal run game, one that is somehow the worst on a per rush basis despite also having the fewest rushes in the league. This is truly incomprehensible. The run game is ultimately still on the quarterback to some extent, and Brady's inability to open up the lanes with his deadly passing ability that he used to have is certainly an explanatory variable for all of this nonsense. However, it could also be concluded that a lot of this is more of a scheme/coaching issue.

The "nerds" have often posited that "establishing the run" is not a truly impactful strategy that alters the efficacy of a passing game. I would counter to this, having not actually read the backup to their claims in quite a while, that although it might not be measurably obvious, an effective run game can certainly open up the passing game. Do I have any proof of this? No. However, it is still true. If Brady does go to the 49ers next season, which seems excitingly likely, I would fully expect him to thrive. Of course, I expected him to do much better in Tampa than he has. Sure, he won a ring, and sure, he should have won an MVP, but I was thinking that the Buccaneers offense would be unrivaled. Instead, they were simply elite. Some of this has to do with the fact that Tampa Tom simply wasn't as good of a QB as late career Patriots Tom, but a lot had to do with coaching as well. Byron Leftwich, bless his heart, is not a good offensive coordinator in any form.

I'm not really sure what my point is. Brady is leading the league in attempts by a comical margin, and is still mustering some passable efficiency. The Bucs issue, outside of Tom not being the motherfucker he once was, lies more within the 1.5 yards per carry they're getting from their run game. It's weird to say that a team's issue is their run game (it probably still isn't, it's probably still Tom's fault) but it might just be true. I find it interesting that, whenever it is crunch time, the Buccaneers offense is an unstoppable force. Brady has led at least 100 different game tying TD drives late in the 4th (although they were often punctuated with failed two point conversions that were quite necessary) this year, and the offense always looks deadly when it matters. However, it fucking sucks otherwise, and it confuses me to no end. Whatever. Yet another blog to add to my resume. 

Corbin Carroll MVP? Early examination of the odds

 Scanning through preseason betting odds is a truly entertaining way for me to spend my time. Juan Soto is the favorite at +600, meaning that he has probably around a 10% chance at the award (the break even rate is 14%, but I'm assuming there is a lot of juice here). Second place is Mookie Betts at +800, and then Fernando Tatis Jr., despite being sidelined until April 20th because the MLB is not based, is third at +1200. I believe that Tatis should probably be the favorite every year until he dies, but I understand why he isn't this year. For one, he won't be playing shortstop; the lack of defensive opportunities diminishes his chances of racking up a really high war. For another, he won't be active for all 162 games. This is arguably not a huge deal for a guy like Tatis, who would probably get injured at some point anyways and could use the lessened workload. My concern for him is rust, but still, at +1200 I think that is an easy bet. He is far and away the best player in the national league when he is actually on the field, so he should certainly win over 8% of the time in this economy. This isn't about Tatis, though.

This is about Corbin Carroll. The Diamondbacks rookie is +40000 to win MVP, meaning that he would have to win once every 400 years in a theoretical simulation in order to break even. In comparison, Ryan McMahon of the Colorado Rockies is +25000. So is Nick Madrigal. And Miguel Rojas. I digress.

Carroll was, in my opinion, last year's top prospect when he was called up. All he did was rake, as he hit something like 260/330/500 with truly excellent defense and baserunning. A lot of his offensive success was due to his fantastic speed, and the batted ball data isn't as promising, but that isn't really too important to me at this stage. The solid production he posted in his limited debut is just a launching pad for a true age 23 breakout that could shock the league.

I'm not saying that Carroll should be one of the favorites to win the award. Obviously there is not a huge record of success for rookies in the award; only two have won it, with one being Fred Lynn in 1975, and the other being Ichiro in 2001. I would point out that the sample is really small, and the voting is way different now than it was back then. A breakout rookie in the olden days was likely to be completely ignored by reporters, but things are obviously changing. Voting is becoming more and more objective as sabermetrics brainwash the average MVP voter. If the voting were done today, I don't see how Mike Trout's excellent 2012 rookie season doesn't take home MVP honors. Even though Aaron Judge didn't really deserve MVP in 2017 due to his atrocious clutch performance, he still probably wins it if the voting is done today.

There are also rookies that very certainly could have won if the context allowed for it. Albert Pujols' 2001 rookie campaign could have definitely taken home some hardware in weaker years, but he had to deal with Barry Bonds and Sammy Sosa at their peaks. The only player at that level in the national league right now for me is Tatis, but he is obviously a massive wild card this year. Obviously guys like Soto and maybe Mookie or Acuna could ascend to that level in a fringe scenario, but the point is that the field is relatively weak at the moment. Corey Seager finished third in his ROY campaign in 2016, although that was a fairly weak field and he wasn't a genuine threat to supplant Kris Bryant for the award. Fernando Tatis Jr., if his fielding was more refined, could have definitely won the 2019 award had he stayed healthy and led the Padres to a wild card spot (this is something of a fantasy world stretch for me, but like, it's certainly conceivable). Carroll is already a super refined fielder, and already put up 112 plate appearances under his belt, so he is more of a super rookie than a rookie. He is also 22, which is older for a rookie of his caliber, which should be to his advantage. 

Carroll could have a superstar level season on an up and coming team in a relatively weak NL. Oh yeah, I forgot to mention that the pitch clock will allow Carroll to steal as many bases as he wants. Imagine if the exciting rookie hitting 330/400/550 while playing elite center field defense also has 100 stolen bases? (again, this is a bit of a fantasy and it assumes that the shift ban and the pitch clock significantly boost offense) That is an MVP right there, a perfect storm. For 400 to 1 odds, that is one of the easiest bets you can make. I currently have $5 on it, and I might throw down just a bit more because of how juicy it is. 

Why are NFL teams scoring more?

The National Football League is the most popular professional sports league in the United States and has a strong grip on American culture. Due to its immense popularity, many statistics and other various pieces of data have been meticulously collected since its inception. Said data has gotten more and more detailed over the years as technology and fan interest have both improved. Simultaneously, the game of football itself has, at the highest level, changed immensely over a very short period of time. This is best reflected in the alterations in league wide offensive success, also known as the scoring environment. The scoring environment of a league is best defined as the ease at which an average offense can score points. The changes seen in the NFL are often controversial amongst fans. Older fans will lament that their favorite players of yore had to play against tougher defenses, while newer fans will cite misleading raw statistics that favor modern players due to the nature of today’s game. Attempts to quantify the environmental changes in the league are often ignored by the vast majority of fans, but it is something that fascinates me and should fascinate you as well. 

In order to examine these alterations, I utilized a package in the coding language R titled “nflfastR” in order to quickly scrape play by play data for the NFL. The function “load_pbp” in the package automatically scrapes yearly play by play data from the NFL verse data repository. The data contains an observation for every single play recorded in the NFL since 1999. It contains 373 variables for each play, giving me an immense amount of detail if need be. The data was already in tidy format. However, it was not necessarily in the format I needed in order to examine certain things. I performed various transformations on said data, grouping by different variables depending on what I wanted to look at. 

In football, the goal of the team in possession of the ball is to accumulate yards as they advance towards the end zone. If the team reaches the end zone, they are awarded six points, with point after attempt that could potentially (and very likely) make it seven. If a team is unable to score a touchdown within the amount of tries allotted, they can either kick a field goal to earn three points, or if the kick is too long, they can punt the ball to the other team. This simple set of rules have been played with by coaches doing all they can to gain a competitive advantage. Over the years, NFL teams have gotten smarter and more effective in ways they can move the ball in an attempt to score points. This is reflected in the significant increase in league wide scoring since 1999.

Teams as a whole have been scoring more points as time has passed, with the exception of the ongoing 2022 season, a season in which offenses have been mysteriously anemic. There are a few explanations, including a rule change allowing defenders to make more contact with offensive players, and stronger defensive strategies. Still, overall, there is a significant upward trend. The next thing I looked at is whether or not this trend is a league-wide phenomenon, or if it is driven by a change in behavior by a certain class of teams.

This graphic demonstrates a quantile time series regression meant to answer this question. Each data point is a single team in a single season. Based on the percentiles used, it looks like there has been a consistent league wide increase in scoring. The best and worst offenses are both improving at similar rates, indicating an environment change that equally impacts all types of teams. The next thing to consider is how certain offensive strategies are impacting the game.

The graphic above visualizes the league average efficacy of pass and run plays. A pass play is defined as a play in which an offensive player, usually the quarterback, drops back with the intention of throwing a forward pass. These include sacks, plays in which the passer is unable to actually throw a forward pass because a defender tackled them for a loss of yards. These are still considered pass plays, as the team had the intention of executing a forward pass. Rush plays are plays intended to gain yards without throwing a forward pass. They can be handoffs, pitches, or designed quarterback runs. If a quarterback runs on a designed pass play, it is not considered a rush play. As time has gone on, teams have started running the ball less frequently in favor of putting the ball in the air. This is a fairly intuitive trend to expect, as pass plays average considerably more yards. Football is a complicated sport, and there are nuances to the play types that prevent teams from simply throwing the ball every time, but the general tendency as teams gain in information is to throw the ball more. Despite an increase in throwing volume, throwing efficacy has not gone down. This could be due to changes in offensive strategies, rule changes allowing offensive players to get away with more, and other important factors. On the flip side, as teams run the ball less, they are experiencing more success in the run game. An increase in average yardage on both run and pass plays is a clear potential cause of this league wide increase in scoring. However, as seen in the graphic below, league pass rates have plateaued in recent years.

Similar to league scoring, all kinds of teams have experienced substantial improvements in both their rushing and passing success. The worst, average, and best offenses have all tended to improve over time both on the ground and through the air. This further supports the relationship between the increase in yards per play and the increase in points scored.

The increase in rushing yards per play can be mostly attributed to the fact that teams are running the ball less frequently, per the law of diminishing marginal utility. However, if this is the case, then how is passing becoming more effective?

The chart above shots the average depth of target and the average yards after catch. Depth of target is a self evident term that measures how far down the field a receiver is when the quarterback targets him. Yards after the catch is also a self evident term, as it measures the average yards a receiver gains after catching a pass. This data only goes back to 2006, when tracking data became more sophisticated, but it still shows a clear trend. Teams are tending to throw fewer deep passes, and yards after catch has stagnated. If teams are throwing the ball shorter, but not gaining any more yards after the catch, then how are they gaining more yards overall? The key difference is the significant increase in completion percentage, the rate at which a forward pass is caught by a wide receiver. While depth of target has gone down with little increase in yards after catch, quarterbacks are completing considerably more passes and this is allowing for a significant boost to their yards per attempt. 

Based on the data presented, the current league tendency with the passing of time is to experience a boost in offensive performance. It is difficult to determine the specific causes of this boost, as there are a massive amount of different factors that influence the game of football. However, I did find two key drivers of the increase in offensive success: higher rates of passing, and higher rates of completed short passes. The higher rate of passing allows for more efficient rushing, while opting for the more effective play type more often. The higher rate of completed short passes helps keep the average yards per attempt high, allowing teams to move the ball more. There is a lot more nuance that can be examined further, but these are seemingly the two biggest factors driving this offensive boom in today’s league. 


NFL Dataverse Repository

The Office: Cringe Comedy Quantified

The competition I analyzed was an August 2018 r/dataisbeautiful contest involving a raw dataset of every line in the office. The data had the season, episode, scene, quoted character, and a text string of their line, along with a unique id for each line. It looked like this:

Reading the data into R was not particularly difficult, as I just downloaded the google sheets file as a csv and then read the csv into R. One aspect that was a bit more difficult was creating a new column titled “word count” whose function is self-evident. I accomplished this using the R function 

office_lines <- office_lines %>%


         words=lengths(gregexpr("\\W+", line_text)) + 1)

which counted the amount of word separators and added one to get a final word count. This allowed for, in my opinion, a more specific analysis. This way, a 5 minute Michael Scott rant is valued differently than a disdained one word line from Stanley. For all of my charts, I used word count instead of line count due to this exact reason.

The first graphic I critiqued was the interactive chart showing how often every character talked. The visualization was just messy, and I didn’t think the interactive part added a ton. It was cool how it showed the most common lines by each character, though. The issue is that the only dots that can really be appreciated are the Michael dots, and they are placed randomly across the circle. Visually, it just doesn’t do much.

I believe my version is superior because it demonstrates the portion of words that were spoken by each character more effectively. I’m not a huge fan of pie charts, but I think the order is much clearer and presents data a lot better. One issue was that, since I made it interactive by season when you hover your cursor over it, I couldn’t actually label each slice by the individual character. If I figured that out in time, I would have, because it isn’t very visually appealing to have to check the legend if you want to know which character is which.

This chart was easily my favorite concept, but it had a few flaws. I love the fact that it attempts to quantify the impact of each character. Even if there are obvious issues with this approach, it is probably the best that can be done with this data. That said, I didn’t like a few things. For one, it only includes episodes that the character appeared in. This makes no sense! If you had x episodes, and a character was gone for half of them and those episodes were a lot worse, wouldn’t it be something of a reasonable conclusion to say that that character was very valuable to the show? Zero words spoken is still an amount of words that should be factored. The other issue was displaying the R2 value instead of the R value, because R2 is always positive and therefore doesn’t show whether or not the trend was negative. This is important in this context because we want to know if the character has a positive or negative effect.

I basically did the same thing as this guy but with the necessary changes outlined. As you can see, the R value for Michael Scott goes way up, because the show got dramatically worse after he left. I also believe that my chart is a little cleaner and nicer to look at.

This was easily the best chart, because it perfectly answers the question that needed to be answered and left nothing out. I have the one small issue with characters who fell out of the top 10 still getting a continuous line connected to their next top 10 appearance, and Andy’s picture is really stupid. I also don’t like how Andy led the entire show in lines in season 8, that is just tragic. There was not much I could do to improve upon this chart, so I instead made one with a completely different concept. 

This is a time series chart showing both the IMDB episode rating and the portion of words spoken by Michael Scott in that episode. I like this concept because it shows two different valuable time series at the same time. Although the two y variables are different and this creates confusion, they are somewhat related and create an interesting look into their relation. However, since they are still truly scaled differently, things can be improved. 

I think it was Eli who suggested scaling the data so we could see a stronger relationship. I had thought of this, but I didn’t have the mental energy to execute this idea after already being done with my work. However, it was much easier than I thought, and I think it shows a pretty strong relationship between the two variables. The obvious downside here is that we take away a quantitative meaning from each variable, but the upside is that we still get to experience the visual time series effect while also seeing a pretty strong relationship between the two variables. The exception is obviously towards the end of the show, when Michael was gone, because generally speaking, series finales tend to get very high ratings if the show has a passionate following. Outside of this blip, there is a very clear relationship between how often Michael spoke and how highly the episode was regarded.