Posts about Formal sciences
After seeing u/wormhole222's post highlighting the fact that, despite being currently up 2-0, ESPN only gives the Heat a 35% of winning their series against the Celtics, I decided to do some investigation, and I think I figured out why this model is so risibly fallacious. I believe that, at the start of the playoffs, they created a set of probability values P for each team winning a single game against each other team and from that extrapolate performance for the entire playoffs. My evidence is as follows.
At 0-0, they gave the Heat a 3.2% chance of winning their series. If they are deriving this value in the way I believe, then it can be found by solving the equation Σn=47 (pn )(1-p)n-4 (n-1 choose n-4) = 0.032, or as written out, the 7th degree polynomial p4 +4 * p4 * (1-p)+10 * (p4 ) * (1-p)2 + 20 * (p4 )(1-p)3 = 0.032. This has one real solution in the interval [0, 1], 0.197625, so from this we can assume that ESPN's model gives the Heat around a 20% chance of winning any given game against the Celtics.
How can we test this hypothesis to see if it's true? The Heat are now up 2-0 against them, and ESPN now gives them a 35% chance of winning the series. We can calculate the probability of a team winning two games before its opponent wins 4 given a constant probability value for it winning a single game. The formula for this is 1-(3*(1-p)4 (p)+(1-p)4 ), and sure enough when you plug in p=0.2, this expression evaluates to 0.344, a 34.4% chance of winning, very close to ESPN's assigned value and within the margin of error for their reported significant figures.
In effect, ESPN has taken the laziest approach for statistical modeling and run with it. This approach can work for simplistic systems and would be fine if basketball tournaments were a fixed Markov process, but they are not. They are not updating probabilities for individual game victories based on team's recent performances, especially in head-to-head matchups. More complex analysis would also factor in game location, as the NBA has demonstrable home court advantages, and could also draw on historical trends from series results given the current record to try to capture the concept of momentum.
As it is, in the same way their reporting, talking heads, and game broadcasts are often subpar, so too is their statistical modeling and it shouldn't be used for any realistic assessment of a team's playoff odds.
EDIT Woke up this morning, saw this had blown up, and decided to spend more than the 5 minutes this initial shitpost took to gather additional data, particularly as spurred on by a few commenters pointing out that my results were not consistent with the odds given after game 1. I no longer believe that ESPN is modeling their playoff odds as a fixed Markov process, and chalk my original conclusion up to a remarkable coincidence in the data.
That being said, I still fervently believe they put as much effort into developing things like this as they do with the rest of their product, namely not much. I think that they need something with a higher update velocity to better account for playoff momentum, and I do not believe they are giving historical outcomes of teams up 2-0 in a series enough weight.