Using Pythagorean Win Percentage as a prediction tool
I blogged about defense, scoring differential, and the definition of Pythagorean Win Percentage in my first three blogs of the 2023-2024 basketball season. You can click on the links for each blog by clicking on the words above.
Some of you may have read these discussions and said "So what!". Some people who have a keen interest in statistics and their power of prediction devoured these blogs. I don't guarantee that I will entertain everyone as we wait for the season to start.
I will be putting out pre-season rankings once the MSHSAA sets up the class assignments for all teams, which typically only happens about mid-November. However, I am working on improving my early rankings with predictive statistics. Once the season starts, ratings will be based on head-to-head games and comparative game results to develop a dynamic algorithm. My formulae are set up so that a team can improve their ranking for early season wins if the teams they beat get better and conversely, ratings can be dragged down from teams you beat early that tank during the season.
But enough about ratings. I want to get back to the statistical mouthful... Pythagorean Win Percentage or PWP for short. PWP is used by the NBA as a statistic to predict winning percentage based on point differential. The only problem is that is it something of what I would call a "rearview mirror" statistic, which means, the teams have already played the games so it accurately describes the past results and nothing more.
What if you could somehow use historical PWP to predict future results? I have found that high school basketball programs are fairly consistent. Consistently good and sometimes consistently bad. Schools with great feeder programs reload with fresh, talented kids as quickly as talented seniors have graduated. Of course, some private schools can "recruit" talent through traditionally strong programs. Kids, and their parents, opt for a private education at schools with talented and successful sports programs. In both cases, the public and private schools are good or great, year after year. At some schools, the culture just isn't there for success, until some coach comes along with a fresh outlook and changes the culture both in the school and in the community. That has happened with Coach Eric Bennaka at Smithville, where they have turned the program around. Some schools like South Iron have had that success culture for years, starting kids in a feeder program in early elementary school.
As a statistician, I have seen this consistency (good and bad) in ratings over the past six or seven years that I have conducted Gramps Ratings. I decided to look at PWP for different periods. I looked at Three-year average, a two-year average, a last year's average, and, a five and a ten year average PWP. I also looked at a five-game PWP for the beginning of last season. I threw out the five and ten-year PWP for this discussion.
So do any of those PWP averages reliably predict the team's winning percentage for last year? The table is quite large so I will drop it in at the end of the blog. Let's look at the Correlation coefficients and Scatter plots for three scenarios:
1. Three-year PWP averages - these are the Pythagorean Win percentage prediction based on the team's point differential averaged for the 2019-20, 2020-21, and 2021-22 seasons,
2. Two-year PWP averages - these are the Pythagorean Win percentage prediction based on the team's point differential averaged for the 2020-21 and 2021-22 seasons,
The two-year PWP appears to explain a little over 80% of the team's actual winning percentage, a bit better than the three-year PWP. That's still 20% unexplained by PWP but better than any other predictive formula I have looked at. You may recall that I have a complex formula with return scoring, VPS values for returning players, and some other factors which explained the winning percentage at about 60% so I will take 80%. I actually found that return scoring by itself, only explained the team's winning percentage at about 7%.
4. Five games In PWP averages - this was the Pythagorean Win percentage prediction based on the team's point differential averaged for the first five games of the 2022-23 seasons,
Comments
Post a Comment