NFL Prophet
NFL Prophet About >> Weekly Predictions >> Playoff Odds >> Fantasy Projections >> Previous Matchups

About

Welcome to NFL Prophet

@author jcpoir (NEW) View past prediction accuracy HERE. This site is the endpoint of a cloud-based data pipeline that leverages machine learning and Monte Carlo simulation in order to render predictions about future NFL game outcomes and player statistics. Navigate to see picks for this week's games, end-of-season predictions, and fantasy football projections. Data are sourced from ESPN's NFL and Fantasy football APIs.
To convert 300K+ rows of raw play-by-play data into actionable insights, I have developed a pipeline that (1) queries and cleans ESPN API data, (2) generalizes plays into smoothed probability density functions, and (3) uses Monte Carlo Simulation to derive estimates of future performance. Predictions are made using a Naive Bayesian approach, meaning that the effects of individual factors such as field position, time remaining, and injuries (for example) on single-play outcomes are (largely) assumed to be independent. Note that all predictions are fully automated and do not represent my personal opinion.

Getting Started

• How likely are the Cincinnati Bengals to make the playoffs? >> HERE • How many rush attempts should we expect from Aaron Jones this week? >> HERE • How likely is Lamar Jackson to throw two or more interceptions? >> HERE • Which teams are the strongest picks to win this week? >> HERE

Reading the Swarm Plots

To illustrate how randomized game simulations are used to make predictions, I've employed a kind of interactive chart called a swarm plot. Each circle within these plots represents an individual simulated game, of which the first 1,000 out of 10,000 total simulations are displayed. While it's impossible to render all 10,000 examples in one chart due to computational constraints, the full set of simulations is used to produce probability estimates.


Fig 1a. A Sample Matchup Swarm Plot

Fig 1b. A Sample Player Swarm Plot

For each statistical category, the frozen vertical line demarcates the sample mean of the full dataset (10,000 simulations). To find the probability of reaching a statistical threshold (i.e. passing yards > 300), select the relevant stat from the blue dropdown and move your mouse to that point along the horizontal axis. The percentage values above the axis represent the odds of the statistic falling above or below the set threshold.