I've been planning to create this model for a while.
With it, you can calculate the probability that the home team will eventually prevail given the state of the match at a particular point in the game.
Each line corresponds to a particular lead enjoyed or deficit faced by the home team. As you move from left to right you're moving through time. The far left of each line corresponds to the point 5 minutes into the first term and the far right corresponds to the point 15 minutes into the final term.
So, for example, when the home team leads by 12 points at half time it can be expected to win about 76% of the time. If it kicks a goal just prior to the break and so leads by 18 points instead, its probability of victory rises to 84%. If, instead, it concedes a goal and so goes into the main break up by only 6 points, its probability of victory stands at just 67%.
The model is based on Brownian motion (no kidding) and the exact fitted equation is:
Prob(Home Team Wins) = A / (1+A),
A = exp(-0.1569 + 0.0567*L/sqrt(1-T) + 0.5184*sqrt(1-T)),
L = Home Team Lead, and
T = Time Elapsed in Game (as a proportion of the total game, so 0<=T<=1)
I fitted the model using all the data for seasons 2000 to 2009, including finals.
In the table below I've applied the model to the results for the first 10 rounds of this season, creating three predictions for the winner of each game, one prediction based on the score at quarter-time, another based on the score at half time, and a third based on the score at three-quarter time.
The cells highlighted are those game situations that have occurred in at least 10 games so far this season.
Overall I'd say the fit is acceptable, particularly at the points where there is more data.
Finally, let's take a look at how well the model fits the data for the period from which it draws its data.
The table below is based on the 1,850 games that make up the seasons 2000 to 2009. Each cell reflects the results of at least 70 games.
The fit, I'd suggest, is excellent.
One of the things I'm amazed about - which you can see in this table and in the earlier chart - is the huge difference that a single goal makes to the home team's probability of victory, in some cases as much as 10% points. I doubt that in-running wagering markets adjust by an amount this large every time a goal is kicked.
The model I've created links three variables: probability of home team victory, home team lead or deficit, and time elapsed in the game. In the earlier chart the relationship between home team victory probability and time elapsed is plotted for varying home team leads and deficits. Another way of looking at the model is to plot home team lead or deficit against time elapsed for different probabilities.
Each line on this chart is what I'll call an iso-prob line since it joins situations of equal probability, in much the same way as isobars join geographic locations of equal atmospheric pressure on a meteorological map.
One of the ways you could think about this chart is that it shows the minimum lead that the home team needs at any point in the game in order to be at least an X% chance of winning. Each line on the chart provides the answer to that question for a different value of X.