Wednesday, January 25, 2012

I see dead people. I mean, bell curves.

Those familiar with football (soccer to my North American friends) will know that most matches end with relatively low scores: 1-0, 2-1, 0-0. Usually the stronger team wins as expected.

Sometimes, though, things just go strangely, the scores mount up and up, the weaker team squeezes in impossible goals, the stronger team screws up easy shots. Such matches are rare and commentators get excited trying to explain why events had turned out so weirdly.

So I wondered, do football results have a normal distribution?

A normal distribution means that when measured for some variable, like height, weight, income, most of what is being measured clusters around the middle and ever-decreasing numbers spread out at the fringes. It produces a curve like this below, supplied by Qwfp and Pbroks13 on Wikipedia, the bell curve. Height, for a particular age group, is a common example: most people have fairly average, ordinary height, a few are very tall or very small, and tiny numbers are giants or dwarfs.
So out of curiosity, I looked at every result Manchester United got in the 2010-11 season, in all competitive matches. I subtracted the score of the opposing team from United's score for each match, giving me a goal difference for every match of plus or minus a number of goals. I then plotted this onto a graph.
A normal distribution! More or less: the goal difference clusters around the centre, with dwindling numbers towards the fringes.

Mode = 1. That is, the most common result for a Manchester United match in the 2010-11 season was to win by one goal, be it a 1-0 win, or 2-1, 3-2, etc. United being a strong team, they tend to win.

Median = 1

Mean = 3.78

The mean is higher than the median because, while United were most likely to win by one goal, they also won quite often by two goals (5 times), three goals (5 times) and five goals (3 times), while they were much less likely to lose by more than one goal (down 3 only once, and down 5 also only once).

Those strange results I mentioned in the beginning fit at the edges in this graph. In 2011 United won a bizarre 8-2 victory against Arsenal, giving us their solitary 6-goal lead. They were also hammered by Manchester City 6-1, giving the 5-goal loss.

This is a small sample, and I would really need to look through hundreds or thousands of matches to see if the distribution remains roughly normal. I suspect it would. So while commentators try to figure out what went wrong or right in individual matches, a probabilistic approach might say that if you play enough matches sooner or later you will start supplying those freak upsets. Far from being impossible for Manchester United to lose 6-1, it is highly likely - but only if you get them to play a sufficiently high number of games.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.