Blog

Prediction: Who will win the 2018 World Cup?

Header Viz

In the following days of the FIFA World Cup, you will find the answer to the question who are the top favorites for the FIFA World Cup here in our blog - daily updated and based on a lot of data and up-to-date statistical analyses.

"The likelihood of not winning the championship is greater than the likelihood
of not escaping relegation." Dettmar Cramer (German football player, coach and "football professor")

Here is our Answer Today From July 14th, 2018 - 11pm

Winners

World Cup Forecast 2018: Who will be World Champion? Can Data Science Beat the Animal Oracle?

All over the world, at newsstands, in public transport and above all in countless betting communities, smart football fans are asking themselves the question: Who will be the world champion of the 2018 Football World Cup?

Using statistical data science models, we simulated the 2018 FIFA World Cup 10,000 times to determine the probabilities for the next FIFA World Cup winner and thus the World Cup favorites.

The question is whether our INWT Statistics World Cup forecast will be better than the official World Cup oracle, the deaf cat Achilles.

World Cup Oracle Achilles: Animal World Cup Predictions Determine Who Will Be World Champion?

Experts of every kind have been arguing for weeks: Who is the World Cup favorite this time? Who will be crowned new football world champion after the World Cup final on 15 July 2018 and can hold the "bottle-sized gold statue", the FIFA World Cup, in their hands and wet themselves with champagne, beer or perhaps vodka?

Until now, betting and technical discussions on the topic of "Who will win the World Cup" have not only had to rely on your gut feeling, the study of football literature and the expertise you have acquired in countless football matches. One could also rely on the recognized World Cup forecasts of the football oracle. Octopus Paul from Oberhausen (North Rhine-Westphalia) correctly predicted the outcome of all seven World Cup matches for the German team in 2010 by deciding between mussel meat from one of two glasses labeled with team flags. His mussel meat predictions were so good that Paul even received death threats from disappointed fans for his well-founded predictions and thus personal protection eventually.

For the upcoming 2018 World Cup in Russia, the deaf white cat Achilles from the Hermitage in St. Petersburg will take his place. Achilles has qualified himself as an official animal world championship oracle at the Confed Cup 2017 when he predicted Russia's victory over New Zealand. He now has on-site knowledge for the 2018 World Cup, which should certainly benefit his World Cup forecasts. Achilles announces his prediction results by choosing between two bowls of cat food, each showing a team flag.

Does Achilles know who will win the 2018 World Cup?

With a 50% prediction accuracy (no oracle) one should expect 3 to 4 correctly predicted game results out of 7 German games. If we assume that Achilles has sound specialist knowledge and psychic abilities, the probability should be higher than 50%, for example 75%. Then he would have to correctly predict the results of 5 to 6 out of 7 German games.

That's quite a lot. By the way, we also tacitly assume that Germany will reach the final, i. e. that they really will play 7 games. That's a long shot, of course. But despite Achilles' proven qualification and high goal probability, his World Cup predictions have a decisive catch: The predictions are based only on his own, very personal assessment, however well-founded it may be.

World Cup Forecast 2018: Data Science vs. Animal Oracle - We Challenge Achilles

As a data science and predictive analytics company, INWT Statistics specializes in making qualified and very accurate predictions for our customers using large amounts of data and statistical models. This works for many topics: Last year we predicted the results of the 2017 federal elections in Germany. Even the New York Times reported about our INWT Statistics election prognosis tool. Our forecasts were on average between 5 and 9 percent better than the election forecasts of all well-known polling institutes.

Of course, one can object: Football matches are not just data. Maybe even: Football is our life. Football fans may calm down: we are not only familiar with forecasts, but also have proven football knowledge. For several years now we have been supporting the betting provider Tipico in the calculation of their live betting odds, also in the field of football.

We therefore hope that we will be able to build on last year's forecast success with the questions "Who is the Favourite World Cup" and "Who will win the 2018 World Cup in Russia?”. After all, for us, too, "the next game is always the next" (Matthias Sammer).

Data Science and Predictive Analytics in Football: World Cup Forecasts Based on Statistical Models

We have to admit one thing: We can't predict the future perfectly either (and here World Cup oracle Achilles could have a decisive advantage due to supernatural abilities - we'll see). So we can't say with complete certainty who will be world champion, how far Germany will go or which teams will reach the semi-finals. This is simply because the future is uncertain. An injury caused by the wrong foul at the wrong place at the wrong time - and everything is different.

This means for all football fans and competition enthusiasts at this point:

  • even a very unlikely event can happen,
  • and even a very probable event does not necessarily occur.

But what we can do with the help of Data Science and predictive analytics is to forecast very accurately and reliably how likely it is, for example, that Argentina will hold the trophy in their hands, that Germany will win 3-1 against Mexico or that Saudi Arabia will make it to the semi-finals.

So we can and will tell you who is most likely to be world champion. Not Peru. But even if it is very unlikely - you can still keep your fingers crossed for Peru or other of your favorite teams... because who knows who’ll laugh in the end.

INWT Statistics WM Forecast: Bookmakers' Odds, Fifa World Ranking and Historical Data as Data Basis

Three different data sources serve as the data basis for the modeling of the INWT Statistics forecast for the 2018 World Cup and the question "Who will win the 2018 World Cup":

  1. the results of all qualifier and friendly matches played by the national football teams in the last two years;
  2. the FIFA World Ranking List for the national football teams; and
  3. the bookmakers' odds for the World Cup winner and individual matches of the national football teams.

The bookmakers' odds in football include the assessment of betting football fans and bookmakers on the outcome of future football matches. The good thing about these data is that the odds completely reflect the expectations of football fans and experts. If not, the bookies could lose a lot of money.

The bookmakers charge for their services as a betting provider in addition to the real odds. We are therefore adjusting the bookmakers' odds from this charge using a special procedure. We use Tipico's football bookmaker odds, which are publicly available on Tipico's website. Tipico is one of the three largest reputable providers of football betting worldwide and has been a customer of INWT for many years.

The data from the FIFA World Ranking (official designation FIFA/Coca-Cola World Ranking), which is maintained by the FIFA World Association, forms the basis for the division of the national teams into seed pots for the drawing of the World Cup qualifying groups. The FIFA World Ranking takes into account the performance of the national teams of the past years and weights them according to their importance (friendly matches, for example, have a lower weight than world championship matches).

We also include the results of past qualifier and friendly matches in our forecast, because sometimes there is historical potential in these matches as well. The results of past qualifiers and friendly matches, with various factors such as motivation, home advantage, team strength during the tournament and recent events, are incorporated into our statistical model.

World Cup Prediction With Data Science: Our Methodology for Statistical Modelling of the Results of the 2018 World Cup

The statistical modelling of our forecast essentially consists of three steps:

Step 1:

We weight the results of the three data sources or data models (1) qualifier and friendly matches, (2) FIFA World Ranking List and (3) bookmakers' odds for the World Cup winner and calculate the probabilities for victory, defeat and draw as well as the number of goals that the two teams are likely to score for every conceivable match between two national football teams.

Step 2:

However, the calculations for victory, defeat, draw and number of goals are not sufficient. Who finishes first or second in the group can depend on the goal ratio, so the match result with the goals scored must be simulated for both teams.

We assume that the goal ratios of a football match follow a certain statistical distribution: the (bivariate) Poisson distribution. The Poisson distribution is well suited for goal conditions because it shows the "distribution of rare events". The probability that a certain goal ratio will occur in a football match is very low due to the large number of possible outcomes.

At the same time, the probabilities of the goal conditions make one well aware of the order of magnitude. We can thus calculate the probability that a certain goal ratio will be scored in a match. This is good for our 2018 World Cup prediction: it gives us the opportunity to predict the odds of which team will win which match at the World Cup - and who will become World Champion. Of course, in the knockout phase we also consider extensions and penalties.

Step 3:

But as already said: The concrete outcome of an encounter is very uncertain. The course of a World Cup for a particular football team also depends on the results of matches between the other football teams. Will Brazil play Germany or maybe Mexico or Sweden in the round of 16? Small differences e.g. in the goal difference can lead to a completely different tournament process. That is why we simulate the entire course of the World Cup in a so-called Monte Carlo simulation.

Monte Carlo simulation is a statistical method in which many similar random experiments are carried out. So we pretend that the World Cup will not be held just once, but 10,000 times. From this we can deduce which is the most likely outcome for certain events (for example: Who will be World Champion, who will probably be runner-up in Group A, or who will probably be eliminated in the round of 16).