Bert Vaux and Marius L. Jøhndal (University of Cambridge, United Kingdom) have just recently published some exciting results of the The Cambridge Online Survey of World Englishes that we try to analyse a bit further below.
Please note that the below report is generated automatically based on a statistical report template and the results, map, tables and all these text is generated real-time or served from cache. This means that you are now reading a non-proofread quick report written by computers.
First, let us plot the raw results about Pop or soda? gathered in the United Kingdom on a terrain map borrowed from Google:
You can see the raw results geocoded by the Zip code of the respondents in the above map marked by coloured stars for the 6 categories offered in the survey. See the legend on the top right corner for details where the number of cases for each category is shown after the labels in square brackets.
Beside the 152 answers, 192 subdivisions of the United Kingdom is also shown in similar (a bit dimmer and transparent) colours defined by k-nearest neighbour algorithm where k being 3.This classification method builds and uses the survey data to determine the most likely category for the given subdivision based on the k number of nearest neighbour(s).
This means that setting k to 1 would find the nearest point to each subdivisions centre and colour the polygons accordingly, and using a higher number for k would return a more smoothed map of colours.
Although the characteristics of the four countries addressed in this report may be seen in the above map, some more detailed descriptive statistics are also worth noting.
The above table shows the number of geocoded cases for each category in each country, that is just not too informative. A row-percentage table with the marginal and emphasized based on the computed Pearson-residuals might be a lot better to check out.
The last column of the above table shows the summarized distribution of the answers about Pop or soda? that is worth comparing to the country-specific values. The most interesting 5 values are highlighted based on their residuals.
It seems that a real association can be pointed out between the question and the country (χ=31.7 at the degree of freedom being 15) at the significance level of 0.00709. This means that there is a significance difference in what people think about Pop or soda? in the analysed four countries. This association seems to be weak based on Cramer's V (0.228).
The most popular category in the United Kingdom was <<pop>> for <<Pop or soda?>> chosen by four tenth of the respondents.
And the most important differences between the countries can be summarised as: