Monday, May 30, 2011
My friend Andrea Kuszewsky posted in her Facebook page an interesting blog article on the "wisdom of crowds", the idea that a group of people is smarter in general and more often than single individuals. The article is interesting among the other things because it contains a criticism by a working neuroscientist of a well known amateur or using a popular term, citizen scientist, Johan Lehrer, author of "Proust was a Neuroscientist" and "How we decide". The title of the critical blog article is "Johan Lehrer is not a Neuroscientist". Peter Freed, Md, the author of the blog, did a good job in pointing out a clear misleading and let's say amateurish misunderstanding of a paper, with important social consequences, that Lehrer mentions (without a proper citation) in the Wall Street Journal. It is great that more journalists and bloggers, without a formal background in professional research, are discussing about science and helping in spreading the enthusiasm of discovering to the public. But when this is done at the price of over simplifications, blatant errors and basic misunderstandings the professional scientists have the right to point out these errors. It seems though that the professional neuroscientist didn't completely grasp (by his own admission) some basic statistical properties of different types of means and he arrives to the conclusion that not just Lehrer interpretation of the paper is wrong but even the scientists that wrote the paper are spinning the significance of their results. I don't quite agree. Here is my rebuttal to his blog article:
"While I applaud you for the investigative work you did in finding the silly and misleading mistake of Mr.Lehrer, not understanding what is the significance of a median, I don't agree with your generalconclusion. Let me start with saying that the questions asked to the students mentioned in the paper are a kind of problem (making an educated guess) called Fermi's problem. From Wikipedia :"In science, particularly in physics or engineering education, a Fermi problem, Fermi question, or Fermi estimate is an estimation problem designed to teach dimensional analysis, approximation, and the importance of clearly identifying one's assumptions." While the Swiss students didn't go consciously through the steps of a typical Fermi's problem (I imagine they had to give quick intuitive answers) unconsciously they maybe have done exactly that. But even if they had the time and inclination to go through the steps, a typical Fermi's problem allows for an error in the estimate that is up to an order of magnitude. In other words, if the exact number is 10.000, guessing 30,40, or even 90 thousands would have been good in the context of a Fermi problem. So getting an error in the hundred is actually pretty good (it means a factor of fews that is basically nothing). Fermi problems are considered a golden standard in the context of a guessestimate and in fact it takes a good deal of convincing to explain students that guessing wrong by a factor of 10 as a first rough estimate of the numerical answer of a problem, where little or no information is given is actually pretty good and a useful thing to do.
So actually the single individuals have done pretty good in most of the cases in the experiment (with some exceptions as in the assault estimate).
Second, there is a reason the geometric means is actually doing so much better.
Again from the definition of Geometric Mean in Wikipedia:
"Although the geometric mean has been relatively rare in computing social statistics, in 2010 the United Nations Human Development Index switched to this mode of calculation, on the grounds that it better reflected the non-substitutable nature of the statistics being compiled and compared:
The geometric mean reduces the level of substitutability between dimensions [being compared] and at the same time ensures that a 1 percent decline in say life expectancy at birth has the same impact on the HDI as a 1 percent decline in education or income. Thus, as a basis for comparisons of achievements, this method is also more respectful of the intrinsic differences across the dimensions than a simple average."
The idea here is that if you have statistically independent samples, representing different processes and parameters of a problem, taking an arithmetic mean is not that significant. A geometrical mean characterizes in a more meaningful way the different weights of the individual estimates. An intuitive understanding of how this could be useful in the context of crowd wisdom is to think about where the source of crowd wisdom may actually be (if it exists at all).
Imagine that in a small group of students that don't have any clue about how many immigrants there are in Zurich there is one or even few that are immigrants, or have studied the topic at school, or have read for themselves some material about the subject. Their answer will be much closer to the correct one. An arithmetic average would wipe this "higher wisdom" if the number of expert is small (as indeed is the case for random crowds). Statistically the answer of the non experts would be quite all over the place while the response of the experts would be in a narrow range. You can consider these two types of answers as different "dimension" and they should be emphasized in a different way that is what the geometric mean does. It is actually a significant result that the Swiss scientists were able to show that the geometric mean works much better in this case. Now a more relevant and interesting question is how then there is anecdotal evidence that we recognize this "geometrical mean wisdom" in crowds? Do we do a geometrical mean calculation in our heads when we ask the crowd for a word of wisdom on a particular question?
Well, I think it has to do on how the data is displayed. Let's take an example, that is the popular game "Who wants to be a millionaire?". One of the life savers is to ask the public to help, in other words to use the wisdom of a crowd. I noticed, that rarely the crowds are wrong. How the data is displayed in this particular example? In a histogram, showing the different counts for the possible answers. If most people don't have a clue the distribution would be flat. And in fact, probably, for the difficult questions the distribution would look pretty flat. But if in the crowd there are few experts the right answer would stick out of the flat distribution as sore thumb. Geometrical means, and our eyes looking at a distribution are good in picking up this spiky features. An arithmetical mean would be pretty useless in this context. If I'm right in this analysis, then this should also explain why communicating crowds would destroy the wisdom effect. Most people would listen to what the majority has to say and the experts would not be listened and in fact they too, being humans, may feel compelled to change their initial guess (in particular if they were not truly experts and they were simply just better than the other ones in making educated guesses).
I think this idea of the wisdom of crowds deserves more investigation and in particular the usefulness of the geometric mean or other statistical ways of the extracting the wisdom from the crowd should be explored. Indeed, what distinguishes a scientist, professional or citizen, is patience, tenacity and taking the time to understand both the general picture than the details of a problem. "