The recent worrying reports coming out of Iraq suggest that doctors in the Iraqi city of Fallujah are reporting a high level of birth defects, with some blaming weapons used by the US after the Iraq invasion. The BBC’s World Today programme quotes a British-based Iraqi researcher saying that doctors in Fallujah were witnessing a “massive unprecedented number” of heart defects, and an increase in the number of nervous system defects. The researcher went on to say that based on data from January this year, the rate of congenital heart defects was 95 per 1,000 births – 13 times the rate found in Europe. This indeed is cause for concern but how significant is it? The Operational Research team in Capgemini are trained in statistical methods and analysis and employ these to help clients make sense of, often confusing or conflicting, evidence. Any application of statistical analysis to the Fallujah data will need to address: – what the underlying incidence of defects was before the conflict – if the defects are afflicting all mothers or just a particular group – from how far afield are reports being collated to establish the size of the underlying population. In addition, some assumptions need to be made to allow statistical approaches to be deployed. In this case some form of appropriate probability distribution will be used to establish what, for a given population, might be the expected number of such defects in the normal run of events. Something like a Poisson distribution might well be appropriate here as it is often used to estimate the number of relatively unlikely events that could occur during a given time period. If the use of the chosen distribution is well justified, then it will also enable a calculation to be made of the likelihood that, in any given time period, more than a chosen number of incidents might occur. . Poisson Distribution with mean (expected) value of 10
(It was discovered by Siméon Denis Poisson and the classic Poisson example is the data set of von Bortkiewicz (1898), for the chance of a Prussian cavalryman being killed by the kick of a horse.) This essentially provides us with an objective means of estimating how significant any observed high (or in some cases low) value might be. All this, of course, will be needed to establish that there is indeed a higher than expected incidence of defects even before the vexed issue of exploring what might be causing the effect. It is important to understand and state explicitly what the assumptions are that you are making in both business analysis by consultants and by academics. In the real world, some of the theoretical assumptions needed to deploy statistical methods can become compromised. In a trivial example, rolling a “fair” die will yield an equal chance of getting the numbers 1 through 6. If rolled repeatedly, the assumption of fairness can be used to calculate the likely hood of getting say 10 “sixes” in a nominated sequence of 10 rolls. (about 1 in 60million) But in a long sequence of roll results there will be sequences of “6’s” perhaps 2,3,4 or 5 long which wouldn’t be worthy of comment. How frequent and how long do these need to be before we start to doubt the underlying assumption of fairness? This inverting of the statistical approach is important in error and bias detection and in exposing some fraudulent business systems in, for example, the Gaming and Lottery sectors and also in Insurance. The other key assumption underpinning a good deal of probability and statistical methods is the concept of independence. It is the assumption that the probability of one event happening is not affected by (is independent of) the occurrence or non-occurrence of another event. Assuming this independence pertains we can legitimately estimate the probability of the two events both happening as the product of the two individual probabilities – we simply multiply them together. An infamous example of where someone got this wrong is the inventor of the psychological condition called “Munchausen’s syndrome by proxy” Sir Roy Meadow. A few years ago, about one baby in 8,400 died a “cot death” (also known as Sudden Infant Death Syndrome – SIDS) in the UK every year. Among families with 2 or more children it is about 1 in 1,600. Sir Roy, an expert witness at anumber of trials, has said that one cot death is a tragedy, two is suspicious and three is murder. He had calculated that the chance that two further cot deaths occur in a family in which one baby has already died a cot death is 1 in 73 million, which is so small (if it is chance) that it must be murder. The fundamental assumption Sir Roy was making was that the events were independent. He simply multiplies 8,400 by itself. They are not independent; the rate of cot-deaths is higher in families in which there is a smoker, and among babies whose mothers put them to sleep face down. There is also a genetic component. So the rate of two further cot deaths is much larger than 1 in 73 million. Because of this error, and because some evidence of genetic susceptibility was withheld from the defence, three convicted mothers have been released by the Appeal Court, on the grounds that the murder convictions are unsafe. Sir Roy was struck off in 2005 by The General Medical Council for his “misleading” evidence. It also inspired an unprecedented intervention by the Royal Statistical Society who criticised the incorrect use of statistics. A renowned expert in his chosen field stumbled into the minefield of statistical assumptions and paid a heavy price. We should, therefore, avoid a similar “rush to judgement” in the current Fallujah example. If the underlying rate of defects is four or five times higher in the Middle East than in Europe then this seemingly extreme example may be within “normal” variations however there is clearly a case to be answered and the considered analysis needs to be done. .