Statistical humbug

A paper appeared in last weeks JAMA that I just now got around to reading, but had already read about in a number of other publications. The study looked at the effect early-in-life obesity has on death from heart disease decades later. The paper is a real treasure trove of information worthy of a longer, more comprehensive blog later on. For now I want to use it as an example of how statistics can be used to humbug the non-statistically inclined.

In brief the JAMA study was done with data pulled from the monster-sized Chicago Heart Association Detection Project in Industry that was begun in 1967. Subjects who were at least 31 years old were evaluated for a number of parameters including BMI, blood pressure, elevated cholesterol, and history of smoking. The researchers re-evaluated these subjects over the next several decades.

Researchers divided the subjects into five groups: low risk, moderate risk, intermediate risk, elevated risk, and highest risk. It’s not important to the point of this post to get into what specifically constituted these varying levels of risk, but in general, the greater the number or the more severe the risk factors, i.e., elevated cholesterol, high blood pressure, etc., the more high-risk the category. The researchers then divided the subjects into three other groups based solely on BMI: normal weight, overweight, and obese.

Within each of these three weight-related groups were spread subjects with varying degrees of risk. In other words, the normal weight group contained subjects who were low risk, moderate risk, intermediate risk, elevated risk and highest risk as a function of their cholesterol levels, blood pressures, smoking history, etc. It was the same for all the groups. The obese group was composed of obese subjects who ranged from low-risk to highest risk. The object of the study was to follow these subjects for many years to see if obesity was truly a risk factor for death from heart disease or if obesity led to elevated cholesterol, high blood pressure, and all the rest, which in turn caused the heart disease mortality.

If the subjects in the normal weight, low-risk group had no more heart disease than those in the obese, low-risk group, then it could be inferred that obesity by itself may not cause heart disease. If, on the other hand, the subjects in the obese, low-risk group had a much higher death rate from heart disease than did those subjects in the normal weight, low-risk group, then at least some of the heart disease could be attributed to the excess fat.

(There is really much, much more under the surface of this paper worthy of exploration, but it will have to wait.)

So what did the study show? It depends upon where you get your information.

According to a statistical analysis of the data, the odds for death by heart disease in the obese, low-risk subjects was 1.43 times that of the normal weight, low-risk subjects implying an almost half again greater risk simply for being obese. And that’s how it was reported in the lay press.

WebMD allowed as to how

The researchers found that the risk of dying from heart disease was 43% higher for study participants who were obese but also met these qualifications for low cardiovascular risk than for normal-weight, low-risk participants.

It appears to be a pretty clear indictment against obesity.

But not if the statistics are analyzed correctly.

Before I get into that I want to produce a quote from Judge Samuel Alito that he uttered during his confirmation hearing before the Senate Judiciary Committee last week. Said he

Well, the analogy went to the issue of statistics and the use and misuse of statistics and the fact that statistics can be quite misleading. … And that’s what that was referring to. There’s a whole – I mean, statistics is a branch of mathematics, and there are ways to analyze statistics so that you draw sound conclusions from them and avoid erroneous conclusions from them.

Truer words were never spoken. But you’ve got to analyze the statistics, not take them at face value.

So let’s analyze the statistics used in our study under discussion to see if and how anyone went wrong.

Here is how the 1.43 ratio was written in the paper:

the odds ratio (95% confidence interval) for CHD [coronary heart disease] death for obese participants compared with those of normal weight in the same risk category was 1.43 (0.33-6.25).

What does this really mean?

First, it means that 143 people who were in the obese, low-risk group died from CHD for each 100 people who were in the normal weight, low-risk group, giving the risk ratio of 143/100 or 1.43. It seems reasonable that if that were really the finding, then the risk of dying if you are obese with low-risk (as these researchers define low risk) is 1.43 times greater than if you aren’t obese. Right? Not necessarily, and here’s why.

If you flip a coin 10,000 times the odds are that you will get about 5000 heads and 5000 tails since the odds are 50-50 of the coin landing on either side. But what about if you only flip it 40 times? Are you going to get exactly 20 heads and 20 tails? Probably not. What about if you only flip it six times? Will you get three and three? Maybe, but probably not. In fact I just flipped a coin 10 times and got five heads in a row, two tails, one head, and one tail, giving me seven heads and three tails. Now if this were a study on coin flipping I could confidently predict based on my data that if I flipped this same coin 10,000 times I would get 7000 heads and 3000 tails. But we all know this isn’t really true because the sample size I used (ten) was too small to be used to accurately predict the outcome for 10,000 flips. The fact that I went 7 heads and 3 tails came about strictly by chance, which plays a smaller and smaller role as the sample size gets larger.

Statisticians realize that virtually any outcome can be influenced by chance and have developed equations to quantify just how much chance is involved. One of the terms they have come up with (after some pretty complex mathematical maneuvers) is the confidence interval. Pretty much the gold standard for confidence intervals is what’s called the 95 percent confidence interval. What this means is that once a confidence interval has been established (a range between two numbers) you can be confident that your result will fall into that range 95 percent of the time. Since chance can’t be totally eliminated, there will still be a 5 percent chance that our result will fall outside the range.

Let me digress a little here to define what we mean by our result falling into or outside of this range. In our study the data showed that 43 percent more people died in the obese, low-risk group than in the normal weight, low-risk group. But so what? As callous as it seems, we don’t care about those people; they’re already dead. What we care about is how the data provided by all these dead people affects you and me and our loved ones and all the other people who aren’t dead yet. We all want to know if this 43 percent is just a chance finding like my flipping 7 out of 10 heads, in which case it’s meaningless, or do I really have a 43 percent greater chance of dying of heart disease if I’m obese even though I don’t have any other risk factors? Those are the questions the confidence interval addresses.

In this study the 95 percent confidence interval is (0.33-6.25). It’s usually stated like it is in this case 1.43 (0.33-6.25). This means that the risk ratio as applied to the population at large should come in 95 percent of the time between 0.33 and 6.25. So this means that the risk of dying if you are obese with no other risk factors could be anywhere from 1/3 as much to 6.25 times as much.

Say what? 1/3 as much?

Yep. Even though the middle of this range is around 1.43 the actual risk is just as likely to be 0.5, which would mean you have half the chance of dying as someone who is normal weight without risk factors. In other words, you would be better off being fat.

These numbers in the parenthesis are critical. If you’ve got a positive number in front of the parenthesis as we do with the 1.43, then you want to make sure that both numbers within the parenthesis are above 1. If the first number is below one that means that the risk is actually greater the other way, which negates your analysis.

In this case since the first number is less than 1, indicating that the risk ratio is meaningless and can be ignored. What it means in this specific case is that it makes no difference whether or not you’re overweight early in life as long as your other risk factors (as identified by the researchers in this study) are normal in terms of your risk of dying of heart disease later in life. Not the 43 percent that the authors and the press that picked the story up proclaimed. Too bad the authors and all the medical press people weren’t a little more statistically honest.

But to tell you the truth, I suspect that the authors of this paper (and I know that the medical writers) don’t have the same understanding of the confidence limits and what they really mean that you do after reading this post. Most researchers run their data through a computerized statistical program and simply look at the risk ratio (the 1.43 in this case) without really having a clue what the numbers inside the parenthesis mean.

I swear that over the next few weeks I’ll post in as simple a way as I can a basic (very basic) primer on statistics. If I don’t do this in a timely fashion you may write and cancel your subscription to this blog, and the unused portion of your subscription fee will be cheerfully returned.

4 Comments

Mike Dodge says:

January 20, 2006 at 3:27 pm

Good old statistics. I’ve spent most of my life as a quality or reliability engineer so I know a lot about statistics. Your analysis was good, but I would expand it a little.
As you showed, the ‘1.43 (0.33-6.25)’ means that their was no statistical difference between the two groups. But what if the results were ‘1.43 (1.01-6.25’? In this case there is a statistically meaningful difference between the two groups. But is the difference really 43% as would be reported? No, the difference is statistically 1%. Whey? Because, at the 95% confidence level that the limits were established at, the lower limit is the one the defines the only limit that is meaningful. Sure the 43% was the mean difference between the two groups, but it is not at the 95% confidence limit. If the 43% is to have a statistical significane, then the results would have to be something like 1.43 (1.42-1.45).
Unfortunately the mean difference is always reported and used as if it were a meaningful number. A 43% difference sounds a lot better than a 1% difference.

Michael R. Eades, M.D. says:

January 20, 2006 at 9:35 pm

Precisely.

Diana Fairbanks says:

February 18, 2006 at 11:05 am

Are you familiar with a product called “omegasentails” based on the research of German doctor Johanna Budwig? I found this product at http://www.mnwelldir.org/docs/nutrition/coc_recipes.htm#Omegasentials1
Just wondering if you would recommend it….

Michael R. Eades, MD says:

February 27, 2006 at 10:24 pm

I’m not crazy about Omegasentails because they are made from flax seeds, which means that the primary omega-3 fat is alpha-linolenic acid (ALA). According to the ads for the stuff there is some DHA added, but I don’t know how much and I don’t know what the quality is.
Humans don’t use ALA well; they must first convert it to EPA then to DHA. I’ve always figured that if that’s what it needs to get to and your trying to supplement, why not just go with the EPA and the DHA and not worry about the conversion.
Good quality fish oil is EPA and DHA with no ALA in the mix.

4 Comments

Leave a Reply Cancel reply

Subscribe To The Arrow