Random significance question

Quinn_Inuit

Newly Enlightened
Joined
Mar 30, 2007
Messages
67
Location
Virginia
Random question based on this comic: https://xkcd.com/882/
Let's say they'd tested green jelly beans first and came up with the same spurious P-value. (Also assume there are twenty total jelly bean colors.) At that point, they declare victory and go back to playing Minecraft. Would the scientists still need to perform the Bonferroni Correction on the results because they could have tested 19 other colors, but didn't?
 
Last edited:

Borad

Enlightened
Joined
May 27, 2011
Messages
227
If you go to the address in the link text, it works. The target he's using is different than the link test. Try this.
 

ElectronGuru

Flashaholic
Joined
Aug 18, 2007
Messages
6,055
Location
Oregon
Random question based on this comic: https://xkcd.com/882/
Let's say they'd tested green jelly beans first and came up with the same spurious P-value. (Also assume there are twenty total jelly bean colors.) At that point, they declare victory and go back to playing Minecraft. Would the scientists still need to perform the Bonferroni Correction on the results because they could have tested 19 other colors, but didn't?
The problem with statistics is it tries to use math to predict or describe human behavior. XKCD is showing what happens when reporters - who are incentivized to find significance - make conclusions with information from scientists who aren't or aren't' ready to.

Every answer has a certain amount of random chance. And the more times you ask a question, the more likely you are to hit that chance. And the more likely people are to see that hit as though it means something. As though it was wasn't random. Especially when we are looking for meaning (as the reporters are).

Scientists have tools like Bonferroni but the real problem lies not with the scientists or the reporters but with the consumers of the information. With so many layers of interpretation between the data and the reader, how does the reader deal with so much uncertainty?

For example, if the Bonferroni Correction is designed to compensate for the increased influence of random chance, applying it to only 1 of 19 results would be correcting that which needs no correction But understanding the science is very time consuming. I prefer mapping the incentives of each 'actor' in the information stream.
 

Quinn_Inuit

Newly Enlightened
Joined
Mar 30, 2007
Messages
67
Location
Virginia
The problem with statistics is it tries to use math to predict or describe human behavior. XKCD is showing what happens when reporters - who are incentivized to find significance - make conclusions with information from scientists who aren't or aren't' ready to.

Every answer has a certain amount of random chance. And the more times you ask a question, the more likely you are to hit that chance. And the more likely people are to see that hit as though it means something. As though it was wasn't random. Especially when we are looking for meaning (as the reporters are).

No argument here. My question, though, involves what happens if you happen to hit that chance early on. Is there a way to correct for that possibility beforehand, or is that baked into whatever you use to generate the p-value in the first place?

Scientists have tools like Bonferroni but the real problem lies not with the scientists or the reporters but with the consumers of the information. With so many layers of interpretation between the data and the reader, how does the reader deal with so much uncertainty?

For example, if the Bonferroni Correction is designed to compensate for the increased influence of random chance, applying it to only 1 of 19 results would be correcting that which needs no correction But understanding the science is very time consuming. I prefer mapping the incentives of each 'actor' in the information stream.

That's a reasonable way to do it from a practical/personal perspective, but it opens you up to accusations of argumentam ad hominem if you try to use it in debate with others.
 

ElectronGuru

Flashaholic
Joined
Aug 18, 2007
Messages
6,055
Location
Oregon
It sounds like you are dealing with something specific you can't share so I'm unable to address it specifically. But speaking generally, the confidence of an answer depends on sample size and quality. So trying to compensate for incomplete data with an otherwise reliable correction sounds like a good idea. But does knowing a flaw in the data exists make you immune to it, if you don't know what the flaw is?

It's like with antibiotics. You're on a 10 day course and feel better at day 5 and stop taking the pills. You got the answer you wanted (feeling better), so why continue? Except that we know from other people who've tried it (more data) that stopping early leaves some of the buggers alive to come back stronger next time. But you wouldn't know that your first time without cross checking with other patients first times.
 

Quinn_Inuit

Newly Enlightened
Joined
Mar 30, 2007
Messages
67
Location
Virginia
It sounds like you are dealing with something specific you can't share so I'm unable to address it specifically. But speaking generally, the confidence of an answer depends on sample size and quality. So trying to compensate for incomplete data with an otherwise reliable correction sounds like a good idea. But does knowing a flaw in the data exists make you immune to it, if you don't know what the flaw is?

It's like with antibiotics. You're on a 10 day course and feel better at day 5 and stop taking the pills. You got the answer you wanted (feeling better), so why continue? Except that we know from other people who've tried it (more data) that stopping early leaves some of the buggers alive to come back stronger next time. But you wouldn't know that your first time without cross checking with other patients first times.

I'm sorry, I'm not trying to be opaque. I appreciate the assistance, and I'm really asking about the situation in the comic. Let's say they found the significance on the first time they tried a color and then stopped. Have they made any mistakes in their process or their calculations, or is this just the random chance that the P value takes into account?
 

RetroTechie

Flashlight Enthusiast
Joined
Oct 11, 2013
Messages
1,007
Location
Hengelo, NL
and I'm really asking about the situation in the comic.
Fair enough... let's take that apart:

Statement (or theory, if you will): "Jelly beans cause acne!"
(gets refuted)
New theory: "It's only a certain color that causes it."
(gets refuted for a whole set of colors)

Reporter (who doesn't seem to understand the facts, their meaning, science or statistics) comes up with his own theory: "Green jelly beans linked to acne!"

Now a) That last 'theory' should be easy to confirm or refute by those scientists who've already tested 20 other jelly bean colors. :laughing:
b) Reporter's claim would actually hold if original claims are true (or assumed to be true), and those 20 colors + green are all possible jelly bean colors. A simple process of elimination, after which you're left with whatever must be true then (however likely or unlikely it seems).
c) Read closer: "I hear it's only a certain color ...". If indeed hearsay, that makes all other claims based on quicksand.
d) Whatever you conclude, those scientist's findings (all those jelly bean colors tested) simply stand. And will keep standing. Until experiment(s) itself are shown to be flawed somehow.
e) Reporter's public will draw its own conclusions.

So science isn't always easy. It is in some ways, but hard in other ways. Math is what it is, and statistics can be used to 'proof' anything. Like ElectronGuru said, it matters a lot what data you've got, and how it was obtained. If you're obtaining data, all data obtained should be considered meaningful - even those 99 "fails" before your 100th "success".

So trying to compensate for incomplete data with an otherwise reliable correction sounds like a good idea.
I disagree. That's just trying to compensate for missing data by applying some correction that's worked for other data. Depending on context, that may make sense. Or it may attach more weight to the data than should be attached to it. Either way, it's no substitute for "obtain more / better data".

I'll spring my own theory: on average, reporters are about as dumb as the public they serve. Just more skilled where it comes to making money from their writings... :p
 
Top