Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board > Off Topic > Off Topic - Computers


Reply
 
Thread Tools Rate Thread
Old 04-29-2015, 11:27 AM   #1
traynor
Registered User
 
traynor's Avatar
 
Join Date: Jan 2005
Posts: 6,626
Statistics Done Wrong

"What goes wrong most often in scientific research and data science? Statistics.

Statistical analysis is tricky to get right, even for the best and brightest. You'd be surprised how many pitfalls there are, and how many published papers succumb to them. Here's a sample:

Statistical power. Many researchers use sample sizes that are too small to detect any noteworthy effects and, failing to detect them, declare they must not exist. Even medical trials often don't have the sample size needed to detect a 50% difference in symptoms. And right turns at red lights are legal only because safety trials had inadequate sample sizes.
Truth inflation. If your sample size is too small, the only way you'll get a statistically significant result is if you get lucky and overestimate the effect you're looking for. Ever wonder why exciting new wonder drugs never work as well as first promised? Truth inflation.

The base rate fallacy. If you're screening for a rare event, there are many more opportunities for false positives than false negatives, and so most of your positive results will be false positives. That's important for cancer screening and medical tests, but it's also why surveys on the use of guns for self-defense produce exaggerated results.

Stopping rules. Why not start with a smaller sample size and increase it as necessary? This is quite common but, unless you're careful, it vastly increases the chances of exaggeration and false positives. Medical trials that stop early exaggerate their results by 30% on average."

Statistics Done Wrong: The Woefully Complete Guide
http://www.statisticsdonewrong.com/

Free.
traynor is offline   Reply With Quote Reply
Old 04-29-2015, 12:33 PM   #2
DeltaLover
Registered user
 
DeltaLover's Avatar
 
Join Date: Oct 2008
Location: FALIRIKON DELTA
Posts: 4,439
Quote:
Originally Posted by traynor
"What goes wrong most often in scientific research and data science? Statistics.

Statistical analysis is tricky to get right, even for the best and brightest. You'd be surprised how many pitfalls there are, and how many published papers succumb to them. Here's a sample:

Statistical power. Many researchers use sample sizes that are too small to detect any noteworthy effects and, failing to detect them, declare they must not exist. Even medical trials often don't have the sample size needed to detect a 50% difference in symptoms. And right turns at red lights are legal only because safety trials had inadequate sample sizes.
Truth inflation. If your sample size is too small, the only way you'll get a statistically significant result is if you get lucky and overestimate the effect you're looking for. Ever wonder why exciting new wonder drugs never work as well as first promised? Truth inflation.

The base rate fallacy. If you're screening for a rare event, there are many more opportunities for false positives than false negatives, and so most of your positive results will be false positives. That's important for cancer screening and medical tests, but it's also why surveys on the use of guns for self-defense produce exaggerated results.

Stopping rules. Why not start with a smaller sample size and increase it as necessary? This is quite common but, unless you're careful, it vastly increases the chances of exaggeration and false positives. Medical trials that stop early exaggerate their results by 30% on average."

Statistics Done Wrong: The Woefully Complete Guide
http://www.statisticsdonewrong.com/

Free.
Other things that might be responsible for wrong statistical analysis:
p { margin-bottom: 0.1in; line-height: 120%; } - Garbage in – garbage out

- Wrong null hypothesis

- Lack or wrong data normalization
__________________
whereof one cannot speak thereof one must be silent
Ludwig Wittgenstein
DeltaLover is offline   Reply With Quote Reply
Old 05-01-2015, 07:46 PM   #3
crestridge
Paladin & Fudge
 
Join Date: Jun 2007
Location: CALIFORNIA
Posts: 347
Wrong stats

You are so correct; however, in addition, the use of 1st true positives and 2nd true positives in association with false positives, smooths out the exaggeration. But the key is to place a reasonable probability to your own estimate, then obtain further evidence apart from your estimate, and then find a "devil's advocate" (so to speak), (a proper false positive estimate), then do the math with all 3 factors, which produces a single probability.

Usually our own estimates are to high, so we need other opinions which may be more negative. The combinations probably give a more realistic view of our predictions.

We must realize a lot of the things that happen on the track are out of our control, and these impact our analysis.

Last edited by crestridge; 05-01-2015 at 07:47 PM.
crestridge is offline   Reply With Quote Reply
Reply





Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

» Advertisement
» Current Polls
Wh deserves to be the favorite? (last 4 figures)
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 11:57 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.