Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board

Go Back   Horse Racing Forum - PaceAdvantage.Com - Horse Racing Message Board


View Single Post
Old 11-04-2012, 06:03 PM   #3
GameTheory
Registered User
 
Join Date: Dec 2001
Posts: 6,128
(...cont)

Ok, moving on, so what about the science part? What CAN we take away from polls at face value?

Well, the direct questions and answers: If they ask who you are voting for, or your age, or what party or ideology do you consider yourself, there is no reason to question those answers. Sure, people can lie, but that's got to be a pretty small number. So that's the actual "opinion of the people" part of the poll (which the turnout numbers are not), but we want to use a whole bunch of such q&a sessions to predict things like who's going to win and by how much. But because we know that the population of poll respondents does NOT represent the entire voting population at large (and the margin of error does not address this question, see my previous post about polling cats & dogs in the prez odds thread), that's where the art comes in and we start weighting our samples and looking at the main variables we are interested in (who are you voting for) cross-referenced with other traits like party id, sex, age, etc. And how we weight those samples controls the "top-line" results of the poll. (e.g. Obama 47, Romney 45). Because the opinion part of the polls are very consistent within regions and groups. For instance, if you break it down by (self-identified) party id, which correlates highest with breaking for one candidate or the other, the results are always around a 90-10 split for the "my party" guy and then the independents break where ever they break for that particular region and set of candidates. And you will generally see strong agreement in all the polls within these categories, even if their top-lines are completely different. (Independents vary the most, but the self identified groups are always between 88-92 for their guy. 88% percent represents a very UNENTHUSIASTIC response to a party candidate within that party.) Close to the election you don't see significant changes in those numbers, so the top-lines result is ultimately controlled almost completely by the makeup of the sample.

It is easiest to pick on party id as *the* factor to look for here, and that's the one people always talk about (which is exactly what I'm going to do as well), but realize there are an infinite number of ways to slice up this data.

Ok, so all of the above boils down to this: the top-line poll results you see are FAR more "turnout estimate models" than they are representative of "the opinion of the people". So just remember when comparing polls with vastly different results what the real differences are. If you drilled into those polls you'd find near identical numbers under the Republican, Democrat, and Independent categories, and the reason they have different top-line results is because the proportions of Republicans, Democrats, and Independents varies wildly between them. So on the science side, we've got numbers that are essentially fixed and that everybody agrees on, and so the difference between the polls is almost exclusively on the art side. (This is not necessarily true over time, but right near the end of the election it is.) So what are we looking at when we see a poll result? An estimate of the TURNOUT in the *opinion* of the pollster. That's it! But their opinions are not (usually) wild guesses -- they are based on the census, voter registrations, historical turnout, and their likely voter screen. But please realize how sensitive these numbers are. Just one percentage point from one party's column that should be in the other party's column has a major effect because they all vote 90-10 for their guy. So polls with a 2 point spread, that means nothing. The exit polls aren't even that accurate, often off by 1.5pt or more for each candidate. (Early voting is now a problem for exit polls too, but it is factored in.) The level of precision people ascribe to these polls is simply not there.

Which is why we make averages, right? Yes, that does improve things, but if all the DRI (Democrat-Republican-Independent) mixes in each poll are systematically off in most of the polls in one direction, then that error will still be baked into the average rather than canceled out between them, which is the usual justification of the averaging in the first place.

So what I do is to separate the art and science parts of the poll by making two weighted averages.

The first is the percentage votes for each candidate by each party id sub-group. So I'll look at several polls, and maybe one of the will have Dems voting for Obama:Romney at 90:10, and another will have 92:8, and another will have 89:11 and then I'll weight those by sample size to get an estimate of how self-labeled Democrats are going to vote, and then do the same with Republicans and Independents. These numbers I really pay attention to, they are not opinions (of the pollster), and it doesn't even matter if the pollster is biased -- these numbers should still hold as long as the pollster isn't just fabricating data (which some of them clearly do, but mostly small-timers).

Then I take a separate weighted average of the DRI mix used in each poll to get a grand average of all the polls turnout estimate. And then I can put those two resulting sets of averages back together and do the math to come up with a weighted poll of polls (that is still using the internal models of the pollsters). If the pollsters are accurate in their models, this will give a very good projection. But I'm also free to ignore the DRI average and substitute what I might feel is a better turnout estimate and create a different projection that way -- substituting my art for theirs. Not all polls give out all the information I need to calculate the averages (or you have to pay for it), so I don't use Rasmussen and some others.

So the first question is: are the pollsters turnout estimates (DRI mix) any good? Well, that depends on whether you believe the exit polls -- it gets murky when you start comparing one poll to another to check for accuracy, especially since in states with early voting adding them to the exit poll is tricky business, and the turnout model of the exit poll could also be flawed. Whaddya gonna do? (And as we are about to find out, the polls turnout projection and the reported turnout on the exit poll are wildly different.) I've concentrated most of my analysis on Ohio and Colorado. If Romney wins those, he wins. I think it is safe to say that if he wins Ohio in defiance of the polls and wins Colorado also (which I actually think might be tougher) then it is a pretty good bet he is also going to win all those other swing states where he is tied or leading and so will have at least 275 electoral votes. Both Ohio and Colorado are heavy early voting states.

Let's look at 2008.

According to my average of the 2008 polls taken just before election day in Colorado, the pollsters projected DRI was 40:38:22. (Once again that's 40% Dems, 38% Reps, and 22% Inds) What did the exit poll say? 30:31:39. Not even close to the projection, and that is a trend I see in most of the 2008 data -- the polls projections have way too many Democrats and far too low Independents compared to the exit polls. And for Colorado anyway, that independent number of the exit poll is more plausible as more people in Colorado (where I am) do think of themselves that way, but they lean Democrat. Independents leaned Democrat in Colorado even towards Kerry in 2004, and are projected to so again this year which is one reason I think Colorado may be tougher for Romney than Ohio even though it appears he is polling better here. Anyway, even though the projection mix and the exit poll mix are very much different, the projections made using the pollsters' model in 2008 was actually spot-on. I usually normalize the numbers as if all votes went to one of the two main candidates, so I'm not looking at exact vote numbers. My normalized 2008 projection for Colorado using the polls turnout estimate: 54.7-45.2. Actual (normalized) results: 54.4-45.6. WOW that's close. That's closer than the exit poll itself predicted using its own DRI mix and vote-by-party id percentages. It predicted 53.3-46.7, which over-estimated McCain. Still, that could just an accident. What if I plug-in the exit polls reported DRI mix with my calculated numbers for how each group would vote? Then I get 55.1-44.9, which over-estimates Obama. Of course, if you are doing this before the election, you don't have an exit poll to give you a DRI mix -- you either use the pollsters or tweak it or base it on the previous election or just make one up.

2008 in Ohio pollsters projected DRI mix: 45:39:16. Again, independents so low. Exit poll reported mix: 39:31:30. That looks closer to reality for 2008. The pure poll projection for 2008 Ohio: 52.2-47.8. Actual results: 52.3-47.7. Wow, again. Predictions don't get closer than that. (Again, these numbers are normalized to sum to 100%.) I'm going to have a lot of trouble arguing against using the pollsters numbers for 2012, aren't I? They nailed this thing in 2008, at least in CO & OH. (Actually, I nailed it using their numbers, but anyway.) Exit poll prediction was: 53.6-46.4, over-estimating Obama. And using the exit DRI mix with the poll numbers for how those groups would vote: 53.9-46.0, over-estimating Obama even more. But to repeat, there is no exit poll before the election so that's just a curiosity. However, we would like to have some historical numbers to use in order to help guide us as we come up with a turnout projection for this year, so it would be nice if the exit poll numbers were accurate. No way of knowing.

And now 2012.

In Colorado, my current DRI projection from the polls is: 34:34:32. Not as rosy for Obama as 2008's 40:38:22, but in Colorado independents lean Dem so it still points to an Obama win: 50.7-49.3 (normalized), although a thin one. (He won by nearly 9 points normalized in 2008.) But certainly not a lock this time. What if we use the 2008 exit poll turnout numbers (30:31:39)? Then we get dead solid even -- Obama ahead by only a 1/3 of a normalized point.

So what happens in Colorado this year? By the pollster's numbers, a thin Obama win. By the 2008 exit polls, a tie, uncallable. Move any of the turnout projections towards Romney at all, and Romney wins. But Colorado is very tough -- population is expanding with people moving from other states, mostly to the urban areas. (Denver public school district is currently the fastest growing the nation.) Those new people probably lean Democrat, but conservative stronghold Colorado Springs is growing as well, and then we've got places like Aspen and Vail full of rich people and all the rural areas that always go red. And, enthusiasm for Obama *is* down overall if not necessarily among hard-core Democrats. Among independents it is definitely way down from 61-39 to 53-47. And enthusiasm for Romney is much higher than McCain among Reps: from 89-11 to 94-6. 94 is about as high as it gets for in-party numbers, and 89 is pretty bad actually. So I think the best scenario for Obama is that any possible (but not assured) favorable changes within the population in the state are canceled out by his lower popularity and Mitt's much better showing. Remember the exit poll in 2008 actually had 1 point more Republicans than Dems and with independents being higher than either, and over-estimated McCain. Let's assume the Dem-Rep mix was a rounding error and give 1 pt back to the Dems from the Reps. That gives us 31:30:39, which would have predicted the 2008 election spot-on. Using that mix for 2012, we get a projection of 51.0-48.9, almost exactly the same as the current polls projection of 50.7-49.3. Different roads to the same result.

Here's where the art really takes over. We have seen that the pollster's turnout projections compared to the exit polls turnout estimates are VASTLY different -- not even close. And they are all just polls. So at this point anybody inclined to argue simply that "pollsters know what they are doing, we should just go by their estimates" doesn't have a leg to stand on. You can't argue that pollsters are always right when their own polls are wildly differing on the equivalent figures. So is the sampling from an exit poll more representative of the voting population than pre-election sampling? Common sense says yes, and the exit polls have much larger samples than the normal pre-election polls. But still they could be skewed depending on where you do your exit polling, and how you account for early voting which means you are adding telephone samples back into the mix. Polling is more difficult than ever -- these figures just can't be that precise. Anyway, I choose the exit polls as more plausible and less arbitrary than the pre-election turnout estimates. You may disagree, and start throwing "historical accuracy" at me, but certainly exit polls historical record at predicting elections is at least as good pre-race polls, right?

Ok, so if a 31:30:39 DRI mix picked the 2008 election spot-on, and that same mix predicts a 2 pt win by Obama this time, what does that mean? I don't think anybody is going to argue that this election is going to be MORE favorable turnout-wise for Obama than last time, even in Colorado where maybe the Democrat population has increased. So Obama's best scenario is a thin win in Colorado. If he wins big in Colorado (he did win by 9 last time), then I am totally out to lunch and I apologize you've read this far. But we've got to ask how plausible is it that turnout will even be equally as favorable to Obama this time? Not likely, I think. To get an idea of what high Republican turnout in Colorado might look like, consider than the 2004 exit polls were 29:38:33 -- R+9. Such things are possible in this state. If turnout is like that, Romney wins by 7 points. But, that's probably not plausible either. Obama is an incumbent and still well-regarded among his base, not a lame-o loser like John Kerry who wasn't well-regarded by anybody. But similarly, enthusiasm for Romney is well above that for McCain. Does it approach incumbent George Bush 2004 levels? Maybe, but he is the challenger so let's assume not. Independents are tough to figure -- they haven't gone over to Romney en masse (they went for Kerry even), but they are way less sweet on Obama. That could just mean the ones that voted for him last time have largely decided just to stay home. I'm going to say independents are going to be closer to a 1/3 of the electorate this time rather than the very high figure of 39% reported last time. So let's lower independents, add some to Repubs, and also add some to Dems, but less than Repubs. Remember these are percentages so D's and R's can go up relatively simply by I's staying home. Basically I'm assuming that D's will turn out in the same numbers, R's will turn out in much greater number, and the independents will turn out less. How about 33:34:33? That gets us absolutely tied -- a re-count situation with Romney winning by 100 votes or something. These are not absolute predictions but just plausible scenarios -- more plausible I think than what the raw polling is giving us. I'm trying to be very conservative and be generous to Obama -- he's the incumbant and incumbants are hard to beat. But seriously, the range of the plausible goes from a very thin Obama win to a big Romney win. But the most likely is Romney by a point or two at least. But I do find Colorado hard to figure.

Objection: "But you're just pulling numbers out of your ass!" Well, yeah, that's the art side, and the pollsters do the same. That's my point. ALL TURNOUT ESTIMATES ARE PULLED OUT OF ASSES! So we look at history and current conditions, etc to guide us. But in the end there is always something arbitrary about them.

On to Ohio. It seems like it should be a slam-dunk for Romney, given that we are talking about Obama being tied in Colorado which he won by *9* last time, greater than the national average. He won Ohio by only 4 or 5 pts (depending on where you look -- I actually see at least 3 different results for Ohio 2008 from different sources), well less than the national average. And everybody agrees that Romney is winning Ohio independents handily. So what's going on? How could Obama possibly be winning Ohio?

Using all the pollster numbers gives us a projection of Obama winning by 3 points: 51.5-48.5. Using the 2008 exit polls, we get Obama by 4.5 points: 52.3-47.7. But the exit polls for 08 also over-estimated Obama in 08, so if we dial back the Dems a bit to give it a 37:32:31 DRI mix, that would have hit 2008 spot-on. If we compare the pollsters turnout DRI mixes from 08 to 12 it looks like this: from 45:39:16 to 38:31:30, which like in Colorado actually ends up slightly rosier than a slight tweak to the exit poll to get it in line. Their projection actually looks much more normal this time as the independent number isn't absurdly low anymore. Anyway, using my modified 2008 exit poll DRI mix, the projection is now Obama by 2 points: 51-49.

So again we ask, is it really plausible that the turnout in 2012 will be just as favorable to Obama as it was in 2008? Once again, let's see what a high Republican turnout in Ohio might look like. The 2004 exit poll reports 35:40:25. If we use that, Romney wins by 6. But, as in Colorado, probably more realistic is to split the difference while still favoring Obama. Moving the turnout rates only a single point from 2008 towards Romney to 36:33:31 gets us a absolute tie. Isn't it reasonable to think that the shift towards Romney will be at least that much? If so, Romney wins.

I would like to sum up now, but I'm sick of writing this, so we'll cover any confusion in later discussion.

And now a final wrinkle -- the big-ass storm. Did it help Obama? Probably. Will it tip it in his direction if it needed tipping? Maybe. I've got no answers for that, and there isn't enough time for the polls to really reflect it. But a bunch of very late polls moving towards Obama (or vice-versa) could be the real deal. I had expected the election to have obviously broken for one guy or another by now, so that still can happen very late storm or not, reflected in the polling or not. Gallup suspended national polling early due to the storm. Hard to say what the final effect will be, if any, since the hardest hit areas are already going for Obama. It did kind of stop the Romney campaign in its tracks and take him off people's radar. Who knows?


Here's some more crap to read on similar subjects:

http://www.redstate.com/2012/10/31/o...ewed-unskewed/


Discuss.
GameTheory is offline   Reply With Quote Reply
 
» Advertisement
» Current Polls
Wh deserves to be the favorite? (last 4 figures)
Powered by vBadvanced CMPS v3.2.3

All times are GMT -4. The time now is 09:54 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright 1999 - 2023 -- PaceAdvantage.Com -- All Rights Reserved
We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program
designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.