POLMETH Archives

Political Methodology Society

POLMETH@LISTSERV.WUSTL.EDU

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Michael McDonald <[log in to unmask]>
Reply To:
Political Methodology Society <[log in to unmask]>
Date:
Mon, 10 Jan 2011 14:38:07 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (458 lines)
As a data provider and as someone who works with election data...I
wholeheartedly concur with Charles and Bryan. Trust but verify keeps me on
my toes, too. I have also found a zen-like quality to doing hand data entry.
There is value to knowing your data beyond descriptive statistics.

This thread brings to mind an idea I have been kicking around to create a
social science data wiki, with the ability for trusted users to
collaboratively create and maintain social science databases, rather than
allowing only one person with the authority to manage each database, which
is the current model employed at ICSPR, the Dataverse, and other data
archives.

============
Dr. Michael P. McDonald
Associate Professor, George Mason University 
Non-Resident Senior Fellow, Brookings Institution

                             Mailing address:
(o) 703-993-4191             George Mason University
(f) 703-993-1399             Dept. of Public and International Affairs
[log in to unmask]               4400 University Drive - 3F4
http://elections.gmu.edu     Fairfax, VA 22030-4444


-----Original Message-----
From: Political Methodology Society [mailto:[log in to unmask]] On
Behalf Of Bryan Jones
Sent: Monday, January 10, 2011 1:45 PM
To: [log in to unmask]
Subject: Re: [POLMETH] 2010 Senate Elections results

Tobin Grant wrote:

> Would that warning against using others data apply to data archived  
> here?
>
> http://www.policyagendas.org/page/datasets-codebooks


YES, definitely!  The quality of the policy agendas datasets should be  
questioned by anyone who uses them.  Many of you have found mistakes  
we have made, or suggested improvements, and we have tried to address  
these.

But of course Tobin and James are correct--I should have added the  
following phrase: "without examining the reliability of the data  
collection system independently".  But I stand by the criticism that  
too many of us use datasets we know little about.

Bryan

On Jan 10, 2011, at 12:01 PM, Tobin Grant wrote:

> Would that warning against using others data apply to data archived  
> here?
>
> http://www.policyagendas.org/page/datasets-codebooks
>
> I'm kidding on the square.  The caution against simply using existing
> data is warranted, but it can be taken too far.  At some point most of
> us have to rely on someone else for our data.  I can get roll call
> data from the 111th Congress, but that data set probably used some
> documents compiled by the Clerk of the House of Representatives.  I
> can get aggregate public opinion data, but those marginals are often
> reported by the polling house.  Even if you had the data from the
> survey, we shouldn't demand that researchers see the original paper
> (for old surveys) or recordings of each interview to double-check the
> accuracy of the survey coding.  And we could further argue about
> whether we should use research assistants.
>
> Tobin
>
> On Mon, Jan 10, 2011 at 11:25 AM, Bryan Jones <[log in to unmask] 
> > wrote:
>> Colleagues
>>
>> I want to add a strong second to Charles Stewart's old grump  
>> caveat.  He
>> only thinks he is a grump about data quality!   I am the Grand Grump.
>> Grabbing someone else's numbers and running analyses on them should  
>> no
>> longer be acceptable in political science (anywhere else, for that  
>> matter).
>>  A little care in data quality can yield strong benefits in  
>> improved test
>> power.  But as Soroka, Wlezien, and McLeana show, error in data can  
>> also
>> lead one to reject a null inappropriately.
>>
>>  It is a great piece that shows how error in data can cause all  
>> sorts of
>> bad stuff, and should be required reading in research design courses.
>> Stuart Soroka, Christopher Wlezien, and Iain McLean. 2005.  How  
>> Measures
>> Matter.  Journal of the Royal Statistical Society A 169 Part 2:  
>> 255-71.
>>
>>
>> On Jan 8, 2011, at 8:57 AM, Charles Stewart III wrote:
>>
>>> At the risk of seeming like an old grump, could I step back and  
>>> fret a bit
>>> about data quality that this thread has left by the side of the  
>>> road?
>>>
>>> Election return data at the local is surprisingly difficult to  
>>> acquire,
>>> even in this electronic age.  It requires a certain persistence  
>>> that has to
>>> be learned and trained.  Learning to be persistent is difficult in  
>>> these
>>> days when we teach our students that if they know how to ask,  
>>> there's
>>> already an Excel spreadsheet available with the data they need.
>>>
>>> In the United States, all local jurisdictions must eventually  
>>> report their
>>> canvassed returns to the state election officer, which then  
>>> publishes them.
>>> All states publish returns at the level of reporting, usually the  
>>> county.
>>> Most of these reports are on the web, in a variety of formats,  
>>> which now
>>> can easily (if tediously) be manipulated with software that is now
>>> essentially free --- by which I mean Excel and Adobe's pdf  
>>> reader.  I know
>>> this from personal experience, constructing datasets of turnout  
>>> and vote
>>> totals for presidential elections at the county and town level  
>>> each year
>>> since 2000.  While I have purchased some software to speed up the  
>>> process,
>>> like able2extract and ocr programs, it's still the case that every  
>>> format
>>> I've seen from a state can be manipulated with one of these  
>>> programs, with a
>>> little hand-entry thrown in from time-to-time.
>>>
>>> We have a great debt to pay to news organizations and web sites that
>>> gather this data for us and post it up on their sites for our  
>>> use.  In the
>>> days immediately following elections, these sites --- NY Times,  
>>> CNN, etc,
>>> --- are often the only practical sources for scholars to use, if  
>>> they want
>>> to be involved in the evolving story about the election.  (In  
>>> other words,
>>> if you want to talk to reporters or wrote blogs, you often need to  
>>> use these
>>> sites.)  These sites are often just mirroring the AP operation,  
>>> which is an
>>> extraordinary operation.  HOWEVER, AP is not a scholarly  
>>> operation.  That's
>>> where my concern lies.
>>>
>>> States take a long time to certify their election returns and  
>>> release the
>>> final results, accounted for at the local level --- sometimes  
>>> taking months
>>> to do so.  (In 2008, for instance, Massachusetts did not make  
>>> available its
>>> town-level returns, even if you asked really nice, until April  
>>> 2009.)  AP
>>> will eventually mop up their data set, so that it reflects these  
>>> certified
>>> returns.  However, I strongly suspect that the abandoned datasets  
>>> sitting
>>> around on the CNN and NYT web sites are not updated, and so  
>>> reflect all
>>> sorts of omissions, such as provisional and absentee ballots in  
>>> many states.
>>>
>>> As an example, compare the results at the NYT site for California
>>> (http://elections.nytimes.com/2010/results/california) with the  
>>> California
>>> statement of vote
>>> (http://www.sos.ca.gov/elections/sov/2010-general/complete- 
>>> sov.pdf).  The
>>> New York Times data is missing 1.6 million votes, or roughly 16%  
>>> of the
>>> votes cast for senator.  That's because the NYT data ceased to be  
>>> updated as
>>> California completed its count.
>>>
>>> I think it is generally a very bad idea to encourage graduate  
>>> students to
>>> purchase datasets like this, or to rely on news web sites, but if  
>>> there is
>>> one web site that bears watching, it's Dave Leip's Atlas of U.S.
>>> Presidential Elections, since the data are usually based on  
>>> official returns
>>> and, if you dig deeply enough, he reports his data sources.  I've  
>>> purchased
>>> data from his site, have cross-checked it with my own efforts, and  
>>> found it
>>> very good.
>>>
>>> In conclusion, I hope we can make a distinction between the fast- 
>>> and-dirty
>>> analysis we do on the fly, because we're trying to make sense of  
>>> unfolding
>>> election counts or trying to gin-up an example for class, and the  
>>> scholarly
>>> analysis we do.  If the former, then the newspaper web sites are  
>>> fine, and
>>> even indispensible.  If the latter, we have a duty to be careful  
>>> about
>>> provenance.
>>>
>>> Charles
>>>
>>> ===============================================================
>>> Charles Stewart III
>>> Kenan Sahin Distinguished Professor of Political Science
>>> Housemaster of McCormick Hall
>>>
>>> Voice:  617.253.3127 / Facsimile:  617.258.8546
>>> e-mail:  [log in to unmask] / URL:  http://web.mit.edu/cstewart/www/
>>>
>>> Department of Political Science
>>> 30 Wadsworth Street
>>> Building E53-449
>>> Cambridge, Massachusetts   02139
>>>
>>>
>>> -----Original Message-----
>>> From: Political Methodology Society  
>>> [mailto:[log in to unmask]] On
>>> Behalf Of John Henderson
>>> Sent: Friday, January 07, 2011 7:34 PM
>>> To: [log in to unmask]
>>> Subject: Re: [POLMETH] 2010 Senate Elections results
>>>
>>> I don't believe so.  Here's the full API list:
>>> http://developer.nytimes.com/docs
>>>
>>> On Fri, Jan 7, 2011 at 10:31 AM, Simon Jackman  
>>> <[log in to unmask]>
>>> wrote:
>>>
>>>> Are these data part of what NYTimes exposes via it's API?
>>>>
>>>> Simon Jackman
>>>> Dept of Political Science and, by courtesy, Statistics,
>>>> Stanford University, Stanford CA 94305-6044
>>>> http://jackman.Stanford.edu
>>>>
>>>> On Jan 7, 2011, at 8:41 AM, John Henderson <[log in to unmask] 
>>>> >
>>>> wrote:
>>>>
>>>>> For a licensing fee, you can purchase the county data here:
>>>>> http://www.uselectionatlas.org/BOTTOM/store_data.php.
>>>>>
>>>>> For free, you can access county data on the nytimes webpage, but  
>>>>> you'll
>>>>
>>>> have
>>>>>
>>>>> to enter the data by hand using their graphical interface:
>>>>> http://elections.nytimes.com/2010/results/senate.  Maybe you  
>>>>> could get
>>>>
>>>> the
>>>>>
>>>>> data from the nytimes via email however?
>>>>>
>>>>> Best,
>>>>> John
>>>>>
>>>>>
>>>>> On Fri, Jan 7, 2011 at 1:20 AM, Inaki Sagarzazu <
>>>>> [log in to unmask]> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I was trying to find the county level results for the 2010 senate
>>>>
>>>> elections
>>>>>>
>>>>>> and it has proven a little difficult. Does anybody have any  
>>>>>> suggestions
>>>>
>>>> as
>>>>>>
>>>>>> to where can these results be found?
>>>>>>
>>>>>> Thanks in advance
>>>>>>
>>>>>> Iņaki
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
----------------------------------------------------------------------------
-----------------------------------
>>>>>>
>>>>>> Iņaki Sagarzazu
>>>>>> Research Officer
>>>>>> Nuffield College
>>>>>> University of Oxford
>>>>>> OX1 1NF Oxford
>>>>>> United Kingdom
>>>>>>
>>>>>> Office: 44 (0) 1 865 614990
>>>>>> [log in to unmask]<mailto:
>>>>
>>>> [log in to unmask]
>>>>>>>
>>>>>> http://www.nuffield.ox.ac.uk/users/sagarzazu/index.html
>>>>>>
>>>>>>
>>>>>> **********************************************************
>>>>>>         Political Methodology E-Mail List
>>>>>> Editors: Diana O'Brien        <[log in to unmask]>
>>>>>>        Jon C. Rogowski <[log in to unmask]>
>>>>>> **********************************************************
>>>>>>    Send messages to [log in to unmask]
>>>>>> To join the list, cancel your subscription, or modify
>>>>>>       your subscription settings visit:
>>>>>>
>>>>>>      http://polmeth.wustl.edu/polmeth.php
>>>>>>
>>>>>> **********************************************************
>>>>>>
>>>>>
>>>>> **********************************************************
>>>>>          Political Methodology E-Mail List
>>>>> Editors: Diana O'Brien        <[log in to unmask]>
>>>>>         Jon C. Rogowski <[log in to unmask]>
>>>>> **********************************************************
>>>>>     Send messages to [log in to unmask]
>>>>> To join the list, cancel your subscription, or modify
>>>>>        your subscription settings visit:
>>>>>
>>>>>       http://polmeth.wustl.edu/polmeth.php
>>>>>
>>>>> **********************************************************
>>>>
>>>> **********************************************************
>>>>          Political Methodology E-Mail List
>>>> Editors: Diana O'Brien        <[log in to unmask]>
>>>>         Jon C. Rogowski <[log in to unmask]>
>>>> **********************************************************
>>>>     Send messages to [log in to unmask]
>>>> To join the list, cancel your subscription, or modify
>>>>        your subscription settings visit:
>>>>
>>>>       http://polmeth.wustl.edu/polmeth.php
>>>>
>>>> **********************************************************
>>>>
>>>
>>> **********************************************************
>>>           Political Methodology E-Mail List
>>> Editors: Diana O'Brien        <[log in to unmask]>
>>>          Jon C. Rogowski <[log in to unmask]>
>>> **********************************************************
>>>      Send messages to [log in to unmask]
>>> To join the list, cancel your subscription, or modify
>>>         your subscription settings visit:
>>>
>>>        http://polmeth.wustl.edu/polmeth.php
>>>
>>> **********************************************************
>>>
>>> **********************************************************
>>>           Political Methodology E-Mail List
>>> Editors: Diana O'Brien        <[log in to unmask]>
>>>          Jon C. Rogowski <[log in to unmask]>
>>> **********************************************************
>>>      Send messages to [log in to unmask]
>>> To join the list, cancel your subscription, or modify
>>>         your subscription settings visit:
>>>
>>>        http://polmeth.wustl.edu/polmeth.php
>>>
>>> **********************************************************
>>
>> Bryan D. Jones
>> JJ Pickle Chair of Congressional Studies
>> Department of Government
>> University of Texas at Austin
>> Austin, TX 78712
>> Office: 512-471-9973
>> [log in to unmask]
>>
>>
>>
>>
>>
>> **********************************************************
>>           Political Methodology E-Mail List
>> Editors: Diana O'Brien        <[log in to unmask]>
>>          Jon C. Rogowski <[log in to unmask]>
>> **********************************************************
>>      Send messages to [log in to unmask]
>> To join the list, cancel your subscription, or modify
>>         your subscription settings visit:
>>
>>        http://polmeth.wustl.edu/polmeth.php
>>
>> **********************************************************
>>
>
> **********************************************************
>             Political Methodology E-Mail List
>   Editors: Diana O'Brien        <[log in to unmask]>
>            Jon C. Rogowski <[log in to unmask]>
> **********************************************************
>        Send messages to [log in to unmask]
>  To join the list, cancel your subscription, or modify
>           your subscription settings visit:
>
>          http://polmeth.wustl.edu/polmeth.php
>
> **********************************************************

Bryan D. Jones
JJ Pickle Chair of Congressional Studies
Department of Government
University of Texas at Austin
Austin, TX 78712
Office: 512-471-9973
[log in to unmask]





**********************************************************
             Political Methodology E-Mail List
   Editors: Diana O'Brien        <[log in to unmask]>
            Jon C. Rogowski <[log in to unmask]>
**********************************************************
        Send messages to [log in to unmask]
  To join the list, cancel your subscription, or modify
           your subscription settings visit:

          http://polmeth.wustl.edu/polmeth.php

**********************************************************

**********************************************************
             Political Methodology E-Mail List
   Editors: Diana O'Brien        <[log in to unmask]>
            Jon C. Rogowski <[log in to unmask]>
**********************************************************
        Send messages to [log in to unmask]
  To join the list, cancel your subscription, or modify
           your subscription settings visit:

          http://polmeth.wustl.edu/polmeth.php

**********************************************************

ATOM RSS1 RSS2