I would just like to point out that I have seen some horrible, horrible
"manipulation" done with "Excel and Adobe's pdf reader". Just for a start,
such methods are by their very nature non-reproducible.
So, there´s something positive about either joining forces (so it is
easy/cheaper to double or triple check) or selecting a trustworthy provider
of data rather than rolling your own.
just my 2 cents.
-Eduardo
--
Eduardo Leoni
Analista - Gestão em Pesquisa
CD 2010 - IBGE/UE/BA
[log in to unmask]
71 2105-8654
On Mon, Jan 10, 2011 at 2:25 PM, Bryan Jones <[log in to unmask]>wrote:
> Colleagues
>
> I want to add a strong second to Charles Stewart's old grump caveat. He
> only thinks he is a grump about data quality! I am the Grand Grump.
> Grabbing someone else's numbers and running analyses on them should no
> longer be acceptable in political science (anywhere else, for that matter).
> A little care in data quality can yield strong benefits in improved test
> power. But as Soroka, Wlezien, and McLeana show, error in data can also
> lead one to reject a null inappropriately.
>
> It is a great piece that shows how error in data can cause all sorts of
> bad stuff, and should be required reading in research design courses.
> Stuart Soroka, Christopher Wlezien, and Iain McLean. 2005. How Measures
> Matter. Journal of the Royal Statistical Society A 169 Part 2: 255-71.
>
>
>
> On Jan 8, 2011, at 8:57 AM, Charles Stewart III wrote:
>
> At the risk of seeming like an old grump, could I step back and fret a bit
>> about data quality that this thread has left by the side of the road?
>>
>> Election return data at the local is surprisingly difficult to acquire,
>> even in this electronic age. It requires a certain persistence that has to
>> be learned and trained. Learning to be persistent is difficult in these
>> days when we teach our students that if they know how to ask, there's
>> already an Excel spreadsheet available with the data they need.
>>
>> In the United States, all local jurisdictions must eventually report their
>> canvassed returns to the state election officer, which then publishes them.
>> All states publish returns at the level of reporting, usually the county.
>> Most of these reports are on the web, in a variety of formats, which now
>> can easily (if tediously) be manipulated with software that is now
>> essentially free --- by which I mean Excel and Adobe's pdf reader. I know
>> this from personal experience, constructing datasets of turnout and vote
>> totals for presidential elections at the county and town level each year
>> since 2000. While I have purchased some software to speed up the process,
>> like able2extract and ocr programs, it's still the case that every format
>> I've seen from a state can be manipulated with one of these programs, with a
>> little hand-entry thrown in from time-to-time.
>>
>> We have a great debt to pay to news organizations and web sites that
>> gather this data for us and post it up on their sites for our use. In the
>> days immediately following elections, these sites --- NY Times, CNN, etc,
>> --- are often the only practical sources for scholars to use, if they want
>> to be involved in the evolving story about the election. (In other words,
>> if you want to talk to reporters or wrote blogs, you often need to use these
>> sites.) These sites are often just mirroring the AP operation, which is an
>> extraordinary operation. HOWEVER, AP is not a scholarly operation. That's
>> where my concern lies.
>>
>> States take a long time to certify their election returns and release the
>> final results, accounted for at the local level --- sometimes taking months
>> to do so. (In 2008, for instance, Massachusetts did not make available its
>> town-level returns, even if you asked really nice, until April 2009.) AP
>> will eventually mop up their data set, so that it reflects these certified
>> returns. However, I strongly suspect that the abandoned datasets sitting
>> around on the CNN and NYT web sites are not updated, and so reflect all
>> sorts of omissions, such as provisional and absentee ballots in many states.
>>
>> As an example, compare the results at the NYT site for California (
>> http://elections.nytimes.com/2010/results/california) with the California
>> statement of vote (
>> http://www.sos.ca.gov/elections/sov/2010-general/complete-sov.pdf). The
>> New York Times data is missing 1.6 million votes, or roughly 16% of the
>> votes cast for senator. That's because the NYT data ceased to be updated as
>> California completed its count.
>>
>> I think it is generally a very bad idea to encourage graduate students to
>> purchase datasets like this, or to rely on news web sites, but if there is
>> one web site that bears watching, it's Dave Leip's Atlas of U.S.
>> Presidential Elections, since the data are usually based on official returns
>> and, if you dig deeply enough, he reports his data sources. I've purchased
>> data from his site, have cross-checked it with my own efforts, and found it
>> very good.
>>
>> In conclusion, I hope we can make a distinction between the fast-and-dirty
>> analysis we do on the fly, because we're trying to make sense of unfolding
>> election counts or trying to gin-up an example for class, and the scholarly
>> analysis we do. If the former, then the newspaper web sites are fine, and
>> even indispensible. If the latter, we have a duty to be careful about
>> provenance.
>>
>> Charles
>>
>> ===============================================================
>> Charles Stewart III
>> Kenan Sahin Distinguished Professor of Political Science
>> Housemaster of McCormick Hall
>>
>> Voice: 617.253.3127 / Facsimile: 617.258.8546
>> e-mail: [log in to unmask] / URL: http://web.mit.edu/cstewart/www/
>>
>> Department of Political Science
>> 30 Wadsworth Street
>> Building E53-449
>> Cambridge, Massachusetts 02139
>>
>>
>> -----Original Message-----
>> From: Political Methodology Society [mailto:[log in to unmask]] On
>> Behalf Of John Henderson
>> Sent: Friday, January 07, 2011 7:34 PM
>> To: [log in to unmask]
>> Subject: Re: [POLMETH] 2010 Senate Elections results
>>
>> I don't believe so. Here's the full API list:
>> http://developer.nytimes.com/docs
>>
>> On Fri, Jan 7, 2011 at 10:31 AM, Simon Jackman <[log in to unmask]>
>> wrote:
>>
>> Are these data part of what NYTimes exposes via it's API?
>>>
>>> Simon Jackman
>>> Dept of Political Science and, by courtesy, Statistics,
>>> Stanford University, Stanford CA 94305-6044
>>> http://jackman.Stanford.edu
>>>
>>> On Jan 7, 2011, at 8:41 AM, John Henderson <[log in to unmask]>
>>> wrote:
>>>
>>> For a licensing fee, you can purchase the county data here:
>>>> http://www.uselectionatlas.org/BOTTOM/store_data.php.
>>>>
>>>> For free, you can access county data on the nytimes webpage, but you'll
>>>>
>>> have
>>>
>>>> to enter the data by hand using their graphical interface:
>>>> http://elections.nytimes.com/2010/results/senate. Maybe you could get
>>>>
>>> the
>>>
>>>> data from the nytimes via email however?
>>>>
>>>> Best,
>>>> John
>>>>
>>>>
>>>> On Fri, Jan 7, 2011 at 1:20 AM, Inaki Sagarzazu <
>>>> [log in to unmask]> wrote:
>>>>
>>>> Hello,
>>>>>
>>>>> I was trying to find the county level results for the 2010 senate
>>>>>
>>>> elections
>>>
>>>> and it has proven a little difficult. Does anybody have any suggestions
>>>>>
>>>> as
>>>
>>>> to where can these results be found?
>>>>>
>>>>> Thanks in advance
>>>>>
>>>>> Iņaki
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------------------------------------------------
>>>
>>>> Iņaki Sagarzazu
>>>>> Research Officer
>>>>> Nuffield College
>>>>> University of Oxford
>>>>> OX1 1NF Oxford
>>>>> United Kingdom
>>>>>
>>>>> Office: 44 (0) 1 865 614990
>>>>> [log in to unmask]<mailto:
>>>>>
>>>> [log in to unmask]
>>>
>>>>
>>>>>> http://www.nuffield.ox.ac.uk/users/sagarzazu/index.html
>>>>>
>>>>>
>>>>> **********************************************************
>>>>> Political Methodology E-Mail List
>>>>> Editors: Diana O'Brien <[log in to unmask]>
>>>>> Jon C. Rogowski <[log in to unmask]>
>>>>> **********************************************************
>>>>> Send messages to [log in to unmask]
>>>>> To join the list, cancel your subscription, or modify
>>>>> your subscription settings visit:
>>>>>
>>>>> http://polmeth.wustl.edu/polmeth.php
>>>>>
>>>>> **********************************************************
>>>>>
>>>>>
>>>> **********************************************************
>>>> Political Methodology E-Mail List
>>>> Editors: Diana O'Brien <[log in to unmask]>
>>>> Jon C. Rogowski <[log in to unmask]>
>>>> **********************************************************
>>>> Send messages to [log in to unmask]
>>>> To join the list, cancel your subscription, or modify
>>>> your subscription settings visit:
>>>>
>>>> http://polmeth.wustl.edu/polmeth.php
>>>>
>>>> **********************************************************
>>>>
>>>
>>> **********************************************************
>>> Political Methodology E-Mail List
>>> Editors: Diana O'Brien <[log in to unmask]>
>>> Jon C. Rogowski <[log in to unmask]>
>>> **********************************************************
>>> Send messages to [log in to unmask]
>>> To join the list, cancel your subscription, or modify
>>> your subscription settings visit:
>>>
>>> http://polmeth.wustl.edu/polmeth.php
>>>
>>> **********************************************************
>>>
>>>
>> **********************************************************
>> Political Methodology E-Mail List
>> Editors: Diana O'Brien <[log in to unmask]>
>> Jon C. Rogowski <[log in to unmask]>
>> **********************************************************
>> Send messages to [log in to unmask]
>> To join the list, cancel your subscription, or modify
>> your subscription settings visit:
>>
>> http://polmeth.wustl.edu/polmeth.php
>>
>> **********************************************************
>>
>> **********************************************************
>> Political Methodology E-Mail List
>> Editors: Diana O'Brien <[log in to unmask]>
>> Jon C. Rogowski <[log in to unmask]>
>> **********************************************************
>> Send messages to [log in to unmask]
>> To join the list, cancel your subscription, or modify
>> your subscription settings visit:
>>
>> http://polmeth.wustl.edu/polmeth.php
>>
>> **********************************************************
>>
>
> Bryan D. Jones
> JJ Pickle Chair of Congressional Studies
> Department of Government
> University of Texas at Austin
> Austin, TX 78712
> Office: 512-471-9973
> [log in to unmask]
>
>
>
>
>
>
> **********************************************************
> Political Methodology E-Mail List
> Editors: Diana O'Brien <[log in to unmask]>
> Jon C. Rogowski <[log in to unmask]>
> **********************************************************
> Send messages to [log in to unmask]
> To join the list, cancel your subscription, or modify
> your subscription settings visit:
>
> http://polmeth.wustl.edu/polmeth.php
>
> **********************************************************
>
**********************************************************
Political Methodology E-Mail List
Editors: Diana O'Brien <[log in to unmask]>
Jon C. Rogowski <[log in to unmask]>
**********************************************************
Send messages to [log in to unmask]
To join the list, cancel your subscription, or modify
your subscription settings visit:
http://polmeth.wustl.edu/polmeth.php
**********************************************************
|