LISTSERV - POLMETH Archives

Anna Maria Ortiz <[log in to unmask]> · Wed, 7 Feb 2007 08:52:07 -0500

This dataset was developed for natural language processing but may be of
interest to political scientists as well. 

---------- Forwarded message ----------
Subject: [DDLBETAtag] congressional-speech dataset available

The "congressional speech" corpus and associated graph information
used in our "Get out the vote: Determining support or opposition from
Congressional floor-debate transcripts" EMNLP 2006 paper is now
available.

Specifically, the data includes speeches as individual documents,
together with:

      * automatically-derived labels for whether the speakers
supported
        the legislation under discussion or not, allowing for
        experiments with this kind of sentiment analysis

      * indications of which debate each speech comes from (and the
        position within the debate), allowing for consideration of
        conversational structure

      * indications of by-name references between speakers, allowing
for
        experiments with agreement classification (if one determines
the
        "true" labels from the support/oppose labels assigned to the
        pair of speakers in question)

      * the edge weights and other information we derived to create
the
        graphs we used for our experiments upon this data,
facilitating
        implementation of alternative graph-based classification
methods
        upon the graphs we constructed

The download site is:
http://www.cs.cornell.edu/home/llee/data/convote.html 

Matt Thomas, Bo Pang, and Lillian Lee

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "DDLBETA" group.
To post to this group, send email to [log in to unmask] 
To unsubscribe from this group, send email to
[log in to unmask] 
For more options, visit this group at
http://groups.google.com/group/DDLBETA?hl=en 
-~----------~----~----~----~------~----~------~--~---

_______________________________________________
Clip-nlp mailing list
[log in to unmask] 
http://lists.umiacs.umd.edu/mailman/listinfo/clip-nlp

Anna Maria Ortiz, Ph.D.
Senior Statistician
Applied Research and Methods
U.S. Government Accountability Office
(202) 512-2788
[log in to unmask]

POLMETH Archives

Political Methodology Society

POLMETH@LISTSERV.WUSTL.EDU