Subject: | |
From: | |
Reply To: | |
Date: | Wed, 7 Feb 2007 08:52:07 -0500 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
This dataset was developed for natural language processing but may be of
interest to political scientists as well.
---------- Forwarded message ----------
Subject: [DDLBETAtag] congressional-speech dataset available
The "congressional speech" corpus and associated graph information
used in our "Get out the vote: Determining support or opposition from
Congressional floor-debate transcripts" EMNLP 2006 paper is now
available.
Specifically, the data includes speeches as individual documents,
together with:
* automatically-derived labels for whether the speakers
supported
the legislation under discussion or not, allowing for
experiments with this kind of sentiment analysis
* indications of which debate each speech comes from (and the
position within the debate), allowing for consideration of
conversational structure
* indications of by-name references between speakers, allowing
for
experiments with agreement classification (if one determines
the
"true" labels from the support/oppose labels assigned to the
pair of speakers in question)
* the edge weights and other information we derived to create
the
graphs we used for our experiments upon this data,
facilitating
implementation of alternative graph-based classification
methods
upon the graphs we constructed
The download site is:
http://www.cs.cornell.edu/home/llee/data/convote.html
Matt Thomas, Bo Pang, and Lillian Lee
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "DDLBETA" group.
To post to this group, send email to [log in to unmask]
To unsubscribe from this group, send email to
[log in to unmask]
For more options, visit this group at
http://groups.google.com/group/DDLBETA?hl=en
-~----------~----~----~----~------~----~------~--~---
_______________________________________________
Clip-nlp mailing list
[log in to unmask]
http://lists.umiacs.umd.edu/mailman/listinfo/clip-nlp
Anna Maria Ortiz, Ph.D.
Senior Statistician
Applied Research and Methods
U.S. Government Accountability Office
(202) 512-2788
[log in to unmask]
|
|
|