Benjamin Freeman wrote:
> I am having some difficulty working with the IMF Government Finance
> Statistics Data from the ICPSR Study No. 8624. Specifically, I do not know
> exactly how to unpack the main data file. Has anyone written code to unpack
> this file? Or, does anyone know of an alternative source for this data that
> does not require unpacking?
>
You need to be more specific with your question. I don't have any
trouble with "unpacking".
$ unzip 6279173.zip
Archive: 6279173.zip
extracting: 6279173/ICPSR_08624/08624-manifest.txt
extracting: 6279173/ICPSR_08624/08624-related_literature.txt
extracting: 6279173/ICPSR_08624/08624-descriptioncitation.pdf
inflating: 6279173/ICPSR_08624/DS0001/08624-0001-Codebook.pdf
inflating: 6279173/ICPSR_08624/DS0001/08624-0001-Data.txt
inflating: 6279173/ICPSR_08624/DS0002/08624-0002-Data.txt
Now, concerning the file 6279173/ICPSR_08624/DS0001/08624-0001-Data.txt,
I do see trouble. It is a block of characters I don't recognize. It is
certainly not encoded in ASCII or Unicode. My colleagues here suspect
it might be encoded EBCDIC character set, but my first guess was that it
was edited by somebody in a program that was popular in 1998, say "Word
Perfect" or such, and that person who saved it forgot to save into plain
text format. I don't think it is EBCDIC, because:
$ recode EBCDIC..MSDOS test.txt
recode: test.txt failed: Invalid input in step `ANSI_X3.4-1968..ISO-8859-1'
Ah, then back to read the codebook. I see this comment "The data are
stored in packed zoned decimal format. A supplemental COBOL processing
program is available for use with this dataset." The COBOL program
called Funpack is provided in the directory DS0002, and my guess is that
one needs a COBOL compiler to run this program. Here are the first few
lines, in case you need to be convinced.
EBCDIC LINE
000010 IDENTIFICATION DIVISION.
000020 PROGRAM-ID. 'FUNPACK'.
000030 AUTHOR. KATHLEEN NELICK.
000040 INSTALLATION. IMF BUREAU OF STATISTICS.
000050 DATE-WRITTEN. FEBRUARY 1972.
000060 DATE-COMPILED.
000070*REMARKS. REFORMAT PACKED IFS DF RECORD TO INTERNAL DF RECORD.
I tracked down a couple of COBOL compilers, but could not make any
progress compiling that code.
As I google about in the internet, I gather that several IMF datasets
are in this format and several people have asked what to do about them.
I don't have more time to spend on this, but if I were you the first
thing I would try is PROC DATASOURCE in SAS, which claims it can handle
IMF files.
Good luck. I think if you are more clear about what goes wrong, you
are probably more likely to get useful answers. When you leave us to
track down what dataset you are using and then guess what might be going
wrong, you are asking an awful lot.
--
Paul E. Johnson email: [log in to unmask]
Dept. of Political Science http://pj.freefaculty.org
1541 Lilac Lane, Rm 504
University of Kansas Office: (785) 864-9086
Lawrence, Kansas 66044-3177 FAX: (785) 864-5700
**********************************************************
Political Methodology E-Mail List
Editor: Karen Long Jusko <[log in to unmask]>
**********************************************************
Send messages to [log in to unmask]
To join the list, cancel your subscription, or modify
your subscription settings visit:
http://polmeth.wustl.edu/polmeth.php
**********************************************************
|