POLMETH Archives

Political Methodology Society

POLMETH@LISTSERV.WUSTL.EDU

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Gerald Wright <[log in to unmask]>
Reply To:
Political Methodology Society <[log in to unmask]>
Date:
Tue, 31 Oct 2006 04:58:32 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (89 lines)
The regex approach is powerful, but may not be necessary for Barry's
problem.  I think the easiest way is to use the index function:
.gen hh=1 if index(y,"Hoover")&index(y,"Herb")

This will catch any instance of the string variable y that contains the
strings "Herb" and "Hoover" including "Herbert Hoover" and "Hoover, Herb."
but also some that may not be what you want such as "Herboof, Axelrod and
Hooversmith"

I'd check to see if such usages slip through and add to the "if" statement
to catch those conditions.  For example if that above string is the only one
that does not refer to the preferred Hoover, amending the Stata statement
above would do the trick:
.gen hh=1 if index(y,"Hoover")&index(y,"Herb")&~index(y,"oof")

Jerry


----------------------------------------
Gerald Wright
Department of Political Science
Indiana University
Bloomington, IN 47405
phone:  (812) 855-6308
fax:    (812) 855-2027
Home page: http://mypage.iu.edu/~wright1/
-----------------------------------------



On 10/31/06 2:46 AM, "Christopher N. Lawrence" <[log in to unmask]> wrote:

> On 10/30/06, Barry C. Burden <[log in to unmask]> wrote:
>> I am trying to write some if/then statements in Stata that depend on the
>> value of a string variable.  The trouble is that variable contains names
>> that are not always spelled the same way or written out fully.  For
>> example, in one observation the name is written as "Hoover, Herbert" but
>> in another observation it has been shortened to "Hoover, Herb."  To
>> capture both of these possibilities I would like to tell Stata to recode
>> if y=="Hoov*" or some such thing.  Unfortunately, the user guide isn't
>> clear about the use of wildcards with string variables and my own
>> experiments have failed.  Perhaps this is a better question for the
>> Stata users list, but I thought another political scientist who has
>> worked with strings might be able to come through with a solution.
>
> You may want to try the regexm and/or strmatch functions; alas,
> Stata's online help doesn't really document the pattern matching
> syntax for either... but if regexm(y, "Hoov.*") should do it -- note
> you use '.*' instead of '*' in regular expressions to match any
> string.
>
> Hope this helps,
>
>
> Chris
> --over
> Christopher N. Lawrence <[log in to unmask]>
> Assistant Professor of Political Science (non-tenure-track)
> Saint Louis University
> 109 Fitzgerald Hall
> 3500 Lindell Boulevard
> St. Louis, Missouri 63103-1021
>
> Website: http://www.cnlawrence.com/
>
> **********************************************************
>              Political Methodology E-Mail List
>         Editor: Karen Long Jusko <[log in to unmask]>
> **********************************************************
>         Send messages to [log in to unmask]
>   To join the list, cancel your subscription, or modify
>            your subscription settings visit:
>
>           http://polmeth.wustl.edu/polmeth.php
>
> **********************************************************

**********************************************************
             Political Methodology E-Mail List
        Editor: Karen Long Jusko <[log in to unmask]>
**********************************************************
        Send messages to [log in to unmask]
  To join the list, cancel your subscription, or modify
           your subscription settings visit:

          http://polmeth.wustl.edu/polmeth.php

********************************************************** 

ATOM RSS1 RSS2