POLMETH Archives

Political Methodology Society

POLMETH@LISTSERV.WUSTL.EDU

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Reply To:
Political Methodology Society <[log in to unmask]>
Date:
Fri, 3 Aug 2007 09:58:13 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (40 lines)
Title:      A default prior distribution for logistic and other
regression models

Authors:    Andrew Gelman, Aleks Jakulin, Maria Grazia Pittau,
Yu-Sung Su

Entrydate:  2007-08-03 09:30:57

Keywords:   Bayesian inference, generalized linear model, least
squares, hierarchical model, linear regression, logistic
regression, multilevel model, noninformative prior distribution

Abstract:   We propose a new prior distribution for classical
(non-hierarchical) logistic regression models, constructed by
first scaling all nonbinary variables to have mean 0 and
standard deviation 0.5, and then placing independent Student-$t$
prior distributions on the coefficients.  As a default choice, we
recommend the Cauchy distribution with center 0 and scale 2.5,
which in the simplest setting is a longer-tailed version of the
distribution attained by assuming one-half additional success
and one-half additional failure in a logistic regression.  We
implement a procedure to fit generalized linear models in R with
this prior distribution by incorporating an approximate EM
algorithm into the usual iteratively weighted least squares.  We
illustrate with several examples, including a series of logistic
regressions predicting voting preferences, an imputation model
for a public health data set, and a hierarchical logistic
regression in epidemiology.

We recommend this default prior distribution for routine applied
use.  It has the advantage of always giving answers, even when
there is complete separation in logistic regression (a common
problem, even when the sample size is large and the number of
predictors is small) and also automatically applying more
shrinkage to higher-order interactions.  This can be useful in
routine data analysis as well as in automated procedures such as
chained equations for missing-data imputation.

http://polmeth.wustl.edu/retrieve.php?id=717

ATOM RSS1 RSS2