Interpreting Coefficients from a Logit-Linear Model with a Proportional Dependent Variable

Often researchers deal with proportional dependent variables that are logit-linear. In other words,

Screen Shot 2014-04-24 at 13.25.26

where y is the observed number within some group (n) divided by the total number of observations (N). As an empirical example, in my own work, y is commonly the proportion of seats held by females in the national legislature, i.e. the number of women divided by the number of seats. 

Here’s a hypothetical distribution of a logit-normal variable that was generated using Stata’s random number generator:

Logit_normal

Note that the logit transformation only works for values that fall between zero and one. Zeros and ones are undefined. Thus, researchers utilizing closed interval proportional data will need to ‘winsorize’ their observations, making all of them slightly more than zero and slightly less than one. This introduces a certain amount of arbitrary bias into the model.

The logit-linear model can be estimated as:
Screen Shot 2014-04-24 at 13.26.19

A common problem then, is interpreting the coefficients from this model, or the impact of a one-unit change in x on the dependent variable y, rather than on the logit-transformation of the dependent variable.

This is particularly complicated if we continue to think in terms of proportions rather than in ratios. In effect, the logit transformation of any proportion reduces to the natural log of the ratio of some in-group to some out-group. The in-group refers to the group we are interested in studying (n). In the previous example, females are the in-group. The out-group is basically everyone else or the reference category ( z = N – n ). When looking at female representation, the out-group is males.

Some simple rearranging will make it clear. Let n denote the number of members within the in-group, let z denote the number of members in the out-group, and let N represent the total number of observations (i.e. both groups combined).
Screen Shot 2014-04-24 at 13.27.00

because

Screen Shot 2014-04-24 at 13.27.38

As a result, the logit-linear model for a proportional dependent variable can be thought of as a log-linear model for a ratio dependent variable of the in-group to the out-group.
Screen Shot 2014-04-24 at 13.28.43

Interpretation of coefficients follows the same logic as any other log-linear model. This is easiest to interpret as a percentage change in the ratio of the in-group to the out-group.
Screen Shot 2014-04-24 at 13.29.11

Example using Stata
In 2008, Tripp & Kang published an article in Comparative Political Studies, which included data on women’s representation in 153 countries in 2006. They utilized a logit-linear model with the proportion of seats held by women as the dependent variable.

Screen Shot 2014-04-24 at 13.29.34

where rep2006 is the proportion of seats held by women in 2006. Note, like many others, Tripp and Kang observe zeros and thus winsorize these data to 0.01. Their model reduces to the log-linear model of the ratio of females to males in the legislature:
Screen Shot 2014-04-24 at 13.30.51

Using data from Kang’s website, I re-estimate their “Model C” in Stata. The raw output is below.
Screen Shot 2014-04-24 at 14.28.25

Of course, these are identical to the results reported by Tripp & Kang in Table 2 of their publication. Their dataset contains the logit-transformed variable, rep06. Alternatively, one can calculate a dependent variable that is logit-transformed by typing the following command:

. gen logit_y = logit(y)

where y is the observed PDV. Stata can calculate the inverse of the logit of some value using the display function:

. display invlogit([value])

and can also calculate an inverse logit variable using the generate command:

. gen invlogit_y = invlogit(y)

Rather than using Clarify to simulate predicted changes in the proportion of seats held by women, I recommend interpreting these results in terms of the predicted percentage change in the ratio of females to males.

The two key variables in Model C are quota, which is a dummy variable coded as one (1) if the country had any sort of gender quota in place during the previous election, and prelect which is coded as one (1) if the country utilized proportional representation during the previous election. Following the equation above, the coefficients for these variables can be interpreted in terms of a percentage change in the ratio of females to males.

Countries with gender quotas have, on average, 86.3% higher ratio of females to males in the legislature, all else being equal.
Screen Shot 2014-04-24 at 13.31.38

Countries with proportional representation have on average, 53.8% higher ratio of females to males in the legislature, all else being equal.

Screen Shot 2014-04-24 at 13.32.02

To wrap up, here’s a graph comparing the observed values to the fitted values in each interpretation method. As you can see, the slopes of the two lines are nearly identical, with some slight differences due to error.
Logit

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s