Chapter 10 Panel Data  Fixed Effects and some Random Effects
10.1 Seminar
In this seminar, you will be asked to work more on your own. Start by clearing your workspace and setting your working directory. We will then introduce the necessary R code for today using the example from the lecture. This will be brief and afterwards, you can analyse yourself whether more guns lead to less crime.
rm(list = ls())
setwd("Your directory")
We start by loading the resource curse data and checking the data with the str()
function.
a < read.csv("resourcecurse.csv")
str(a)
'data.frame': 876 obs. of 10 variables:
$ country : Factor w/ 73 levels "Afghanistan",..: 1 1 1 1 1 1 1 1 1 1 ...
$ countrycode : Factor w/ 73 levels "AFG","ALB","ARG",..: 1 1 1 1 1 1 1 1 1 1 ...
$ year : int 1996 1998 2000 2002 2003 2004 2005 2006 2007 2008 ...
$ aid : num NA NA NA 1.15 1.21 ...
$ oil : Factor w/ 532 levels "..","0","0.000156118640282417",..: 1 1 1 57 58 61 66 48 44 47 ...
$ gdp.capita : num NA NA NA NA NA NA NA NA NA NA ...
$ institutions: num 2.06 2.09 2.13 1.75 1.58 ...
$ polity2 : int 7 7 7 NA NA NA NA NA NA NA ...
$ population : int 17822884 18863999 20093756 21979923 23064851 24118979 25070798 25893450 26616792 27294031 ...
$ mortality : num 106 104 104 103 104 ...
The oil variable is coded as a factor variable but it should be numeric. Missing values as coded as “..”. convert the variable to a numeric variable and drop missing values.
# recode missings
a$oil[which(a$oil=="..")] < NA
# convert to numeric
a$oil < as.numeric(a$oil)
To estimate panel data models, we need to install the plm
package. You only need to do this once.
install.packages("plm")
Every time, we want to use the package (when we start a new R session), we load the plm
library like so:
library(plm)
Loading required package: Formula
We logtransform gdp per capita and population size.
a$log.gdp < log(a$gdp.capita)
a$log.pop < log(a$population)
10.1.1 Our data
Variable  Description 

country 
country name 
countrycode 
3 letter country abbreviation 
year 

aid 
net aid flow (in per cent of GDP) 
oil 
oil rents (in per cent of GDP) 
gdp.capita 
GDP per capita in constant 2000 US dollars 
institutions 
world governance indicator index for quality of institutions 
polity2 
polity IV project index 
population 

mortality 
rate (per 1000 live births) 
We test the rentier states theory and the resource curse that we discussed in the lecture. It states that rentier capitalism can be a curse on the systemic level. States that extract rents from easily lootable resources instead of taxing their people develop institutions that become unresponsive to their citizens and provide less public goods. North and Weingast (academic heroes), for instance, relate the advent of democracy in Britain to the struggle for property rights.
10.1.2 Unit fixed effects (country fixed effects)
In class, our first fixed effects model was called m3
. It was the unit fixed effects model. Recall, that the unit fixed effects model is the same as including dummy variables for all countries except the baseline country. Therefore, we control for all potential confounders that vary across countries but are constant over time (e.g., the colonial heritage of a country).
# run fixed effects model
m3 < plm(
institutions ~ oil + aid + log.gdp + polity2 + log.pop + mortality,
data = a,
index = c("country", "year"),
model = "within",
effect = "individual"
)
# model output
summary(m3)
Oneway (individual) effect Within Model
Call:
plm(formula = institutions ~ oil + aid + log.gdp + polity2 +
log.pop + mortality, data = a, effect = "individual", model = "within",
index = c("country", "year"))
Unbalanced Panel: n = 58, T = 112, N = 672
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
0.3936224 0.0622048 0.0019414 0.0580157 0.3903817
Coefficients:
Estimate Std. Error tvalue Pr(>t)
oil 0.000077706 0.000092452 0.8405 0.400961
aid 0.002250157 0.000980402 2.2951 0.022065 *
log.gdp 0.190834199 0.032396694 5.8905 0.000000006374 ***
polity2 0.016004181 0.002707903 5.9102 0.000000005696 ***
log.pop 0.190493863 0.070707709 2.6941 0.007253 **
mortality 0.008294374 0.001553846 5.3380 0.000000132901 ***

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 8.8269
Residual Sum of Squares: 7.3822
RSquared: 0.16367
Adj. RSquared: 0.077009
Fstatistic: 19.8307 on 6 and 608 DF, pvalue: < 0.000000000000000222
Similar to the Ftest, we use the check whether country fixed effects explain any variation at all using the Lagrange Multiplier test.
# check for unit(country) fixed effects
plmtest(m3, effect="individual")
Lagrange Multiplier Test  (Honda) for unbalanced panels
data: institutions ~ oil + aid + log.gdp + polity2 + log.pop + mortality
normal = 53.332, pvalue < 0.00000000000000022
alternative hypothesis: significant effects
The null hypothesis is that country fixed effects do not have any effect and that would mean, statistically, that we could leave them out. However, in this case we reject the null hypothesis and hence we do need to control for country fixed effects.
10.1.3 Time fixed effects
We now estimate the time fixed effects model to illustrate how this would be done. However, we already know that we do need to include country fixed effects. Not estimating country fixed effects would be a mistake. The time fixed effects model does not include country fixed effects and, therefore, it makes that mistake. Generally, in the time fixed effects model, we control for all sources of confounding that vary over time but are constant across the units (the countries) such as technological change, for instance (you can argue whether technological change really affects all countries in our sample in the same way). The time fixed effects model includes a dummy variable for every time period except the baseline.
# time fixed effects model
m4 < plm(
institutions ~ oil + aid + log.gdp + polity2 + log.pop + mortality,
data = a,
index = c("country", "year"),
model = "within",
effect = "time")
# model output time fixed effects
summary(m4)
Oneway (time) effect Within Model
Call:
plm(formula = institutions ~ oil + aid + log.gdp + polity2 +
log.pop + mortality, data = a, effect = "time", model = "within",
index = c("country", "year"))
Unbalanced Panel: n = 58, T = 112, N = 672
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
1.196568 0.282023 0.028316 0.291527 0.865248
Coefficients:
Estimate Std. Error tvalue Pr(>t)
oil 0.00094474 0.00010632 8.8855 < 0.00000000000000022 ***
aid 0.01147113 0.00307715 3.7278 0.0002099 ***
log.gdp 0.45007149 0.01913597 23.5197 < 0.00000000000000022 ***
polity2 0.03248425 0.00280650 11.5746 < 0.00000000000000022 ***
log.pop 0.01333510 0.01052619 1.2668 0.2056601
mortality 0.00360009 0.00119458 3.0137 0.0026806 **

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 457.29
Residual Sum of Squares: 106.31
RSquared: 0.76752
Adj. RSquared: 0.76148
Fstatistic: 359.866 on 6 and 654 DF, pvalue: < 0.000000000000000222
Notice that adjusted R^2 is much larger in the time fixed effects model than in the country fixed effects model. That does not mean that the time fixed effects model is better. In fact adjusted R^2 cannot be compared between country fixed effects and time fixed effects models. In the country fixed effects model, adjusted R^2 is the variation in the dependent variable that is explained by our independent variables that vary within in countries. It is the explained within country variation. In a time fixed effects model, adjusted R^2 gives us the explained within time variation.
The time fixed effects model gives us different results than the country fixed effects model. We don’t like the time fixed effects model here because we already saw that we need to include time fixed effects from the plmtest()
. We can, however, check whether we need to include time fixed effects or put differently whether time fixed effects matter jointly. We do this using the plmtest()
again.
# test for time fixed effects
plmtest(m4, effect="time")
Lagrange Multiplier Test  time effects (Honda) for unbalanced
panels
data: institutions ~ oil + aid + log.gdp + polity2 + log.pop + mortality
normal = 1.5508, pvalue = 0.06048
alternative hypothesis: significant effects
The test comes back insignificant. That means, statistically speaking, we do not need to control for time fixed effects to have a consistent model. The test gives you justification to stick with the country fixed effects model. But, we will ignore the test. In the country fixed effects model, we have 602 degrees of freedom. We can afford to estimate country fixed effects in addition. There, are 12 time periods (indicated by the capital T in the summary output) and you can verify this like so:
# frequency table of year (i.e., number of observations per period)
table(a$year)
1996 1998 2000 2002 2003 2004 2005 2006 2007 2008 2009 2010
73 73 73 73 73 73 73 73 73 73 73 73
# number of time periods
length(table(a$year))
[1] 12
With 602 degrees of freedom, we can easily afford to estimate another 11 parameters (1 for each year where 1 year is the baseline category). Having 602 degrees of freedom is like having 602 free observations (that is a lot of information).
We do not make a mistake by controlling for potential confounders that vary across countries and are constant over time (unit fixed effects) and confounders that vary across time but are constant across units (time fixed effects). Therefore, we do that.
10.1.4 Twoway fixed effects
We now estimate the twoway fixed effects model. We control for all confounders that vary across units (countries) but are constant over time and we control for all confounders that vary over time but are constant across units.
# twoway fixed effects model
m5 < plm(
institutions ~ oil + aid + log.gdp + polity2 + log.pop + mortality,
data = a,
index = c("country", "year"),
model = "within",
effect = "twoways"
)
summary(m5)
Twoways effects Within Model
Call:
plm(formula = institutions ~ oil + aid + log.gdp + polity2 +
log.pop + mortality, data = a, effect = "twoways", model = "within",
index = c("country", "year"))
Unbalanced Panel: n = 58, T = 112, N = 672
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
0.37357541 0.06093757 0.00020216 0.05919668 0.45397954
Coefficients:
Estimate Std. Error tvalue Pr(>t)
oil 0.000013209 0.000096684 0.1366 0.891380
aid 0.002925254 0.000984319 2.9719 0.003079 **
log.gdp 0.298727506 0.038019594 7.8572 0.00000000000001837 ***
polity2 0.016062925 0.002665367 6.0265 0.00000000293299449 ***
log.pop 0.016589819 0.080469078 0.2062 0.836733
mortality 0.004167650 0.001725965 2.4147 0.016049 *

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 8.3506
Residual Sum of Squares: 6.967
RSquared: 0.16568
Adj. RSquared: 0.062262
Fstatistic: 19.7587 on 6 and 597 DF, pvalue: < 0.000000000000000222
10.1.5 Serial correlation/autocorrelation
In a panel model, we always have serial correlation. Maybe always is an overstatement but just maybe. Serial correlation means that a variable at time t (let’s say 2000) and in country i (let’s say Greece) is related to its value at t1 (in 1999). Anything that is path dependent would fall into this category. Surely, institutional quality is path dependent. There is a statistical test for autocorrelation but really your default assumption should be that autocorrelation is present.
Let’s carry out the test. The null hypothesis is that we do not have autocorrelation.
# BreuschGodfrey test
pbgtest(m5)
BreuschGodfrey/Wooldridge test for serial correlation in panel
models
data: institutions ~ oil + aid + log.gdp + polity2 + log.pop + mortality
chisq = 229.21, df = 1, pvalue < 0.00000000000000022
alternative hypothesis: serial correlation in idiosyncratic errors
Clearly, we do have autocorrelation, so we need to correct our standard errors. We need to libraries for this. First, sandwich
and second, lmtest
.
library(sandwich)
library(lmtest)
# heteroskedasticity and autocorrelation consistent standard errors
m5.hac < coeftest(m5, vcov = vcovHC(m5, method = "arellano", type = "HC3"))
m5.hac
t test of coefficients:
Estimate Std. Error t value Pr(>t)
oil 0.000013209 0.000148966 0.0887 0.929375
aid 0.002925254 0.001684554 1.7365 0.082989 .
log.gdp 0.298727506 0.132230608 2.2591 0.024234 *
polity2 0.016062925 0.006201084 2.5903 0.009822 **
log.pop 0.016589819 0.307818867 0.0539 0.957037
mortality 0.004167650 0.005529193 0.7538 0.451294

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The difference is noticeable. It is a mistake not to correct for serial correlation. The difference is that we now fail to reject the null hypothesis for the effect of aid.
10.1.6 Crosssectional dependence/ spatial dependence
Spatial dependence is common in panel data sets but unlike serial correlation, it is not always present. Spatial correlation means that some units that cluster together (usually geographically) are affected by some external shock in the same way. For instance, the Arab Spring affected counties in the MENA region in the same way.
We test for crosssectional dependence. If it exists, we need to correct for it. The null hypothesis is that we do not have spatial dependence.
# Peasaran test for crosssectional dependence
pcdtest(m5)
Warning in pcdres(tres = tres, n = n, w = w, form =
paste(deparse(x$formula)), : Some pairs of individuals (7 percent) do
not have any or just one time period in common and have been omitted from
calculation
Pesaran CD test for crosssectional dependence in panels
data: institutions ~ oil + aid + log.gdp + polity2 + log.pop + mortality
z = 2.2516, pvalue = 0.02435
alternative hypothesis: crosssectional dependence
The test comes back significant. Therefore, we need to adjust our standard errors for serial correlation, heteroskedasticity and spatial dependency.
Some political scientists like to estimate the socalled panel corrected standard errors (PCSE). In fact, Beck and Katz 1995 is one of the most cited political science papers of all time. However, Driscoll and Kraay (1998) propose standard errors that work even better in short panels (where we have few observations per unit). Their standard errors are sometimes called the SCC estimator. We correct for spatial correlation using SCC standard errors.
# Driscoll and Kraay SCC standard errors
m5.scc < coeftest(m5, vcov = vcovSCC(m5, type = "HC3", cluster = "group"))
m5.scc
t test of coefficients:
Estimate Std. Error t value Pr(>t)
oil 0.000013209 0.000143564 0.0920 0.926725
aid 0.002925254 0.001628816 1.7959 0.073010 .
log.gdp 0.298727506 0.133686296 2.2345 0.025817 *
polity2 0.016062925 0.005974345 2.6887 0.007374 **
log.pop 0.016589819 0.314543279 0.0527 0.957955
mortality 0.004167650 0.004991541 0.8349 0.404084

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This is our final model. We find no evidence for hypothesis 1 and 2. Both oil and aid are unrelated to institutional quality (note that this is different from what you saw in the lecture. I had an error in the code. This version is correct.)
10.1.7 The random effects model
We show you the random effects model only because you see it applied often in political science. However, the model rests on an heroic assumption. Recall from our lecture, the random effects model assumes that the time invariant confounders are unrelated to our regressors. The assumption says: “There are no confounders. By assumption. Basta!” That’s unsatisfactory. In fact, this assumption will almost always be violated. The random effects model is weak from a causal inference standpoint. However, it tends to do well in prediction tasks where we are interested in predicting outcomes but don’t really care whether X is causally related to Y.
Let’s estimate the random effects model.
# random effects model
ran.effects < plm(
institutions ~ oil + aid + log.gdp + polity2 + log.pop + mortality,
data = a,
index = c("country", "year"),
model = "random")
# model output
summary(ran.effects)
Oneway (individual) effect Random Effect Model
(SwamyArora's transformation)
Call:
plm(formula = institutions ~ oil + aid + log.gdp + polity2 +
log.pop + mortality, data = a, model = "random", index = c("country",
"year"))
Unbalanced Panel: n = 58, T = 112, N = 672
Effects:
var std.dev share
idiosyncratic 0.01214 0.11019 0.071
individual 0.15870 0.39837 0.929
theta:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.7334 0.9204 0.9204 0.9194 0.9204 0.9204
Residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.42270 0.06989 0.00032 0.00070 0.08034 0.37430
Coefficients:
Estimate Std. Error tvalue Pr(>t)
(Intercept) 1.33884120 0.62959170 2.1265 0.033827 *
oil 0.00021206 0.00009369 2.2634 0.023933 *
aid 0.00206725 0.00103411 1.9991 0.046009 *
log.gdp 0.31213762 0.02902202 10.7552 < 0.00000000000000022 ***
polity2 0.01942826 0.00273234 7.1105 0.000000000002993849 ***
log.pop 0.09216364 0.03199441 2.8806 0.004097 **
mortality 0.01026414 0.00129407 7.9317 0.000000000000009172 ***

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 11.862
Residual Sum of Squares: 9.0511
RSquared: 0.23701
Adj. RSquared: 0.23013
Fstatistic: 34.4236 on 6 and 665 DF, pvalue: < 0.000000000000000222
As mentioned, you will have an extremely hard time convincing anyone of a causal claim made based on a random effects model. However, sometimes you cannot estimate a fixed effects model. For instance, if you wish to estimate the effect of the electoral system on some outcome, you have the problem that the electoral system does not vary within countries (countries tend to choose an electoral system and stick with it). That means, you cannot estimate a unitfixed effects model. You can however, estimate the random effects model in that case.
The absolute minimum hurdle that you need to pass to be allowed to use the random effects model is to carry out the Hausman test. The test assesses whether the errors are correlated with the X variables. It thus, tests the assumption that the random effects model is based on.
However, we have to caution against the Hausman test! The Hausman test does not take heteroskedastic errors into account and it does not take serial correlation into account. That’s a big problem. Even if the Hausman tests, confirms that the random effects model is consistent, it may be wrong. We should always be skeptical of the random effects model (when it’s used to make a causal claim).
Let’s run the Hausman test. Its null hypothesis is that the errors and the X’s are uncorrelated and hence the random effects model is consistent.
# hausman test
phtest(m5, ran.effects)
Hausman Test
data: institutions ~ oil + aid + log.gdp + polity2 + log.pop + mortality
chisq = 136.39, df = 6, pvalue < 0.00000000000000022
alternative hypothesis: one model is inconsistent
The Hausman test rejects the null hypothesis. The random effects model is inconsistent. You now have all the tools to carry out your own analysis. Go ahead and show us whether more guns lead to less crime or not.
10.1.8 More guns, less crime
More guns, less crime. This is the claim of an in(famous) book. It shows that violent crime rates in the United States decrease when gun ownership restrictions are relaxed. The data used in Lott’s research compares violent crimes, robberies, and murders across 50 states to determine whether the so called “shall” laws that remove discretion from license granting authorities actually decrease crime rates. So far 41 states have passed these “shall” laws where a person applying for a licence to carry a concealed weapon doesn’t have to provide justification or “good cause” for requiring a concealed weapon permit.
Load the guns.csv dataset directly into R by running the following line:
a < read.csv("http://philippbroniecki.github.io/philippbroniecki.github.io/assets/data/guns.csv")
The data includes the following variables:
Variable  Description 

mur 
Murder rate (incidents per 100,000) 
shall 
=1 if state has a shallcarry law in effect in that year, 0 otherwise 
incarc rate 
Incarceration rate in the state in the previous year 
(sentenced prisoners per 100,000 residents; value for the previous year)  
pm1029 
Percent of state population that is male, ages 10 to 29 
stateid 
ID number of states (Alabama = , Alaska = 2, etc.) 
year 
Year (1977  1999) 
10.1.9 Question 1
Estimate the effect of shall using a simple linear model and interpret it.
Reveal answer
summary(lm(mur~shall+incarc_rate+pm1029,data=a))
Call:
lm(formula = mur ~ shall + incarc_rate + pm1029, data = a)
Residuals:
Min 1Q Median 3Q Max
18.020 2.486 0.161 2.123 40.141
Coefficients:
Estimate Std. Error t value Pr(>t)
(Intercept) 26.676300 1.527265 17.467 < 0.0000000000000002 ***
shall 1.964093 0.316082 6.214 0.000000000718 ***
incarc_rate 0.037136 0.000814 45.624 < 0.0000000000000002 ***
pm1029 1.641943 0.087414 18.784 < 0.0000000000000002 ***

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.441 on 1169 degrees of freedom
Multiple Rsquared: 0.6524, Adjusted Rsquared: 0.6515
Fstatistic: 731.4 on 3 and 1169 DF, pvalue: < 0.00000000000000022
Answer: According to our simple linear model, lax gun laws reduce the murder rate. It decreases by roughly 2 incidents per 100,000.
10.1.10 Question 2
Estimate a unit fixed effects model and a random effects model. Are both models consistent? If not, which is the appropriate model? Use a consistent model to estimate the effect of the shall laws on the murder rate.
Reveal answer
# panel data library
library(plm)
# fixed effects
m.fe < plm(mur ~ shall + incarc_rate + pm1029,
data = a,
index = c("stateid", "year"),
model = "within",
effect = "individual")
# random effects
m.re < plm(mur ~ shall + incarc_rate + pm1029,
data = a,
index = c("stateid", "year"),
model = "random")
# hausman test
phtest(m.fe, m.re)
Hausman Test
data: mur ~ shall + incarc_rate + pm1029
chisq = 147.59, df = 3, pvalue < 0.00000000000000022
alternative hypothesis: one model is inconsistent
# effect
summary(m.fe)
Oneway (individual) effect Within Model
Call:
plm(formula = mur ~ shall + incarc_rate + pm1029, data = a, effect = "individual",
model = "within", index = c("stateid", "year"))
Balanced Panel: n = 51, T = 23, N = 1173
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
21.102428 0.958945 0.016047 1.082008 29.031961
Coefficients:
Estimate Std. Error tvalue Pr(>t)
shall 1.4513886 0.3154300 4.6013 0.000004678 ***
incarc_rate 0.0174551 0.0011261 15.4998 < 0.00000000000000022 ***
pm1029 0.9582993 0.0859610 11.1481 < 0.00000000000000022 ***

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 12016
Residual Sum of Squares: 9800
RSquared: 0.18444
Adj. RSquared: 0.14581
Fstatistic: 84.3526 on 3 and 1119 DF, pvalue: < 0.000000000000000222
Answer: The Hausman test shows that we reject the null hypothesis which states that both random effects model and fixed effects model are consistent. The unique errors ui are correlated with the regressors. Therefore, we must rely on the fixed effects model.
The effect of the shall laws has decreased slightly but is still significantly related to the murder rate. Lax gun laws reduce the murder rate by 1.45 incidents per 100,000.
10.1.11 Question 3
Think of a theoretical reason to control for time fixed effects (what confounding sources could bias our estimate of the shall laws?). Test for time fixed effects using the appropriate test. If time fixed effects are required, reestimate the fixed effects model as a twoway fixed effects model and interpret the effect of lax gun laws.
Reveal answer
m.tfe < plm(
mur ~ shall + incarc_rate + pm1029,
data = a,
index = c("stateid", "year"),
model = "within",
effect = "time"
)
plmtest(m.tfe, effect = "time")
Lagrange Multiplier Test  time effects (Honda) for balanced
panels
data: mur ~ shall + incarc_rate + pm1029
normal = 16.104, pvalue < 0.00000000000000022
alternative hypothesis: significant effects
# twoway FE model
m.2wfe < plm(
mur ~ shall + incarc_rate + pm1029,
data = a,
index = c("stateid", "year"),
model = "within",
effect = "twoway")
summary(m.2wfe)
Twoways effects Within Model
Call:
plm(formula = mur ~ shall + incarc_rate + pm1029, data = a, effect = "twoway",
model = "within", index = c("stateid", "year"))
Balanced Panel: n = 51, T = 23, N = 1173
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
19.2097691 0.9748749 0.0069663 1.0119176 27.1354552
Coefficients:
Estimate Std. Error tvalue Pr(>t)
shall 0.5640474 0.3325054 1.6964 0.0901023 .
incarc_rate 0.0209756 0.0011252 18.6411 < 0.00000000000000022 ***
pm1029 0.7326357 0.2189770 3.3457 0.0008485 ***

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 11263
Residual Sum of Squares: 8519.4
RSquared: 0.24357
Adj. RSquared: 0.19186
Fstatistic: 117.746 on 3 and 1097 DF, pvalue: < 0.000000000000000222
Answer: In the 90s, crime rates in inner cities dropped across many Western countries. This trend will have affected U.S. states in a relatively similar way. This source of confounding will be correlated with the murder rate. Such a strong theoretical foundation for confounding should be controlled for using time fixed effects independent of the test for time fixed effects.
We reject the null hypothesis  time fixed effects are insignificant (make no difference). We, therefore, control for time fixed effects to reduce omitted variable bias from sources that vary over time but are constant across states.
The effect of the shall laws is indistinguishable from zero (at the 0.05 alpha level). We conclude that the shall laws do not increase or decrease the murder rate.
10.1.12 Question 4
Correct the standard errors to account for heteroskedasticity and serial correlation. Does the conclusion regarding the effect of the shall laws change?
Reveal answer
m.2wfe.hac < coeftest(m.2wfe, vcov = vcovBK(m.2wfe, type = "HC3", cluster = "group"))
m.2wfe.hac
t test of coefficients:
Estimate Std. Error t value Pr(>t)
shall 0.5640474 0.7662556 0.7361 0.4618
incarc_rate 0.0209756 0.0028249 7.4253 0.0000000000002254 ***
pm1029 0.7326357 0.5118496 1.4313 0.1526

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Answer: The standard error more than doubled. Our substantive conclusion does not change: The shall laws have no effect on the murder rate in our sample.
10.1.13 Question 5
Test for crosssectional dependence and if present, use the SSC estimator to correct for heteroskedasticity, serial correlation, and spatial dependence. Does our conclusion regarding the effect of the shall laws change?
Reveal answer
# test for crosssectional dependence
pcdtest(m.2wfe)
Pesaran CD test for crosssectional dependence in panels
data: mur ~ shall + incarc_rate + pm1029
z = 3.9121, pvalue = 0.00009148
alternative hypothesis: crosssectional dependence
# correct standard errors
m.2wfe.scc < coeftest(m.2wfe, vcov = vcovSCC(m.2wfe, type = "HC3", cluster = "group"))
m.2wfe.scc
t test of coefficients:
Estimate Std. Error t value Pr(>t)
shall 0.564047 0.542698 1.0393 0.29888
incarc_rate 0.020976 0.010321 2.0324 0.04236 *
pm1029 0.732636 0.551066 1.3295 0.18396

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Answer: The effect of the shall laws remains insignificant. The standard error decreased slightly.
Overall, we find no evidence for the claim made in the book. Guns do not appear to decrease the number of violent crimes. There is also no evidence for the opposite effect.