This vignette will take a more detailed look at the
matched.set
object, which is the core object within the
package that captures all of the information associated with treated
units and their matched control units.
First, we will create a smaller subset of the dem
data
set, which is included in the package. This is just to make our results
easier to read. We use the DisplayTreatment function to get a sense of
treatment variation within the subset of data.
library(PanelMatch)
uid <-unique(dem$wbcode2)[1:10]
subdem <- dem[dem$wbcode2 %in% uid, ]
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem', data = subdem)
We can use the PanelMatch
function with
refinement.method
set to none
to obtain a
PanelMatch
object, from which we will extract a
matched.set
object. PanelMatch
returns an S3
object of the PanelMatch
class. These objects are just
lists with some additional attributes. Here, we will focus on one
element contained within PanelMatch
objects:
matched.set
objects. Within the PanelMatch
object, this element is always named either att
or
atc
. When qoi = ate
, then there are two
matched.set
objects included in the resulting
PanelMatch
call. Specifically, there will be two matched
sets named att
and atc
, respectively.
In implementation, the matched.set
is just a named list
with some added attributes (lag, names of treatment, unit, and time
variables) and a structured name scheme. Each entry in the list is a
vector containing the unit ids of control units that are in a matched
set. Additionally, each entry corresponds to a time/unit id pair (the
unit id of a treated unit and the time at which treatment occurred).
This is reflected in the names of each element of the list, as the name
scheme [id varable]
.[time variable]
is
used.
Matched set objects are implemented as lists, but the default
printing behavior resembles that of a data frame. One can toggle a
verbose
option on the print
method to print as
a list and also display a less summarized version of the matched set
data.
## [1] "4.1992" "4.1997" "6.1973" "6.1983" "7.1991" "12.1992" "13.2003"
## [8] "7.1998"
#data frame printing view: useful as a summary view with large data sets
# first column is unit id variable, second is time variable, and
# third is the number of controls in that matched set
print(msets)
## wbcode2 year matched.set.size
## 1 4 1992 2
## 2 4 1997 1
## 3 6 1973 1
## 4 6 1983 2
## 5 7 1991 4
## 6 12 1992 2
## 7 13 2003 2
## 8 7 1998 0
## $`4.1992`
## [1] "3" "13"
## attr(,"weights")
## 3 13
## 0.5 0.5
##
## $`4.1997`
## [1] "7"
## attr(,"weights")
## 7
## 1
##
## $`6.1973`
## [1] "13"
## attr(,"weights")
## 13
## 1
##
## $`6.1983`
## [1] "4" "13"
## attr(,"weights")
## 4 13
## 0.5 0.5
##
## $`7.1991`
## [1] "3" "4" "12" "13"
## attr(,"weights")
## 3 4 12 13
## 0.25 0.25 0.25 0.25
##
## $`12.1992`
## [1] "3" "13"
## attr(,"weights")
## 3 13
## 0.5 0.5
##
## $`13.2003`
## [1] "3" "12"
## attr(,"weights")
## 3 12
## 0.5 0.5
##
## $`7.1998`
## character(0)
##
## attr(,"lag")
## [1] 4
## attr(,"t.var")
## [1] "year"
## attr(,"id.var")
## [1] "wbcode2"
## attr(,"treatment.var")
## [1] "dem"
## attr(,"refinement.method")
## [1] "none"
## attr(,"match.missing")
## [1] TRUE
Note that in the verbose print view above, one can see that each
control unit will have an associated weight. These are the weights that
are assigned from the refinement process. In this example, no refinement
has been applied so each control unit in a matched set will have equal
weight. See the Using PanelMatch
vignette for more about
refinement.
Weights are attributes for each element in the
matched.set
list. As such, they can also be extracted as
follows:
## 3 13
## 0.5 0.5
Note that this returns a vector of weights. The names of each element in the vector corresponds to the control unit that weight is associated with. Weights are normalized should always sum to 1 within each matched set.
The ‘[’ and ‘[[’ operators are implemented and should work intuitively.
Using ‘[’ returns a subsetted matched.set
object (list).
The additional attributes will be copied and transferred as well with
the custom operator. Note how, by default, it prints like the full form
of the matched.set
. Using ‘[[’ will return the unit ids of
the control units in the specified matched set.
Since matched.set
objects are just lists with
attributes, you can expect the [
and [[
functions to work similarly to how they would with a list. So, for
instance, users can extract information about matched sets using
numerical indices or by taking advantage of the naming scheme.
## wbcode2 year matched.set.size
## 1 4 1992 2
## [1] "3" "13"
## attr(,"weights")
## 3 13
## 0.5 0.5
## wbcode2 year matched.set.size
## 1 4 1992 2
## [1] "3" "13"
## attr(,"weights")
## 3 13
## 0.5 0.5
Calling plot
on a matched.set
object will
display a histogram of the sizes of the matched sets. By default, the
number of empty matched sets (treated unit/time id pairs with no
suitable controls for a match) is noted with a vertical bar at x = 0.
One can include empty sets in the histogram by setting the
include.empty.sets
argument to TRUE
# Use full data
PM.results.full <- PanelMatch(lag = 4, time.id = "year", unit.id = "wbcode2",
treatment = "dem", refinement.method = "none",
data = dem, match.missing = TRUE,
qoi = "att" ,outcome.var = "y",
lead = 0, forbid.treatment.reversal = FALSE)
#Extract the matched.set object
plot(PM.results.full$att)
The summary
function provides a variety of information
about the sizes of matched sets, the unit and time ids of treated units,
the number of empty sets, and the lag window size. The
summary
function also has an option to print only the
overview
data frame. Toggle this by setting
verbose = FALSE
## $overview
## wbcode2 year matched.set.size
## 1 4 1992 2
## 2 4 1997 1
## 3 6 1973 1
## 4 6 1983 2
## 5 7 1991 4
## 6 12 1992 2
## 7 13 2003 2
## 8 7 1998 0
##
## $set.size.summary
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 1.00 2.00 1.75 2.00 4.00
##
## $number.of.treated.units
## [1] 8
##
## $num.units.empty.set
## [1] 1
##
## $lag
## [1] 4
## wbcode2 year matched.set.size
## 1 4 1992 2
## 2 4 1997 1
## 3 6 1973 1
## 4 6 1983 2
## 5 7 1991 4
## 6 12 1992 2
## 7 13 2003 2
## 8 7 1998 0
DisplayTreatment
functionPassing a matched set (one treated unit and its corresponding set of
controls) to the DisplayTreatment
function will visually
highlight the lag window histories used to create that matched set.
There is also an option to only display units from the matched set (and
the treated unit), which can be achieved by setting
show.set.only
to TRUE
.
#pass matched.set object using the `[` operator
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem', data = subdem, matched.set = msets[1])
## Warning: Vectorized input to `element_text()` is not officially supported.
## ℹ Results may be unexpected or may change in future versions of ggplot2.
# only show matched set units
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem',
data = subdem, matched.set = msets[1],
show.set.only = TRUE, y.size = 15, x.size = 13)
## Warning: Vectorized input to `element_text()` is not officially supported.
## ℹ Results may be unexpected or may change in future versions of ggplot2.