PSM的定义

在医疗、经济、金融学等领域中，当某项公共政策实施后，我们通常希望通过一些方法去评估该项政策产生的影响，即政策的作用及效应，以更好的指导政策实施，服务于公共决策。比如研究某个劳动者接受某种高等教育或技能培训对其收入的影响，又比如研究某个企业实施了某项激励制度后对企业绩效的影响等。通常情况下，我们会将政策实施对象的 "处理组" 和 "控制组" 进行对比，以期评估该项政策的处理效应(Treatment effect)。但是，对于社会科学来说，我们很难设立随机分组实验，我们更多的是靠观察和准实验来研究，从而我们的数据通常都来自于非随机的现象观察。但是由于选择性偏差（Selection bias）和反事实框架（a counterfactualframework）的存在，我们直接评估政策效果可能存在一定的偏误。

1、何为选择偏差 ( Selection bias)。处理组和控制组的初始条件不完全相同，故存在选择偏差 ( Selection bias)问题。在这种情况下，我们只观察到了对象A因为发生了某一事件后所表现的现象，并且拿这种现象去和另一些没有发生这一事件的对象B进行对比，这显然是不科学的，因为A、B比较的基础并不相同。

2、何为反事实框架（a counterfactual framework）。Rubin于1974年提出了反事实框架：

The main challenge of an impact evaluation is to determine what would have happened to the beneficiaries if the program had not existed. That is, one has to determine the per capita household income of beneficiaries in the absence of the intervention. A beneficiary's outcome in the absence of the intervention would be it's counterfactual. (world bank，P22）

如何解决Selection bias和Counterfactual？这里，以自身为例，假如我想知道读研究生会对我的收入产生怎样的影响。可是，我现在已经读了研究生，那要怎样才能估计出读不读研究生对我收入产生的影响呢？于是，我们引入今日探讨的主题：倾向得分匹配法(Propensity Score Matching，PSM)。该方法能使用倾向得分函数将多维向量的信息压缩成一维，然后根据倾向得分进行匹配。这可以在既定的可观测特征变量下，使得处理组个体和控制组个体尽可能相似，从而缓解处理效应的选择偏差问题。也就是说，该方法能通过对我们每个人读研究生的概率进行估计，然后从一堆没有读研究生的人群中（即我们的总体样本的一个小的子集）选出和我具有非常相似的读研究生的概率，同时，将没有去读的同学李华（是不是很熟悉，考研作文常见的人物）——作为我的对照，然后再来看我们之间的区别。当样本中的所有研究生“我”都找到了与之相匹配的非研究生“李华”，我们便能对这两组样本进行“公平”的比较了。

PSM的原理

对于一个个体，根据是否进行某项处理可以分为两种结果：若其接受处理（Di=1），则其结果yi=y1i；若其未接受处理（Di=0），则其结果yi=y0i。在给定可观测特征变量xi的情况下，个体i进入处理组的条件概率为：p（xi）=Pr（Di=1| x=xi）=E（Di | xi），从而可以得到其平均处理效应为：

PSM的前提假设

The validity of PSM depends on two or three conditions: (1) Conditional independence (namely, that unobserved factors do not affect participation); (2) Sizable common support or overlap in propensity scores across the participant and nonparticipant samples and (3) Banlancingcondition.

(1)Conditional independence

Conditional independence states that given a set of observable covariates X that are not affected by treatment, potential outcomes Y are independent of treatment assignment. If Yi D represent outcomes for participants and Yi C outcomes for nonparticipants, conditional independence implies.

条件独立假定也称为可忽略性假定（ignorability），Treatmenteffect严格外生，不存在内生性问题。

For random experiments, the outcomes are independent of treatment. y0, y1⊥D，The treatment variable needs to be exogenous.

随机实验，实验处理效应严格外生，即样本是否分配到实验组或控制组不会对Y产生影响。

For observational studies, the outcomes are independent of treatment, conditional on x. y0, y1⊥D | x。Weneed treatment assignment that ignores the outcomes.

观察实验，如准实验。即在给定x的情况下，实验处理效应严格外生。

This assumption is also called unconfoundedness (无混淆性，Rosenbaumand Rubin 1983), and it implies that uptake of the program is based entirely on observed characteristics. To estimate the treatment effect on the treated （TOT） as opposed to the average treatment effect（ATE），aweaker assumption is needed.

Conditional independence of the control group outcome and treatment. Weaker assumption than the conditional independence assumption.y0⊥D|x

“条件独立假定”是一个很强的假定，这意味着回归方程包含了所有变量，即不存在遗漏变量。然而，我们并不清楚xi是否会以非线性形式进入方程。

Conditional independence is a strong assumption and is not a directly testable criterion; it depends on specific features of the program itself. If unobserved characteristics determine program participation, conditional independence will be violated, and PSM is not an appropriate method。

如果违背了Conditionalindependence的假设应该怎么办呢？

各种匹配估计量均依赖于可忽略性假定，根据可测变量选择，不适用于根据不可测变量选择的情形。对于观测数据，如果我们怀疑存在根据不可测变量选择的情形，有如下几种处理办法

（a）使用尽可能多的相关可测变量。（如果xi中包含比较丰富的协变量，a rich set of covariates, 则可认为可忽略性得到满足）

（b）如果处理变量Di的不可观测变量不随时间变化而变化，而且有面板数据（Panel data），则使用DID-PSM。

（c）使用断点回归法（RDD），特别是模糊断点回归。

（d）使用虚拟变量（IV）估计。

（e）根据可测变量选择的影响来估计不可测变量的影响。

On its own, PSM is a useful approach when only observed characteristics are believed to affect program participation. Whether this belief is actually the case depends on the unique features of the program itself, in terms of targeting as well as individual takeup of the program. Assuming selection on observed characteristics is sufficiently strong to determine program participation, baseline data on a wide range of preprogram characteristics will allow the probability of participation based on observed characteristics to be specified more precisely. Some tests can be conducted to assess the degree of selection bias or participation on unobserved characteristics.

(2)重叠假定（Sizablecommon support or overlap）

For each value of x, there are both treated and control observations. For each treated observation, there is a matched control observation with similar x.

这个假定意味着处理组和控制组这两个子样本存在重叠，同时，它又是进行匹配的前提，故也称之位“匹配假定”。因此，该假定保证了处理组和控制组的P-Score取值范围有相同的部分（common support）。

重叠假定：对于x取任何值，都有0<p(x)<1。

This condition ensures that treatment observations have comparison observations “nearby" in the propensity score distribution (Heckman, LaLonde, and Smith 1999). Specifically, the effectiveness of PSM also depends on having a large and roughly equal number of participant and nonparticipant observations so that asubstantial region of common support can be found. For estimating the TOT, this assumption can be relaxed to P (Ti = 1|Xi) < 1.

There is overlap between p-score of participants and nonparticipants.

在进行匹配时，为提高匹配质量，我们通常只保留P-Score重叠的个体（尽管会损失样本）。如果倾向得分的共同取值范围太小，则会导致偏差。

Bias may also result from dropping nonparticipant observations that are systematically different from those retained; this problem can also be alleviated by collecting data on a large sample of nonparticipants, with enough variation to allow are presentative sample. Otherwise, examining the characteristics of the dropped non participant sample can refine the interpretation of the treatment effect.

与前述条件独立假定不同的是，Commonsupport是进行匹配的前提，没有改进的方法。如果Common support过小，则说明使用的数据不适合做匹配。

(3)平行假设（Balancingcondition）

严格意义上来说，Balancingcondition 其实也算是Common support的一部分。因为该假定是为了解决在P-Score不重叠部分被删除带来的 Possible sampling bias 。

（a）Assignmentto treatment is independent of the x characteristics, given the same propensity score. D⊥x | P(X)

（b）Thebalancing condition is testable.

Treatment units will therefore have to be similar to nontreatment units in terms of observed characteristics unaffected by participation; thus, some nontreatment units may have to be dropped to ensure comparability. However, sometimes a nonrandom subset of the treatment sample may have to be dropped if similar comparison units do no texist (Ravallion 2008). This situation is more problematic because it creates a possible sampling bias in the treatment effect. Examining the characteristics of dropped units may be useful in interpreting potential bias in the estimated treatment effects.

Heckman, Ichimura, and Todd (1997) encourage dropping treatment observations with weak common support. Only in the area of common support can inferences be made about causality, as reflected in Figure 4.2 reflects a scenario where the common support is weak.

PSM的操作步骤

1.计算倾向值（采用Logistic或Probit回归）；

2.进行得分匹配。得分匹配的方法包括：

（1）最邻近匹配（Nearestneighbor matching, NNM）(是否使用卡尺 withor without caliper)。以倾向得分为依据，在控制组样本中向前或向后寻找最接近干预组样本得分的对象，并形成配对。

（2）半径匹配（Radiusmatching）。设定一个常数r（可理解为区间或范围，一般设定为小于倾向得分标准差的四分之一），将实验组中得分值与控制组得分值的差异在r内的进行配对。

（3）核匹配（KernelMatching）。将干预组样本与由控制组所有样本计算出的一个估计效果进行配对，其中估计效果由实验组个体得分值与控制组所有样本得分值加权平均获得，而权数则由核函数计算得出。

3.评定匹配后的平衡性；

4.计算平均干预效果（ATT）；

5.进行敏感性分析。

PSM的Stata操作

1、前提准备

由于PSM需要用到外部命令psmatch2，且后续操作均建立在该命令基础上，因此，我们先给Stata安装该命令：

在Stata命令输入栏输入（保证计算机处于联网状态）：

ssc install psmatch2

若命令安装成功，则会显示：

checking psmatch2 consistency and verifying not already installed...

installing into .\ado\plus\...

installation complete.（窗口出现此提示表示安装完成）

为了验证是否成功安装以及查看psmatch2命令的帮助菜单，可在命令窗口键入

help psmatch2

如果能顺利弹出帮助文件，表示安装成功。

注：如果需要在Stata中导出估计结果到Word，则需要安装外部命令：

ssc install asdoc, replace

2、数据处理

（1）导入数据及变量处理

输入命令导入数据：

cd ×××××××××

或直接将下载的数据用Stata打开即可：

use "E:\倾向得分匹配PSM\20200727.dta"

xtset ID YEAR

（2）数据描述

describe

将原数据中的变量取对数后对应的变量为：

global xlist "Ln_gdp Ln_gdp2 Ln_indus1 Ln_imex Ln_ppc Ln_ppc2 Ln_freight Ln_indus2"

（3）计算倾向得分

为了保证运行结果可重现，首先设定seed，并对数据排序：

set seed 0001

gen tmp = runiform()

sort tmp

psmatch2 TREAT $xlist, out(Lnfdi) logit neighbor(1) common ate

注：此处采用一对一 logit回归

估计结果中给出了Logit回归结果、实验组处理效应、控制组处理效应、平均处理效应以及共同支撑检验的内容。

估计结果共有三部分，第一部分为Logit回归结果；第二部分为处理组和控制组在匹配前后的差异及其显著性。通过结果可以看出，在匹配前处理组和控制组差异为 1.11262849，t 值为 1.18，匹配后处理组和控制组差异 0.044156431，而 t 值为 0.09。第三部分为观测值共同取值范围的情况。

（4）平衡性检验

然后，我们使用 pstest 命令来考察匹配结果是否较好地平衡了数据的差异性，即检验是否满足平行假设。

pstest $xlist, both graph

注：此处，我们可通过结果判断是否满足平衡性假设。若匹配后所有变量的标准化偏差 ( %bias ) 小于5 %，且所有 t 检验结果接受原假设「处理组与控制组无系统差异」，则平行假设得到满足。

（5）倾向得分分析

紧接着，我们可通过 psgraph 绘图直观地观察倾向得分的共同取值范围：

psgraph

结果给出了匹配前后实验组与控制组的均值、偏差对比、t检验以及偏差下降（百分比）的情况，最后是匹配前后回归结果的对比情况。

注意事项

完成上述步骤，我们就完成了最简单的1:1的倾向得分匹配。其实，psmatch2还提供了多种匹配方法，比如在一定的半径范围内的临近匹配、在一定概率阀值内的全部匹配等。具体的可以在Stata中输入helppsmatch2查看所有可用的选项。但同时需要注意的是，psmatch2会在每一轮匹配的时候重新刷新_ID，所以，当需要对psmatch2加入if语句，进行多次循环匹配时，则需要在每一次结束的时候及时将match的结果使用你自己数据的ID导出到其他变量，否则所有本次_ID以及_n的信息会在下一次匹配中被清除，过后将无法判断对照对象究竟为哪一个。

倾向得分匹配法 PSM