Adaptation of Statistical Learning Theory as an Information Theoretic, Finite Sample, Solution to Propensity Score Analysis
Propensity score analysis seeks to resolve the bias in which the intervention conditions (treatment and control) are not randomly assigned to subjects. In this situation, the pre-intervention characteristics of these subjects are not naturally equivalent between both conditions. When the pre-intervention characteristics are related to the post-intervention outcome of interest, these characteristics are confounders ($X$) causing bias to the causal effect of the intervention on outcome. What is needed to resolve the bias is a process in which all the $X$ are equated on the intervention conditions. Propensity score is defined as the conditional probability function of a subject choosing to be in the treatment condition rather than in the control condition given $X$. Given certain conditions, successful equating of the propensity scores between intervention conditions results in equating of $X$ and the bias is resolved. In finite samples, propensity score needs to be estimated. The problem of the existing propensity score estimation methods lies in their assumption of a direct, error uncontaminated relationship between their data, and the function which controls the observed treatment selection. This problem is resolved in two situations: (1) the sample size approaches infinity (asymptote) which then becomes equivalent to the population, or (2) there are infinite number of samples from the population. In my dissertation, I proposed an estimator based upon statistical learning theory which will not require either infinite replications or infinite samples, using a methodology not based upon asymptotic expectation but instead simply seeks to minimize both sources of overfitting, or bias, as an explicit part of the regularized estimation process. Results from the conducted real data analysis and simulation studies demonstrate both the accurate recovery of the least empirically biased estimate of a causal effect with the smallest standard errors in estimation, as expected by the characteristics of structural risk minimization. Sample size has minimal effect on first and second order consistency of the proposed process, for arbitrary function characteristics. The manuscript is concluded with a discussion concerning the connections and applications for hierarchical data estimation, a common scenario in propensity score analysis for the Social Sciences.
Hurley, Landon, "Adaptation of Statistical Learning Theory as an Information Theoretic, Finite Sample, Solution to Propensity Score Analysis" (2019). ETD Collection for Fordham University. AAI22619650.