Propensity Score
The propensity score makes it so that you don’t have to condition on the entirety of X to achieve independence of the potential outcomes on the treatment. It is sufficient to condition on this single variable, which is the propensity score$$(Y_{0},Y_{1}) \perp T|e(x)$$The propensity score is the conditional probability of receiving the treatment, right? So we can think of it as some sort of function that converts X into the treatment T. The propensity score makes this middle ground between th...
Stats Review
“Some equations are dangerous if you know them, and others are dangerous if you do not. The first category may pose danger because the secrets within its bounds open doors behind which lies terrible peril. The obvious winner in this is Einstein’s iconic equation $E=mc^2$, for it provides a measure of the enormous energy hidden within ordinary matter. […] Instead I am interested in equations that unleash their danger not when we know about them, but rather when we do not. Kept close at h...
Beyond Confounders
Good ControlSometimes treatment’s effect on the outcome is much smaller than other factors, in order to figure out the effect of treatment, we should control other factors because:If a variable is a good predictor of the outcome, it will explain away a lot of its variance.To demonstrate this, let’s resort to the partialling out way of breaking regression into 2 steps. First, we will regress the treatment, email, and the outcome, payments, on the additional controls, credit limit and risk scor...
循证医学7-8周回顾
二项分布与poisson分布及其应用二项分布 $$ P(X) = C^x_{n} \pi^x (1-\pi)^{n-x}$$ $$\mu = n\pi, \sigma^2={ n\pi(1-\pi) }$$ 样本率的方差计算同正态分布时的均值的方差计算:$$S_{p}=\sqrt{ \frac{p(1-p)}{n} }$$总体率置信区间计算: 查表方法 正态近似法(样本容量>100, $\pi \approx 0.5$) $$\begin{aligned} u &= \frac{{p-\pi_{0}}}{\sigma_{p}} \\ \sigma_{p} &= \sqrt{ \frac{\pi_{0}(1-\pi_{0})}{n} } \end{aligned}$$ *既往死亡率为40%,实验中120名病人死亡30名,统计推断: H_0: 均值不等 H_1: 均值相等 确定alpha值为0.05,双尾检验* $$\begin{aligned} \sigma_{p}&=\sqrt{ \frac{\pi_{0}(1-\pi_{0})}{n} }\\ &=\sq...
循证医学5-6周回顾
ANOVA多组样本均数比较 多重比较SNK-Q, Dunnet-t, LSD-t检验,其中SNK-q最难显著,LSD-t最容易显著SNK-q: 任意两组进行均数的比较Dunnet-t: k-1个实验组与一个对照组的比较LSD-t: 特定几组的比较 前提正态分布方差齐性 Bartlett检验: 服从正态分布 Levene检验:服从任意分布 双向方差分析方差分析不等于分析方差,方差分析分析均数类似于ANOVA,仅可以做到比较多组是否全部相同。 析因设计的方差分析先确定有无交互效应若无交互效应则进行主效应分析若有交互效应则进行单独效应分析 F&Q: 老师上课所使用的单变量回归该如何理解?单变量回归的作用是不是就是将不同正态分布的总体拉到同一基线上?对于析因设计的方差检验,由于数据内部存在多组正态分布,导致数据总体不满足正态分布,所以需要分析其拟合后残差是否满足正态性与方差齐性$$\begin{aligned}Y = \beta_{0} + \beta_{1} X_{1} + \beta_{2}X_{2}\end{aligned}$$如果$X_{1}$: 肝脏=...
循证医学3-4周回顾
假设检验 检验的对象:抽样样本的均值,均值分布满足正态分布、抽样所得样本不一定满足正态分布 从正态分布——u分布——t分布(方差未知时用,大部分时候) 可信区间的计算(CI):依赖于标准正态分布$\mathcal{N}(0,1)$ ,双边检验和单边检验的区分变换方法:$$\begin{aligned}X \sim \mathcal{N}(\mu, \sigma^2)\\X-\mu \sim \mathcal{N}(0, \sigma^2)\\\frac{X-\mu}{\sigma} \sim \mathcal{N}(0, 1)\end{aligned}$$ 两组均数比较的参数检验单样本t检验和已知的均值比较。假设我检测的样本均值为$\bar{X} \sim \mathcal{N}(\mu_{1}, \frac{\sigma_{1}^2}{n})$,总体均值为常数$\mu_{2}$,然后我们对这两个作差得到:$$\bar{X}-\mu_{2}\sim \mathcal{N}(\mu_{1}-\mu_{2}, \frac{\sigma_{1}^2}{n})$$我希望的是:证明作差得...
第一篇文章
这是我的第一篇文章1alert('Hello World!');
循证医学1-2周回顾
第一周的主要内容其实就是意识到概率论和数理统计之间的关系 统计学中的基本概念PPT-1基本概念总体——参数——$\mu, \sigma$样本——统计量——$\bar{X}, \bar{Y}$同质性与异质性抽样误差(sampling error) 计量资料——有序计数资料——有序等级资料——无序 因果与联系:金字塔顶端RCT研究,RCT是揭示事物因果关系最重要的方法,但是由于价格原因,大部分时间只能使用他的替代方案。 抽样误差的有趣知识SE为什么要有SE?如下图所示,美国为了研究班级人数的多少和班级平均分的关系,他们发现分数高的是那些班级人数较少的班级,但事实上,只是因为班级人数过少导致SE增大带来的错误因果! PPT-2不同均数的适应情况,及为什么适应: 算术均数算术均数适用于正态分布,因为算术均数其实是正态分布的通过极大似然估计得到的均值$$P(x)=\frac{1}{\sqrt{ 2\pi }\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$$$$\begin{aligned}l(x_{1},x_{2},\dots,x_{n})&...