[수리통계학] 11. 추정량

확률및통계/수리통계학2019. 3. 30. 08:00

[수리통계학] 11. 추정량

모수의 추정에 사용되는 통계량을 추정량(estimator)이라고 한다. 여기서는 추정량이 가져야 할 성질들에 대해 다루도록 하겠다.

모수 $\theta$의 추정량 $\hat{\theta}$에 대하여 $E(\hat{\theta})=\theta$일 때 $\hat{\theta}$를 $\theta$의 불편추정량(unbiased estimator)이라 하고, $E(\hat{\theta})\neq\theta$일 때 $\hat{\theta}$를 $\theta$의 편향추정량(biased estimator)이라고 하며 $E(\hat{\theta})-\theta$를 편향(bias)이라고 하고 $\text{Bias}(\hat{\theta})$로 나타낸다.

어느 추정량에 편향이 있다는 것은 치우침이 있음을 뜻하고 여기서 주의할 점은 편향과 분산은 서로 무관하다는 것이다.

예를들어 $E(\overline{X})=\mu$이므로 $\overline{X}$는 $\mu$의 불편추정량이고, $X\,\sim\,B(n,\,p)$일 때 $\displaystyle\hat{p}=\frac{X}{n}$이라고 하면$$E(\hat{p})=E\left(\frac{X}{n}\right)=\frac{E(X)}{n}=\frac{np}{n}=p$$이므로 $\hat{p}$는 $p$의 불편추정량이다.

평균이 $\mu$이고 분산이 $\sigma^{2}$인 모집단에서 크기가 $n$인 확률표본 $X_{1},\,X_{2},\,\cdots,\,X_{n}$을 추출할 때 모분산 $\sigma^{2}$를 추정하기 위해 다음의 두 추정량이 사용된다.$$S^{2}=\frac{1}{n-1}\sum_{i=1}^{n}{(X_{i}-\overline{X})^{2}},\,\hat{\sigma^{2}}=\frac{1}{n}\sum_{i=1}^{n}{(X_{i}-\overline{X})^{2}}$$$\sigma^{2}=\text{Var}(X_{i})=E((X_{i}-\overline{X})^{2})=E(X_{i}^{2})-\mu^{2}$이므로$$\begin{align*}E\left(\sum_{i=1}^{n}{(X_{i}-\overline{X})^{2}}\right)&=E\left(\sum_{i=1}^{n}{(X_{i}^{2}-2\overline{X}X_{i}+\overline{X}^{2})}\right)\\&=\sum_{i=1}^{n}{E(X_{i}^{2})}-nE(\overline{X}^{2})\\&=n(\sigma^{2}+\mu^{2})-n\left(\mu^{2}+\frac{1}{n}\sigma^{2}\right)\\&=(n-1)\sigma^{2}\end{align*}$$이고$$E(S^{2})=\sigma^{2},\,E(\hat{\sigma^{2}})=\frac{n-1}{n}\sigma^{2}$$이므로 $S^{2}$는 불편추정량이나 $\hat{\sigma^{2}}$는 편향추정량이다.

추정량 $\hat{\theta}$의 표준편차를 표준오차(standard error)라고 하고 $\text{SE}(\hat{\theta})$로 나타낸다. 즉 $\text{SE}(\hat{\theta})=\sqrt{\text{Var}(\hat{\theta})}$.

예를들어 $\displaystyle\text{SE}(\overline{X})=\frac{\sigma}{\sqrt{n}},\,\text{SE}(\hat{p})=\sqrt{\frac{p(1-p)}{n}}$이다.

*모비율의 평균과 분산은 각각 $\displaystyle p,\,\frac{p(1-p)}{n}$이다.

모수 $\theta$와 그 추정량 $\hat{\theta}$에 대하여 $\text{MSE}(\hat{\theta})=E((\hat{\theta}-\theta)^{2})$을 평균제곱오차(mean square error)라고 한다. 이때 $E(\hat{\theta})=\mu$라고 하면 $\mu,\,\theta$는 상수이고 $E(\hat{\theta}-\mu)=0,\,\text{Var}(\hat{\theta})=E((\hat{\theta}-\mu)^{2})$, $\text{Bias}(\hat{\theta})=E(\hat{\theta})-\theta=\mu-\theta$이므로$$\begin{align*}\text{MSE}(\hat{\theta})&=E((\hat{\theta}-\theta)^{2})=E((\hat{\theta}-\mu+\mu-\theta)^{2})\\&=E((\hat{\theta}-\mu)^{2})+2E((\hat{\theta}-\mu)(\mu-\theta))+E((\mu-\theta)^{2})\\&=E((\hat{\theta}-\mu)^{2})+(\mu-\theta)^{2}\\&=\text{Var}(\hat{\theta})+(\text{Bias}(\hat{\theta}))^{2}\end{align*}$$이다.

평균이 $\mu$이고 분산이 $\sigma^{2}$인 모집단에서 추출한 크기가 $n$인 확률표본 $X_{1},\,X_{2},\,\cdots,\,X_{n}$에 대하여 $T_{1}=\overline{X},\,T_{2}=X_{1}$이라 하자.$$\text{Bias}(T_{1})=E(\overline{X})-\mu=0,\,\text{Bias}(T_{2})=E(X_{1})-\mu=0$$이므로 $T_{1},\,T_{2}$모두 불편추정량이다.$$\text{MSE}(T_{1})=\text{Var}(\overline{X})=\frac{\sigma^{2}}{n},\,\text{MSE}(T_{2})=\text{Var}(X_{1})=\sigma^{2}$$이므로 평균제곱오차가 작은 $T_{1}=\overline{X}$가 $T_{2}=X_{1}$보다 $\mu$에 대한 좋은 추정량이다.

앞에서 다룬 모집단에서$$S^{2}=\frac{1}{n-1}\sum_{i=1}^{n}{(X_{i}-\overline{X})^{2}},\,\hat{\sigma^{2}}=\frac{1}{n}\sum_{i=1}^{n}{(X_{i}-\overline{X})^{2}}$$라고 하자. 그러면 $\displaystyle E(S^{2})=\sigma^{2},\,E(\hat{\sigma^{2}})=\frac{n-1}{n}\sigma^{2}$이므로$$\text{Bias}(S^{2})=E(S^{2})-\sigma^{2}=0,\,\text{Bias}(\hat{\sigma^{2}})=E(\hat{\sigma^{2}})-\sigma^{2}=-\frac{1}{n}\sigma^{2}$$이다. 정규분포로 가정하고 평균제곱오차를 구하면$$\begin{align*}\text{MSE}(S^{2})&=\text{Var}(S^{2})=\frac{2}{n-1}\sigma^{4}\\ \text{MSE}(S^{2})&=\text{Var}\left(\frac{n-1}{n}S^{2}\right)+(\text{Bias}(\hat{\sigma^{2}}))\\&=\left(\frac{n-1}{n}\right)^{2}\frac{2}{n-1}\sigma^{4}+\left(-\frac{1}{n}\sigma^{2}\right)^{2}\\&=\frac{2n-1}{n^{2}}\sigma^{4}\end{align*}$$이고$$\begin{align*}\text{MSE}(S^{2})-\text{MSE}(\hat{\sigma^{2}})&=\frac{2}{n-1}\sigma^{4}-\frac{2n-1}{n^{2}}\sigma^{4}\\&=\frac{3n-1}{n^{2}(n-1)}\sigma^{4}>0\,(n\geq2)\end{align*}$$이므로 평균제곱오차의 관점에서는 $\hat{\sigma^{2}}$가 $S^{2}$보다 좋은 추정량이다.

$\theta$의 추정량 $T$가 임의의 $\epsilon>0$에 대하여$$\lim_{n\,\rightarrow\,\infty}{P(|T-\theta|<\epsilon)}=1$$이면, $T$를 $\theta$의 일치추정량(consistent estimator)이라고 하고,$$\lim_{n\,\rightarrow\,\infty}{\text{MSE}(T)}=\lim_{n\,\rightarrow\,\infty}{E((T-\theta)^{2})}=0$$이면, $T$를 $\theta$의 평균제곱오차 일치추정량이라고 한다.

일치추정량의 극한으로 표현되는 수렴의 종류를 확률수렴(convergence in probability)이라고 한다.

$X\,\sim\,B(n,\,p)$일 때, $p$의 추정량으로 표본비율 $\displaystyle\hat{p}=\frac{X}{n}$을 사용했다. $\displaystyle\tilde{p}=\frac{1}{2}$를 사용하여 $p$를 추정하면$$\text{Var}(\hat{p})=\frac{p(1-p)}{n},\,\text{Bias}(\hat{p})=0,\,\text{Var}(\tilde{p})=0,\,\text{Bias}(\tilde{p})=\frac{1}{2}-p$$이므로$$\begin{align*}\text{MSE}(\hat{p})&=\text{Var}(\hat{p})+(\text{Bias}(\hat{p}))^{2}=\frac{p(1-p)}{n}\\ \text{MSE}(\tilde{p})&=\text{Var}(\tilde{p})+(\text{Bias}(\tilde{p}))^{2}=\left(\frac{1}{2}-p\right)^{2}\end{align*}$$이고$$\lim_{n\,\rightarrow\,\infty}{\text{MSE}(\hat{p})}=\lim_{n\,\rightarrow\,\infty}{\frac{p(1-p)}{n}}=0,\,\lim_{n\,\rightarrow\,\infty}{\text{MSE}(\tilde{p})}=\lim_{n\,\rightarrow\,\infty}{\left(\frac{1}{2}-p\right)^{2}}=\left(\frac{1}{2}-p\right)^{2}\neq0$$이므로 $\hat{p}$는 일치추정량이나 $\tilde{p}$는 일치추정량이 아니다.

참고자료:

John E Freund's Mathematical Statistics with Applications 8th edition, Irwon Miller, Marylees Miller, Pearson

수리통계학, 허문열, 송문섭, 박영사

저작자표시 비영리 동일조건 (새창열림)

'확률및통계 > 수리통계학' 카테고리의 다른 글

[수리통계학] 13. 구간추정 (0)	2019.04.01
[수리통계학] 12. 점추정(적률법, 최대우도법) (0)	2019.03.31
[수리통계학] 10. 카이제곱분포, t분포, F분포 (0)	2019.03.29
[수리통계학] 9. 표본분포, 중심극한정리 (0)	2019.03.28
[수리통계학] 8. 확률변수의 변환 (0)	2019.03.27

Posted by skywalker222

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

지식저장고(Knowledge Storage)