1.1. Classical Central Limit Theorem

Theorem (Classical Central Limit Theorem)

Let \(X_1,X_2,X_3,\ldots,\) be iid random variables from some unknown probability distribution with mean 0 and finite variance \(\sigma^2\). Suppose we know that \[ \frac{X_1+X_2+\cdots+X_n}{\sqrt{n}}\stackrel{d}\rightarrow Y,\mbox{ }Y\sim F. \] Then, \(Y\sim \mathcal{N}(0,\sigma^2)\).

Proof: Note that \(\frac{X_1+X_2+\cdots+X_{2n}}{\sqrt{2n}}\stackrel{d}\rightarrow Y\), i.e., \[\frac{1}{\sqrt{2}}\biggl\{\frac{X_1+X_2+\cdots+X_{n}}{\sqrt{n}}+\frac{X_{n+1}+X_{n+2}+\cdots+X_{2n}}{\sqrt{n}}\biggr\}\stackrel{d}\rightarrow Y.\] Since the two terms on the LHS are independent and identically distributed, we also know that \[\frac{1}{\sqrt{2}}\biggl\{\frac{X_1+X_2+\cdots+X_{n}}{\sqrt{n}}+\frac{X_{n+1}+X_{n+2}+\cdots+X_{2n}}{\sqrt{n}}\biggr\}\stackrel{d}\rightarrow \frac{1}{\sqrt{2}}Y_1+ \frac{1}{\sqrt{2}}Y_2.\mbox{ }\mbox{ where } Y_1,Y_2 \stackrel{\text{iid}}\sim F.\] Because of the uniqueness of the characteristic function with respect to the distribution \(F\), we know that \(\frac{1}{\sqrt{2}}Y_1+ \frac{1}{\sqrt{2}}Y_2\stackrel{d}= Y\) \(\mbox{ }\mbox{ where } Y_1,Y_2,Y\stackrel{\text{iid}}\sim F.\)

Let \(\varphi_X(t)\) denote the characteristic function of a random variable \(X\) with some \(t\). Note that \[ \varphi_{(Y_1+Y_2)/\sqrt{2}}(t)= \varphi_{Y}(t) =\varphi_F(t) \] where \(\varphi_F(t)\) means the characteristic fucntion of any random variable with the distribution \(F\). It means \[ E\biggl[\exp\biggl(i\frac{t}{\sqrt{2}}(Y_1+Y_2)\biggr)\biggr]=\varphi_F(t)\\ \implies E\biggl[\exp\biggl(i\frac{t}{\sqrt{2}}Y_1+i\frac{t}{\sqrt{2}}Y_2\biggr)\biggr]=\varphi_F(t)\\ \implies \varphi_F\biggl(\frac{t}{\sqrt{2}}\biggr)^2=\varphi_F(t).\\ \] Note that \(\varphi_F\biggl(t/\sqrt{2}\biggr)=\varphi_F\biggl(\frac{t}{\sqrt{2}}/\sqrt{2}\biggr)^2\).
- 왜냐하면, 맨 처음 \((X_1+\cdots+X_n)/\sqrt{n}\)과 \((X_{n+1}+\cdots+X_{2n})/\sqrt{n}\)을 떠올렸던 것과 같이 \((X_1+\cdots+X_n)/\sqrt{2n}\)과 \((X_{n+1}+\cdots+X_{2n})/\sqrt{2n}\)을 생각한다면 이 둘은 iid이고 위에서 펼쳐온 똑같은 논리로 이 결과를 얻을 수 있다.
Then, \[\begin{align*} E\biggl[\exp\biggl(i\frac{t}{\sqrt{2}}(Y_1+Y_2)\biggr)\biggr]&= \biggl\{\varphi_F\biggl(\frac{t}{\sqrt{2}}/\sqrt{2}\biggr)^2\biggr\}^2\\ &= \biggl\{\varphi_F\biggl(\frac{t}{2}\biggr)\biggr\}^4\\ &= \biggl\{\varphi_F\biggl(\frac{t}{4}\biggr)\biggr\}^{4\cdot 4} \mbox{ }\mbox{ }\mbox{ (with the same logic as above)}\\ \cdots &= \biggl\{\varphi_F\biggl(\frac{t}{2^n}\biggr)\biggr\}^{4^n}. \end{align*}\]
- Recall (Truncated Taylor Expansion of Characteristic Functions, chap 7.2): For a random variable \(X\), let \(E(X^2)<\infty\). Then, we can expand the characteristic function of \(X\) as below: \[ \varphi_X(t)=1+iE(X)t+\frac{i^2}{2}E(X^2)t^2+R(t). \] Then, \(\frac{R(t)}{t^2} \rightarrow 0\) as \(t\rightarrow 0\).
Note that \(\varphi_F(t)=\varphi_F\Bigl(\frac{t}{\sqrt{2}}\Bigr)^2\). By taking derivative w.r.t \(t\), we have \[ \varphi'_F(t)=\frac{1}{\sqrt{2}}\varphi_F\Bigl(\frac{t}{\sqrt{2}}\Bigr)\varphi'_F\Bigl(\frac{t}{\sqrt{2}}\Bigr) \] Thus, \(\varphi'_F(0)=\frac{1}{\sqrt{2}}\varphi_F(0)\varphi'_F(0)=\frac{1}{\sqrt{2}}\cdot 1\cdot \varphi'_F(0)\) \(\mbox{ }\) because \(\varphi_F(0)=1\). In other words, \(\varphi'_F(0)=0\).

Now, we expand \(\bigl\{\varphi_F\bigl(\frac{t}{2^n}\bigr)\bigr\}^{4^n}\) as below: \[\begin{align*} \biggl\{\varphi_F\biggl(\frac{t}{2^n}\biggr)\biggr\}^{4^n}&=\biggl\{1+iE(Y)\frac{t}{2^n}+\frac{i^2}{2}E(Y^2)\frac{t^2}{4^n}+R\biggl(\frac{t}{2^n}\biggr) \biggr\}^{4^n}\\ &=\biggl\{1+0+\frac{-1}{2}\cdot \sigma^2\cdot\biggl(\frac{t}{2^n}\biggr)^2 +R\biggl(\frac{t}{2^n}\biggr) \biggr\}^{4^n}\\ &= \biggl\{1-\frac{\sigma^2 t^2}{2\cdot 4^{n}} +R\biggl(\frac{t}{2^n}\biggr) \biggr\}^{4^n}\\ &= \biggl\{1-\frac{\sigma^2 t^2}{2}\biggl(1-\frac{R(t/2^n)/\sigma^2}{(t/2^n)^2}\biggr) /4^n \biggr\}^{4^n} \end{align*}\]
Let \(A_n= \frac{\sigma^2 t^2}{2}\Bigl(1-\frac{R(t/2^n)/\sigma^2}{(t/2^n)^2}\Bigr)\). Then, \(A_n\rightarrow \frac{\sigma^2 t^2}{2}\) as \(n\rightarrow\infty\) by the Truncated Taylor Expansion theorem of Characteristic Functions. Conclusively, as \(n\rightarrow\infty\) \[ \varphi_F(t)=\biggl\{\varphi_F\biggl(\frac{t}{2^n}\biggr)\biggr\}^{4^n}\rightarrow \exp\Bigl(-\frac{\sigma^2t^2}{2}\Bigr) \] which is the characteristic function of the normal distribution with mean 0 and variance \(\sigma^2\) (QED).

Theorem (Lindeberg-Feller Central Limit Theorem)

Suppose that we have an array of random variable such that \[\begin{align*} &X_{1,1},\ldots,X_{1,m_1}\\ &X_{2,1},X_{2,2},\ldots,X_{1,m_2}\\ &\mbox{ }\mbox{ }\mbox{ }\mbox{ }\mbox{ }\mbox{ }\mbox{ }\vdots\\ &X_{n,1},X_{n,2},X_{n,3},\ldots,X_{n,m_n}. \end{align*}\] Suppose that

within each row, random variables are independent,
All random variables have mean 0,
\(\text{Var}(\sum_{j=1}^{m_n}X_{n_j})=\sum_{j=1}^{m_n}\text{Var}(X_{n_j})=1\),
(Linderberg’s Condition) For every \(\epsilon >0\), \(\sum_{j=1}^{m_n}E\big[X_{nj}^2I_{\{|X_{nj}>\epsilon\}}\big]\rightarrow 0\) as \(n\rightarrow \infty\).

Then, \(\sum_{j=1}^{m_n}X_{nj}\stackrel{d}{\rightarrow} N(0,1)\) as \(n\rightarrow \infty\)

Classical CLT는 \(m_1=1,m_2=2,\ldots,m_n=n\) 일 때의 Lindeberg-Feller CLT의 special case이다 (HW으로 보일 것).
Lindeberg’s Condition의 의미: Note that \[\begin{align*} & X_{nj}^2=X_{nj}^2I_{\{|X_{nj}|\le\epsilon\}}+ X_{nj}^2I_{\{|X_{nj}|>\epsilon\}}\le \epsilon^2+ X_{nj}^2I_{\{|X_{nj}|>\epsilon\}}\\ &\implies E(X_{nj}^2)\le \epsilon^2+ E(X_{nj}^2I_{\{|X_{nj}|>\epsilon\}})\\ & \implies \max_{1\le j\le m_n} E(X_{nj}^2) \le \epsilon^2+ \max_{1\le j\le m_n}E(X_{nj}^2I_{\{|X_{nj}|>\epsilon\}})\\ & \implies \max_{1\le j\le m_n} E(X_{nj}^2) \le \epsilon^2+ \max_{1\le j\le m_n}E(X_{nj}^2I_{\{|X_{nj}|>\epsilon\}})\\ & \implies \max_{1\le j\le m_n} E(X_{nj}^2) \le \epsilon^2+ \sum_{j=1}^{m_n}E(X_{nj}^2I_{\{|X_{nj}|>\epsilon\}}). \end{align*}\] If Lindeberg’s Condition holds, \[\begin{equation*} \lim_{n\rightarrow \infty}\max_{1\le j\le m_n} E(X_{nj}^2) \le \epsilon^2+ \lim_{n\rightarrow \infty}\sum_{j=1}^{m_n}E(X_{nj}^2I_{\{|X_{nj}>\epsilon\}})= \epsilon^2. \end{equation*}\] Therefore, if Lindeberg’s condition holds, then the contribution of variance of one random variable should be negligible (we will call this uniformity from now on).

Theorem (Lyapounov’s Central Limit Theorem)

within each row, random variables are independent,
All random variables have mean 0,
\(\text{Var}(\sum_{j=1}^{m_n}X_{n_j})=\sum_{j=1}^{m_n}\text{Var}(X_{n_j})=1\),
(Lyapounov’s Condition): For some \(\delta>0\) \(\sum_{j=1}^{m_n}E\big[|X_{nj}^{2+\delta}|\big]\rightarrow 0\) as \(n\rightarrow \infty\) (This means uniformity).

Then, \(\sum_{j=1}^{m_n}X_{nj}\stackrel{d}{\rightarrow} N(0,1)\) as \(n\rightarrow \infty\)

Lyapounov’s condition is stronger than Lindeberg’s condition (HW으로 보일 것).

back