Giovanni Antonio Canal (Canaletto) (1697-1768) was an Italian painter.
We have sections
\begin{equation}{\label{a}}\tag{A}\mbox{}\end{equation}
The Univariate Case
The knowledge shown in the page Random Variables and Distribution Functions is helpful for reading this page. Let \(X\) be a random variable, and let \(g:\mathbb{R}\rightarrow \mathbb{R}\) be a measurable function such that \(Y=g(X)\) is a random variable. Given the distribution of \(X\), we want to determine the distribution of \(Y\). Let \(\mathbb{P}_{X}\) and \(\mathbb{P}_{Y}\) be the distributions of \(X\) and \(Y\), respectively. For \(B\in {\cal B}\) and \(A=g^{-1}(B)\), we have
\[\mathbb{P}_{Y}(B)=\mathbb{P}(Y\in B)=\mathbb{P}(X\in A)=\mathbb{P}_{X}(A).\]
Let \(X\) be a discrete random variable taking value \(x_{i}\) for \(i=1,2,\cdots\), and let \(Y=g(X)\). Then \(Y\) is also a discrete random variable taking values \(y_{i}\) for \(i=1,2,\cdots\). We wish to determine
\[f_{Y}(y_{i})=\mathbb{P}(Y=y_{i}).\]
By taking \(B={y_{i}}\), we have \(A=\{x_{j}:g(x_{j})=y_{i}\}\). Therefore, we obtain
\begin{align*} f_{Y}(y_{i}) & =\mathbb{P}(Y=y_{i})\\ & =\mathbb{P}_{Y}({y_{i}})\\ & =\mathbb{P}_{X}(A)\\ & =\sum_{x_{j}\in A}f_{X}(x_{j}),\end{align*}
where \(f_{X}(x_{j})=\mathbb{P}(X=x_{j})\).
Example. Let \(X\) take on the values \(-n,\cdots ,-1,1,\cdots ,n\) each with probability \(1/2n\), and let \(Y=X^{2}\). Then \(Y\) takes on the values \(1,4,\cdots ,n^{2}\). If \(B={r^{2}}\) for \(r=\pm 1,\cdots ,\pm n\), then
\begin{align*} \mathbb{P}_{Y}(B) & =\mathbb{P}_{X}(A)\\ & =\mathbb{P}_{X}(g^{-1}(B))\\ & =\mathbb{P}_{X}({-r})+\mathbb{P}_{X}({r})\\ & =\frac{1}{2n}+\frac{1}{2n}\\ & =\frac{1}{n}.\end{align*}
Therefore, we obtain
\[\mathbb{P}(Y=r^{2})=\frac{1}{n}\]
for \(r=1,\cdots ,n\). \(\sharp\)
Example. Let \(X\) have a Poisson distribution \(\mathbb{P}(\lambda )\), and let
\[Y=g(X)=X^{2}+2X-3.\]
Then \(Y\) takes values \(y=x^{2}+2x-3\) for \(x=0,1,2,\cdots\). From \(x^{2}+2x-3=y\), we get \(x=-1\pm\sqrt{y+4}\). Therefore, we have \(x=-1+\sqrt{y+4}\), the root \(-1-\sqrt{y+4}\) being rejected, since it is negative. For \(B=\{y\}\), we have
\begin{align*} \mathbb{P}_{Y}(Y=y) & =\mathbb{P}_{Y}(B)\\ & =\mathbb{P}_{X}(A)\\ & =\mathbb{P}_{X}(g^{-1}(y))\\ & =\mathbb{P}_{X}({-1+\sqrt{y+4}})\\ & =\frac{e^{-\lambda}\cdot\lambda^{-1+\sqrt{y+4}}}{(-1+\sqrt{y+4})!}.\end{align*}
Example. Let \(Y=g(X)=X^{2}\). Then, we have
\begin{align*} F_{Y}(y) & =\mathbb{P}(Y\leq y)\\ & =\mathbb{P}(X^{2}\leq y)\\ & =\mathbb{P}(-\sqrt{y}\leq X\leq\sqrt{y})\\
& =\mathbb{P}(X\leq\sqrt{y})-\mathbb{P}(X<-\sqrt{y})\\ & =F_{X}(\sqrt{y})-F_{X}(-\sqrt{y}-),\end{align*}
which shows
\[F_{Y}(y)=F_{X}(\sqrt{y})-F_{X}(-\sqrt{y}-)\]
for \(y\geq 0\) and it is zero for \(y<0\). \(\sharp\)
Let \(X\) be a continuous random variable. We can consider a function of \(X\) say \(Y=u(X)\). Then \(Y\) must also be a random variable that has its own distribution. If we can find its distribution function given by
\begin{align*} F(y) & =P(Y\leq y)\\ & =P(u(X)\leq y)\end{align*}
then its p.d.f. is given by \(f(y)=F'(y)\), which is called the distribution function technique.
Let \(X\) be a continuous random variable with p.d.f. \(f(x)\) and support \(c_{1}<x<c_{2}\). We take \(Y=u(X)\) as a continuous increasing function of \(X\) with the inverse function \(X=v(Y)\). The support of \(X\), i.e., \(c_{1}<x<c_{2}\), maps onto \(d_{1}=u(c_{1})<y<d_{2}=u(c_{2})\) (since \(u(x)\) is an increasing function), which is the support of \(Y\). Since \(u\) and \(v\) are continuous and increasing functions., the distribution function of \(Y\) is given by
\begin{align*} G(y) & =P(Y\leq y)\\ & =P(u(X)\leq y)\\ & =P(X\leq v(y))\end{align*}
for \(d_{1}<y<d_{2}\). Then, we have \(G(y)=0\) for \(y\leq d_{1}\) and \(G(y)=1\) for \(y\geq d_{2}\). For \(d_{1}<y<d_{2}\), we have
\[G(y)=\int_{c_{1}}^{v(y)} f(x)dx.\]
Using the fundamental theorem of calculus, we have
\[G'(y)=g(y)=f(v(y))v'(y)\]
for \(d_{1}<y<d_{2}\). We also see \(G'(y)=g(y)=0\) for \(y<d_{1}\) or \(y>d_{2}\). We may let \(g(d_{1})=g(d_{2})=0\).
Suppose now the function \(Y=u(X)\) and its inverse \(X=v(Y)\) are continuous and decreasing functions. Then, the mapping of \(c_{1}<x<c_{2}\) would be \(d_{1}=u(c_{1})>y>d_{2}=u(c_{2})\). Since \(u\) an \(v\) are decreasing functions, we have
\begin{align*} G(y) & =P(Y\leq y)\\ & =P(u(X)\leq y)\\ & =P(X\geq v(y))\\ & =\int_{v(y)}^{c_{2}}f(x)dx\end{align*}
for \(d_{2}<y<d_{1}\). From the calculus, we have
\[G'(y)=g(y)=f(v(y))(-v'(y))\]
for \(d_{2}<y<d_{1}\), and \(G'(y)=g(y)=0\) elsewhere. In both the increasing and decreasing cases, we could write
\[g(y)=f(v(y))|v'(y)|\]
for \(y\in R_{Y}\), where \(R_{Y}\) is the support of \(Y\) found by mapping the support of \(X\), say \(R_{X}\), onto it. The absolute value \(|v'(y)|\) assures that \(g(y)\) is nonnegative.
Example. Let \(X\) have the p.d.f. \(f(x)=3(1-x)^{2}\) for \(0<x<1\). Let \(Y=(1-X)^{3}=u(X)\) be a decreasing function of \(X\). Therefore, we have \(X=1-Y^{1/3}=v(Y)\). The support \(0<x<1\) is mapped onto \(0<y<1\). Since
\[v'(y)=-\frac{1}{3y^{2/3}},\]
we have
\[g(y)=3[1-(1-y^{1/3})]^{2}\left |-\frac{1}{3y^{2/3}}\right |\]
for \(0<y<1\); that is, we have
\[g(y)=3y^{2/3}\left (\frac{1}{3y^{2/3}}\right )=1\]
for \(0<y<\), which shows that \(Y=(1-X)^{3}\) has the uniform distribution \(U(0,1)\). \(\sharp\)
Example. Let \(X\) have the p.d.f.
\[f(x)=\frac{x^{2}}{3}\mbox{ for }-1<x<2.\]
The random variable \(Y=X^{2}\) will have the support \(0\leq y<4\). However, for \(0<y<1\), we obtain the two-to-one transformation represented by \(x_{1}=-\sqrt{y}\) for \(-1<x_{1}<0\), and \(x_{2}=\sqrt{y}\) for \(0<x_{2}<2\). On the other hand, if \(1<y<4\), the one-to-one transformation is represented by \(x_{2}=\sqrt{y}\), \(1<x_{2}<2\). Since
\[\frac{dx_{1}}{dy}=-\frac{1}{2\sqrt{y}}\mbox{ and }\frac{dx_{2}}{dy}=\frac{1}{2\sqrt{y}},\]
the p.d.f. of \(Y=X^{2}\) is given by
\[g(y)=\left\{\begin{array}{ll}
{\displaystyle \frac{(-\sqrt{y})^{2}}{3}\left |-\frac{1}{2\sqrt{y}}\right |+\frac{(\sqrt{y})^{2}}{3}\left |\frac{1}{2\sqrt{y}}\right |=
\frac{\sqrt{y}}{3}}, & 0<y<1,\\
{\displaystyle\frac{(\sqrt{y})^{2}}{3}\left |\frac{1}{2\sqrt{y}}\right |=\frac{\sqrt{y}}{6}}, & 1<y<4.
\end{array}\right .\]
For \(0<y<1\), the p.d.f. is the sum of two terms but only one in case \(1<y<4\). These different expressions in \(g(y)\) correspond to the two types pf transformations, namely the 2-to-1 and the 1-to-1 transformations, respectively.
(Second method). Since \(-2<-\sqrt{y}<-1\) for \(1<y<4\), we consider the distribution function
\begin{align*}
G(y) & =\mathbb{P}(Y\leq y)\\ & =\mathbb{P}(X^{2}\leq y)\\ & =\mathbb{P}(-\sqrt{y}\leq X\leq\sqrt{y})\\ & ={\displaystyle \int_{-\sqrt{y}}^{\sqrt{y}}f(x)dx}\\
& =\left\{\begin{array}{ll}
{\displaystyle \int_{-\sqrt{y}}^{\sqrt{y}} f(x)dx} & \mbox{if \(0<y<1\)}\\
{\displaystyle \int_{-1}^{\sqrt{y}} f(x)dx} & \mbox{if \(1<y<4\)}\end{array}\right .\\
& =\left\{\begin{array}{ll}
\mathbb{P}(-\sqrt{y}\leq X\leq\sqrt{y}) & \mbox{if \(0<y<1\)}\\
\mathbb{P}(-1\leq X\leq\sqrt{y}) & \mbox{if \(1<y<4\)}\end{array}\right .
\end{align*}
Therefore, we obtain
\begin{align*} G'(y) & =g(y)\\ & =\left\{\begin{array}{ll}
{\displaystyle f(\sqrt{y})\cdot\frac{1}{2\sqrt{y}}+f(-\sqrt{y})\cdot\frac{1}{2\sqrt{y}}} & \mbox{if \(0<y<1\)}\\
{\displaystyle f(\sqrt{y})\cdot\frac{1}{2\sqrt{y}}} & \mbox{if \(1<y<4\)}
\end{array}\right .\end{align*}
The p.d.f. of \(Y=X^{2}\) is then given by
\[g(y)=\left\{\begin{array}{ll}
{\displaystyle\frac{(-\sqrt{y})^{2}}{3}\left |-\frac{1}{2\sqrt{y}}\right |+
\frac{(\sqrt{y})^{2}}{3}\left |\frac{1}{2\sqrt{y}}\right |=\frac{\sqrt{y}}{3}}, & 0<y<1,\\
{\displaystyle\frac{(\sqrt{y})^{2}}{3}\left |\frac{1}{2\sqrt{y}}\right |=\frac{\sqrt{y}}{6}}, & 1<y<4.
\end{array}\right .\]
Theorem. Let \(X\) be \(N(\mu ,\sigma^{2})\). Then \(Z=(X-\mu )/\sigma\) is \(N(0,1)\).
Proof. The distribution function of \(Z\) is given by
\begin{align*} P(Z\leq z) & =P\left (\frac{X-\mu}{\sigma}\leq z\right )\\ & =P(X\leq z\sigma+\mu )\\ & =\int_{-\infty}^{z\sigma +\mu}\frac{1}{\sigma\sqrt{2\pi}}\exp\left [-\frac{(x-\mu )^{2}}{2\sigma^{2}}\right ]dx.\end{align*}
Using the change of variable of integration given by \(w=(x-\mu )/\sigma\) (i.e. \(x=w\sigma +\mu\)) to obtain
\[P(Z\leq z)=\int_{-\infty}^{z}\frac{1}{\sqrt{2\pi}}e^{-w^{2}/2}dw,\]
which is the expression for \(\Phi (z)\), i.e., the distribution function of a standardized normal random variable. This completes the proof. \(\blakcsquare\)
This theorem can be used to find probabilities about \(X=N(\mu ,\sigma^{2})\) as follows.
\begin{align*} P(a\leq X\leq b) & =P\left (\frac{a-\mu}{\sigma}\leq\frac{X-\mu}{\sigma}\leq\frac{b-\mu}{\sigma}\right )\\ & =\Phi\left (\frac{b-\mu}{\sigma}\right )-\Phi\left (\frac{a-\mu}{\sigma}\right ),\end{align*}
since \((X-\mu )/\sigma\) is \(N(0,1)\).
Theorem. Let the random variable \(X\) be \(N(\mu ,\sigma^{2})\). Then, the random variable \(V=[(X-\mu )/\sigma ]^{2}=Z^{2}\) is \(\chi^{2}(1)\).
Proof. Since \(V=Z^{2}\), where \(Z=(X-\mu )/\sigma\) is \(N(0,1)\), the distribution function \(F(v)\) of \(V\) is given by
\begin{align*} F(v) & =P(Z^{2}\leq v)\\ & =P(-\sqrt{v}\leq Z\leq\sqrt{v}).\end{align*}
For \(v\geq 0\), we have
\begin{align*} F(v) & =\int_{-\sqrt{v}}^{\sqrt{v}}\frac{1}{\sqrt{2\pi}}e^{-z^{2}/2}dz\\ & =2\int_{0}^{\sqrt{v}}\frac{1}{\sqrt{2\pi}}e^{-z^{2}/2}dz.\end{align*}
Using the change the variable of integration by writing \(z=\sqrt{y}\), we obtain
\[F(v)=\int_{0}^{v}\frac{1}{\sqrt{2\pi y}}e^{-y/2}dy\]
for \(0\leq v\). We also have \(F(v)=0\) for \(v<0\). Therefore, using the fundamental theorem of calculus, the p.d.f. \(f(v)=F'(v)\) of the continuous random variable \(V\) is given by
\[f(v)=\frac{1}{\sqrt{\pi}\sqrt{2}}v^{1/2-1}e^{-v/2}\]
for \(0<v<\infty\). Since \(f(v)\) is a p.d.f., it must be true to have
\[\int_{0}^{\infty}\frac{1}{\sqrt{\pi}\sqrt{2}}v^{1/2-1}e^{-v/2}=1.\]
The change of variables \(x=v/2\) yields
\begin{align*} 1 & =\frac{1}{\sqrt{\pi}}\int_{0}^{\infty}x^{1/2-1}e^{-x}dx\\ & =\frac{1}{\sqrt{\pi}}\Gamma (1/2).\end{align*}
Since \(\Gamma (1/2)=\sqrt{\pi}\), it shows that \(V\) is \(\chi^{2}(1)\). This completes the proof. \(\blacksquare\)
Let \(X\) be \(N(\mu ,\sigma^{2})\). We let \(W=e^{X}\). Then \(X=\ln W\). Therefore, \(W\) is called to have a lognormal distribution. The distribution function of \(W\) is show below.
Theorem. Let \(X\) be \(N(\mu ,\sigma^{2})\), and let \(W=e^{X}\). Then, the p.d.f. of \(W\) is given by
\[g(w) =\frac{1}{\sigma w\sqrt{2\pi}}\exp\left [-\frac{(\ln w-\mu )^{2}}{2\sigma^{2}}\right ]\]
for \(w>0\).
Proof. The distribution function of \(W\) is given by
\begin{align*} G(w) & =P(W\leq w)\\ & =P(e^{X}\leq w)\\ & =P(X\leq\ln w)\end{align*}
for \(0<w\).; that is
\[G(w)=\int_{-\infty}^{\ln w}\frac{1}{\sigma\sqrt{2\pi}}\exp\left [-\frac{(x-\mu )^{2}}{2\sigma^{2}}\right ]dx\]
for \(0<w\). Therefore, the p.d.f. of \(W\) is given by
\begin{align*} g(w) & =G'(w)\\ & =\frac{1}{\sigma w\sqrt{2\pi}}\exp\left [-\frac{(\ln w-\mu )^{2}}{2\sigma^{2}}\right ]\end{align*}
for \(0<w\). This completes the proof. \(\blacksquare\)
Theorem. Let
\[T=\frac{Z}{\sqrt{U/r}},\]
where \(Z\) is \(N(0,1)\) and \(U\) is \(\chi^{2}(r)\) such that \(Z\) and \(U\) are independent. Then, the p.d.f. of \(T\) is
\[f(t)=\frac{\Gamma ((r+2)/2)}{\sqrt{\pi r}\Gamma (r/2)(1+t^{2}/r)^{(r+1)/2}}.\]
Proof. The independence of \(Z\) and \(U\) says that the joint p.d.f. of \(Z\) and \(U\) is given by
\[g(z,u)=\frac{1}{\sqrt{2\pi}}e^{-z^{2}/2}\frac{1}{\Gamma (r/2)2^{r/2}}u^{r/2-1}e^{-u/2}\]
for \(-\infty <z<\infty\) and \(0<u<\infty\). The distribution function \(F(t)=P(T\leq t)\) of \(T\) is given by
\begin{align*} F(t) & =P(Z/\sqrt{U/r}\leq t)\\ & =P(Z\leq t\sqrt{U/r})\\ & =\int_{0}^{\infty}\int_{-\infty}^{t\sqrt{u/r}} g(z,u)dzdu;\end{align*}
that is, we have
\[F(t)=\frac{1}{\sqrt{\pi}\Gamma (r/2)}\int_{0}^{\infty}\left [\int_{-\infty}^{\sqrt{u/r}}\frac{e^{-z^{2}/2}}{2^{(r+1)/2}}dz\right ]u^{r/2-1}e^{-u/2}du.\]
The p.d.f. of \(T\) is the derivative of the distribution function. Applying the fundamental theorem of calculus, we have
\begin{align*} f(t) & =F'(t)\\ & =\frac{1}{\sqrt{\pi}\Gamma (r/2)}\int_{0}^{\infty}\frac{e^{-(u/2)(t^{2}/r)}}{2^{(r+1)/2}}\sqrt{\frac{u}{r}}u^{r/2-1}e^{-u/2}du\\ & =\frac{1}{\sqrt{\pi r}\Gamma (r/2)}\int_{0}^{\infty}\frac{u^{(r+1)/2-1}}{2^{(r+1)/2}}e^{-(u/2)(1+t^{2}/r)}du.\end{align*}
Using the change of variable by taking \(y=(1+t^{2}/r)u\), we have
\[\frac{du}{dy}=\frac{1}{1+t^{2}/r}\]
and
\[f(t)=\frac{\Gamma ((r+1)/2)}{\sqrt{\pi r}\Gamma (r/2)}\left [\frac{1}{(1+t^{2}/2)^{(r+1)/2}}\right ]\int_{0}^{\infty}\frac{y^{(r+1)/2-1}}{\Gamma ((r+1)/2)2^{(r+1)/2}}e^{-y/2}dy.\]
The integral in the last expression for \(f(t)\) is equal to \(1\) since the integrand is like the p.d.f. of a chi-square distribution with \(r+1\) degrees of freedom. Therefore, the p.d.f. is given by
\[f(t)=\frac{\Gamma ((r+2)/2)}{\sqrt{\pi r}\Gamma (r/2)(1+t^{2}/r)^{(r+1)/2}}\]
for \(-\infty <t<\infty\). This completes the proof. \(\blacksquare\)
Theorem. Let
\[W=\frac{U/r_{1}}{V/r_{2}},\]
where \(U\) and \(V\) are independent chi-square random variables with \(r_{1}\) and \(r_{2}\) degrees of freedom, respectively. The p.d.f. of \(F\) is given by
\[f(w)=\frac{(r_{1}/r_{2})^{r_{1}/2}\Gamma ((r_{1}+r_{2})/2)w^{r_{1}/2-1}}{\Gamma (r_{1}/2)\Gamma (r_{2}/2)[1+(r_{1}w/r_{2})]^{(r_{1}+r_{2})/2}}.\]
Proof. The independence of \(U\) and \(V\) says that the joint p.d.f. of \(U\) and \(V\) is given by
\[g(u,v)=\frac{u^{r_{1}/2-1}e^{-u/2}}{\Gamma (r_{1}/2)2^{r_{1}/2}}\cdot\frac{u^{r_{2}/2-1}e^{-v/2}}{\Gamma (r_{2}/2)2^{r_{2}/2}}\]
for \(0<u<\infty\) and \(0<v<\infty\). The distribution function \(F(w)=P(W\leq w)\) of \(W\) is given by
\begin{align*} F(w) & =P\left (\frac{U/r_{1}}{V/r_{2}}\leq w\right )\\ & =P\left (U\leq\frac{r_{1}}{r_{2}}wv\right )\\ & =\int_{0}^{\infty}\int_{0}^{(r_{1}/r_{2})wv}g(u,v)dudv;\end{align*}
that is, we have
\[F(w)=\frac{1}{\Gamma (r_{1}/2)\Gamma (r_{2}/2)}\int_{0}^{\infty}\left [\int_{0}^{(r_{1}/r_{2})wv}\frac{u^{r_{1}/2-1}e^{-u/2}}{2^{(r_{1}+r_{2})/2}}du\right ]v^{r_{2}-1}e^{-v/2}dv.\]
The p.d.f. of \(W\) is the derivative of the distribution function. Applying the fundamental theorem of calculus, we have
\begin{align*} f(w) & =F'(w)\\ & =\frac{1}{\Gamma (r_{1}/2)\Gamma (r_{2}/2)}\int_{0}^{\infty}\frac{[(r_{1}/r_{2})vw]^{r_{1}/2-1}}{2^{(r_{1}+r_{2})/2}}e^{(r_{1}/2r_{2}(vw)}\left (\frac{r_{1}}{r_{2}}v\right )v^{r_{2}/2-1}e^{-v/2}dv\\ & =\frac{(r_{1}/r_{2})w^{r_{1}/2-1}}{\Gamma (r_{1}/2 )\Gamma (r_{2}/2)}\int_{0}^{\infty}\frac{v^{(r_{1}+r_{2})/2-1}}{2^{(r_{1}+r_{2})/2}}e^{(-v/2)[1+(r_{1}/r_{2})w]}dv.\end{align*}
Using the change of variable by taking
\[y=\left (1+\frac{r_{1}}{r_{2}}\right )v,\]
we have
\[\frac{dv}{dy}=\frac{1}{1+(r_{1}/r_{2})w}\]
and
\begin{align*} f(w) & =\frac{(r_{1}/r_{2})^{r_{1}/2}\Gamma ((r_{1}+r_{2})/2)w^{r_{1}/2-1}}{\Gamma (r_{1}/2)\Gamma (r_{2}/2)[1+(r_{1}w/r_{2})]^{(r_{1}+r_{2})/2}}\int_{0}^{\infty}\frac{y^{(r_{1}+r_{2})/2-1}e^{-y/2}}{\Gamma ((r_{1}+r_{2})/2)2^{(r_{1}+r_{2})/2}}dy\\ & =\frac{(r_{1}/r_{2})^{r_{1}/2}\Gamma ((r_{1}+r_{2})/2)w^{r_{1}/2-1}}{\Gamma (r_{1}/2)\Gamma (r_{2}/2)[1+(r_{1}w/r_{2})]^{(r_{1}+r_{2})/2}}.\end{align*}
The integral in the last expression for \(f(w)\) is equal to \(1\) since the integrand is like a p.d.f. of a chi-square distribution with \(r_{1}+r_{2}\) degrees of freedom. This completes the proof. \(\blacksquare\)
Theorem. Let \(Y\) have a distribution that is \(U(0,1)\), and let \(F(x)\) have the properties of a distribution function of the continuous type with \(F(a)=0\) and \(F(b)=1\) such that \(F(x)\) is strictly increasing on the support \(a<x<b\), where \(a\) and \(b\) can be \(-\infty\) and \(\infty\). respectively. Then, the random variable defined by \(X=F^{-1}(Y)\) is a continuous random variable with distribution function \(F(x)\).
Proof. The distribution function of \(X\) is given by
\[\mathbb{P}(X\leq x)=\mathbb{P}(F^{-1}(Y)\leq x)\]
for \(a<x<b\). Since the function \(F(x)\) is strictly increasing, it says that \({F^{-1}(Y)\leq x}\) is equivalent to \({Y\leq F(x)}\). Therefore, we have
\[\mathbb{P}(X\leq x)=\mathbb{P}(Y\leq F(x))\]
for \(a<x<b\). Since \(Y\) is \(U(0,1)\), it means \(\mathbb{P}(Y\leq y)=y\) for \(0<y<1\). Therefore, we obtain
\[\mathbb{P}(X\leq x)=\mathbb{P}(Y\leq F(x))=F(x)\]
for \(0<F(x)<1\), which shows that the distribution function of \(X\) is \(F(x)\). This completes the proof. \(\blacksquare\)
The following probability integral transformation theorem is the converse of the above theorem.
Theorem. Let \(X\) have the distribution function \(F(x)\) of the continuous type that is strictly increasing on the support \(a<x<b\). Then, the random variable \(Y\), defined by \(Y=F(x)\), has a distribution \(U(0,1)\).
Proof. Since \(F(a)=0\) and \(F(b)=1\), the distribution function of \(Y\) is given by
\[\mathbb{P}(Y\leq y)=\mathbb{P}(F(X)\leq y)\]
for \(0<y<1\). Since \(\{F(X)\leq y\}\) is equivalent to \(\{X\leq F^{-1}(y)\}\), we have
\[\mathbb{P}(Y\leq y)=\mathbb{P}(X\leq F^{-1}(y))\]
for \(0<y<1\). Since \(\mathbb{P}(X\leq x)=F(x)\), we also have
\begin{align*} \mathbb{P}(Y\le y) & =\mathbb{P}(X\leq F^{-1}(y))\\ & =F(F^{-1}(y))=y\end{align*}
for \(0<y<1\), which is the distribution function of a \(U(0,1)\) random variable. \(\blacksquare\)
Although in the statements and proofs of the above two theorems we required \(F(x)\) to be strictly increasing, this restriction can be dropped, and both theorems are still true.
\begin{equation}{\label{b}}\tag{B}\mbox{}\end{equation}
The Multivariate Case.
Now, we consider the transformation of two or more random variables. In the case of a single-valued inverse, the rule is about the same as that in the one-variable case, with the derivative being replaced by the Jacobian.
Theorem. Let the random vector \({\bf X}\) have the continuous p.d.f. \(f_{\bf X}\) except possibly for finitely many \({\bf x}\)’s, and let
\[{\bf y}={\bf g}({\bf x})=(g_{1}({\bf x}),\cdots ,g_{n}({\bf x}))\]
be a measurable transformation defined on \(\mathbb{R}^{n}\) into \(\mathbb{R}^{n}\) such that \({\bf Y}={\bf g}({\bf X})\) is a random vector. Let \(S\) be the set of positivity of \(f_{\bf X}\). Suppose that \({\bf g}\) is one-to-one on \(S\) onto \(T\) (the image of \(S\) under \({\bf g}\)) such that the inverse transformation
\[{\bf x}={\bf g}^{-1}({\bf y})=(h_{1}({\bf y}),\cdots ,h_{n}({\bf y}))\]
exists for \({\bf y}\in T\). It is further assumed that the partial derivatives
\[h_{ji}({\bf y})=\frac{\partial}{\partial y_{i}}h_{j}({\bf y})\]
exist for \(i,j=1,\cdots ,n\) and are continuous on \(T\). Then the p.d.f. \(f_{\bf Y}\) of \({\bf Y}\) is given by
\[f_{\bf Y}({\bf y})=\left\{\begin{array}{ll}
f_{\bf X}({\bf g}^{-1}({\bf y}))\cdot |J|=f_{\bf X}(h_{1}({\bf y}),\cdots ,
h_{n}({\bf y}))\cdot |J| & {\bf y}\in T\\
0 & \mbox{otherwise},
\end{array}\right .\]
where the Jacobian \(J\) is a function of \({\bf y}\) and is defined by
\[J=\left |\begin{array}{cccc}
h_{11} & h_{12} & \cdots & h_{1n}\\
h_{21} & h_{22} & \cdots & h_{2n}\\
\vdots & \vdots & \vdots & \vdots\\
h_{n1} & h_{n2} & \cdots & h_{nn}
\end{array}\right |.\]
Example. Let \(X_{1}\) and \(X_{2}\) be i.i.d. random variables distributed as \(U(\alpha ,\beta )\). Set \(Y_{1}=X_{1}+X_{2}\) and find the p.d.f. of \(Y_{1}\). We first have
\begin{align*} f_{X_{1},X_{2}}(x_{1},x_{2}) & =f_{X_{1}}(x_{1})\cdot f_{X_{2}}(x_{2})\\ & =
\left\{\begin{array}{ll}
1/(\beta -\alpha )^{2} & \alpha <x_{1},x_{2}<\beta\\
0 & \mbox{otherwise}.
\end{array}\right .\end{align*}
Consider the transformation
\[g:\left\{\begin{array}{l}
y_{1}=x_{1}+x_{2}\\
y_{2}=x_{2}\end{array}\right .\]
for \(\alpha <x_{1},x_{2}<\beta\). Then, we have
\[\left\{\begin{array}{l}
Y_{1}=X_{1}+X_{2}\\
Y_{2}=X_{2}\end{array}\right .\]
From \(g\), we get \(x_{1}=y_{1}-y_{2}\) and \(x_{2}=y_{2}\). The Jacobian is given by
\[J=\left |\begin{array}{cc}1 & -1\\ 0 & 1\end{array}\right |=1.\]
We also have \(\alpha <y_{2}<\beta\). Since \(y_{1}-y_{2}=x_{1}\) and \(\alpha <x_{1}<\beta\), it follows \(\alpha <y_{1}-y_{2}<\beta\). Therefore, the limits of \(y_{1}\) and \(y_{2}\) are specified by \(\alpha <y_{2}<\beta\) and \(\alpha <y_{1}-y_{2}<\beta\). Now, we have
\[f_{Y_{1},Y_{2}}(y_{1},y_{2})=\left\{\begin{array}{ll}
1/(\beta -\alpha )^{2} & 2\alpha <y_{1}<2\beta ,\alpha<y_{2}<\beta ,
\alpha <y_{1}-y_{2}<\beta\\
0 & \mbox{otherwise}.
\end{array}\right .\]
Therefore, we obtain
\[f_{Y_{1}}(y_{1})=\left\{\begin{array}{ll}
{\displaystyle \frac{1}{(\beta -\alpha )^{2}}\cdot
\int_{\alpha}^{y_{1}-\alpha}dy_{2}=\frac{y_{1}-2\alpha}
{(\beta -\alpha )^{2}}} & 2\alpha <y_{1}<\alpha +\beta\\
{\displaystyle \frac{1}{(\beta -\alpha )^{2}}\cdot
\int_{y_{1}-\beta}^{\beta}dy_{2}=\frac{2\beta -y_{1}}
{(\beta -\alpha )^{2}}} & \alpha +\beta<y_{1}<2\beta\\
0 & \mbox{otherwise}.
\end{array}\right .\]
This density is known as triangular p.d.f. \(\sharp\)
Example. Let \(X_{1}\) and \(X_{2}\) be i.i.d. from \(U(1,\beta )\). Set \(Y_{1}=X_{1}X_{2}\) and find the p.d.f. of \(Y_{1}\). We consider the transformation
\[g:\left\{\begin{array}{l}y_{1}=x_{1}x_{2}\\ y_{2}=x_{2}\end{array}\right .\]
Then, we have
\[\left\{\begin{array}{l}Y_{1}=X_{1}X_{2}\\ Y_{2}=X_{2}\end{array}\right .\]
From \(g\), we get \(x_{1}=y_{1}/y_{2}\) and \(x_{2}=y_{2}\). The Jacobian is given by
\[J=\left |\begin{array}{cc}1/y_{2} & -y_{1}/y_{2}^{2}\\ 0 & 1\end{array}\right |=\frac{1}{y_{2}}.\]
The region
\[S=\{(x_{1},x_{2}):f_{X_{1},X_{2}}(x_{1},x_{2})>0\}\]
is transformed by \(g\) onto the region
\[T=\{(y_{1},y_{2}):1\leq (y_{1}/y_{2})\leq\beta ,1\leq y_{1}\leq\beta^{2},1\leq y_{2}\leq\beta\}.\]
Since
\[f_{Y_{1},Y_{2}}(y_{1},y_{2})=\left\{\begin{array}{ll}
{\displaystyle \frac{1}{(\beta -1)^{2}}\cdot\frac{1}{y_{2}}} & (y_{1},y_{2})\in T\\
0 & \mbox{otherwise},
\end{array}\right .\]
we have
\[f_{Y_{1}}(y_{1})=\left\{\begin{array}{ll}
{\displaystyle \frac{1}{(\beta -1)^{2}}\cdot\int_{1}^{y_{1}}\frac{1}{y_{2}}
dy_{2}=\frac{1}{(\beta -1)^{2}}\cdot\ln y_{1}} & 1<y_{1}<\beta\\
{\displaystyle \frac{1}{(\beta -1)^{2}}\cdot\int_{y_{1}/\beta}^{\beta}
\frac{1}{y_{2}}dy_{2}=\frac{1}{(\beta -1)^{2}}\cdot (2\ln\beta -\ln y_{1}})
& \beta <y_{1}<\beta^{2}.
\end{array}\right .\]
Example. Let \(X_{1}\) and \(X_{2}\) be i.i.d. from \(N(0,1)\). Set \(Y_{1}=X_{1}/X_{2}\) and find the p.d.f. of \(Y_{1}\). Let \(Y_{2}=X_{2}\) and consider the transformation \(y_{1}=x_{1}/x_{2}\) and \(y_{2}=x_{2}\) for \(x_{2}\neq 0\). Then, we have \(x_{1}=y_{1}y_{2}\) and \(x_{2}=y_{2}\). The Jacobian is given by
\[J=\left |\begin{array}{cc}y_{2} & y_{1}\\ 0 & 1\end{array}\right |=y_{2}.\]
Since \(-\infty <x_{1},x_{2}<\infty\) implies \(-\infty <y_{1},y_{2}<\infty\), we have
\begin{align*} f_{Y_{1},Y_{2}}(y_{1},y_{2}) & =f_{X_{1},X_{2}}(y_{1}y_{2},y_{2})\cdot |y_{2}|\\ & =\frac{1}{2\pi}\cdot\exp\left (-\frac{y_{1}^{2}y_{2}^{2}+y_{2}^{2}}
{2}\right )\cdot |y_{2}|.\end{align*}
Therefore, we obtain
\begin{align*} f_{Y_{1}}(y_{1}) & =\frac{1}{2\pi}\int_{-\infty}^{\infty}\exp\left (-\frac{y_{1}^{2}y_{2}^{2}+y_{2}^{2}}{2}\right )\cdot |y_{2}|dy_{2}\\ & =\frac{1}{\pi}\int_{0}^{\infty}\exp\left [-\frac{(y_{1}^{2}+1)y_{2}^{2}}{2}\right ]\cdot y_{2}dy_{2}.\end{align*}
Let \(t=(y_{1}^{2}+1)y_{2}^{2}/2\). We also obtain
\begin{align*} f_{Y_{1}}(y_{1}) & =\frac{1}{\pi}\cdot\frac{1}{y_{1}^{2}+1}\cdot\int_{0}^{\infty}e^{-t}dt\\ & =\frac{1}{\pi}\cdot\frac{1}{y_{1}^{2}+1}.\end{align*}
Example. Let \(X_{1}\) and \(X_{2}\) have independent Gamma distribution with parameters \(\alpha\), \(\theta\) and \(\beta\), \(\theta\), respectively. That is, the joint p.d.f. of \(X_{1}\) and \(X_{2}\) is given by
\begin{align*} f_{X_{1},X_{2}}(x_{1},x_{2}) & =f_{X_{1}}(x_{1})f_{X_{2}}(x_{2})\\ & =\frac{1}{\Gamma (\alpha )\Gamma (\beta )\theta^{\alpha +\beta}}
x_{1}^{\alpha -1}x_{2}^{\beta -1}\exp\left (-\frac{x_{1}+x_{2}}{\theta}\right )\end{align*}
for \(0<x_{1}<\infty, 0<x_{2}<\infty\). Consider
\[Y_{1}=\frac{X_{1}}{X_{1}+X_{2}}\mbox{ and }Y_{2}=X_{1}+X_{2}.\]
Then, we have
\[X_{1}=Y_{1}Y_{2}\mbox{ and }X_{2}=Y_{2}-Y_{1}Y_{2}.\]
The Jacobian is given by
\[J=\left |\begin{array}{cc}
y_{2} & y_{1}\\ -y_{2} & 1-y_{1}
\end{array}\right |=y_{2}(1-y_{1})+y_{1}y_{2}=y_{2}.\]
Therefore, the joint p.d.f. \(g(y_{1},y_{2})\) of \(Y_{1}\) and \(Y_{2}\) is given by
\[f_{Y_{1},Y_{2}}(y_{1},y_{2})=|y_{2}|\frac{1}{\Gamma (\alpha )\Gamma (\beta )
\theta^{\alpha +\beta}}(y_{1}y_{2})^{\alpha -1}(y_{2}-y_{1}y_{2})^{\beta -1}e^{-y_{2}/\theta},\]
where the support is \(0<y_{1}<1\) and \(0<y_{2}<\infty\). The marginal p.d.f. of \(Y_{1}\) is given by
\[f_{Y_{1}}(y_{1})=\frac{y_{1}^{\alpha -1}(1-y_{1})^{\beta -1}}
{\Gamma (\alpha )\Gamma (\beta )}\int_{0}^{\infty}
\frac{y_{2}^{\alpha +\beta -1}}{\theta^{\alpha +\beta}}e^{-y_{2}/\theta}dy_{2}.\]
Since the integral in this expression is a gamma p.d.f. with parameters \(\alpha +\beta\) and \(\theta\), except for \(\Gamma (\alpha +\beta )\) in the denominator; the integral equals \(\Gamma (\alpha +\beta )\) and
\[f_{Y_{1}}(y_{1})=\frac{\Gamma (\alpha +\beta )}{\Gamma (\alpha )\Gamma (\beta )}y_{1}^{\alpha -1}(1-y_{1})^{\beta -1}\]
for \(0<y_{1}<1\). We say that \(Y_{1}\) has a Beta p.d.f. with parameters \(\alpha\) and \(\beta\). \(\sharp\)
Example. (Box-Muller transformation). Consider the following transformation
\[Z_{1}=\sqrt{-2\ln X_{1}}\cos (2\pi X_{2})\mbox{ and }Z_{2}=\sqrt{-2\ln X_{1}}\sin (2\pi X_{2}),\]
where \(X_{1}\) and \(X_{2}\) are observations of a random sample from \(U(0,1)\). Then, we have
\[X_{1}=\exp\left (-\frac{Z_{1}^{2}+Z_{2}^{2}}{2}\right )=e^{-q/2}\mbox{ and }
X_{2}=\frac{1}{2\pi}\tan^{-1}\left (\frac{Z_{2}}{Z_{1}}\right )\]
The Jacobian is given by
\[J=\left |\begin{array}{cc}
-z_{1}e^{-q/2} & -z_{2}e^{-q/2}\\
{\displaystyle \frac{-z_{2}}{2\pi (z_{1}^{2}+z_{2}^{2})}} &
{\displaystyle \frac{z_{1}}{2\pi (z_{1}^{2}+z_{2}^{2})}}
\end{array}\right |=\frac{-1}{2\pi}e^{-q/2}.\]
Since the joint p.d.f. of \(X_{1}\) and \(X_{2}\) is \(f_{X_{1},X_{2}}(x_{1},x_{2})=1\) for \(0<x_{1}<1\) and \(0<x_{2}<1\), the joint p.d.f. of \(Z_{1}\) and \(Z_{2}\) is given by
\[f_{Z_{1},Z_{2}}(z_{1},z_{2})=\left |\frac{-1}{2\pi}e^{-q/2}\right |\cdot 1=\frac{1}{2\pi}\exp\left (-\frac{z_{1}^{2}+z_{2}^{2}}{2}\right )\]
for \(-\infty <z_{1}<\infty\) and \(-\infty <z_{2}<\infty\). It means that, from two independent \(U(0,1)\) random variables, we have generated two independent \(N(0,1)\) random variables through this Box-Muller transformation.
Example. Let \(X_{1},X_{2},X_{3}\) be independent with p.d.f. \(f(x)=e^{-x}\) for \(x>0\). Let
\begin{align*} Y_{1} & =\frac{X_{1}}{X_{1}+X_{2}},\\ Y_{2} & =\frac{X_{1}+X_{2}}{X_{1}+X_{2}+X_{3}},\\ Y_{3} & =X_{1}+X_{2}+X_{3}\end{align*}
Then, we have
\begin{align*} X_{1} & =Y_{1}Y_{2}Y_{3},\\ X_{2} & =-Y_{1}Y_{2}Y_{3}+Y_{2}Y_{3},\\ X_{3} & =-Y_{2}Y_{3}+Y_{3}\end{align*}
The Jacobian is given by
\[J=\left |\begin{array}{ccc}
y_{2}y_{3} & y_{1}y_{3} & y_{1}y_{2}\\
-y_{2}y_{3} & -y_{1}y_{3}+y_{3} & -y_{1}y_{2}+y_{2}\\
0 & -y_{3} & -y_{2}+1
\end{array}\right |=y_{2}y_{3}^{2}.\]
According to the transformation, it follows that \(x_{1},x_{2},x_{3}\in (0,\infty )\) implies \(0<y_{1}<1\), \(0<y_{2}<1\) and \(0<y_{3}<\infty\). Then, we have
\[f_{Y_{1},Y_{2},Y_{3}}(y_{1},y_{2},y_{3})=y_{2}y_{3}^{2}e^{-y_{3}}\]
for \(0<y_{1}<1\), \(0<y_{2}<1\) and \(0<y_{3}<\infty\). Therefore, we obtain
\begin{align*} f_{Y_{1}}(y_{1}) & =\int_{0}^{\infty}\int_{0}^{1}y_{2}y_{3}^{2}e^{-y_{3}}dy_{2}dy_{3}=1\mbox{ for }0<y_{1}<1,\\
f_{Y_{2}}(y_{2}) & =\int_{0}^{\infty}\int_{0}^{1}y_{2}y_{3}^{2}e^{-y_{3}}dy_{1}dy_{3}\\ & =y_{2}\int_{0}^{\infty}y_{3}^{2}e^{-y_{3}}dy_{3}=2y_{2}\mbox{ for }0<y_{2}<1,\\
f_{Y_{3}}(y_{3}) & =\int_{0}^{1}\int_{0}^{1}y_{2}y_{3}^{2}e^{-y_{3}}dy_{1}dy_{2}\\ & =y_{3}^{2}e^{-y_{3}}\int_{0}^{1}y_{2}dy_{2}=\frac{1}{2}y_{3}^{2}
e^{-y_{3}}\mbox{ for }0<y_{3}<\infty ,\end{align*}
Since
\[f_{Y_{1},Y_{2},Y_{3}}(y_{1},y_{2},y_{3})=f_{Y_{1}}(y_{1})f_{Y_{2}}(y_{2})f_{Y_{3}}(y_{3}),\]
it follows that \(Y_{1},Y_{2},Y_{3}\) are independent. Furthermore, we see that \(Y_{1}\) is \(U(0,1)\) and \(Y_{3}\) has a gamma distribution with \(\alpha =3\) and \(\theta =1\).


