Yumi's Blog

Inverse transform sampling and other sampling techniques

Random number generation is important techniques in various statistical modeling, for example, to create Markov Chain Monte Carlo algorithm, or simple Monte Carlo simulation. Here, I make notes on some standard sampling techiniques, and demonstrate its useage in R.

Inverse Transform Sampling

Inverse Transform Sampling is a powerful sampling technique because you can generate samples from any distribution with this technique, as long as its cumulative distribution function exists.


  • generate random samples $X$ from some distribution $F_X$.


  • Step 1: generate random sample $U \sim Unif(0,1)$
  • Step 2: Set $x = F^{-1}_X (u)$

Then $x$s are from $F_X$!! Simple!

Proof: $$ Pr(F_X^{-1}(U)\le t) = Pr(U\le F_X(t)) = F_X(t) $$

The last equality is true because $U$ is from uniform distribution and uniform CDF has:

$$ F_U(t) = Pr(U \le t) = t, \;\;\; t \in (0,1) $$

So $F_X^{-1}(U) \sim F_X$.

Example in R

Let's try to generate gamma distribution with shape 1, scale 1. In R,

    qgamma(p, shape, scale)

evalutes the inverse CDF at point $p$. I will generate 1000 samples from Uniform distribution, and inverse transform using Gamma inverse CDF.

In [1]:
U <- runif(0,1,n=5000)
G <- pgamma(U,shape=1,scale=1) 
options(repr.plot.width=10, repr.plot.height=4)

QQ plot

To evaluate weather $G$ is actually from the Gamma(scale=1,shape=1), I use quantile-quantile plot. A Q-Q plot is a scatterplot created by plotting two sets of quantiles against one another. If both sets of quantiles came from the same distribution, we should see the points forming a line that’s roughly straight.

In [2]:
probs <- seq(0,1,0.001)
quantile_sample <- quantile(G, probs=probs)
quantile_theory <- pgamma(probs, scale=1,shape=1)

The quantiles of the random sample almost agrees to the Gamma's CDF.

In [3]:
options(repr.plot.width=4, repr.plot.height=4)
plot(quantile_sample,quantile_theory )
cat("the maximum difference in qq plot",max(abs(quantile_sample - quantile_theory)))
the maximum difference in qq plot 0.008835873

OK. We learned that Inverse Transform Sampling is useful to generate random numbers. But this method works only if we can generate random numbers from uniform distribution. What if we want to generate uniform random numbers when other random number generator is available?

Generate a uniform number using a nonuniform distributed function

This is actually even easier, assuming that inverse of the nonuniform distribution function exists!

Goal: generate a uniform number number using a nonuniform distributed function


  • Step 1: generate X from $F_X$, nonuniform
  • Step 2: set $u = F_X(X)$

Then $u$ is from Uniform distribution!


$$ x = F_X(F_X^{-1}(x)) = Pr(X \le F_X^{-1}(x)) = Pr(F_X(X) \le x) $$

Therefore, $$ Pr(F_X(X) \le x) = x $$ ,meaning that $F_X(X)$ is from Uniform distribution.

Example in R

Once again, I will use gamma(scale=1,shape=1) as the example of nonuniform distributed function.

In [4]:
G <- rgamma(scale=1,shape=1,n=5000)
U <- pgamma(G,scale=1,shape=1)
options(repr.plot.width=10, repr.plot.height=4)

QQ-plot, again

The difference in the two distribution is very little.

In [5]:
probs <- seq(0,1,0.001)
quantile_sample <- quantile(U, probs=probs)
quantile_theory <- punif(probs)
options(repr.plot.width=4, repr.plot.height=4)
cat("the maximum difference in qq plot",max(abs(quantile_sample - quantile_theory)))
the maximum difference in qq plot 0.008019295

OK. These sampling methods are cool, but they both depend on the invertibility of CDF. Many times, CDF is not invertible. For example, discrete random variable's CDFs are not invertible. Then you may need to customize the random number generator to generate numbers from target distribution. Here is such example.

Write a function to sample from a multinomial distribution

In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts of each side for rolling a k-sided dice n times.

Here is one example of how I would do it.

In [6]:
rmult <- function(n,size,prob){
    # prob: a vector of length k
    # size: the number of samples
    U <- runif(0,1,n=n)
    cprob <- c(0,cumsum(prob))
    out <- NULL
    for (i in 2:length(cprob)){
        out <- c(out,sum(cprob[i-1] < U & U < cprob[i]))

The sampling distribution seems to agree to its theoretical probabilities for large enough sample size.

In [7]:
probs <- c(0.1,0.2,0.3,0.4)
n = 1000
sample <- rmult(n=n,size=2,prob=probs)
cat("theoretical prob=",probs)
cat("\n sampling prop=",sample/n)
theoretical prob= 0.1 0.2 0.3 0.4
 sampling prop= 0.089 0.177 0.304 0.43