Monte Carlo Simulations (MC)
Introduction to Monte Carlo
Methods which make use of random numbers are often called Monte Carlo Methods named after the Casino Monte Carlo in Monaco which has long been famous for games of chance. MC methods vary, bute tend to follow a particular pattern:
- Define a domain of possible inputs
- Generate inputs randomly from a probability distribution
- Perform a deterministic computation on the inputs.
- Aggregate the results.
Math Foundation of MC Methods
1. Law of Large Numbers (LLN)
Assume \(\xi_1, \xi_2, ..., \xi_n\) is a random i.i.d. variable series, with math expectation \(E(xi_i) = \mu\). Then for arbitrary \(\epsilon > 0\), there is: \begin{equation} \lim_{n \to \infty} p \lbrace |\frac{1}{n}\sum_{i=1}^{n} \xi_i-\mu| < \epsilon \rbrace = 1. \end{equation} This means when \(n\) goes large, the average of samples convert to math expectation.
2. Central Limit Theorem (CLT)
In addition to above, if the square difference \(D(\xi_i) = \sigma^2\), then for arbitrary real number \(\lambda\): \begin{equation} \lim_{n \to \infty} p \lbrace \frac{\frac{1}{n}\sum_{i=1}^n \xi_i-\mu}{\sigma / \sqrt{n}} < \lambda \rbrace = \frac{1}{\sqrt{2\pi}}\int_{-\infty}^\lambda e^{-(x^2 /2)} dx, \end{equation} which means when \(n\) goes large, the mean value of i.i.d. random variables approximately follows a normal distribution with average \(\mu\) and square difference \(\sigma^2 /n\).
MC Evaluation of Statistical Mechanics Integrals
Ensemble average of quantity \(A(r,p)\) can be calculated for a given distribution function. For \(NVT\) ensemble we have distribution function \(\rho (r,p)\), \begin{equation} \rho (\vec r^N, \vec p^N)=\frac{1}{Z}\text{exp}(-\frac{E(\vec r^N, \vec p^N)}{k_B T}) \end{equation} \begin{equation} \langle A(\vec r^N,\vec p^N)\rangle = \int A(\vec r^N, \vec p^N) \rho (\vec r^N,\vec p^N)d\vec r^N d\vec p^N. \end{equation} Energy can always be expressed as a sum of kinetic and potential contributions. The contribution of the kinetic part is trivial and we can consider intrgral in only configuration \(3N\) dimensional space, where \(Z\) is configurational integral. \begin{equation} \langle A(\vec r^N)\rangle = \frac{1}{Z} \int A(\vec r^N)\text{exp}( -\frac{U(\vec r^N)}{k_B T} ) d\vec r^N \end{equation} \begin{equation} Z=\int e^{-\frac{U(\vec r^N)}{k_B T}} d\vec r^N \end{equation}
Statistical-mechanics integrals typically have significant contributions only from very small fractions of the 3N space. So random sampling of the configurational space is highly inefficient.
Markov-Chain Monte Carlo (MCMC)
The Markov chain Monte Carlo method takes the opposite extreme approach of generating points such that each point is directly dependent on previus one. Thus the points are correlated with each other.
Markov Chain
A Markov chain is a sequence of elements chosen from a fixed set according to a probabilistic rule. Given the most recently added element of the chain, the choice of the next element depends only on the most recent addition and not on the previous history of the construction process.
Transition Matrix
Let \(P\) be the transition matrix of a Markov chain. The \(ij\)th entry \(p_{ij}^{(n)}\) of the matrix \(P^n\) gives the probability that the Markov chain, starting in state \(s_i\), will be in state \(s_j\) after \(n\) steps.
Stationary Distribution
A stationary distribution (also called an equilibrium distribution) of a Markov chain is a probability distribution \(\pi^{\ast}\) s.t. \(\pi^{\ast} = \pi^{\ast} P\). So if a chain reaches a stationary distribution, then it maintains that distribution for all future time.
Random Walk
Statistical Behavior of 1D Walks
The average over a large number of n-step walks: \begin{equation} \langle x_n\rangle=0 \end{equation} \begin{equation} \langle x_n^2\rangle = n \end{equation}
diffusion constant \(D\)
The diffusion constant is defined by \(\langle x_n^2\rangle = 2Dn\). For this 1-D walk: \(D=\frac{1}{2}\).
Continuum Limit
The basic random walk can be rewritten as a continuum diffusion equation by taking the limit in which the lattice spacing \(l\) and the time step \(\tau\) go to zero. From the master equation \begin{equation} P(i,N)=\frac{1}{2}P(i+1,N-1)+\frac{1}{2}P(i-1,N-1), \end{equation} we can identify \(t=N\tau\) and \(x=il\). After a series of operations we get the equation: \begin{equation} \frac{\partial P(x,t)}{\partial t}=D\frac{\partial ^2 P(x,t)}{\partial x^2}, \end{equation} where \(D=l^2/2\tau\). And we have the Einstein relation: \(\langle x^2(t)\rangle=2Dt\).
Self-Avoiding Random Walk (SRW)
Diffusion-Limited Aggregation (DLA)
Metropolis Monte Carlo (MMC)
We want to generate points in a distribution \(f(x)\) whose probability density \(\rho(x)\) is known.
Basic Idea: construct a Markov chain and make its stationary distribution equals \(\rho (x)\).
Steps of MMC Algorithm
- Initialization:
- select an initial configuration \(x\) for which \(\rho (x)>0\)
- choose a maximum displacement value \(\Delta x_{\text{max}}\)
- calculate the initial \(\rho = \rho (x)\)
- Store the current state as \(x_0=x\)
- Generate a random number vector \(\vec u = (u_1,u_2,...,u_M)\) where each \(u\) is a uniform random number between \(-1\) and \(1\).
- Calculate the value of the function in the trial state \(\rho ' =\rho (x')\)
- Choose whether to move to the new state as follows (Metropolis):
- if \(\rho '\geq \rho\): accept the state
- if \(\rho '< \rho\): accept the state only if \(r < \frac{\rho '}{\rho}\) where \(r\) is a random number ~Uniform\([0,1]\)
- Calculate the new \(\rho = \rho (x)\)
- Do statistics of the current value of \(x\) or other properties
- return to step 2
why this works?
- Being in a minimum of \(\rho (x)\): new trial state will always be accepted.
- Being next to a minimum: we have a smaller probability of accepting the move to the minimum. Therefore, the system is more likely to go to states with larger \(\rho\) values!
The sequence of states are highly correlated! Thus for a low statistics this is probably a very poor way of generating points in a distribution.
Detailed Balance Condition
\(P(m,t)\) is the probability of being in configuration \(m\) at time \(t\), \(W(m /to n, t)\) is the probability of going from state \(m\) to state \(n\) per unit time (transition probability). Then we have \begin{equation} P(m,t+1)=P(m,t)+\sum_n [W(n\to m)P(n,t)-W(m\to n)P(m,t)]. \end{equation} At large \(t\), clearly a sufficient (but not necessary) condition for an equilibrium probability distribution is the so-called detailed balance condition: \begin{equation} W(n\to m)P(n,t)=W(m\to n)P(m,t) \end{equation}
This can be applied to any probability distribution, but if we choose the Boltzmann distribution we have \begin{equation} \label{dbc} \frac{W(m\to n)}{W(n\to m)}=\frac{P(n)}{P(m)}=\text{exp}(-\frac{U_n-U_m}{k_BT}), \end{equation} where \(Z\) does not appear.
MMC for Ensemble Averages
The MMC algorithm for ensemble: \begin{equation} W(m\to n)=\frac{\rho_n}{\rho_m}=\text{exp}(-\frac{U_{nm}}{K_BT}),U_{nm}>0 \end{equation} \begin{equation} W(m\to n)=1,U_{nm}\leq 0, \end{equation} which satisfies the detailed balance condition (\(\ref{dbc}\)).
Now the ensemble average of a physics quantity \(A\) can be calculated by math average from samples with Boltzmann distribution: \begin{equation} \langle A\rangle \approx \frac{1}{N_{\text{samples}}}\sum_{i=1}^{N_{\text{samples}}} A(\vec r_i^N). \end{equation} The key points are:
- Taking Boltzmann distribution as desired distribution: \(\rho(\vec r^N)=e^{-\frac{U(\vec r^N)}{k_BT}}\).
- Constructing sample chain with Metropolis MC.
- The weight of samples are included in sampling frequency.