Monte Carlo integration by Vegas¶
Vegas Algorithm¶
We proved before that the best weight function has the propery: \begin{eqnarray} \left[\frac{|f(x_1,x_2,\cdots,x_n)|^2}{(w(x_1,x_2,\cdots,x_n))^2}\right] = const \end{eqnarray}
The Vegas method is primarily based on the importance sampling algorithm with the above mentioned self-adapting strategy. The basic idea is to use a separable weight function. Thus instead of complicated $w(x_1,x_2,x_3,\cdots)$ one uses an ansatz $w=w_1(x_1)w_2(x_2)w_3(x_3)\cdots$.
The optimal separable weight functions are \begin{eqnarray} (w_1(x_1))^2 \propto \left[\int dx_2 dx_3\cdots dx_n \frac{|f(x_1,x_2,x_3,\cdots,dx_n)|^2}{w_2(x_2)w_3(x_3)\cdots w_n(x_n)}\right] \end{eqnarray} as follows from our ansatz.
Proof of optimal weight function for separable weight:
As before, we want to minimize $$\langle(\frac{f}{w})^2\rangle-\langle\frac{f}{w}\rangle^2$$ with constraint $\int w dV=1$
\begin{eqnarray} \frac{\delta}{\delta w_i(x_i)}\left[ \int \frac{f(x_1\cdots x_n)^2}{(w_1(x_1)\cdots w_n(x_n))^2} w_1(x_1)\cdots w_n(x_n)dV - \left(\int \frac{f(x_1\cdots x_n)}{w_1(x_1)\cdots w_n(x_n)} w_1(x_1)\cdots w_n(x_n)dV\right)^2 +\sum_i \lambda_i\left(\int w_i(x_i) dx_i -1 \right) \right]=0 \end{eqnarray}simplifies to
\begin{eqnarray} \frac{\delta}{\delta w_i(x_i)}\left[ \int \frac{f(x_1\cdots x_n)^2}{w_1(x_1)\cdots w_n(x_n)} dV - \left(\int f(x_1\cdots x_n) dV\right)^2 +\sum_i \lambda_i\left(\int w_i(x_i) dx_i -1 \right) \right]=0 \end{eqnarray}and gives
\begin{eqnarray} -\frac{1}{(w_i(x_i))^2}\int \frac{f(x_1\cdots x_n)^2}{\prod_{j\ne i} w_j(x_j)} \prod_{j\ne i} dx_j +\lambda_i =0 \end{eqnarray}Finally
\begin{eqnarray} (w_i(x_i))^2 \propto \int \frac{f(x_1\cdots x_n)^2}{\prod_{j\ne i} w_j(x_j)^2} \prod_{j\ne i} w_j(x_j) dx_j \end{eqnarray}The power of Vegas is that by iteration, it can resolve any divergent point which is separable, i.e., it is parallel to any axis. However, when the divergency is along diagonal Vegas is similar to usual MC sampling. Note that separable ansatz avoids the explosion of stratified regions, which scale as $K^d$, while separable ansatz scales as $K\times d$. ($K$ is number of points in one dimension, i.e., typically $K\approx 10^2$, $d\approx 10 \rightarrow$ ZettaByte versus KiloByte).
Vegas also does not compute exactly the best weight functions $w_i(x_i)$ derived above. It rather approximates them on a very rough mesh (any number of points in the grid makes algorithm more efficient than naive Monte Carlo) but does not introduce systematic error. In the limit of infinite number of random numbers integral is exact regardless of how well are weight functions approximated on finite mesh.
We will slightly change the notation now, and instead of weight functions $w_i(x_i)$, we will talk about grid functions $g_i(x_i)$, which are related to $w_i(x_i)$ by $\frac{1}{w_i(x_i)}=\frac{dg_i(x_i)}{dx_i}$.
The algorithm starts with the product of grid functions $g_i(x)$ (separable weight functions). We want to evaluate \begin{eqnarray} &&\int_{a_1}^{b_1} dg_1 \int_{a_2}^{b_2} dg_2\cdots\int_{a_n}^{b_n} dg_n f(g_1,g_2,\cdots,g_n)= \nonumber\\ &&\int_0^1 dx_1\int_0^1 dx_2\cdots\int_0^{x_n} f(g_1(x_1),g_2(x_2),\cdots,g_n(x_n)) \frac{dg_1}{dx_1}\frac{dg_2}{dx_2}\cdots\frac{dg_n}{dx_n} \end{eqnarray} Since we casted the problem into multiple 1D problems, we can simply looked at this problem as the change of integration variable. Note that all $x_i$ are not distributed between $[0,1]$ in hypercubic box.
Note that $g_1$ depends only on $x_1$, and $g_2$ only on $x_2$, so that the Jacobian of such separable set of functions is just the product of all derivatives.
We first start with the grid functions $g_i(x)=a_i + x (b_i-a_i)$, so that the integration at the first iteration is equivalent to the usual Monte Carlo sampling.
We generate a few thousand sets of random points $(x_1,x_2,\cdots,x_n)$ and evaluate $f$ on these points. During the sampling we evaluate the integral \begin{eqnarray} ⟨f⟩=\sum_{x_1,x_2,\cdots,x_n} f(g_1(x_1),g_2(x_2),\cdots,g_n(x_n)) \frac{dg_1}{dx_1}\frac{dg_2}{dx_2}\cdots\frac{dg_n}{dx_n}, \label{vegas_int} \end{eqnarray} Note that random points are uniformly distributed on mesh $x_i$ ($x_i \in [0,1) $), therefore the unit volume $\int dx dy \cdots =1$.
We will also construct the projection of the function $f$ to all dimenions. We start by sampling the following functions: \begin{eqnarray} f_1(x_1) = \sum_{x_2,x_3} |f(g_1(x_1),g_2(x_2),g_3(x_3)) \frac{dg_1}{dx_1}\frac{dg_2}{dx_2}\frac{dg_3}{dx_3}|^2\\ f_2(x_2) = \sum_{x_1,x_3} |f(g_1(x_1),g_2(x_2),g_3(x_3)) \frac{dg_1}{dx_1}\frac{dg_2}{dx_2}\frac{dg_3}{dx_3}|^2\\ f_3(x_3) = \sum_{x_1,x_2} |f(g_1(x_1),g_2(x_2),g_3(x_3))\frac{dg_1}{dx_1}\frac{dg_2}{dx_2}\frac{dg_3}{dx_3}|^2 \label{vegas_weight} \end{eqnarray} and then we normalize these projected functions and take the square root, because we squared the function before (notice that optimal weight function squared is proportional to $f^2$). The projection of the function to each dimension is than \begin{eqnarray} \widetilde{f}_i(x)= \sqrt{\frac{f_i(x)}{\int_0^1 f_i(x) dx}}, \end{eqnarray} so that we map the interval $[0,1]\rightarrow [0,1]$. This is choosen as our weight function $w_i\propto \widetilde{f}_i$.
In Vegas the weight functions are $$w_i(x) = \frac{1}{\frac{dg_i(x)}{dx}} \propto \widetilde{f}_i(g_i(x))$$ This requires that \begin{eqnarray} \widetilde{f}_i(g_i(x))\frac{dg_i(x)}{dx}=const \end{eqnarray}
This can also be understood as simple 1D problem involving a change of variable. We aim to find a refined grid $g(x)$, which is the best mesh for function $\widetilde{f}$. As proven in statistical part of the course, this requires the cumulative distribution to be constant, $$ \int_{g_{l}}^{g_{l+1}} \widetilde{f}(g) dg = const$$ for every $l$. In another words, we want each small interval of the integral to contribute equally to the integral, when using the improved grid function $g(x)$.
We can also write \begin{eqnarray} \int \widetilde{f}(g)dg = \int \widetilde{f}(g(x))\frac{dg}{dx}dx, \end{eqnarray} we would like to have \begin{eqnarray} \widetilde{f}(g(x))\frac{dg}{dx} = const \end{eqnarray} which is the same as requirement above.
To determine the new grid $g_l$, we will determine the grip points numerically so that \begin{eqnarray} \int_0^1 \widetilde{f}(g_{old}) dg_{old} = I\\ \int_{g^{new}_{l-1}}^{g^{new}_l} \widetilde{f}(g_{old})dg_{old} = \frac{I}{N_g} \label{f_int} \end{eqnarray} where $N_g$ is the number of gridpoints. Hence, we require that there is exactly the same weight between each two consecutive points (between $g_{0}$ and $g_1$, and between $g_1$ and $g_2$,....).
Once we have a new grid, we can restart the sampling of the integral of $f$, just like in Eqs. for $⟨f⟩$, $f_1(x_1)$, $f_2(x_2)$, $f_3(x_3),$ but using $g_{new}$. We generate again a few thousand random set of points $(x_1,x_2,x_3)$ and obtain $\widetilde{f}_i(x)$ functions. We then repeat this procedure approximately 10-times, and we can slightly increase the number of random points each time, as the grid functions becomes more are more precise. At the end, we can run a single long run with 10-times longer sampling, to reduce sampling error-bar.
In practice, we will implement one more tricks, to make the algorithm more stable. When we compute the separable function $\widetilde{f}_i(x)$, we will smoothen it, because we want to avoid accessive discontinuities due to finite number of random points. We will just average over nearest neighbors \begin{eqnarray} \widetilde{f}_i \leftarrow \frac{\widetilde{f}_{i-1}+\widetilde{f}_i +\widetilde{f}_{i+1}}{3} \end{eqnarray} Note, be careful at the endpoints, where you need to average over two points only.
Finally, for the grid $g(x)$ we will have the grid of points distributed between $[0,1]$ so that $x_0 = \frac{1}{N}$, $x_1=\frac{2}{N}$, $\cdots$, $x_{N-2}=\frac{N-1}{N}$, $x_{N-1}=1$. We know that $g(x=0)=0$, hence we do not need to save pont $x=0$, but we need to be careful when interpolating at the point $x_0$. For such equidistant mesh, it is clear that given a point $x\in [0,1]$, we can compute $i=int(x N)$, and then we know that $x$ appears between $x_{i-1}$ and $x_i$, and linear interpolation gives \begin{eqnarray} &&g(x) = g_{i-1} + (g_{i}-g_{i-1})\frac{(x-i/N)}{1/N}\\ &&g(x) = g_{0}\; x N \quad if\quad i=0 \end{eqnarray}
Finally, we want to discuss the calculation of the error and confidence in the result. We will perform $M$ outside iterations (which update the grid). Each such iteration will consist of $n_i$ function evaluations (of the order of few thousand to ten thousand). From all these calculations $M*n$ we want to evaluate the best estimate of the integral, and its estimation for the error.
At each iteration, we will sample the following qualities: \begin{eqnarray} && ⟨f_w⟩_i = \frac{1}{n_i}\sum_{j=1}^{n_i} f(g_1({x_1}_j),g_2({x_2}_j),g_3({x_3}_j))\frac{dg_1}{dx_1}({x_1}_j)\frac{dg_2}{dx_2}({x_2}_j)\frac{dg_3}{dx_3}({x_3}_j)\\ && ⟨f_w^2⟩_i = \frac{1}{n_i}\sum_{j=1}^{n_i} \left(f(g_1({x_1}_j),g_2({x_2}_j),g_3({x_3}_j))\frac{dg_1}{dx_1}({x_1}_j)\frac{dg_2}{dx_2}({x_2}_j)\frac{dg_3}{dx_3}({x_3}_j)\right)^2 \end{eqnarray} Then the estimation for the variance-square of the MC-sampling is \begin{eqnarray} \sigma_i^2 = \frac{ ⟨f_w^2⟩_i-⟨f_w⟩_i^2}{n_i-1} \end{eqnarray} Note that the variance of the sampled function is $\sqrt{⟨f_w^2⟩_i-⟨f_w⟩_i^2}$, which is approaching a constant when the number of sampled points $n_i$ goes to infinity. However, the variance of the MC-sampling is $\sigma_i \propto \frac{1}{\sqrt{n_i}}$, as expected.
From all accumulated evaluations of the function (in M iterations), we can construct the best estimate of the integral. Naively, we would just calculate $1/M\sum_i ⟨f_w⟩_i$. However, at the first few iterations the error was way bigger than in the last iteration, and therefore we want to penalize those early estimates, which were not so good. We achieve that by \begin{eqnarray} I_{best}=\frac{\sum_{i=1}^M \frac{⟨f_w⟩_i}{\sigma_i^2}}{\sum_{i=1}^M\frac{1}{\sigma_i^2}} \end{eqnarray} Similarly, the error does not sum up, but is rather smaller than the smallest error (in the last iteration). We have \begin{eqnarray} \frac{1}{\sigma^2}_{best} = \sum_{i=1}^M \frac{1}{\sigma_i^2} \end{eqnarray} and finally the $\chi^2$ can be estimated by \begin{eqnarray} \chi^2 = \frac{1}{M-1}\sum_{i=1}^M \frac{(⟨f_w⟩_i -I_{best})^2}{\sigma_i^2} \end{eqnarray}
It measures the weighted squared deviations between the observed data ($⟨f_w⟩_i$) and the model prediction ($I_{best}$). If model prediction is good, $\chi^2$ should be close to unity, because $⟨f_w⟩_i$ should be on average $\sigma_i$ away from $I_{best}$.
Brute force Monte Carlo¶
First we will implement the simple Monte Carlo method.
# simple class to take care of statistically determined quantities
class Cumulants:
def __init__(self):
self.sum=0.0 # f_0 + f_1 +.... + f_N
self.sqsum=0.0 # f_0^2 + f_1^2 +....+ f_N^2
self.avg = 0.0 # I_best when many iterations, otherwise <f> = 1/N\sum_i f_i
self.err = 0.0 # sigma of I_best when many iterations, otherwise sqrt( <f^2>-<f>^2 )/sqrt(N)
self.chisq = 0.0
self.weightsum=0.0 # \sum_i 1/sigma_i^2
self.avgsum=0.0 # \sum_i <f>_i/sigma_i^2
self.avg2sum=0.0 # \sum_i <f>_i^2/sigma_i^2
def SimpleMC(integrant, ndim, unit, maxeval, cum):
""" integrant -- function to integrate. It should be vectorized function!
ndim -- how many dimenions the function has
unit -- the size of the box we are using for integration [[0,L],[0,L]...]
maxeval -- total number of function evaluations we will perform
cum -- class Cumulants, which will hold all information we want to obatin.
"""
nbatch=1000 # function will be evaluated in bacthes of 1000 evaluations at one time (for efficiency and storage issues)
neval=0
for nsamples in range(maxeval,0,-nbatch): # loop over all_nsample evaluations in batches of nbatch
# nsamples counts remaining number of function evaluations
n = min(nbatch,nsamples) # How many evaluations in this pass?
xr = unit*random.random((n,ndim)) # generates 2-d array of random numbers in the interval [0,1)
wfun = integrant(xr) # n function evaluations required in single call
neval += n # We just added so many fuction evaluations
cum.sum += sum(wfun) # sum_i f_i = <fw> * neval
cum.sqsum += sum(wfun*wfun) # sum_i f_i^2 = <fw^2> * neval
cum.avg = cum.sum/neval
w0 = sqrt(cum.sqsum/neval) # sqrt(<f^2>)
cum.err = sqrt((w0-cum.avg)*(w0+cum.avg)/neval) # sqrt(sqsum-sum**2)
cum.avg *= unit**ndim # adding units if not [0,1] interval
cum.err *= unit**ndim # adding units if not [0,1] interval
return cum
We used here a simple trick to avoid overflow, i.e.,
\begin{eqnarray} \sqrt{\frac{\langle f^2\rangle-\langle f\rangle^2}{N}} = \sqrt{\frac{(\sqrt{\langle f^2\rangle}-\langle f\rangle)(\sqrt{\langle f^2\rangle}+\langle f\rangle)}{N}} \end{eqnarray}For testing, we will integrate $$ \frac{1}{\pi^3}\int_0^\pi dx\int_0^\pi dy \int_0^\pi dz \frac{1}{1-\cos(x) \cos(y) \cos(z)}$$
def my_integrant2(x):
""" For testing, we are integration the function
1/(1-cos(x)*cos(y)*cos(z))/pi^3
in the interval [0,pi]**3
"""
#nbatch,ndim = shape(x)
return 1.0/(1.0-cos(x[:,0])*cos(x[:,1])*cos(x[:,2]))/pi**3
from scipy import *
from numpy import *
unit=pi
ndim=3
maxeval=200000
exact = 1.3932 # exact value of the integral
cum = Cumulants()
SimpleMC(my_integrant2, ndim, pi, maxeval, cum)
print(cum.avg, '+-', cum.err, 'exact=', exact)
print('how many sigma away', abs(cum.avg-exact)/cum.err)
1.396295798059595 +- 0.016560211756764396 exact= 1.3932 how many sigma away 0.18694193679802276
Vegas¶
First we define the grid points $g(x)$. At the beginning, we just set $g(x)=x$.
class Grid:
"""Contains the grid points g_n(x) with x=[0...1], and g=[0...1]
for Vegas integration. There are n-dim g_n functions.
Constraints : g(0)=0 and g(1)=1.
"""
def __init__(self, ndim, nbins):
self.g = zeros((ndim,nbins+1))
# a bit dirty trick: We will later use also g[-1] in interpolation, which should be set to zero, hence
# we allocate dimension nbins+1, rather than nbins
self.ndim=ndim
self.nbins=nbins
# At the beginning we set g(x)=x
# The grid-points are x_0 = 1/N, x_1 = 2/N, ... x_{N-1}=1.0. Note x_N=x_{-1}=0
# Note that g(0)=0, and we skip this point on the mesh.
for idim in range(ndim): # over all 3 dimensions
self.g[idim,:nbins] = arange(1,nbins+1)/float(nbins)
def Vegas_step1(integrant, unit, maxeval, nstart, nincrease, grid, cum):
"""
integrant -- function to integrate. It should be vectorized function!
unit -- the size of the box we are using for integration [[0,L],[0,L]...]
maxeval -- total number of function evaluations we will perform
nstart -- number of MC steps in the first iteration (before refining the grid)
nincrease -- how many points we add once grid is refined
grid -- the class which contains current grid, and is self-consistently refined.
cum -- class Cumulants, which will hold all information we want to obatin.
"""
ndim, nbins = grid.ndim,grid.nbins # dimension of the integral, size of the grid for binning in each direction
unit_dim = unit**ndim # converts from unit cube integration to generalized cube with unit length
nbatch=1000 # function will be evaluated in bacthes of 1000 evaluations at one time (for efficiency and storage issues)
neval=0
print ("""Vegas parameters:
ndim = """+str(ndim)+"""
unit = """+str(unit)+"""
maxeval = """+str(maxeval)+"""
nstart = """+str(nstart)+"""
nincrease = """+str(nincrease)+"""
nbins = """+str(nbins)+"""
nbaths = """+str(nbatch)+"\n")
bins = zeros((nbatch,ndim),dtype=int) # in which sampled bin does this point fall?
all_nsamples = nstart
for nsamples in range(all_nsamples,0,-nbatch): # loop over all_nsample evaluations in batches of nbatch
# nsamples is the number of points left to evaluate
n = min(nbatch,nsamples) # How many evaluations in this pass?
# We are integrating f(g_1(x),g_2(y),g_3(z))*dg_1/dx*dg_2/dy*dg_3/dz dx*dy*dz
# This is represented as 1/all_nsamples \sum_{x_i,y_i,z_i} f(g_1(x_i),g_2(y_i),g_3(z_i))*dg_1/dx*dg_2/dy*dg_3/dz
# where dg_1/dx = diff*NBINS
wgh = zeros(nbatch) # weights for each random point in the batch
xr = random.random((n,ndim)) # generates 2-d array of random numbers in the interval [0,1)
# This part takes quite a lot of time and it would be nice to rewrite with numba!
for i in range(n): # over the batch, which is n-items long
weight = 1.0/all_nsamples # weight in this point of the batch, i.e., one configuration
for dim in range(ndim): # over all dimension of the function ovaluation
# We want to evaluate the function f at point g(x), i.e, f(g_1(x_1),g_2(x_2),...)
# Here we transform the points x_1,x_2,x_3 -> g_1(x_1), g_2(x_2), g_3(x_3)
# We hence want to evaluate g(x) ~ g(x[i]), where x is the random number and g is the grid function
# The discretized g(t) is defined on the grid :
# t[-1]=0, t[0]=1/N, t[1]=2/N, t[2]=3/N ... t[N-1]=1.
# We know that g(0)=0 and g(1)=1, so that g[-1]=0.0 and g[N-1]=1.0
# To interpolate g at x, we first compute i=int(x*N) and then we use linear interpolation
# g(x) = g[i-1] + (g[i]-g[i-1])*(x*N-i) ; if i>0
# g(x) = 0 + (g[0]-0)*(x*N-0) ; if i=0
#
pos = xr[i,dim]*nbins # which grid would it fit ? (x*N)
ipos = int(pos) # the grid position is ipos : int(x*N)==i
diff = grid.g[dim,ipos] - grid.g[dim,ipos-1] # g[i]-g[i-1]
# linear interpolation for g(x) :
xr[i,dim] = (grid.g[dim,ipos-1] + (pos-ipos)*diff)*unit # g(xr) ~ ( g[i-1]+(g[i]-g[i-1])*(x*N-i) )*[units]
bins[i,dim]=ipos # remember in which bin this random number falls.
weight *= diff*nbins # weight for this dimension is dg/dx = (g[i]-g[i-1])*N
# because dx = i/N - (i-1)/N = 1/N
wgh[i] = weight # total weight is for this point (df/dx)*(df/dy)*(df/dz).../N_{samples}
# Here we evaluate function f on all randomly generated x points above
fx = integrant(xr) # n function evaluations required in single call
neval += n # We just added so many fuction evaluations
# Now we compute the integral as weighted average, namely, f(g(x))*dg/dx
wfun = wgh * fx # weight * function ~ f_i*w_i, here w_i has 1/N in its weight, hence actually we are
# evaluating 1/N * sum_i f_i*w_i
cum.sum += sum(wfun) # 1/N sum_i f_i*w_i = <fw>
wfun *= wfun # carefull : we need 1/N (f_i w_i)^2, while this gives (f_i w_i/N)^2
cum.sqsum += sum(wfun) # sum_i (f_i*w_i/N)^2 = <fw^2>/all_nsamples
#
w0 = sqrt(cum.sqsum*all_nsamples) # w0 = sqrt(<fw^2>)
w1 = (w0 + cum.sum)*(w0 - cum.sum) # w1 = (w0^2 - <fw>^2) = (<fw^2>-<fw>^2)
w = (all_nsamples-1)/w1 # w ~ 1/sigma_i^2 = (N-1)/(<fw^2>-<fw>^2)
# Note that variance of the MC sampling is Var(monte-f) = (<f^2>-<f>^2)/N == 1/sigma_i^2
cum.weightsum += w # weightsum ~ \sum_i 1/sigma_i^2
cum.avgsum += w*cum.sum # avgsum ~ \sum_i <fw>_i / sigma_i^2
cum.avg = cum.avgsum/cum.weightsum # I_best = (\sum_i <fw>_i/sigma_i^2 )/(\sum_i 1/sigma_i^2)
cum.err = sqrt(1/cum.weightsum) # err ~ sqrt(best sigma^2) = sqrt(1/(\sum_i 1/sigma_i^2))
cum.avg *= unit**ndim
cum.err *= unit**ndim
from scipy import *
from numpy import *
unit=pi
ndim=3
maxeval=200000
exact = 1.3932 # exact value of the integral
cum = Cumulants()
nbins=128
nstart =10000
nincrease=5000
grid = Grid(ndim,nbins)
#random.seed(0)
#SimpleMC(my_integrant2, ndim, pi, maxeval, cum)
Vegas_step1(my_integrant2, pi, maxeval, nstart, nincrease, grid, cum)
print (cum.avg, '+-', cum.err, 'exact=', exact)
Vegas parameters:
ndim = 3
unit = 3.141592653589793
maxeval = 200000
nstart = 10000
nincrease = 5000
nbins = 128
nbaths = 1000
1.3618807993509006 +- 0.025336066258706214 exact= 1.3932
Now we are going to insert the sampling of the projection of the function $f(x)$ to all axis, $f_1(x), f_2(y)...$, from which we will calculate new grids $g_1(x), g_2(y)...$.
def Vegas_step2(integrant, unit, maxeval, nstart, nincrease, grid, cum):
ndim, nbins = grid.ndim,grid.nbins # dimension of the integral, size of the grid for binning in each direction
unit_dim = unit**ndim # converts from unit cube integration to generalized cube with unit length
nbatch=1000 # function will be evaluated in bacthes of 1000 evaluations at one time (for efficiency and storage issues)
neval=0
print ("""Vegas parameters:
ndim = """+str(ndim)+"""
unit = """+str(unit)+"""
maxeval = """+str(maxeval)+"""
nstart = """+str(nstart)+"""
nincrease = """+str(nincrease)+"""
nbins = """+str(nbins)+"""
nbaths = """+str(nbatch)+"\n")
bins = zeros((nbatch,ndim),dtype=int) # in which sampled bin does this point fall?
all_nsamples = nstart
# fxbin is not the function being integrated; it is an envelope for importance sampling.
fxbin = zeros((ndim,nbins)) #new2: after each iteration we reset the average function being binned
for nsamples in range(all_nsamples,0,-nbatch): # loop over all_nsample evaluations in batches of nbatch
n = min(nbatch,nsamples) # How many evaluations in this pass?
# We are integrating f(g_1(x),g_2(y),g_3(z))*dg_1/dx*dg_2/dy*dg_3/dz dx*dy*dz
# This is represented as 1/all_nsamples \sum_{x_i,y_i,z_i} f(g_1(x_i),g_2(y_i),g_3(z_i))*dg_1/dx*dg_2/dy*dg_3/dz
# where dg_1/dx = diff*NBINS
wgh = zeros(nbatch) # weights for each random point in the batch
xr = random.random((n,ndim)) # generates 2-d array of random numbers in the interval [0,1)
for i in range(n):
weight = 1.0/all_nsamples
for dim in range(ndim):
# We want to evaluate the function f at point g(x), i.e, f(g_1(x),g_2(y),...)
# Here we transform the points x,y,z -> g_1(x), g_2(y), g_3(z)
# We hence want to evaluate g(x) ~ g(x[i]), where x is the random number and g is the grid function
# The discretized g(t) is defined on the grid :
# t[-1]=0, t[0]=1/N, t[1]=2/N, t[2]=3/N ... t[N-1]=1.
# We know that g(0)=0 and g(1)=1, so that g[-1]=0.0 and g[N-1]=1.0
# To interpolate g at x, we first compute i=int(x*N) and then we use linear interpolation
# g(x) = g[i-1] + (g[i]-g[i-1])*(x*N-i) ; if i>0
# g(x) = 0 + (g[0]-0)*(x*N-0) ; if i=0
#
pos = xr[i,dim]*nbins # which grid would it fit ? (x*N)
ipos = int(pos) # the grid position is ipos : int(x*N)==i
diff = grid.g[dim,ipos] - grid.g[dim,ipos-1] # g[i]-g[-1]
# linear interpolation for g(x) :
xr[i,dim] = (grid.g[dim,ipos-1] + (pos-ipos)*diff)*unit # g(xr) ~ ( g[i-1]+(g[i]-g[i-1])*(x*N-i) )*[units]
bins[i,dim]=ipos # remember in which bin this random number falls.
weight *= diff*nbins # weight for this dimension is dg/dx = (g[i]-g[i-1])*N
# because dx = i/N - (i-1)/N = 1/N
wgh[i] = weight # total weight is (df/dx)*(df/dy)*(df/dx).../N_{samples}
# Here we evaluate function f on all randomly generated x points above
fx = integrant(xr) # n function evaluations required in single call
neval += n # We just added so many fuction evaluations
# Now we compute the integral as weighted average, namely, f(g(x))*dg/dx
wfun = wgh * fx # weight * function ~ f_i*w_i
cum.sum += sum(wfun) # sum_i f_i*w_i = <fw>
wfun *= wfun # carefull : this is like (f_i * w_i/N)^2 hence 1/N (1/N (f_i*w_i)^2)
cum.sqsum += sum(wfun) # sum_i (f_i*w_i)^2 = <fw^2>/all_nsamples
#
for dim in range(ndim): #new2
# projection of the sampled function^2 to each dimension, which will be used to improve the grid.
for i in range(n): #new2
# loop over x_i: fxbin[dim,x_i] += sum_{x1_1,x_2,..but not x_i}|f(g_1(x_1),g_2(x_2),...)*dg_1/dx*dg_2/dx...|^2
fxbin[dim, bins[i,dim] ] += wfun[i] #new2: just bin the function f. We saved the bin position before.
w0 = sqrt(cum.sqsum*all_nsamples) # w0 = sqrt(<fw^2>)
w1 = (w0 + cum.sum)*(w0 - cum.sum) # w1 = (w0^2 - <fw>^2) = (<fw^2>-<fw>^2)
w = (all_nsamples-1)/w1 # w ~ 1/sigma_i^2 = (N-1)/(<fw^2>-<fw>^2)
# Note that variance of the MC sampling is Var(monte-f) = (<f^2>-<f>^2)/N == 1/sigma_i^2
cum.weightsum += w # weightsum ~ \sum_i 1/sigma_i^2
cum.avgsum += w*cum.sum # avgsum ~ \sum_i <fw>_i / sigma_i^2
cum.avg = cum.avgsum/cum.weightsum # I_best = (\sum_i <fw>_i/sigma_i^2 )/(\sum_i 1/sigma_i^2)
cum.err = sqrt(1/cum.weightsum) # err ~ sqrt(best sigma^2) = sqrt(1/(\sum_i 1/sigma_i^2))
cum.avg *= unit**ndim
cum.err *= unit**ndim
return fxbin
from scipy import *
from numpy import *
unit=pi
ndim=3
maxeval=200000
exact = 1.3932 # exact value of the integral
cum = Cumulants()
nbins=128
nstart =10000
nincrease=5000
grid = Grid(ndim,nbins)
#SimpleMC(my_integrant2, ndim, pi, maxeval, cum)
fxbin = Vegas_step2(my_integrant2, pi, maxeval, nstart, nincrease, grid, cum)
print(cum.avg, '+-', cum.err, 'exact=', exact)
Vegas parameters:
ndim = 3
unit = 3.141592653589793
maxeval = 200000
nstart = 10000
nincrease = 5000
nbins = 128
nbaths = 1000
1.3189788965606613 +- 0.01902487587708361 exact= 1.3932
from pylab import *
%matplotlib inline
plot(fxbin[0])
plot(fxbin[1])
plot(fxbin[2])
#xlim([0,128])
ylim([0,1e-7])
show()
def Smoothen(fxbin):
(ndim,nbins) = shape(fxbin)
final = zeros(shape(fxbin))
for idim in range(ndim):
fxb = copy(fxbin[idim,:]) # widetilde{f}(x) copied
#**** smooth the widetilde{f} value stored for each bin ****
# f[i] <- (f[i+1]+f[i]+f[i-1])/3.
fxb[:nbins-1] += fxbin[idim,1:nbins]
fxb[1:nbins] += fxbin[idim,:nbins-1]
fxb[1:nbins-1] *= 1/3.
fxb[0] *= 1/2.
fxb[nbins-1] *= 1/2.
norm = sum(fxb)
if( norm == 0 ):
print ('ERROR can not refine the grid with zero grid function')
return # can not refine the grid if the function is zero.
fxb *= 1.0/norm # we normalize the function.
# Note that normalization is such that the sum is 1.
# And than we take the square root of abs value just in case the value is negative at a point.
final[idim,:] = sqrt(abs(fxb))
return final
from pylab import *
%matplotlib inline
imp = Smoothen(fxbin)
plot(imp[0])
plot(imp[1])
plot(imp[2])
xlim([0,128])
show()
Note that we require $$f(g(x))\frac{dg}{dx} = const$$ which is equivalent to say that the weigh function will be $$w(x) \propto f(g(x))$$ because $\frac{dg}{dx}=\frac{1}{w(x)}$
Next we wanted to rewrite this identity in more useful way: $$f(g(x))\frac{dg}{dx} = const = \int_{g(x)}^{g(x)+\frac{dg}{dx}dx} f(t) dt$$ which when discretized becomes
$$\int_{g_{l-1}}^{g_l} f(g_{old}) dg_{old} = \frac{I}{N_g}$$where $I=\int_0^1 f(g_{old}) dg_{old}$ and $N_g$ is the number of discrete points $l$.
class Grid:
"""Contains the grid points g_n(x) with x=[0...1], and g=[0...1]
for Vegas integration. There are n-dim g_n functions.
Constraints : g(0)=0 and g(1)=1.
"""
def __init__(self, ndim, nbins):
self.g = zeros((ndim,nbins+1))
# a bit dirty trick: We will later use also g[-1] in interpolation, which should be set to zero, hence
# we allocate dimension nbins+1, rather than nbinx
self.ndim=ndim
self.nbins=nbins
# At the beginning we set g(x)=x
# The grid-points are x_0 = 1/N, x_1 = 2/N, ... x_{N-1}=1.0.
# Note that g(0)=0, and we skip this point on the mesh.
for idim in range(ndim):
self.g[idim,:nbins] = arange(1,nbins+1)/float(nbins)
def RefineGrid(self, imp):
# imp[idim,ibin] is the input function \widetilde{f}(g) from which we construct the grid
(ndim,nbins) = shape(imp)
gnew = zeros((ndim,nbins+1))
for idim in range(ndim):
avgperbin = sum(imp[idim,:])/nbins
#**** redefine the size of each bin ****
newgrid = zeros(nbins)
cur=0.0
newcur=0.0
thisbin = 0.0
ibin = -1
# we are trying to determine
# Int[ f(g) dg, {g, g_{i-1},g_i}] == I/N_g
# where I == avgperbin
for newbin in range(nbins-1): # all but the last bin, which is 1.0
while (thisbin < avgperbin) :
ibin+=1
thisbin += imp[idim,ibin] # adding widetilde{f}(g)
prev = cur # g^old_{l-1}
cur = self.g[idim,ibin] # g^old_{l}
# Explanation is in order :
# g^new_l should be somewhere between g^old_l and g^old_{l-1}, because if we add the last point
# we exceeded I/Ng value, while withouth the last point, we do not have enough. Hence need to interpolate.
# prev -- g^{old}_{l-1}
# cur -- g^{old}_l
# thisbin -- Sm = f_{l-k}+.... +f_{l-2}+f_{l-1}+f_l
# we know that Sm is just a bit more than we need, i.e., I/N_g, hence we need to compute how much more
# using linear interpolation :
# g^{new} = g_l - (g_l-g_{l-1}) * (f_{l-k}+....+f_{l-2}+f_{l-1}+f_l - I/N_g)/f_l
# clearly
# if I/N_g == f_{l-k}+....+f_{l-2}+f_{l-1}+f_l
# we will get g^{new} = g_l
# and if I/N_g == f_{l-k}+....+f_{l-2}+f_{l-1}
# we will get g^{new} = g_{l-1}
# and if I/N_g is between the two possibilities, we will get linear interpolation between
# g_{l-1} and g_l
#
thisbin -= avgperbin # thisbin <- (f_{l-k}+....+f_{l-2}+f_{l-1}+f_l - I/N_g)
dx=thisbin/imp[idim,ibin] # dx is between 0 and 1
# cur=g_l and prev=g_{l-1} and dx is the fraction of each we need to take for linear interpolation
newgrid[newbin] = cur + (prev-cur)*dx # linear interpolation between g_l and g_{l-1}
newgrid[nbins-1]=1.0
gnew[idim,:nbins]= newgrid
self.g = gnew
return gnew
Update the class Grid, so that it can refine the grid
from scipy import *
from numpy import *
unit=pi
ndim=3
maxeval=200000
exact = 1.3932 # exact value of the integral
cum = Cumulants()
nbins=128
nstart =10000
nincrease=5000
grid = Grid(ndim,nbins)
fxbin = Vegas_step2(my_integrant2, pi, maxeval, nstart, nincrease, grid, cum)
imp = Smoothen(fxbin)
grid.RefineGrid(imp)
print(cum.avg, '+-', cum.err, 'exact=', exact)
plot(grid.g[0,:nbins])
plot(grid.g[1,:nbins])
plot(grid.g[2,:nbins])
xlim([0,128])
show()
Vegas parameters:
ndim = 3
unit = 3.141592653589793
maxeval = 200000
nstart = 10000
nincrease = 5000
nbins = 128
nbaths = 1000
1.3875011307357779 +- 0.0376606993288532 exact= 1.3932
def Vegas_step3(integrant, unit, maxeval, nstart, nincrease, grid, cum):
ndim, nbins = grid.ndim,grid.nbins # dimension of the integral, size of the grid for binning in each direction
unit_dim = unit**ndim # converts from unit cube integration to generalized cube with unit length
nbatch=1000 # function will be evaluated in bacthes of 1000 evaluations at one time (for efficiency and storage issues)
neval=0
print ("""Vegas parameters:
ndim = """+str(ndim)+"""
unit = """+str(unit)+"""
maxeval = """+str(maxeval)+"""
nstart = """+str(nstart)+"""
nincrease = """+str(nincrease)+"""
nbins = """+str(nbins)+"""
nbaths = """+str(nbatch)+"\n")
bins = zeros((nbatch,ndim),dtype=int) # in which sampled bin does this point fall?
all_nsamples = nstart
for iter in range(1000): # NEW in step 3
wgh = zeros(nbatch) # weights for each random point in the batch
fxbin = zeros((ndim,nbins)) # after each iteration we reset the average function being binned
for nsamples in range(all_nsamples,0,-nbatch): # loop over all_nsample evaluations in batches of nbatch
n = min(nbatch,nsamples) # How many evaluations in this pass?
# We are integrating f(g_1(x),g_2(y),g_3(z))*dg_1/dx*dg_2/dy*dg_3/dz dx*dy*dz
# This is represented as 1/all_nsamples \sum_{x_i,y_i,z_i} f(g_1(x_i),g_2(y_i),g_3(z_i))*dg_1/dx*dg_2/dy*dg_3/dz
# where dg_1/dx = diff*NBINS
xr = random.random((n,ndim)) # generates 2-d array of random numbers in the interval [0,1)
for i in range(n):
weight = 1.0/all_nsamples
for dim in range(ndim):
# We want to evaluate the function f at point g(x), i.e, f(g_1(x),g_2(y),...)
# Here we transform the points x,y,z -> g_1(x), g_2(y), g_3(z)
# We hence want to evaluate g(x) ~ g(x[i]), where x is the random number and g is the grid function
# The discretized g(t) is defined on the grid :
# t[-1]=0, t[0]=1/N, t[1]=2/N, t[2]=3/N ... t[N-1]=1.
# We know that g(0)=0 and g(1)=1, so that g[-1]=0.0 and g[N-1]=1.0
# To interpolate g at x, we first compute i=int(x*N) and then we use linear interpolation
# g(x) = g[i-1] + (g[i]-g[i-1])*(x*N-i) ; if i>0
# g(x) = 0 + (g[0]-0)*(x*N-0) ; if i=0
#
pos = xr[i,dim]*nbins # which grid would it fit ? (x*N)
ipos = int(pos) # the grid position is ipos : int(x*N)==i
diff = grid.g[dim,ipos] - grid.g[dim,ipos-1] # g[i]-g[i-1]
# linear interpolation for g(x) :
xr[i,dim] = (grid.g[dim,ipos-1] + (pos-ipos)*diff)*unit # g(xr) ~ ( g[i-1]+(g[i]-g[i-1])*(x*N-i) )*[units]
bins[i,dim]=ipos # remember in which bin this random number falls.
weight *= diff*nbins # weight for this dimension is dg/dx = (g[i]-g[i-1])*N
# because dx = i/N - (i-1)/N = 1/N
wgh[i] = weight # total weight is (df/dx)*(df/dy)*(df/dx).../N_{samples}
# Here we evaluate function f on all randomly generated x points above
fx = integrant(xr) # n function evaluations required in single call
neval += n # We just added so many fuction evaluations
# Now we compute the integral as weighted average, namely, f(g(x))*dg/dx
wfun = wgh * fx # weight * function ~ f_i*w_i
cum.sum += sum(wfun) # sum_i f_i*w_i = <fw>
wfun *= wfun # carefull : this is like (f_i * w_i/N)^2 hence 1/N (1/N (f_i*w_i)^2)
cum.sqsum += sum(wfun) # sum_i (f_i*w_i)^2 = <fw^2>/all_nsamples
#
for dim in range(ndim): #new2
# Here we make a better approximation for the function, which we are integrating.
for i in range(n): #new2
fxbin[dim, bins[i,dim] ] += wfun[i] #new2: just bin the function f. We saved the bin position before.
w0 = sqrt(cum.sqsum*all_nsamples) # w0 = sqrt(<fw^2>)
w1 = (w0 + cum.sum)*(w0 - cum.sum) # w1 = (w0^2 - <fw>^2) = (<fw^2>-<fw>^2)
w = (all_nsamples-1)/w1 # w ~ 1/sigma_i^2 = (N-1)/(<fw^2>-<fw>^2)
# Note that variance of the MC sampling is Var(monte-f) = (<f^2>-<f>^2)/N == 1/sigma_i^2
cum.weightsum += w # weightsum ~ \sum_i 1/sigma_i^2
cum.avgsum += w*cum.sum # avgsum ~ \sum_i <fw>_i / sigma_i^2
cum.avg2sum += w*cum.sum**2 # avg2cum ~ \sum_i <fw>_i^2/sigma_i^2
cum.avg = cum.avgsum/cum.weightsum # I_best = (\sum_i <fw>_i/sigma_i^2 )/(\sum_i 1/sigma_i^2)
cum.err = sqrt(1/cum.weightsum) # err ~ sqrt(best sigma^2) = sqrt(1/(\sum_i 1/sigma_i^2))
# NEW in this step3
if iter>0:
cum.chisq = (cum.avg2sum - 2*cum.avgsum*cum.avg + cum.weightsum*cum.avg**2)/iter
print ("Iteration {:3d}: I= {:10.8f} +- {:10.8f} chisq= {:10.8f} number of evaluations = {:7d} ".format(iter+1, cum.avg*unit_dim, cum.err*unit_dim, cum.chisq, neval))
imp = Smoothen(fxbin)
grid.RefineGrid(imp)
cum.sum=0 # clear the partial sum for the next step
cum.sqsum=0
all_nsamples += nincrease # for the next time, increase the number of steps a bit
if (neval>=maxeval): break
cum.avg *= unit**ndim
cum.err *= unit**ndim
from scipy import *
from numpy import *
unit=pi
ndim=3
maxeval=20000000
exact = 1.3932 # exact value of the integral
cum = Cumulants()
nbins=128
nstart =100000
nincrease=5000
grid = Grid(ndim,nbins)
#random.seed(0)
Vegas_step3(my_integrant2, pi, maxeval, nstart, nincrease, grid, cum)
print (cum.avg, '+-', cum.err, 'exact=', exact, 'real error=', abs(cum.avg-exact)/exact)
plot(grid.g[0,:nbins])
plot(grid.g[1,:nbins])
plot(grid.g[2,:nbins])
show()
Vegas parameters:
ndim = 3
unit = 3.141592653589793
maxeval = 20000000
nstart = 100000
nincrease = 5000
nbins = 128
nbaths = 1000
Iteration 1: I= 1.37185497 +- 0.01219075 chisq= 0.00000000 number of evaluations = 100000
Iteration 2: I= 1.39930206 +- 0.00628292 chisq= 6.90258448 number of evaluations = 205000
Iteration 3: I= 1.40244764 +- 0.00433868 chisq= 3.69086103 number of evaluations = 315000
Iteration 4: I= 1.40122918 +- 0.00332122 chisq= 2.52407193 number of evaluations = 430000
Iteration 5: I= 1.39782439 +- 0.00272196 chisq= 2.69332894 number of evaluations = 550000
Iteration 6: I= 1.39440311 +- 0.00231822 chisq= 3.30510111 number of evaluations = 675000
Iteration 7: I= 1.39247026 +- 0.00201408 chisq= 3.22680709 number of evaluations = 805000
Iteration 8: I= 1.39367398 +- 0.00182466 chisq= 3.05049301 number of evaluations = 940000
Iteration 9: I= 1.39244159 +- 0.00166689 chisq= 3.01383010 number of evaluations = 1080000
Iteration 10: I= 1.39169324 +- 0.00153112 chisq= 2.82227410 number of evaluations = 1225000
Iteration 11: I= 1.39227460 +- 0.00147042 chisq= 2.72554480 number of evaluations = 1375000
Iteration 12: I= 1.39228851 +- 0.00139759 chisq= 2.47785223 number of evaluations = 1530000
Iteration 13: I= 1.39273632 +- 0.00131795 chisq= 2.34863197 number of evaluations = 1690000
Iteration 14: I= 1.39193851 +- 0.00124003 chisq= 2.41360925 number of evaluations = 1855000
Iteration 15: I= 1.39209278 +- 0.00117397 chisq= 2.25186744 number of evaluations = 2025000
Iteration 16: I= 1.39195469 +- 0.00111669 chisq= 2.11143293 number of evaluations = 2200000
Iteration 17: I= 1.39199769 +- 0.00106459 chisq= 1.98048511 number of evaluations = 2380000
Iteration 18: I= 1.39224558 +- 0.00101696 chisq= 1.90044675 number of evaluations = 2565000
Iteration 19: I= 1.39226155 +- 0.00097467 chisq= 1.79503445 number of evaluations = 2755000
Iteration 20: I= 1.39215354 +- 0.00093873 chisq= 1.70948666 number of evaluations = 2950000
Iteration 21: I= 1.39174991 +- 0.00090295 chisq= 1.74763427 number of evaluations = 3150000
Iteration 22: I= 1.39201916 +- 0.00088861 chisq= 1.79876934 number of evaluations = 3355000
Iteration 23: I= 1.39193942 +- 0.00086626 chisq= 1.72437605 number of evaluations = 3565000
Iteration 24: I= 1.39175239 +- 0.00083569 chisq= 1.67863878 number of evaluations = 3780000
Iteration 25: I= 1.39179166 +- 0.00080836 chisq= 1.61012556 number of evaluations = 4000000
Iteration 26: I= 1.39193725 +- 0.00078270 chisq= 1.56648987 number of evaluations = 4225000
Iteration 27: I= 1.39177483 +- 0.00076036 chisq= 1.53567400 number of evaluations = 4455000
Iteration 28: I= 1.39183237 +- 0.00074168 chisq= 1.48316902 number of evaluations = 4690000
Iteration 29: I= 1.39178878 +- 0.00072642 chisq= 1.43322770 number of evaluations = 4930000
Iteration 30: I= 1.39161003 +- 0.00070920 chisq= 1.42837145 number of evaluations = 5175000
Iteration 31: I= 1.39178860 +- 0.00069055 chisq= 1.42145998 number of evaluations = 5425000
Iteration 32: I= 1.39187229 +- 0.00067370 chisq= 1.38543875 number of evaluations = 5680000
Iteration 33: I= 1.39204059 +- 0.00065766 chisq= 1.38358451 number of evaluations = 5940000
Iteration 34: I= 1.39204148 +- 0.00064133 chisq= 1.34165884 number of evaluations = 6205000
Iteration 35: I= 1.39217355 +- 0.00062549 chisq= 1.32776639 number of evaluations = 6475000
Iteration 36: I= 1.39202398 +- 0.00061224 chisq= 1.32880030 number of evaluations = 6750000
Iteration 37: I= 1.39212769 +- 0.00059926 chisq= 1.31088728 number of evaluations = 7030000
Iteration 38: I= 1.39226711 +- 0.00058736 chisq= 1.31268450 number of evaluations = 7315000
Iteration 39: I= 1.39229800 +- 0.00057577 chisq= 1.28000361 number of evaluations = 7605000
Iteration 40: I= 1.39235635 +- 0.00056399 chisq= 1.25368055 number of evaluations = 7900000
Iteration 41: I= 1.39226205 +- 0.00055255 chisq= 1.23974393 number of evaluations = 8200000
Iteration 42: I= 1.39215539 +- 0.00054107 chisq= 1.23160718 number of evaluations = 8505000
Iteration 43: I= 1.39227960 +- 0.00053165 chisq= 1.23863367 number of evaluations = 8815000
Iteration 44: I= 1.39226526 +- 0.00052344 chisq= 1.21038076 number of evaluations = 9130000
Iteration 45: I= 1.39224601 +- 0.00051438 chisq= 1.18376739 number of evaluations = 9450000
Iteration 46: I= 1.39225536 +- 0.00051427 chisq= 1.17542647 number of evaluations = 9775000
Iteration 47: I= 1.39225319 +- 0.00051322 chisq= 1.14996767 number of evaluations = 10105000
Iteration 48: I= 1.39247655 +- 0.00050376 chisq= 1.23578846 number of evaluations = 10440000
Iteration 49: I= 1.39245623 +- 0.00049464 chisq= 1.21098841 number of evaluations = 10780000
Iteration 50: I= 1.39246964 +- 0.00048563 chisq= 1.18669024 number of evaluations = 11125000
Iteration 51: I= 1.39248006 +- 0.00047990 chisq= 1.16334846 number of evaluations = 11475000
Iteration 52: I= 1.39259921 +- 0.00047305 chisq= 1.18321110 number of evaluations = 11830000
Iteration 53: I= 1.39255013 +- 0.00046496 chisq= 1.16656505 number of evaluations = 12190000
Iteration 54: I= 1.39267805 +- 0.00045753 chisq= 1.18959615 number of evaluations = 12555000
Iteration 55: I= 1.39280301 +- 0.00044997 chisq= 1.20970349 number of evaluations = 12925000
Iteration 56: I= 1.39272051 +- 0.00044270 chisq= 1.20676952 number of evaluations = 13300000
Iteration 57: I= 1.39266996 +- 0.00043550 chisq= 1.19244091 number of evaluations = 13680000
Iteration 58: I= 1.39262597 +- 0.00042841 chisq= 1.17705560 number of evaluations = 14065000
Iteration 59: I= 1.39259513 +- 0.00042186 chisq= 1.15971001 number of evaluations = 14455000
Iteration 60: I= 1.39261472 +- 0.00041574 chisq= 1.14132155 number of evaluations = 14850000
Iteration 61: I= 1.39250347 +- 0.00040953 chisq= 1.16252478 number of evaluations = 15250000
Iteration 62: I= 1.39246565 +- 0.00040372 chisq= 1.14843251 number of evaluations = 15655000
Iteration 63: I= 1.39250951 +- 0.00039812 chisq= 1.13682304 number of evaluations = 16065000
Iteration 64: I= 1.39249895 +- 0.00039268 chisq= 1.11918950 number of evaluations = 16480000
Iteration 65: I= 1.39252391 +- 0.00038719 chisq= 1.10397899 number of evaluations = 16900000
Iteration 66: I= 1.39252339 +- 0.00038171 chisq= 1.08699571 number of evaluations = 17325000
Iteration 67: I= 1.39256877 +- 0.00037668 chisq= 1.07870883 number of evaluations = 17755000
Iteration 68: I= 1.39259264 +- 0.00037438 chisq= 1.06753048 number of evaluations = 18190000
Iteration 69: I= 1.39268362 +- 0.00037074 chisq= 1.09667271 number of evaluations = 18630000
Iteration 70: I= 1.39257656 +- 0.00036586 chisq= 1.12702497 number of evaluations = 19075000
Iteration 71: I= 1.39259940 +- 0.00036129 chisq= 1.11316580 number of evaluations = 19525000
Iteration 72: I= 1.39252010 +- 0.00035652 chisq= 1.12335027 number of evaluations = 19980000
Iteration 73: I= 1.39249992 +- 0.00035224 chisq= 1.10961017 number of evaluations = 20440000
1.3924999192458365 +- 0.0003522366641009809 exact= 1.3932 real error= 0.0005024983880013646
Here we are going to speed up the code by using vectorized numpy capability, and numba capability.
Here numba will be used to set fxbin array, using wfun array, which contains projection of the function integrated to all axis.
To interpolate grid at the random points xr, we will use numpy vectorized functionality and fancy indexing to remove the loop over batches. All loops over batch size n is gone in the final version of the code.
from numba import jit
@jit(nopython=True)
def SetFxbin(fxbin,bins,wfun):
(n,ndim) = bins.shape
for dim in range(ndim):
# Here we make a better approximation for the function, which we are integrating.
for i in range(n):
fxbin[dim, bins[i,dim] ] += abs(wfun[i]) # just bin the function f. We saved the bin position before.
def Vegas_step3b(integrant, unit, maxeval, nstart, nincrease, grid, cum):
ndim, nbins = grid.ndim,grid.nbins # dimension of the integral, size of the grid for binning in each direction
unit_dim = unit**ndim # converts from unit cube integration to generalized cube with unit length
nbatch=1000 # function will be evaluated in bacthes of 1000 evaluations at one time (for efficiency and storage issues)
neval=0
print ("""Vegas parameters:
ndim = """+str(ndim)+"""
unit = """+str(unit)+"""
maxeval = """+str(maxeval)+"""
nstart = """+str(nstart)+"""
nincrease = """+str(nincrease)+"""
nbins = """+str(nbins)+"""
nbaths = """+str(nbatch)+"\n")
bins = zeros((nbatch,ndim),dtype=int) # in which sampled bin does this point fall?
all_nsamples = nstart
for iter in range(1000): # NEW in step 3
wgh = zeros(nbatch) # weights for each random point in the batch
fxbin = zeros((ndim,nbins)) # after each iteration we reset the average function being binned
for nsamples in range(all_nsamples,0,-nbatch): # loop over all_nsample evaluations in batches of nbatch
n = min(nbatch,nsamples) # How many evaluations in this pass?
# We are integrating f(g_1(x),g_2(y),g_3(z))*dg_1/dx*dg_2/dy*dg_3/dz dx*dy*dz
# This is represented as 1/all_nsamples \sum_{x_i,y_i,z_i} f(g_1(x_i),g_2(y_i),g_3(z_i))*dg_1/dx*dg_2/dy*dg_3/dz
# where dg_1/dx = diff*NBINS
xr = random.random((n,ndim)) # generates 2-d array of random numbers in the interval [0,1)
pos = xr*nbins # (x*N)
bins = array(pos,dtype=int) # which grid would it fit ? (x*N)
wgh = ones(nbatch)/all_nsamples
for dim in range(ndim):
# We want to evaluate the function f at point g(x), i.e, f(g_1(x),g_2(y),...)
# Here we transform the points x,y,z -> g_1(x), g_2(y), g_3(z)
# We hence want to evaluate g(x) ~ g(x[i]), where x is the random number and g is the grid function
# The discretized g(t) is defined on the grid :
# t[-1]=0, t[0]=1/N, t[1]=2/N, t[2]=3/N ... t[N-1]=1.
# We know that g(0)=0 and g(1)=1, so that g[-1]=0.0 and g[N-1]=1.0
# To interpolate g at x, we first compute i=int(x*N) and then we use linear interpolation
# g(x) = g[i-1] + (g[i]-g[i-1])*(x*N-int(x*N))
gi = grid.g[dim,bins[:,dim]] # g[i]
gm = grid.g[dim,bins[:,dim]-1] # g[i-1]
diff = gi - gm # g[i]-g[i-1]
gx = gm + (pos[:,dim]-bins[:,dim])*diff # linear interpolation g(xr)
xr[:,dim] = gx*unit # xr <- g(xr)
wgh *= diff*nbins # wgh = prod_{dim} dg/dx
# Here we evaluate function f on all randomly generated x points above
fx = integrant(xr) # n function evaluations required in single call
neval += n # We just added so many fuction evaluations
# Now we compute the integral as weighted average, namely, f(g(x))*dg/dx
wfun = wgh * fx # weight * function ~ f_i*w_i
cum.sum += sum(wfun) # sum_i f_i*w_i = <fw>
wfun *= wfun # carefull : this is like (f_i * w_i/N)^2 hence 1/N (1/N (f_i*w_i)^2)
cum.sqsum += sum(wfun) # sum_i (f_i*w_i)^2 = <fw^2>/all_nsamples
#
SetFxbin(fxbin,bins,wfun)
#for dim in range(ndim): #new2
# # Here we make a better approximation for the function, which we are integrating.
# for i in range(n): #new2
# fxbin[dim, bins[i,dim] ] += wfun[i] #new2: just bin the function f. We saved the bin position before.
w0 = sqrt(cum.sqsum*all_nsamples) # w0 = sqrt(<fw^2>)
w1 = (w0 + cum.sum)*(w0 - cum.sum) # w1 = (w0^2 - <fw>^2) = (<fw^2>-<fw>^2)
w = (all_nsamples-1)/w1 # w ~ 1/sigma_i^2 = (N-1)/(<fw^2>-<fw>^2)
# Note that variance of the MC sampling is Var(monte-f) = (<f^2>-<f>^2)/N == 1/sigma_i^2
cum.weightsum += w # weightsum ~ \sum_i 1/sigma_i^2
cum.avgsum += w*cum.sum # avgsum ~ \sum_i <fw>_i / sigma_i^2
cum.avg2sum += w*cum.sum**2 # avg2cum ~ \sum_i <fw>_i^2/sigma_i^2
cum.avg = cum.avgsum/cum.weightsum # I_best = (\sum_i <fw>_i/sigma_i^2 )/(\sum_i 1/sigma_i^2)
cum.err = sqrt(1/cum.weightsum) # err ~ sqrt(best sigma^2) = sqrt(1/(\sum_i 1/sigma_i^2))
# NEW in this step3
if iter>0:
cum.chisq = (cum.avg2sum - 2*cum.avgsum*cum.avg + cum.weightsum*cum.avg**2)/iter
print ("Iteration {:3d}: I= {:10.8f} +- {:10.8f} chisq= {:10.8f} number of evaluations = {:7d} ".format(iter+1, cum.avg*unit_dim, cum.err*unit_dim, cum.chisq, neval))
imp = Smoothen(fxbin)
grid.RefineGrid(imp)
cum.sum=0 # clear the partial sum for the next step
cum.sqsum=0
all_nsamples += nincrease # for the next time, increase the number of steps a bit
if (neval>=maxeval): break
cum.avg *= unit**ndim
cum.err *= unit**ndim
from scipy import *
from numpy import *
unit=pi
ndim=3
maxeval=20000000
exact = 1.3932 # exact value of the integral
cum = Cumulants()
nbins=128
nstart =100000
nincrease=5000
grid = Grid(ndim,nbins)
random.seed(0)
Vegas_step3b(my_integrant2, pi, maxeval, nstart, nincrease, grid, cum)
print (cum.avg, '+-', cum.err, 'exact=', exact, 'real error=', abs(cum.avg-exact)/exact)
plot(grid.g[0,:nbins])
plot(grid.g[1,:nbins])
plot(grid.g[2,:nbins])
show()
Vegas parameters:
ndim = 3
unit = 3.141592653589793
maxeval = 20000000
nstart = 100000
nincrease = 5000
nbins = 128
nbaths = 1000
Iteration 1: I= 1.36762346 +- 0.01023309 chisq= 0.00000000 number of evaluations = 100000
Iteration 2: I= 1.38493614 +- 0.00552045 chisq= 4.03725299 number of evaluations = 205000
Iteration 3: I= 1.39107078 +- 0.00365694 chisq= 3.11889891 number of evaluations = 315000
Iteration 4: I= 1.39220291 +- 0.00285603 chisq= 2.16116961 number of evaluations = 430000
Iteration 5: I= 1.39391923 +- 0.00246919 chisq= 1.97837591 number of evaluations = 550000
Iteration 6: I= 1.39327395 +- 0.00216911 chisq= 1.64253221 number of evaluations = 675000
Iteration 7: I= 1.39237495 +- 0.00190896 chisq= 1.49574064 number of evaluations = 805000
Iteration 8: I= 1.39084621 +- 0.00171092 chisq= 1.74779014 number of evaluations = 940000
Iteration 9: I= 1.39124007 +- 0.00159307 chisq= 1.57911830 number of evaluations = 1080000
Iteration 10: I= 1.39178313 +- 0.00148354 chisq= 1.50089423 number of evaluations = 1225000
Iteration 11: I= 1.39129403 +- 0.00138447 chisq= 1.43499906 number of evaluations = 1375000
Iteration 12: I= 1.39136789 +- 0.00129897 chisq= 1.30670608 number of evaluations = 1530000
Iteration 13: I= 1.39192235 +- 0.00122906 chisq= 1.34277224 number of evaluations = 1690000
Iteration 14: I= 1.39210209 +- 0.00116712 chisq= 1.25622458 number of evaluations = 1855000
Iteration 15: I= 1.39289525 +- 0.00112200 chisq= 1.60155180 number of evaluations = 2025000
Iteration 16: I= 1.39279461 +- 0.00107659 chisq= 1.50154462 number of evaluations = 2200000
Iteration 17: I= 1.39284042 +- 0.00106297 chisq= 1.41219776 number of evaluations = 2380000
Iteration 18: I= 1.39295317 +- 0.00103524 chisq= 1.34198096 number of evaluations = 2565000
Iteration 19: I= 1.39360680 +- 0.00099462 chisq= 1.55533457 number of evaluations = 2755000
Iteration 20: I= 1.39377022 +- 0.00095542 chisq= 1.49186042 number of evaluations = 2950000
Iteration 21: I= 1.39356055 +- 0.00091986 chisq= 1.45022690 number of evaluations = 3150000
Iteration 22: I= 1.39346800 +- 0.00088810 chisq= 1.38827277 number of evaluations = 3355000
Iteration 23: I= 1.39356357 +- 0.00085791 chisq= 1.33304529 number of evaluations = 3565000
Iteration 24: I= 1.39348336 +- 0.00082902 chisq= 1.28082649 number of evaluations = 3780000
Iteration 25: I= 1.39358231 +- 0.00080332 chisq= 1.23718491 number of evaluations = 4000000
Iteration 26: I= 1.39371675 +- 0.00078013 chisq= 1.20738380 number of evaluations = 4225000
Iteration 27: I= 1.39378904 +- 0.00075729 chisq= 1.16667047 number of evaluations = 4455000
Iteration 28: I= 1.39369588 +- 0.00073682 chisq= 1.13397047 number of evaluations = 4690000
Iteration 29: I= 1.39340503 +- 0.00071662 chisq= 1.19640849 number of evaluations = 4930000
Iteration 30: I= 1.39302942 +- 0.00069683 chisq= 1.32902196 number of evaluations = 5175000
Iteration 31: I= 1.39284884 +- 0.00067851 chisq= 1.32786718 number of evaluations = 5425000
Iteration 32: I= 1.39253524 +- 0.00066360 chisq= 1.44353551 number of evaluations = 5680000
Iteration 33: I= 1.39256824 +- 0.00064833 chisq= 1.40012422 number of evaluations = 5940000
Iteration 34: I= 1.39270298 +- 0.00063255 chisq= 1.38490760 number of evaluations = 6205000
Iteration 35: I= 1.39260687 +- 0.00061716 chisq= 1.35830699 number of evaluations = 6475000
Iteration 36: I= 1.39266202 +- 0.00060602 chisq= 1.32587503 number of evaluations = 6750000
Iteration 37: I= 1.39269778 +- 0.00059474 chisq= 1.29166805 number of evaluations = 7030000
Iteration 38: I= 1.39261544 +- 0.00058300 chisq= 1.27001715 number of evaluations = 7315000
Iteration 39: I= 1.39265521 +- 0.00057115 chisq= 1.23964002 number of evaluations = 7605000
Iteration 40: I= 1.39263452 +- 0.00055972 chisq= 1.20870268 number of evaluations = 7900000
Iteration 41: I= 1.39262307 +- 0.00054850 chisq= 1.17874879 number of evaluations = 8200000
Iteration 42: I= 1.39254852 +- 0.00053745 chisq= 1.16129750 number of evaluations = 8505000
Iteration 43: I= 1.39243857 +- 0.00052710 chisq= 1.15977315 number of evaluations = 8815000
Iteration 44: I= 1.39247021 +- 0.00051732 chisq= 1.13508077 number of evaluations = 9130000
Iteration 45: I= 1.39271351 +- 0.00050792 chisq= 1.24897036 number of evaluations = 9450000
Iteration 46: I= 1.39273124 +- 0.00049882 chisq= 1.22197782 number of evaluations = 9775000
Iteration 47: I= 1.39268508 +- 0.00048980 chisq= 1.20060371 number of evaluations = 10105000
Iteration 48: I= 1.39270826 +- 0.00048160 chisq= 1.17649486 number of evaluations = 10440000
Iteration 49: I= 1.39268980 +- 0.00047340 chisq= 1.15289105 number of evaluations = 10780000
Iteration 50: I= 1.39277335 +- 0.00046527 chisq= 1.14803488 number of evaluations = 11125000
Iteration 51: I= 1.39276749 +- 0.00045757 chisq= 1.12517076 number of evaluations = 11475000
Iteration 52: I= 1.39279949 +- 0.00045023 chisq= 1.10612186 number of evaluations = 11830000
Iteration 53: I= 1.39282711 +- 0.00044337 chisq= 1.08724099 number of evaluations = 12190000
Iteration 54: I= 1.39269319 +- 0.00043621 chisq= 1.12046711 number of evaluations = 12555000
Iteration 55: I= 1.39263143 +- 0.00043015 chisq= 1.11316659 number of evaluations = 12925000
Iteration 56: I= 1.39259639 +- 0.00042410 chisq= 1.09724364 number of evaluations = 13300000
Iteration 57: I= 1.39266669 +- 0.00041788 chisq= 1.09451755 number of evaluations = 13680000
Iteration 58: I= 1.39269151 +- 0.00041183 chisq= 1.07746788 number of evaluations = 14065000
Iteration 59: I= 1.39275558 +- 0.00040646 chisq= 1.07498678 number of evaluations = 14455000
Iteration 60: I= 1.39273543 +- 0.00040089 chisq= 1.05829837 number of evaluations = 14850000
Iteration 61: I= 1.39278217 +- 0.00039653 chisq= 1.05112443 number of evaluations = 15250000
Iteration 62: I= 1.39276559 +- 0.00039173 chisq= 1.03508563 number of evaluations = 15655000
Iteration 63: I= 1.39279028 +- 0.00038664 chisq= 1.02087328 number of evaluations = 16065000
Iteration 64: I= 1.39288265 +- 0.00038148 chisq= 1.03883198 number of evaluations = 16480000
Iteration 65: I= 1.39293693 +- 0.00037982 chisq= 1.05905661 number of evaluations = 16900000
Iteration 66: I= 1.39293378 +- 0.00037673 chisq= 1.04282903 number of evaluations = 17325000
Iteration 67: I= 1.39296399 +- 0.00037183 chisq= 1.03079886 number of evaluations = 17755000
Iteration 68: I= 1.39297885 +- 0.00036871 chisq= 1.01684201 number of evaluations = 18190000
Iteration 69: I= 1.39301103 +- 0.00036469 chisq= 1.00705290 number of evaluations = 18630000
Iteration 70: I= 1.39300538 +- 0.00035991 chisq= 0.99259139 number of evaluations = 19075000
Iteration 71: I= 1.39299462 +- 0.00035519 chisq= 0.97890211 number of evaluations = 19525000
Iteration 72: I= 1.39296368 +- 0.00035098 chisq= 0.96964893 number of evaluations = 19980000
Iteration 73: I= 1.39298080 +- 0.00034688 chisq= 0.95760365 number of evaluations = 20440000
1.3929807984198352 +- 0.00034687790730998316 exact= 1.3932 real error= 0.00015733676440191774
Homework¶
- generalize Vegas algorithm so that the integration limits are given by arbitrary numbers [a,b]. You can still assume supercube in which all dimensions have the same limit [a,b].
- Test Vegas on the same example $f=1/(1-\cos(k_x)\cos(k_y)\cos(k_z))/\pi^3$ but use limits $[-\pi,\pi]$ instead of $[0,\pi]$.
- Speed up the part of the code that Redefines the grid
RefineGrid - generalize Vegas so that it works for complex functions.
- Using Vegas, evaluate the following Linhard function:
Here $\varepsilon_\vec{k}=k^2-k_F^2$.
We will use $k_F=\frac{(\frac{9\pi}{4})^{1/3}}{rs}$ with $r_s=2$, $f(x) = 1/(\exp(x/T)+1)$, $T=0.02 k_F^2$, $\delta = 0.002 k_F^2$, $\Omega=0$, $q=0.1 k_F$, integration limits can be set to $[-3 k_F,3 k_F]$.
The result for $P(\Omega=0,q<< k_F)$ should be close to $-n_F$, where $n_F = \frac{k_F}{2\pi^2}$.
- Optional: Change the Vegas algorithm so that it computes $P(\Omega,\vec{q})$ at once for an array of $\Omega$ points, such as
linspace(0,0.5*kF*kF,200). Use $P(\Omega=0)$ to redefine the grid, so that we have most efficient integration at $\Omega=0$ (where the function is hardest to integrate).