Importance sampling is probably one of the easiest sampling algorithm and one of the most fundamental one as well. The main purpose of it is to estimate the properties of a particular distribution, while only having samples generated from a different distribution rather than the distribution of interest. Depending on the application, the term may refer to the process of sampling from this alternative distribution, the process of inference, or both.
In this lecture note it is very in depth. Also I recommend you to watch this video as well.
This is probably one of the ugliest code I have ever written :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | from __future__ import division import pylab from pylab import arange,pi,sin,cos,sqrt from numpy import * #This is a random uniform density function def gSample(): return random.uniform(0, 1) * g(x=0) #The density function of the sample def f(x): return 0.2 * exp(-(x - 0.2) ** 2) + 0.8 * exp(-(x - 2.0) ** 2 / 0.2) def g(x): return 4.0 def importanceSampling(nSamples): samples = zeros(nSamples, dtype=float) w = zeros(nSamples, dtype=float) for i in range(nSamples): samples[i] = gSample() w[i] = f(samples[i]) / g(samples[i]) return samples, w def getExpectation(w, samples): nCols = len(w) wVal = 0 for i in range(nCols): wVal += samples[i] * w[i] return wVal / nCols def getExpectationSelfNormalized(w, samples): nCols = len(w) totalW = 0 wVal = 0 for i in range(nCols): wVal += samples[i] * w[i] totalW += w[i] return wVal / totalW def main(): x = arange(0, 4, 0.01) y = arange(-0.5, 4.5, 0.1) noOfSamples = 10000 realdata = f(x) box = ones(len(y)) * 0.8 params = {'backend': 'ps', 'axes.labelsize': 10, 'text.fontsize': 10, 'legend.fontsize': 10, 'xtick.labelsize': 8, 'ytick.labelsize': 8, 'text.usetex': False, } pylab.rcParams.update(params) box[:5] = 0 box[-5:] = 0 pylab.plot(x, realdata, 'b', lw = 6, label='The Distribution of real data') pylab.plot(y, box, 'r--', lw = 3, label='The Upper bound of distribution') samples, w = importanceSampling(noOfSamples) # show samples in the histogram as a normalised way pylab.hist(samples, normed=1, fc='g', label='The random samples that are generated') eX = getExpectation(w, samples) print "Importance sampling expected value: " + str(eX) eXSN = getExpectationSelfNormalized(w, samples) print "Importance sampling self-normalized expected value: " + str(eXSN) pylab.xlabel('x', fontsize = 16) pylab.ylabel('p(x)', fontsize = 16) pylab.legend(); pylab.show() main() |
Also this code draws the plot of distribution of interest and the uniform distribution as well(using matplotlib):
If you look carefully to the values the script outputs, the self-normalized expected value that the program outputs is very close to the point that the distribution in the graph peaks.
Also for a gentle introduction about algorithmic MC techniques, I recommend you to have look at this book.
Correction: There was a small typo instead of totalW, I wrote wWeights while initializing variables.


Recent Comments