Wednesday, May 29, 2013

Monte Carlo

Math is pretty awesome, but there are some problems that just have no closed-form solution. Similarly, there are many problems in physics that are most easily solved by numerical simulation. There are many approaches to numerical simulation. One of the easiest (quite probably the easiest, actually) is called Monte Carlo, and it operates by injecting randomness in a system, seeing how it pans out, and repeating many times to reduce statistical uncertainty. That's how I would define it, in any case - you'll get different answers if you ask different people. The presence of randomness means that by increasing the number of samples, you can decrease your error, which contrasts sharply with deterministic methods, for which you're pretty much stuck with what you have.

Some examples of the usefulness of Monte Carlo techniques include risk analysis in investment, particle physics simulations (how do you think I heard about it?), and randomizing the parameters of a simulation in order to determine sources of error. It's a useful technique in many fields.

A simple application of Monte Carlo techniques is in numerical integration. Some functions have no friendly antiderivative, which means we're stuck with one method of numerical integration or another. I'm by no means saying that Monte Carlo is the right way to go on this one; there are plenty of deterministic methods that will also work quite well, but it's an easy example to get your head around the power of randomness in numerical simulation. And it turns out that once you get to around 20-dimensional integrals (not too uncommon, it turns out, in theoretical physics), Monte Carlo results in lower uncertainties than deterministic methods.

Here's the plan: we have a function $f(x)$ that we want to integrate from $x_0$ to $x_1$. To do so, we'll generate a lot of random points in the rectangular region from $x_0$ to $x_1$ and $y=0$ to $y_{max}$ (suppose for the sake of simplicity that $f(x)$ is positive everywhere - the method extends to negative functions as well with a little more hassle). We'll then check each point to determine whether it's under the curve. In the end, we take the ratio of the number of points under the curve to the number of points we generated, scale by the size of the rectangular region, and the result is our numerical integration! Here's a demonstration of how it works:
Demonstration of Monte Carlo numerical integration, on
a Gaussian function with standard deviation 0.3.
1000 points (little blue dots) were generated, but only 379
were under the curve (circled in black). Thus this particular
simulation yielded a value of 0.7580 for the integral, as
compared with the true value, 0.7468.
This plot was generated with the following simple MATLAB code:
function result = nIntegrate(func, minX, maxX, maxY, n)
    % Use maxY instead of finding the maximum to avoid evaluating the 
    % function unnecessarily. 
    totalArea = (maxX - minX) * maxY;
    
    % Generate and scale the random points
    xs = (rand(n,1) * (maxX - minX)) + minX;
    ys = rand(n,1) * maxY;
    pts = [xs ys];
    
    % Figure out which ones are under the curve
    filtered = pts(ys < func(xs),:);
    nUnderCurve = length(filtered);
    
    result = (nUnderCurve / n) * totalArea;
    
    % Make a nice plot to go with the analysis.
    figure; hold on;
    title(sprintf('Monte Carlo integration: %d points', n));
    xlabel('x');
    ylabel('y');

    % All points
    scatter(xs, ys,1);
    
    % The function
    realX = linspace(minX, maxX, 100);
    realY = func(realX);
    plot(realX, realY,'Color','red','LineWidth',2);

    % The points under the curve
    scatter(filtered(:,1), filtered(:,2), 15, 'k');
    hold off;
end

The function was then called like this:
nIntegrate(@(x)gaussmf(x,[.3,0]),-1,1,1,1000)
(the at sign makes an anonymous function; in this case, a function that will return the value of the Gaussian centered at zero with standard deviation 0.3 at a given value of x.)

I can't quite figure out how to get MATLAB syntax highlighting in there, but you get the idea. The marvelous thing is that it takes only six lines of code to do the whole integration numerically. The rest is just plotting the results.

Furthermore, you can easily see that the statistical uncertainty decreases as you increase the number of samples:
The error in the integration decreases with an increase
in the number of points.
Cute, huh? And that's barely scratched the surface of what Monte Carlo can do.

No comments:

Post a Comment