英文:
How to adjust for non-uniform sampling (log-scale/polar) in Monte Carlo integration?
问题
我正在尝试对一个函数进行蒙特卡洛积分,但是样本是非均匀的。我需要这种方法既适用于对数尺度,又适用于极坐标下的积分,因为我将结合这两种方法,并在半径上使用极坐标和对数尺度的采样。
我编写了一个测试脚本,试图进行以下操作:
-
在极坐标下积分一个二维高斯函数(结果应该等于π)
-
在对数尺度下从10的-2次方积分到10的7次方的y(x)=x函数(结果应该约等于0.5*10**14)
为了测试目的,我还使用了一个基于均匀笛卡尔坐标的蒙特卡洛积分方法,它是有效的。样本的非均匀性导致了结果的偏移。
import numpy as np
def function_to_integrate(x, y):
return np.exp(-x**2 - y**2)
def polar_MC(polar):
size = 100000
integral = 0.
integration_radius = 4.
if polar:
for _ in range(size):
r = np.random.random()*integration_radius
phi = np.random.random()*2.*np.pi
x = r*np.cos(phi)
y = r*np.sin(phi)
jacobian_MC_polar = 1.
integral += function_to_integrate(x, y) * jacobian_MC_polar
integral = integral * np.pi * integration_radius**2 / size
else:
for _ in range(size):
length = 2. * integration_radius
x = np.random.random()*length - length/2.
y = np.random.random()*length - length/2.
integral += function_to_integrate(x, y)
integral = integral * length**2 / size
print('极坐标:真实积分应该是π,MC结果为:', integral, polar)
def log_MC(log):
size = 10000
integral = 0.
if log:
for _ in range(size):
x = np.random.uniform(-2, 7.)
jacobian_MC_log = 1.
integral += 10**x * jacobian_MC_log
else:
for _ in range(size):
x = np.random.uniform(10**-2, 10**7)
integral += x
integral = integral*10**7 / size
print('对数尺度:真实积分应该是0.5*10**7*10**7 = 5*10**13,MC结果为:', integral/10**13, '* 10**13', log)
polar_MC(polar=True)
polar_MC(polar=False)
log_MC(log=True)
log_MC(log=False)
我无法从极坐标和对数尺度的蒙特卡洛积分中得到正确的结果,我应该如何设置jacobian_MC才能使其正常工作?或者我做错了其他什么事情?
我尝试使用标准的雅可比矩阵(极坐标下为r,对数尺度下为r*np.log(10)),但没有帮助。
当雅可比矩阵设置为1时,我得到的结果是:
极坐标:真实积分应该是π,MC结果为: 11.041032315593327 True
极坐标:真实积分应该是π,MC结果为: 3.108344559871783 False
对数尺度:真实积分应该是0.5*10**7*10**7 = 5*10**13,MC结果为: 0.48366198481209793 * 10**13 True
对数尺度:真实积分应该是0.5*10**7*10**7 = 5*10**13,MC结果为: 5.003437412553992 * 10**13 False
增加采样数量并没有帮助,结果接近收敛。
我应该用什么概率分布除以采样点?
英文:
I am trying to perform Monte Carlo integration of a function, but sample non-uniformly. I need this method to work for both logarithmic scale and for integration in polar coordinates since I will then combine the two and use polar with log-scale sampling in radius.
I wrote a testing script that tries to
-
integrate a 2D gaussian in polar coordinates (which should equal to pi)
-
integrate y(x) = x in log-scale from 10**-2 to 10**7 (which should equal to ~0.5*10 ** 14)
For testing purposes, I complement the calculation with a uniform cartesian coordinate-based Monte Carlo that works. It is the non-uniformity of the sample that shifts my results.
import numpy as np
def function_to_integrate(x, y):
return np.exp(-x**2 - y**2)
def polar_MC(polar):
size = 100000
integral = 0.
integration_radius = 4.
if polar:
for _ in range(size):
r = np.random.random()*integration_radius
phi = np.random.random()*2.*np.pi
x = r*np.cos(phi)
y = r*np.sin(phi)
jacobian_MC_polar = 1.
integral += function_to_integrate(x, y) * jacobian_MC_polar
integral = integral * np.pi * integration_radius**2 / size
else:
for _ in range(size):
length = 2. * integration_radius
x = np.random.random()*length - length/2.
y = np.random.random()*length - length/2.
integral += function_to_integrate(x, y)
integral = integral * length**2 / size
print('POLAR: True integral should be pi ', '; MC:', integral, polar)
def log_MC(log):
size = 10000
integral = 0.
if log:
for _ in range(size):
x = np.random.uniform(-2, 7.)
jacobian_MC_log = 1.
integral += 10**x * jacobian_MC_log
else:
for _ in range(size):
x = np.random.uniform(10**-2, 10**7)
integral += x
integral = integral*10**7 / size
print('LOG: True integral should be 0.5*10**7*10**7 = 5*10**13; MC:', integral/10**13, '* 10**13', log)
polar_MC(polar=True)
polar_MC(polar=False)
log_MC(log=True)
log_MC(log=False)
I am unable to get the correct result from the polar and log-scale Monte Carlo, how should I set the jacobian_MC in order for this to work? Or am I doing something else wrong?
I have tried using the standard Jacobians (r for polar and r*np.log(10) for logarithmic) and that did not help.
With the Jacobians set to 1, I am getting
POLAR: True integral should be pi ; MC: 11.041032315593327 True
POLAR: True integral should be pi ; MC: 3.108344559871783 False
LOG: True integral should be 0.5*10**7*10**7 = 5*10**13; MC: 0.48366198481209793 * 10**13 True
LOG: True integral should be 0.5*10**7*10**7 = 5*10**13; MC: 5.003437412553992 * 10**13 False
Increasing sampling does not help, the results is close to being converged.
What probability distribution should I divide the sampled points with?
答案1
得分: 0
你对极坐标积分的雅可比和归一化部分都弄错了。
以下是正确的代码,Python 3.10,Win x64:
import numpy as np
rng = np.random.default_rng()
def integrand(x: np.float64, y: np.float64) -> np.float64:
r = np.sqrt(x*x + y*y)
jacobian = r
return jacobian * np.exp(-r*r)
def sample_xy(R: np.float64):
r = R * rng.random()
phi = 2.0*np.pi*rng.random()
return r*np.cos(phi), r*np.sin(phi)
N = 1000000
R = 100.0
s: np.float64 = 0.0
for k in range(0, N):
x,y = sample_xy(R)
s += integrand(x, y)
print(s/N * 2.0*np.pi*R)
它始终打印出大约3.14的值:
3.155748795359562
3.14192687470938
3.161890183195259
更新:
这不是周长或面积的问题。这是如何制作概率密度函数(PDF)的问题。
所以你有 f(r)
f(r) = r e-r2
和积分
I = S02 pi d phi S0R dr f(r)
你想使用蒙特卡洛方法。这意味着采样 phi 和采样 r。为了采样任何东西,你必须有概率密度函数(PDF)(正的,正确归一化为1等)。
所以让我们从 phi 的积分开始
Iphi = S02 pi d phi.
我将它乘以 2 pi 并除以 2 pi。
Iphi = 2 pi S02 pi d phi/(2 pi).
这样在积分符号下就得到了正确的 PDF
PDF(phi) = d phi/(2 pi)
和正确的采样
phi = 2 pi random()
但是前面还剩下 2 pi,你必须将其移到积分的前面。这就是它的工作原理——你制作了归一化为1的积分下的 PDF,无论其中有什么常数,你在整个采样过程之后再考虑它。这不是面积也不是周长,而是 PDF 的构建和归一化。
径向积分也是一样的
IR = S0R dr = R S dr/R.
PDF(r) = dr/R,所以你可以采样 r = R random(),但是你必须将 R 带到最后的计算步骤中。
英文:
You got both Jacobian and normalization parts wrong for polar integration
Here is the correct code, Python 3.10, Win x64
import numpy as np
rng = np.random.default_rng()
def integrand(x: np.float64, y: np.float64) -> np.float64:
r = np.sqrt(x*x + y*y)
jacobian = r
return jacobian * np.exp(-r*r)
def sample_xy(R: np.float64):
r = R * rng.random()
phi = 2.0*np.pi*rng.random()
return r*np.cos(phi), r*np.sin(phi)
N = 1000000
R = 100.0
s: np.float64 = 0.0
for k in range(0, N):
x,y = sample_xy(R)
s += integrand(x, y)
print(s/N * 2.0*np.pi*R)
it consistently prints values around 3.14:
3.155748795359562
3.14192687470938
3.161890183195259
UPDATE
This is not perimeter or area thing. This is how you make Probability Density Function (PDF).
So you have f(r)
f(r) = r e<sup>-r<sup>2</sup></sup>
and integrals
I = S<sub>0</sub><sup>2 pi</sup> d phi S<sub>0</sub><sup>R</sup> dr f(r)
You want to use Monte Carlo. It means sampling phi and sampling r. In order to sample something (anything!) you HAVE TO HAVE PDF (positive, properly normalized to 1 etc).
So lets start with integral over phi
I<sub>phi</sub> = S<sub>0</sub><sup>2 pi</sup> d phi.
I'll multiply and divide it by 2 pi.
I<sub>phi</sub> = 2 pi S<sub>0</sub><sup>2 pi</sup> d phi/(2 pi).
which makes proper PDF under the integration sign
PDF(phi) = d phi/(2 pi)
and proper sampling
phi = 2 pi random()
But what's left is 2 pi upfront which you have to carry out to the front of the integral. This is how it works - you make normalized to 1 PDF under the integral, and whatever constant is there you account for it later, after whole sampling routine. It is not area nor perimeter, it is PDF construction and normalization
Same for radial integral
I<sub>R</sub> = S<sub>0</sub><sup>R</sup> dr = R S dr/R.
PDF(r) = dr/R, so you could sample r = R random(), but you have to carry R upfront to final calculation step.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论