Concept: Monte Carlo Better than Quadrature
Core Intuition
Q: Why Monte Carlo method is often the default method for high-dimensional integral approximation, instead of Quadrature methods? I got this question while reading Fearnhead et al. (2025), Scalable Monte Carlo for Bayesian Learning, p.6.
This comes directly from the following:
- Suppose we apply a cubature rule based on a grid of equally spaced points in each dimension.
- The spacing of these points will be
- Then there will be points in total
- If we have a cubature whose error decays in , for some power , then the error decays at a rate of
- Note for Monte Carlo method: its error decays at a rate of , by the central limit theorem
Mathematical Foundation
Q: Why cubature whose error decays in ? Taylor’s theorem with integral remainder.
Taylor’s Theorem with Remainder
For a function with continuous derivatives on an interval containing and :
where the remainder can be written as:
Proof of the Error Rate Bound in
Step 1: Take absolute value of the remainder:
Step 2: Apply triangle inequality:
Step 3: Use (definition of sup norm):
Step 4: Evaluate the integral. Let :
Step 5: Combine:
Step 6: Set and :
Derivation of Taylor’s Theorem with Remainder Formula
The key idea is to repeatedly apply integration by parts to express the error in terms of the highest derivative.
Starting Point: Fundamental Theorem of Calculus
This is just:
Step 1: Integration by Parts (once)
Apply integration by parts to with:
- , so
- , so
So:
Step 2: Integration by Parts (again)
Apply integration by parts to with:
- , so
- , so
So:
Pattern Recognition
After k integration by parts, we get:
Setting k = r:
Intuition
The integral form shows that the remainder is a weighted average of over the interval , with weights that give more importance to points closer to .
Key Equation
For curvature: If we have a cubature whose error decays in . for some power , then the error decays at a rate of .
As , , so .
For Monte Carlo: The error decays at a rate of , by the central limit theorem.
Component of
Insights
- Curse of dimensionality breaks quadrature: cubature error scales as , so for fixed and , doubling halves the exponent. Monte Carlo error is completely dimension-free.
- Crossover dimension: Monte Carlo beats an order- cubature rule when . For a second-order rule (), the crossover is ; for most practical integrands in ML/Bayesian settings, is in the hundreds or thousands.
- Higher-order rules don’t rescue quadrature at scale: increasing shifts the crossover dimension, but collecting enough smooth derivatives of in high is itself intractable.
- Monte Carlo’s slow 1D rate is a feature in high : looks poor compared to in 1D, but its constant does not grow exponentially with .
Pitfalls
- Monte Carlo is not always preferable: in or (for ), a well-chosen quadrature rule is faster to converge; defaulting to Monte Carlo in low dimensions wastes samples.
- Rate assumes i.i.d. samples: the CLT rate breaks down for correlated samples (e.g., MCMC chains), which introduce an effective-sample-size penalty.
- Ignoring variance: the CLT bound hides the integrand variance ; a high-variance integrand can make Monte Carlo impractical even when the dimension favors it. Variance reduction (importance sampling, control variates) is then essential.
- Confusing (total points) with (points per dimension): the grid has points, so comparing quadrature and Monte Carlo at the same (not the same ) is critical.
Connections
- Quasi-Monte Carlo todo: low-discrepancy sequences beat i.i.d. draws, achieving error for moderate .
- MCMC todo: Markov chains replace i.i.d. draws when the target is known up to a constant; rate survives asymptotically via ergodic theory.
- Importance Sampling todo: reweights a surrogate distribution to reduce variance; rate holds with a smaller constant.
- Curse of Dimensionality todo: cubature rate is the canonical example of exponential cost growth with .
- Central Limit Theorem todo: source of Monte Carlo’s guarantee for i.i.d. estimators.
References
[1] Fearnhead et al. (2025), Scalable Monte Carlo for Bayesian Learning