path integral control

Path integral control (Kappen, 2005) provides an analytic expression for the optimal path distribution of a stochastic optimal control problem, given any sampling distribution with sufficient support. In the context of GSBM, it is used to debias the Gaussian path approximation of the CondSOC solution.

Given a reference process $r (\overset{ˉ}{X} ∣ x_{0}, x_{1})$ with drift $v_{t}$ , the optimal path-integral solution to the CondSOC problem (Proposition 4 of GSBM) is $p^{*} (\overset{ˉ}{X} ∣ x_{0}, x_{1}) = \frac{1}{Z} ω (\overset{ˉ}{X} ∣ x_{0}, x_{1}) r (\overset{ˉ}{X} ∣ x_{0}, x_{1})$ , where the importance weight is:

$ω (\overset{ˉ}{X} ∣ x_{0}, x_{1}) = exp (- \int_{0}^{1} \frac{1}{σ ^{2}} [V_{t} (X_{t}) + \frac{1}{2} ∥ v_{t} (X_{t}) ∥^{2}] d t - \int_{0}^{1} \frac{1}{σ ^{2}} v_{t} (X_{t})^{⊤} d W_{t})$

The key insight is the connection to information-theoretic stochastic optimal control (Theodorou et al., 2010): the $ℓ_{2}$ -norm control cost allows the KL divergence between controlled and uncontrolled processes to be computed analytically via Girsanov’s theorem, yielding the importance weight formula. When $V_{t} = 0$ , the optimal solution reduces to the Brownian bridge — the Brownian motion conditioned on reaching $x_{1}$ , which is exactly the reference process used in DSBM.

In GSBM, the Gaussian path approximation (optimized via splines) serves as the reference distribution $r$ , and path integral resampling draws samples proportionally to $ω$ . This improves performance particularly at low noise ( $σ$ ), at the cost of ~8% additional runtime and requiring sequential simulation.

Key Details

Provides exact optimal path distribution via importance reweighting of any reference process
Requires $σ > 0$ (stochastic processes only) and differentiable $V_{t}$
Lower variance when reference $r$ is closer to optimal $p^{*}$ (motivates using optimized Gaussian paths as $r$ )
Reduces to Brownian bridge conditioning when $V_{t} = 0$
Connected to linearly-solvable MDPs (Todorov, 2007) and probabilistic inference formulation of control (Levine, 2018)
Optional step in GSBM (Alg. 4): empirically helps most at low noise

method

Alethograph

Explorer

path integral control

Key Details

Graph View

Backlinks